File bcftools.changes of Package bcftools
-------------------------------------------------------------------
Fri May 14 10:42:08 UTC 2021 - Ferdinand Thiessen <rpm@fthiessen.de>
- Update to version 1.12
* The output file type is determined from the output file name
suffix, where available, so the -O/--output-type option is often
no longer necessary.
* Make F_MISSING in filtering expressions work for sites with
multiple ALT alleles
* Fix N_PASS and F_PASS to behave according to expectation when
reverse logic is used (#1397). This fix has the side effect of
query (or programs like +trio-stats) behaving differently with
these expressions, operating now in site-oriented rather than
sample-oriented mode.
* bcftools annotate:
* New --rename-annots option to help fix broken VCFs
* New -C option allows to read a long list of options from a file
to prevent very long command lines.
* New append-missing logic allows annotations to be added for
each ALT allele in the same order as they appear in the VCF.
* bcftools concat:
* Do not phase genotypes by mistake if they are not already
phased with -l
* bcftools consensus:
* New --mask-with, --mark-del, --mark-ins, --mark-snv options
* Symbolic <DEL> should have only one REF base. If there are
multiple, take POS+1 as the first deleted base.
* Make consensus work when the first base of the reference genome
is deleted.
* bcftools +contrast:
* The NOVELGT annotation was previously not added when requested.
* bcftools convert:
* Make the --hapsample and --hapsample2vcf options consistent with
each other and with the documentation.
* bcftools call:
* Revamp of call -G, previously sample grouping by population was
not truly independent and could still be influenced by the
presence of other sample groups.
* Optional addition of INFO/PV4 annotation with call -a INFO/PV4
* Remove generation of useless HOB and ICB annotation;
use +fill-tags -- -t HWE,ExcHet instead
* The call -f option was renamed to -a to (1) make it consistent
with mpileup and (2) to indicate that it includes both INFO and
FORMAT annotations
* bcftools csq:
* Fix a bug wich caused incorrect FORMAT/BCSQ formatting at sites
with too many per-sample consequences
* Fix a bug which incorrectly handled the --ncsq parameter and
could clash with reserved BCF values, consequently producing
truncated or even incorrect output of the %TBCSQ formatting
expression in bcftools query.
* bcftools +fill-tags:
* MAF definition revised for multiallelic sites, the second most
common allele is considered to be the minor allele
* New FORMAT/VAF, VAF1 annotations to set the fraction of
alternate reads provided FORMAT/AD is present
* bcftools gtcheck:
* support matching of a single sample against all other samples
in the file with -s qry:sample -s gt:-.
* bcftools merge:
* Make merge -R behavior consistent with other commands and pull
in overlapping records with POS outside of the regions
* Bug fix
* bcftools mpileup:
* Add new optional tag mpileup -a FORMAT/QS
* bcftools norm:
* New -a, --atomize functionality to decompose complex variants,
for example MNVs into consecutive SNVs
* New option --old-rec-tag to indicate the original variant
* bcftools query:
* Incorrect fields were printed in the per-sample output when
subset of samples was requested via -s/-S and the order of
samples in the header was different from the requested -s/-S order
* bcftools +prune:
* New options --random-seed and --nsites-per-win-mode
* bcftools +split-vep:
* Transcript selection now works also on the raw CSQ/BCSQ annotation.
* Bug fix, samples were dropped on VCF input and VCF/BCF output
* bcftools stats:
* Changes to QUAL and ts/tv plotting stats: avoid capping QUAL to
predefined bins, use an open-range logarithmic binning instead
* plot dual ts/tv stats: per quality bin and cumulative as if
threshold applied on the whole dataset
* bcftools +trio-dnm2:
* Major revamp of +trio-dnm plugin, which is now deprecated
and replaced by +trio-dnm2.
* The original trio-dnm calling model used genotype likelihoods
(PLs) as the input for calling.
* This new version also implements the DeNovoGear model.
* For more details see http://samtools.github.io/bcftools/trio-dnm.pdf
- Update use_python3.patch
-------------------------------------------------------------------
Thu May 13 00:53:30 UTC 2021 - Ferdinand Thiessen <rpm@fthiessen.de>
- Update to version 1.11
* Breaking change in -i/-e expressions on the FILTER column.
The new behaviour is:
Expression Result
FILTER="A" Exact match, for example "A;B" does not pass
FILTER!="A" Exact match, for example "A;B" does pass
FILTER~"A" Both "A" and "A;B" pass
FILTER!~"A" Neither "A" nor "A;B" pass
* Fix in commutative comparison operators, in some cases reversing
sides would produce incorrect results
* Better support for filtering on sample subsests
* bcftools annotate:
* Previously it was not possible to use --columns =TAG with INFO
tags and the --merge-logic feature was restricted to tab files
with BEG,END columns, now extended to work also with REF,ALT.
* Make annotate -TAG/+TAG work also with FORMAT fields.
* ID and FILTER can be transferred to INFO and ID can be populated
from INFO.
* bcftools consensus:
* Fix in handling symbolic deletions and overlapping variants.
* Fix --iupac-codes crash on REF-only positions with ALT=".".
* Fix --chain crash
* Preserve the case of the genome reference.
* Add new -a, --absent option which allows to set positions with
no supporting evidence to "N" (or any other character).
* bcftools convert:
* The option --vcf-ids now works also with -haplegendsample2vcf.
* New option --keep-duplicates
* bcftools csq:
* Add misc/gff2gff.py script for conversion between various
flavors of GFF files. The initial commit supports only one type
* Add missing consequence types.
* Allow overlapping CDS to support ribosomal slippage.
* bcftools +fill-tags:
* Added new annotations: INFO/END, TYPE, F_MISSING.
* bcftools filter:
* Make --SnpGap optionally filter also SNPs close to other variant
types.
* bcftools gtcheck:
* Complete revamp of the command. The new version is faster and allows
N:M sample comparisons, not just 1:N or NxN comparisons. Some
functionality was lost (plotting and clustering) but may be added back
on popular demand.
* bcftools +mendelian:
* Revamp of user options, output VCFs with mendelian errors annotation,
read PED files
* bcftools merge:
* Update headers when appropriate with the '--info-rules *:join'
INFO rule.
* Local alleles merging that produce LAA and LPL when requested, a
draft implementation of samtools/hts-specs#434
* New --no-index which allows to merge unindexed files.
* Fixes in gVCF merging.
* bcftools norm:
* Fixes in --check-ref s reference setting features with non-ACGT bases.
* New --keep-sum switch to keep vector sum constant when splitting
multiallelics.
* bcftools +prune:
* Extend to allow annotating with various LD metrics: r^2, Lewontin's D'
* bcftools query:
* New %N_PASS() formatting expression to output the number of samples
that pass the filtering expression.
* bcftools reheader:
* Improved error reporting to prevent user mistakes.
* bcftools roh:
* The --AF-file description incorrectly suggested "REF\tALT"
instead of the correct "REF,ALT".
* RG lines could have negative length.
* new --include-noalt option to allow also ALT=. records.
* bcftools scatter:
* New plugin intended as a convenient inverse to concat
* bcftools +split:
* New --groups-file option for more flexibility of defining
desired output
* New --hts-opts option to reduce required memory by reusing
one output
header and allow overriding the default hFile's block size
* Add support for multisample output and sample renaming
* bcftools +split-vep:
* Add default types (Integer, Float, String) for VEP subfields
and make --columns - extract all subfields into INFO tags
in one go.
-------------------------------------------------------------------
Tue Feb 25 11:46:58 UTC 2020 - Pierre Bonamy <flyos@mailoo.org>
- Changed python dependencies from python3 to python3-base and
python3-matplotlib
-------------------------------------------------------------------
Wed Feb 12 15:10:09 UTC 2020 - Todd R <toddrme2178@gmail.com>
- Add use_python3.patch to switch from python2 to python3
-------------------------------------------------------------------
Wed Feb 5 19:01:01 UTC 2020 - Todd R <toddrme2178@gmail.com>
- Update to 1.10.2
* This release fixes crashes reported on files including integer
INFO tags with values outside the range officially supported
by VCF. It also fixes a bug where invalid BCF files would be
created if such values were present.
- Update to 1.10.0
+ Numerous bug fixes, usability improvements and sanity checks were added to prevent common user errors.
+ The -r, --regions (and -R, --regions-file) option should never create unsorted VCFs or duplicates records again. This also fixes rare cases where a spanning deletion makes a subsequent record invisible to bcftools isec and other commands.
+ Additions to filtering and formatting expressions
* support for the spanning deletion alternate allele (ALT=*)
* new ILEN filtering expression to be able to filter by indel length
* new MEAN, MEDIAN, MODE, STDEV, phred filtering functions
* new formatting expression %PBINOM (phred-scaled binomial probability), %INFO (the whole INFO column), %FORMAT (the whole FORMAT column), %END (end position of the REF allele), %END0 (0-based end position of the REF allele), %MASK (with multiple files indicates the presence of the site in other files)
+ New plugins
* +gvcfz: compress gVCF file by resizing gVCF blocks according to specified criteria
* +indel-stats: collect various indel-specific statistics
* +parental-origin: determine parental origin of a CNV region
* +remove-overlaps: remove overlapping variants.
* +split-vep: query structured annotations such INFO/CSQ created by bcftools/csq or VEP
* +trio-dnm: screen variants for possible de-novo mutations in trios
+ annotate
* new -l, --merge-logic option for combining multiple overlapping regions
+ call
* new bcftools call -G, --group-samples option which allows grouping samples into populations and applying the HWE assumption within but not across the groups.
+ csq
* significant reduction of memory usage in the local -l mode for VCFs with thousands of samples and 20% reduction in the non-local haplotype-aware mode.
* fixes a small memory leak and formatting issue in FORMAT/BCSQ at sites with many consequences
* do not print protein sequence of start_lost events
* support for "start_retained" consequence
* support for symbolic insertions (ALT="<INS...>"), "feature_elongation" consequence
* new -b, --brief-predictions option to output abbreviated protein predictions.
+ concat
* the --naive command now checks header compatibility when concatenating multiple files.
+ consensus
* add a new -H, --haplotype 1pIu/2pIu feature to output first/second allele for phased genotypes and the IUPAC code for unphased genotypes
* new -p, --prefix option to add a prefix to sequence names on output
+ +contrast
* added support for Fisher's test probability and other annotations
+ +fill-from-fasta
* new -N, --replace-non-ACGTN option
+ +dosage
* fix some serious bugs in dosage calculation
+ +fill-tags
* extended to perform simple on-the-fly calculations such as calculating INFO/DP from FORMAT/DP.
+ merge
* add support for merging FORMAT strings
* bug fixed in gVCF merging
+ mpileup
* a new optional SCR annotation for the number of soft-clipped reads
+ reheader
* new -f, --fai option for updating contig lines in the VCF header
+ +trio-stats
* extend output to include DNM homs and recurrent DNMs
+ VariantKey support
-------------------------------------------------------------------
Thu Sep 6 08:43:05 UTC 2018 - flyos@mailoo.org
- Update to 1.9
* `annotate`
- REF and ALT columns can be now transferred from the annotation
file.
- fixed bug when setting vector_end values.
* `consensus`
- new -M option to control output at missing genotypes
- variants immediately following insersions should not be skipped.
Note however, that the current fix requires normalized VCF and may
still falsely skip variants adjacent to multiallelic indels.
- bug fixed in -H selection handling
* `convert`
- the --tsv2vcf option now makes the missing genotypes diploid,
"./." instead of "."
- the behavior of -i/-e with --gvcf2vcf changed. Previously only
sites with FILTER set to "PASS" or "." were expanded and the -i/-e
options dropped sites completely. The new behavior is to let the -i/-e
options control which records will be expanded. In order to drop
records completely, one can stream through "bcftools view" first.
* `csq`
- since the real consequence of start/splice events are not known,
the aminoacid positions at subsequent variants should stay unchanged
- add `--force` option to skip malformatted transcripts in GFFs
with out-of-phase CDS exons.
* `+dosage`: output all alleles and all their dosages at multiallelic
sites
* `+fixref`: fix serious bug in -m top conversion
* `-i/-e` filtering expressions:
- add two-tailed binomial test
- add functions N_PASS() and F_PASS()
- add support for lists of samples in filtering expressions, with
many samples it was impractical to list them all on the command line.
Samples can be now in a file as, e.g., GT[@samples.txt]="het"
- allow multiple perl functions in the expressions and some bug
fixes
- fix a parsing problem, '@' was not removed from '@filename'
expressions
* `mpileup`: fixed bug where, if samples were renamed using the `-G`
(`--read-groups`) option, some samples could be omitted from the
output file.
* `norm`: update INFO/END when normalizing indels
* `+split`: new -S option to subset samples and to use custom file
names instead of the defaults
* `+smpl-stats`: new plugin
* `+trio-stats`: new plugin
* Fixed build problems with non-functional configure script produced
on some platforms
-------------------------------------------------------------------
Thu Jul 12 08:58:12 UTC 2018 - flyos@mailoo.org
- Cleaned spec file using spec-cleaner
- Update to 1.8
* `-i, -e` filtering: Support for custom perl scripts
* `+contrast`: New plugin to annotate genotype differences between groups of samples
* `+fixploidy`: New options for simpler ploidy usage
* `+setGT`: Target genotypes can be set to phased by giving `--new-gt p`
* `run-roh.pl`: Allow to pass options directly to `bcftools roh`
* Number of bug fixes
* `-i, -e` filtering: Major revamp, improved filtering by FORMAT fields
and missing values. New GT=ref,alt,mis etc keywords, check the documenation
for details.
* `query`: Only matching expression are printed when both the -f and -i/-e
expressions contain genotype fields. Note that this changes the original
behavior. Previously all samples were output when one matching sample was
found. This functionality can be achieved by pre-filtering with view and then
streaming to query. Compare
bcftools query -f'[%CHROM:%POS %SAMPLE %GT\n]' -i'GT="alt"' file.bcf
and
bcftools view -i'GT="alt"' file.bcf -Ou | bcftools query -f'[%CHROM:%POS %SAMPLE %GT\n]'
* `annotate`: New -k, --keep-sites option
* `consensus`: Fix --iupac-codes output
* `csq`: Homs always considered phased and other fixes
* `norm`: Make `-c none` work and remove `query -c`
* `roh`: Fix errors in the RG output
* `stats`: Allow IUPAC ambiguity codes in the reference file; report the number of missing genotypes
* `+fill-tags`: Add ExcHet annotation
* `+setGt`: Fix bug in binom.test calculation, previously it worked only for nAlt<nRef!
* `+split`: New plugin to split a multi-sample file into single-sample files in one go
* Improve python3 compatibility in plotting scripts
* New `sort` command.
* New options added to the `consensus` command. Note that the `-i, --iupac`
option has been renamed to `-I, --iupac`, in favor of the standard
`-i, --include`.
* Filtering expressions (`-i/-e`): support for `GT=<type>` expressions and
for lists and ranges (#639) - see the man page for details.
* `csq`: relax some GFF3 parsing restrictions to enable using Ensembl
GFF3 files for plants (#667)
* `stats`: add further documentation to output stats files (#316) and
include haploid counts in per-sample output (#671).
* `plot-vcfstats`: further fixes for Python3 (@nsoranzo, #645, #666).
* `query` bugfix (#632)
* `+setGT` plugin: new option to set genotypes based on a two-tailed binomial
distribution test. Also, allow combining `-i/-e` with `-t q`.
* `mpileup`: fix typo (#636)
* `convert --gvcf2vcf` bugfix (#641)
* `+mendelian`: recognize some mendelian inconsistencies that were
being missed (@oronnavon, #660), also add support for multiallelic
sites and sex chromosomes.
-------------------------------------------------------------------
Mon Jul 10 21:28:20 UTC 2017 - flyos@mailoo.org
- Update to 1.5
- Fixed some runtime dependencies (perl and python-matplotlib)
-------------------------------------------------------------------
Sun May 22 09:16:49 UTC 2016 - flyos@mailoo.org
- Initial release