Overview

Request 717710 accepted

- Update to version 1.74
* Our core sequence objects (``Seq``, ``UnknownSeq``, and ``MutableSeq``) now
have a string-like ``.join()`` method.
* The NCBI now allows longer accessions in the GenBank file LOCUS line, meaning
the fields may not always follow the historical column based positions. We
no longer give a warning when parsing these. We now allow writing such files
(although with a warning as support for reading them is not yet widespread).
* Support for the ``mysqlclient`` package, a fork of MySQLdb, has been added.
* We now capture the IDcode field from PDB Header records.
* ``Bio.pairwise2``'s pretty-print output from ``format_alignment`` has been
optimized for local alignments: If they do not consist of the whole sequences,
only the aligned section of the sequences are shown, together with the start
positions of the sequences (in 1-based notation). Alignments of lists will now
also be prettily printed.
* ``Bio.SearchIO`` now supports parsing the text output of the HHsuite protein
sequence search tool. The format name is ``hhsuite2-text`` and
``hhsuite3-text``, for versions 2 and 3 of HHsuite, respectively.
* ``Bio.SearchIO`` HSP objects has a new attribute called ``output_index``. This
attribute is meant for capturing the order by which the HSP were output in the
parsed file and is set with a default value of -1 for all HSP objects. It is
also used for sorting the output of ``QueryResult.hsps``.
* ``Bio.SeqIO.AbiIO`` has been updated to preserve bytes value when parsing. The
goal of this change is make the parser more robust by being able to extract
string-values that are not utf-8-encoded. This affects all tag values, except
for ID and description values, where they need to be extracted as strings
to conform to the ``SeqRecord`` interface. In this case, the parser will
attempt to decode using ``utf-8`` and fall back to the system encoding if that
fails. This change affects Python 3 only.
* ``Bio.motifs.mast`` has been updated to parse XML output files from MAST over
the plain-text output file. The goal of this change is to parse a more
structured data source with minimal loss of functionality upon future MAST
releases. Class structure remains the same plus an additional attribute
``Record.strand_handling`` required for diagram parsing.
* ``Bio.Entrez`` now automatically retries HTTP requests on failure. The
maximum number of tries and the sleep between them can be configured by
changing ``Bio.Entrez.max_tries`` and ``Bio.Entrez.sleep_between_tries``.
(The defaults are 3 tries and 15 seconds, respectively.)
* All tests using the older print-and-compare approach have been replaced by
unittests following Python's standard testing framework.
* On the documentation side, all the public modules, classes, methods and
functions now have docstrings (built in help strings). Furthermore, the PDF
version of the *Biopython Tutorial and Cookbook* now uses syntax coloring
for code snippets.
* Additionally, a number of small bugs and typos have been fixed with further
additions to the test suite, and there has been further work to follow the
Python PEP8, PEP257 and best practice standard coding style.

Request History
Todd R's avatar

TheBlackCat created request

- Update to version 1.74
* Our core sequence objects (``Seq``, ``UnknownSeq``, and ``MutableSeq``) now
have a string-like ``.join()`` method.
* The NCBI now allows longer accessions in the GenBank file LOCUS line, meaning
the fields may not always follow the historical column based positions. We
no longer give a warning when parsing these. We now allow writing such files
(although with a warning as support for reading them is not yet widespread).
* Support for the ``mysqlclient`` package, a fork of MySQLdb, has been added.
* We now capture the IDcode field from PDB Header records.
* ``Bio.pairwise2``'s pretty-print output from ``format_alignment`` has been
optimized for local alignments: If they do not consist of the whole sequences,
only the aligned section of the sequences are shown, together with the start
positions of the sequences (in 1-based notation). Alignments of lists will now
also be prettily printed.
* ``Bio.SearchIO`` now supports parsing the text output of the HHsuite protein
sequence search tool. The format name is ``hhsuite2-text`` and
``hhsuite3-text``, for versions 2 and 3 of HHsuite, respectively.
* ``Bio.SearchIO`` HSP objects has a new attribute called ``output_index``. This
attribute is meant for capturing the order by which the HSP were output in the
parsed file and is set with a default value of -1 for all HSP objects. It is
also used for sorting the output of ``QueryResult.hsps``.
* ``Bio.SeqIO.AbiIO`` has been updated to preserve bytes value when parsing. The
goal of this change is make the parser more robust by being able to extract
string-values that are not utf-8-encoded. This affects all tag values, except
for ID and description values, where they need to be extracted as strings
to conform to the ``SeqRecord`` interface. In this case, the parser will
attempt to decode using ``utf-8`` and fall back to the system encoding if that
fails. This change affects Python 3 only.
* ``Bio.motifs.mast`` has been updated to parse XML output files from MAST over
the plain-text output file. The goal of this change is to parse a more
structured data source with minimal loss of functionality upon future MAST
releases. Class structure remains the same plus an additional attribute
``Record.strand_handling`` required for diagram parsing.
* ``Bio.Entrez`` now automatically retries HTTP requests on failure. The
maximum number of tries and the sleep between them can be configured by
changing ``Bio.Entrez.max_tries`` and ``Bio.Entrez.sleep_between_tries``.
(The defaults are 3 tries and 15 seconds, respectively.)
* All tests using the older print-and-compare approach have been replaced by
unittests following Python's standard testing framework.
* On the documentation side, all the public modules, classes, methods and
functions now have docstrings (built in help strings). Furthermore, the PDF
version of the *Biopython Tutorial and Cookbook* now uses syntax coloring
for code snippets.
* Additionally, a number of small bugs and typos have been fixed with further
additions to the test suite, and there has been further work to follow the
Python PEP8, PEP257 and best practice standard coding style.


Todd R's avatar

TheBlackCat accepted request

openSUSE Build Service is sponsored by