Burrows-Wheeler Alignment Tool
BWA is a fast light-weighted tool that aligns relatively short sequences (queries) to a sequence database (targe), such as the human reference genome. It implements two different algorithms, both based on Burrows-Wheeler Transform (BWT). The first algorithm is designed for short queries up to ~200bp with low error rate (<3%). It does gapped global alignment w.r.t. queries, supports paired-end reads, and is one of the fastest short read alignment algorithms to date while also visiting suboptimal hits. The second algorithm, BWA-SW, is designed for long reads with more errors. It performs heuristic Smith-Waterman-like alignment to find high-scoring local hits (and thus chimera). On low-error short queries, BWA-SW is slower and less accurate than the first algorithm, but on long queries, it is better.
For both algorithms, the database file in the FASTA format must be first indexed with the ‘index’ command, which typically takes a few hours. The first algorithm is implemented via the ‘aln’ command, which finds the suffix array (SA) coordinates of good hits of each individual read, and the ‘samse/sampe’ command, which converts SA coordinates to chromosomal coordinate and pairs reads (for ‘sampe’). The second algorithm is invoked by the ‘bwasw’ command. It works for single-end reads only.