Comparing WHAM with BWA using wgsim
This document presents results from an evaluation of WHAM against BWA using the wgsim tool.
Methodology:
First, we generate 100000 single-end 100bps reads using wgsim. The command is exactly same as the one used on this page.
wgsim -N 100000 -r 0.01 -1 100 -S11 -d0 -e0 hs37m.fa r0.fq /dev/null
Then, we issued the following command to build a WHAM index on the sequence file hs37m.fa. The options -l 100 and -v 4 specify that the index will be used to align 100bps reads with up to 4 errors. It takes about one hour to build this index on a machine with a 2.67GHz Intel Xeon CPU and 24GB memory.
wham_build -l 100 -v 4 -p 5 --mask hs37m.fa indexes/wgsim
After building the index, we used two commands (described below) to align the simulated reads with two different error models. The SAM outputs are stored in wham1.sam and wham2.sam, respectively.
The first command find the best matches with up to 4 errors and 2 gaps. Specifying –e 0 instructs WHAM to report quality scores for all alignments. The command takes around 7 seconds to run on our machine. This command is:
wham -v 4 -g 2 –e 0 -S r0.fa indexes/wgsim wham1.sam
The second command finds alignments with up to 8 errors and 6 gaps. Since the index is built with 4 errors, WHAM can’t guarantee that it find all matched alignments under this error model, but it can find more alignments compared to the first command. The command takes around 21 seconds to run. This second command is:
wham -v 8 -g 6 –e 0 -S r0.fa indexes/wgsim wham2.sam
Next, we used the wgsim script to compare the accuracy of the outputs of WHAM and BWA-SW. The results are shown in the following tables. In each table, we show the number of true positives and the number of false positives in the second and third columns respectively. For BWA-SW, we only show the result for quality scores lower than 36. Note that the quality scores for BWA-SW and WHAM are not identical (both use different methods), so one can’t compare the quality scores directly.
| Quality Score | #True Positives | #False Positives |
| 36 | 77353 | 2 |
| 33 | 87851 | 12 |
| 28 | 90354 | 88 |
| 8 | 90949 | 88 |
| 6 | 91186 | 97 |
| 5 | 91621 | 97 |
| 4 | 91764 | 103 |
| 3 | 93319 | 945 |
| 2 | 93869 | 1309 |
| 1 | 94586 | 2026 |
| Quality Score | #True Positives | #False Positives |
| 36 | 77344 | 2 |
| 33 | 87842 | 12 |
| 28 | 90712 | 86 |
| 8 | 91312 | 86 |
| 6 | 91552 | 95 |
| 5 | 91985 | 95 |
| 4 | 92131 | 101 |
| 3 | 94313 | 1520 |
| 2 | 94864 | 1887 |
| 1 | 95581 | 2606 |
| Quality Score | #True Positives | #False Positives |
| 36 | 83999 | 1 |
| 35 | 84426 | 1 |
| 34 | 85590 | 1 |
| 33 | 86840 | 1 |
| 32 | 87413 | 1 |
| 31 | 87564 | 1 |
| 30 | 87928 | 2 |
| 29 | 88039 | 2 |
| 28 | 88180 | 2 |
| 27 | 88306 | 2 |
| 26 | 88437 | 2 |
| 25 | 88581 | 2 |
| 24 | 88721 | 2 |
| 23 | 88885 | 2 |
| 22 | 89051 | 2 |
| 21 | 89329 | 2 |
| 20 | 89695 | 3 |
| 19 | 89854 | 3 |
| 18 | 90071 | 3 |
| 17 | 90217 | 3 |
| 16 | 90431 | 3 |
| 15 | 90642 | 4 |
| 14 | 90844 | 4 |
| 13 | 91114 | 4 |
| 12 | 91346 | 4 |
| 11 | 91843 | 5 |
| 10 | 92474 | 7 |
| 9 | 92714 | 7 |
| 8 | 93095 | 10 |
| 7 | 93347 | 12 |
| 6 | 93794 | 15 |
| 5 | 94068 | 17 |
| 4 | 94638 | 24 |
| 3 | 94862 | 35 |
| 2 | 95577 | 103 |
| 1 | 95798 | 184 |
As we can see from tables above, the first WHAM command produces fewer true positives. The second command finds similar number of correct alignments as BWA, but outputs more false positives than BWA.
To analyze the reason why WHAM outputs more wrong alignments, we use the following command to output up to 5 best alignments for each read, and use wgsim to verify these alignments.
We found that with this command, WHAM finds 97708 true positives, more than 95798 of BWA. This means that WHAM finds the correct alignments based on its error model, but has difficulty in ranking the correct alignment to the best match due to its simple model of quality score. We are actively looking for life sciences collaborators to give input on the scoring model, and collaborate on customizing it for their environment. Please send an email via the comment form below if you would like to work with us.
Scoring method used in WHAM
In the current scoring model in WHAM, the quality score of an alignment depends on the number of errors in this alignment, and also the number of other alignments for a given read.
Suppose that a given read aligns to multiple positions on the target genome, i.e. nk positions for k errors. Also assume that the best alignment has kmin errors. The posterior probability that the best alignment is actually correct is P[kmin errors] / (P[kmin errors] + Σ k * P[k errors]). This probability is then mapped to a Phred quality score.
The probability of an alignment with k errors, P[k errors], is calculated using a binomial distribution. Suppose a read has l bases, and each one has a probability of p to be wrong. Then, we have P[k errors] = C(l, k) * pk*(1-p)l-k. In the current version of WHAM, p = 0.01, which is a Phred quality score of 20. In future version, we might use base quality from FASTQ format to evaluate an alignment.
Acknowledgements
Thanks to Toby Bloom for pointing us to the wgsim tool and suggesting this evaluation with BWA!