Published May 9, 2008 | Version v1
Journal article Open

Viral Population Estimation Using Pyrosequencing

  • 1. University of Chicago
  • 2. University of California, Berkeley
  • 3. Stanford University Medical Center
  • 4. Stanford University
  • 5. ETH Zurich

Description

The diversity of virus populations within single infected hosts presents a major difficulty for the natural immune response as well as for vaccine design and antiviral drug therapy. Recently developed pyrophosphate-based sequencing technologies (pyrosequencing) can be used for quantifying this diversity by ultra-deep sequencing of virus samples. We present computational methods for the analysis of such sequence data and apply these techniques to pyrosequencing data obtained from HIV populations within patients harboring drug-resistant virus strains. Our main result is the estimation of the population structure of the sample from the pyrosequencing reads. This inference is based on a statistical approach to error correction, followed by a combinatorial algorithm for constructing a minimal set of haplotypes that explain the data. Using this set of explaining haplotypes, we apply a statistical model to infer the frequencies of the haplotypes in the population via an expectation–maximization (EM) algorithm. We demonstrate that pyrosequencing reads allow for effective population reconstruction by extensive simulations and by comparison to 165 sequences obtained directly from clonal sequencing of four independent, diverse HIV populations. Thus, pyrosequencing can be used for cost-effective estimation of the structure of virus populations, promising new insights into viral evolutionary dynamics and disease control strategies.

Files

journal.pcbi.1000074.pdf

Files (534.5 kB)

Name Size Download all
Article
md5:d96d2c0e6a80e81886c048551cdf277d
442.6 kB Preview Download
md5:be57c777555d94400e80f83549fb0f22
91.9 kB Preview Download

Additional details

Identifiers

DOI
10.1371/journal.pcbi.1000074
Other
oai:uchicago.tind.io:10216

Funding

National Science Foundation
DMS-0603448
National Science Foundation
CCF-0347992
Bill and Melinda Gates Foundation
Grand Challenges in Global Health Initiative

UChicago Information

Division(s)
Physical Sciences Division
Department(s)
Statistics