PacBio Blog: Data Release: ~54x Long-Read Coverage for PacBio-only De Novo Human Genome Assembly

Sunday, August 31st, 2014

We are pleased to make publicly available a new shotgun sequence dataset of long PacBio® reads from a human DNA sample. We previously released sequence data using Single Molecule, Real-Time (SMRT®) Sequencing of ~10x coverage of this sample, sufficient for
reference-based detection of structural variation. Today we expand on that release with additional data that increases the total sequencing coverage to ~54x. This long-read data has enabled the generation of the first de novohuman genome assembly from PacBio-only sequence reads. Download the 54x long-read coverage dataset.

The dataset was generated from sequencing a well-studied human cell line (CHM1htert), which is being utilized as part of a National Institutes of Health project to sequence and assemble an alternate reference genome (the “platinum genome”). This NIH project is being led by Rick Wilson from Washington University at St. Louis and Evan Eichler from the University of Washington in collaboration with investigators from the National Center for Biotechnology Information. “}}