Jun Wang1,2,3,4,12, Wei Wang1,3,12, Ruiqiang Li1,3,4,12, Yingrui Li1,5,6,12, Geng Tian1,7, Laurie Goodman1, Wei Fan1, Junqing Zhang1, Jun Li1, Juanbin Zhang1, Yiran Guo1,7, Binxiao Feng1, Heng Li1,8, Yao Lu1, Xiaodong Fang1, Huiqing Liang1, Zhenglin Du1, Dong Li1, Yiqing Zhao1,7, Yujie Hu1,7, Zhenzhen Yang1, Hancheng Zheng1, Ines Hellmann9, Michael Inouye8, John Pool9, Xin Yi1,7, Jing Zhao1, Jinjie Duan1, Yan Zhou1, Junjie Qin1,7, Lijia Ma1,7, Guoqing Li1, Zhentao Yang1, Guojie Zhang1,7, Bin Yang1, Chang Yu1, Fang Liang1,7, Wenjie Li1, Shaochuan Li1, Dawei Li1, Peixiang Ni1, Jue Ruan1,7, Qibin Li1,7, Hongmei Zhu1, Dongyuan Liu1, Zhike Lu1, Ning Li1,7, Guangwu Guo1,7, Jianguo Zhang1, Jia Ye1, Lin Fang1, Qin Hao1,7, Quan Chen1,5, Yu Liang1,7, Yeyang Su1,7, A. san1,7, Cuo Ping1,7, Shuang Yang1, Fang Chen1,7, Li Li1, Ke Zhou1, Hongkun Zheng1,4, Yuanyuan Ren1, Ling Yang1, Yang Gao1,6, Guohua Yang1,2, Zhuo Li1, Xiaoli Feng1, Karsten Kristiansen4, Gane Ka-Shu Wong1,10, Rasmus Nielsen9, Richard Durbin8, Lars Bolund1,11, Xiuqing Zhang1,6, Songgang Li1,2,5, Huanming Yang1,2,3 & Jian Wang1,2,3

Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual’s genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics.

No related posts.

Related posts brought to you by Yet Another Related Posts Plugin.