[13] LEGRAIN P, AEBERSOLD R, ARCHAKOV A, et al. The human proteome project: current state and future direction[J]. Molecular & Cellular Proteomics, 2011, 10(7): M111. 009993.
[14] GILBERTJ A, MEYER F,ANTONOPOULOS D, et al. Meeting report: the terabase metagenomics workshop and the vision of an earth microbiome project[J]. Standards in Genomic Sciences, 2010, 3(3): 243.
[15] ROBINSON G E, HACKETT K J, PURCELL M M, et al. Creating a buzz about insect genomes[J]. Science, 2011, 331(6023): 1386.
[16] JOLY Y, DOVE E S, KNOPPERS B M, et al. Data sharing in the post-genomic world: the experience of the international cancer genome consortium (ICGC) data access compliance office (DACO)[J]. PLoS Comput Biol, 2012, 8(7): e1002549.
[17] WU X D, ZHU X Q. Data mining with big data[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(1): 97-108.
[18] CHRISTLEY S, LU Y, LI C, et al. Human genomes as email attachments[J]. Bioinformatics, 2009, 25(2): 274-275.
[19] BRADON M C, WALLACE D C, BALDI P. Data structures and compression algorithms for genomic sequence data[J]. Bioinformatics, 2009, 25(14): 1731-1738.
[20] KOZANITIS C, SAUNDERS C, KRUGLYAK S, et al. Compressing genomic sequence fragments using SlimGene[J]. Journal of Computational Biology, 2011, 18(3): 401-413.
[21] WANG C, ZHANG D. A novel compression tool for efficient storage of genome resequencing data[J]. Nucleic Acids Research, 2011, 39(7): e45.
[22] FRITZMH Y, LEINONEN R, COCHRANE G, et al. Efficient storage of high throughput DNA sequencing data using reference-based compression[J]. Genome Research, 2011, 21(5): 734-740.
[23] MILLER J R, KOREN S, SUTTON G. Assembly algorithms for next-generation sequencing data[J]. Genomics, 2010, 95(6): 315-327.
[24] BONFI ELD J K, M A HONEY M V. Compression of FASTQ and SAM format sequencing data[J]. Plos One, 2013, 8(3): 1453-1456.
[25] COX A J, BAUER M J, JAKOBI T, et al. Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform[J]. Bioinformatics, 2012, 28(11): 1415-1419.
[26] HACH F, NUMANAGI? I, ALKAN C, et al. SCALCE: boosting sequence compression algorithms using locally consistent encoding[J]. Bioinformatics, 2012, 28(23): 3051-3057.
[27] SELVA J J, CHEN X. SRComp: short read sequence compression using burstsort and Elias omega coding[J]. PloS One, 2013, 8(12): e81414.
[28] PATRO R, K INGSFOR D C. Data-dependent bucketing improves reference-free compression of sequencing reads[J]. Bioinformatics, 2015: btv248.
[29] JONES D C, RUZZO W L, PENG X, et al. Compression of next-generation sequencing reads aided by highly efficient de novo assembly[J]. Nucleic Acids Research, 2012, 40(22): e171.
[30] METZKER M L. Applications of next-generation sequencing technologies the next generation[J]. Nature Reviews Genetics, 2010, 11(1): 31-46.
[31] WOOLEY C, GODZIK A, FRIEDBERG I. A primer on metagenomics[J]. PLoS Comput Biol, 2010, 6(2): e1000667.
[32] POP M, PHILLIPPY A, DELCHER A L, et al. Comparative genome assembly[J]. Briefings in Bioinformatics, 2004, 5(3): 237-248.
[33] KECECIOGLU J, JU J. Separating repeats in DNA sequence assembly[C]// The 5th Annual International Conference on Computational Biology, April 22-25, 2001, Montreal, Canada. [S.l.:s.n.], 2001: 176-183.
[34] PRIDE D T, MEINERSMANN R J, WASSENAAR T M, et al. Evolutionary implications of microbial genome tetranucleotide frequency biases[J]. Genome Research, 2003, 13(2): 145-158.
[35] WU Y W, YE Y. A novel abundance-based algorithm for binning metagenomic sequences using l-tuples[J]. Journal of Computational Biology, 2011, 18(3): 523-534.
[36] PRAKASH T, TAYLOR T D. Functional assignment of metagenomic data: challenges and applications[J]. Briefings in Bioinformatics, 2012, 13(6): 711-727.
[37] QIN J, LI R, RAES J, et al. A human gut microbial gene catalogue established by metagenomic sequencing[J]. Nature, 2010, 464(7285): 59-65.
[38] QIN J, LI Y, CAI Z, et al. A metagenome-wide association study of gut microbiota in type 2 diabetes[J]. Nature, 2012, 490(7418): 55-60.
[39] BORODOVSKY M, MCININCH J. GENMARK: parallel gene recognition for both DNA strands[J]. Computers &Chemistry, 1993, 17(2): 123-133.
[40] LU K ASH I N A, BOROD OVSKY M. GeneMark.hmm: new solutions for gene finding[J]. Nucleic Acids Research, 1998, 26(4): 1107-1115.
[41] BESEMER J, LOMSADZE A, BORODOVSKY M. GeneMarks: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions[J]. Nucleic Acids Research, 2001, 29(12): 2607-2618.
[42] SALZBERG S L, DELCHER A L, KASIF S, et al. Microbial gene identification using interpolated Markov models[J]. Nucleic Acids Research, 1998, 26(2): 544-548.
[43] DELCHER A L, BRATKE K A, POWERS E C, et al. Identifying bacterial genes and endosymbiont DNA with Glimmer[J]. Bioinformatics, 2007, 23(6): 673-679.