Whole genome amplification by the multiple displacement amplification (MDA) method allows sequencing of DNA from single cells of bacteria that cannot be cultured. Assembling a genome is challenging, however, because MDA generates highly nonuniform coverage of the genome. Here we describe an algorithm tailored for short-read data from single cells that improves assembly through the use of a progressively increasing coverage cutoff. Assembly of reads from single Escherichia coli and Staphylococcus aureus cells captures >91% of genes within contigs, approaching the 95% captured from an assembly based on many E. coli cells. We apply this method to assemble a genome from a single cell of an uncultivated SAR324 clade of Deltaproteobacteria, a cosmopolitan bacterial lineage in the global ocean. Metabolic reconstruction suggests that SAR324 is aerobic, motile and chemotaxic. Our approach enables acquisition of genome assemblies for individual uncultivated bacteria using only short reads, providing cell-specific genetic information absent from metagenomic studies.
Sequence Analysis is still sexy:Dual Descriptor Method for Biological Sequence Analysis
The emergence of “Systems Biology” in recent years highlights the systematic viewpoint of bio-system modeling. Building on such a background, Dual Descriptor Method, a generic methodology for biological sequence analysis is proposed. From a systematic perspective, Dual Descriptor is defined as a two element set of Composition Weight Map and Position Weight Function which aim at reflecting the composition and permutation information of a sequence. An alternate training algorithm is provided to get an optimum description of the building patterns of the sequences. In this paper , dual descriptor method has been applied to the analysis of two typical problems of molecular biology: gene identification and the prediction of protein function. Satisfactory and insightful results are achieved. Owing to the generality of this methodology, dual descriptor method has wide application perspective for many problems of pattern recognition, especially those involved in “Systems Biology”. Be a part of
Comments