Posts

Showing posts with the label Biological sequence analysis

New Algorithm for detection of viral sequence fragments of HIV-1 subfamilies

Image
Background Methods of determining whether or not any particular HIV-1 sequence stems - completely or in part - from some unknown HIV-1 subtype are important for the design of vaccines and molecular detection systems, as well as for epidemiological monitoring. Nevertheless, a single algorithm only, the Branching Index (BI), has been developed for this task so far. Moving along the genome of a query sequence in a sliding window, the BI computes a ratio quantifying how closely the query sequence clusters with a subtype clade. In its current version, however, the BI does not provide predicted boundaries of unknown fragments. Results We have developed  Unknown Subtype Finder  (USF), an algorithm based on a probabilistic model, which automatically determines which parts of an input sequence originate from a subtype yet unknown. The underlying model is based on a simple profile hidden Markov model (pHMM) for each  known  subtype and an additional pHMM for an  unknown...

PubDNA Finder

Courtesy: Dr Raghava &  Bioclues PubDNA Finder is an online repository to link PubMed Central manuscripts to the sequences of nucleic acids appearing in them. It extends the search capabilities provided by PubMed Central by enabling researchers to perform advanced searches involving sequences of nucleic acids. This includes, among other features, (1) searching for papers mentioning one or more specific sequences of nucleic acids and (2) retrieving the genetic sequences appearing in different articles. These additional query capabilities are provided by a searchable index that was created by using the full text of the 176672 papers available at PubMed Central at the time of writing and the sequences of nucleic acids appearing in them. To automatically  extract the genetic sequences occurring in each paper an original method has been developed. The database is updated monthly by automatically connecting to the PubMed Central FTP site to retrieve and index...

Wheat's Genetic Code Cracked

A team of UK researchers, funded by the Biotechnology and Biological Sciences Research Council (BBSRC), has publicly released the first sequence coverage of the wheat genome. The release is a step towards a fully annotated genome and makes a significant contribution to efforts to support global food security and to increase the competitiveness of UK farming. The genome sequences released comprise five read-throughs of a reference variety of wheat and give scientists and breeders access to 95% of all wheat genes. This is among the largest genome projects undertaken, and the rapid public release of the data is expected to accelerate significantly the use of the information by wheat breeding companies. The team involved Prof Neil Hall and Dr Anthony Hall at the University of Liverpool, Prof Keith Edwards and Dr Gary Barker at the University of Bristol and Prof Mike Bevan at the John Innes Centre, a BBSRC-funded Institute. The genome data released are in a 'raw' format, comprisin...

Sequence Analysis is still sexy:Dual Descriptor Method for Biological Sequence Analysis

The emergence of “Systems Biology” in recent years highlights the systematic viewpoint of bio-system modeling. Building on such a background, Dual Descriptor Method, a generic methodology for biological sequence analysis is proposed. From a systematic perspective, Dual Descriptor is defined as a two element set of Composition Weight Map and Position Weight Function which aim at reflecting the composition and permutation information of a sequence. An alternate training algorithm is provided to get an optimum description of the building patterns of the sequences. In this paper , dual descriptor method has been applied to the analysis of two typical problems of molecular biology: gene identification and the prediction of protein function. Satisfactory and insightful results are achieved. Owing to the generality of this methodology, dual descriptor method has wide application perspective for many problems of pattern recognition, especially those involved in “Systems Biology”. Be a part of ...