Bio Saga Headlines

Bio Saga

Monday, May 31, 2010

GSK and Online Communities Create Unique Alliance to Stimulate Open Source Drug Discovery for Malaria

- GSK becomes first company to freely share chemical structures on 13,500 molecules from its compound library
- Alliances formed with leading scientific research communities from private industry and public-domain data provider

courtesy CDD Blog
GlaxoSmithKline (GSK) had teamed up with leading public-domain data providers European Bioinformatics Institute (EMBL-EBI), the U.S. National Library of Medicine (NLM) and the US-based informatics service provider Collaborative Drug Discovery (CDD) to make freely available key scientific information on more than 13,500 compounds that could ultimately lead to new treatments for malaria.

The release of this data marks the first time that a pharmaceutical company has made available the structures of so many compounds and is made possible through the collaboration of the web hosts and their specialist research tools, which will be available at no cost to researchers. The information, which is hosted on websites regularly used by researchers, includes high quality scientific data about the molecules from GSK’s own compound library which have demonstrated potency against the most deadly malaria parasite, P. falciparum.

“We are delighted that EMBL-EBI, NLM and CDD have joined us in this worthwhile endeavour to apply the principles of open source to drug discovery for malaria,” said Patrick Vallance, head of drug discovery at GSK. “Defeating this disease will require many scientific minds working together. We hope researchers from across the world will now use this information to drive further studies, and that other groups from pharmaceutical industry to academia will add their information to this on-line resource.”

This type of data is the first step on the road to developing new medicines. With the structure of the compounds and information about where they affect the malaria parasite, scientists could then carry out further research on these compounds for drug discovery or to understand how these might be used to inhibit the parasite’s life cycle and ultimately lead to new medicines. Opening up this information widely is essentially an example of ‘open source’ tactic being applied to drug discovery.

“Making life-science information openly available to the research community is at the heart of the EMBL-EBI’s mission,” added John Overington, leader of the EMBL-EBI’s ChEMBL team, which will act as the primary repository for the data through its ChEMBL resource. “We’re proud to be able to add value to the GSK data by incorporating it into ChEMBL and linking it with a vast array of information that could help researchers to find new treatments for malaria. This is the beginning of a new era of public–private collaboration in drug research.”

“NLM is excited to be involved in this groundbreaking release of information to the public,” said Steve Bryant, head of NLM’s PubChem database, which is housing the data. “By making these data available through public resources such as PubChem, GSK is greatly facilitating the research process, as the information is linked to related compounds, bioactivity results, published literature, and other resources that will assist researchers in making new discoveries to combat malaria.”

“CDD is delighted to be playing a role in this truly historic event,” commented Barry A. Bunin, CEO of Collaborative Drug Discovery. “In decades of medical breakthroughs from Big Pharmas, this is the first time a group is openly sharing all the chemical and biological data – not just the few hits. Furthermore, for phenotypic screens, the CDD tools allow researchers to begin to hypothesize and validate the targets from the whole cell screens.”

EMBL-EBI will act as the primary repository for the data on this compound set, and will index and format further information that is contributed. GSK will add more information as it is generated and external scientists researching these compounds and the data will be asked do the same.

About the data

The data contains the ‘hits’ or results from a screening of the 2 million compounds in GSK’s compound library to determine the effect of these compounds on the malaria parasite. The screening project identified ~13,500 compounds that showed strong inhibition on the parasite.

Kinase inhibitors constituted a large proportion of the molecules with previously known activity and now identified as antimalarial hits. The data includes the chemical families that GSK is currently researching for this indication and the ‘mechanisms of action’ for those compounds which the company has previously tested for other indications.

Most of the compound structures identified have been classified as capable of being converted into medicine.

The current microbiological information for the compounds and the structures have been put on online resources that are easily accessed by researchers. The EMBL-EBI site has been constructed so that scientists globally can add their data to the information there, with access free to all. The value of the release of information is enhanced by the collaboration of the web hosts and the specialist research tools on the site, that are being made available to researchers at no cost to them.

GSK gratefully recognises the support of Medicines for Malaria Venture, which contributed funding for this project.

Full information can be viewed online at:

Do you want to know more?

Wednesday, May 26, 2010

Simple tips to deal with Big Data In Bioinformatics

Here is something I came across recently, a relatively old article which I found on Bioinformatics Zen by Michale Barton. Simple, practical and worldly wise.

Bioinformatics usually involves shuffling data into the right format for plotting or statistical tests. I prefer to use a database to store and format data as I think this make projects easier to maintain compared with using just scripts. I find a dynamic language like Ruby and libraries for database manipulation like ActiveRecord makes using a database relatively simple.

Using a database however stops being simple when you have to deal with very large amounts of data. Here I’m outlining my experience of analysing gigabytes of data with millions of data points and how tried to improve my software’s performance when manipulating this data. I’ve ordered each approaches with I think is the most pragmatic first.

The simple things

Obvious but sometimes overlooked.

1. Use a bigger computer

Using a faster computer might seem like a lazy option compared with optimising your code, but if the analysis works on your computer it should work the same, but faster, on a more powerful computer. Using a faster computer is probably one of the few things I tried which which didn’t involved modifying my code and therefore shouldn’t introduce any bugs. I used unit tests to make sure the code still worked as expected though.

2. Add database indices

Since I’m using a database, making sure it runs as fast as possible is another cheap way to improve performance. Properly indexed database columns reduce running times when searching or joining tables as an index means rows are looked up much faster. Database indices are also relatively easy to implement, just specify which columns need to be indexed either using SQL or in my case using ActiveRecord.

3. Use a faster language interpreter

Most of the time the standard version of ruby is sufficient for running my code. In last two years though different but faster versions have been created such as REE, JRuby and Ruby 1.9. Therefore when I was encountering long running times from processing millions of database rows I thought it was worth trying a faster Ruby version. I use Ruby 1.9 and it did improve performance. One caveat though was that I had to make my code compatible with the newer version specifically for the CSV library. These code changes were still relatively cheap to implement given the noticeable performance benefits.

Delete stuff

After the above three points I generally had to start digging around in my code - which is bad because changing working code usually creates broken code. A good way to optimise code, without introducing too many new problems, is just to delete it entirely.

4. Delete unnecessary data and analysis

I find that I often generate variables which I think might be useful at some future time. As you might expect just deleting the code that produces these variables removes the time required to compute them. More often than not I never ended up needing the variable anyway.

5. Remove database table joins

I’m using a database because usually I want to compare two or more sets of data and therefore I need to format them in a way that makes them comparable. Once formatted I join each variable in the database and then print the results as a CSV file.

The problem with joining a large number of database records, even with database indices, is that it can take a very long time. The amount of time required also increases the more the data is normalised. To try and fix this I found that I could drop the smaller of two variables I was joining and instead do the join further into my workflow.

For example I had two variables, the first contained millions of entries each one corresponding to a protein residue. The second data contained around only 100 entries each one corresponding to one of twenty amino acids. Merging these two variables in my database required millions of joins and took a long time. Instead I joined my amino acid data to my protein data after I had calculated the mean of each protein residue. This reduced the number of joins from a million down to around 100. I did the join as I plotted it using the merge function in R.

Code optimisation

When I was encountering performance problems I left optimising code as a last resort. There were three reasons for this, the first is that premature optimisation may be the root of all evil. The second reason reason is that the enemy of good-enough code is perfect code - when I start optimising code I tend keep going more than is necessary. Code doesn’t need to be as fast as possible though, just fast enough to get the results I need. The third point is that optimising code, means changing code, which introduces bugs and so the more the code is optimised the more chance of bugs. Code optimisation was a necessity though because my analysis was still taking days to run. I should also point out that my code optimisation was combined with thorough unit testing and benchmarking - which I think is usually how it should be done.

6a. Batch load database query results

One easy way I reducing running times was by batch loading the database table rows rather than trying to load a big table all at once. Pulling all the database records into memory means that most of the running time is spent loading the data into memory rather than actually dealing with it. Batch loading instead pulls subsets of records into memory at a time and each subset is then processed before the next set or rows is retrieved. This means less less memory is used each time. A example of this in Ruby is the ActiveRecord method find_in_batches.

6b. Association loading

Association loading means that when a row of Table A is retrieved from the database, that all the rows associated with it in Table B are also retrieved. This will usually mean that the database is only queried twice, once to find the records from Table A and once to find the records from Table B. The alternative option is to use a loop to retrieve each required row from Table B but this will mean as many database queries as there are rows - and more queries means more running time.

6c. Database querying in loops

I found that large loops which query the database were often the majority of my software’s running time. I improved this by instead moving the database calls, up or out of the loops as much as possible and caching rows in memory before hand. This meant the looping code was looking things up in memory rather then querying the database each time. A similar approach can also be used to avoid object creation inside loops which also seems to improve performance. Combining this approach with the one below was what most improved the running time in my analysis.

6d. Use raw SQL

Object relational management (ORM) libraries like ActiveRecord allow the database to be manipulated using object orientated programming which generally makes using a database a lot easer easier. Using an ORM does however add a performance penalty because it’s an extra layer on top of the database. When I was doing millions of database updates I found that skipping the ORM and directly using raw SQL contributed to a large saving of processing time. The advantages of this technique are neatly outlined by Ilya Grigorik.

That’s it.

Quite a long post I know. I know code performance is a weighty topic and probably what I’m outlining here isn’t the best way to go about dealing with large data. I’m there are better ways better technologies to manage large amounts of data too, e.g. map/reduce or schemaless databases. I’m not a trained computer scientist or a software engineer, but a biologist and what I’ve outlined is what allowed to me to produce the results I need. I’d be happy to read any further suggestions in the comments though.

Monday, May 24, 2010

Abbott Laboratories Buys Piramal Healthcare Limited Biz for $3.72B

Abbott May 21 announced a definitive agreement with Piramal Healthcare Limited to acquire full ownership of Piramal's Healthcare Solutions business (Domestic Formulations), a leader in the Indian branded generics market, for an up-front payment of $2.12 billion, plus $400 million annually for the next four years, giving Abbott the No. 1 position in the Indian pharmaceutical market. This further accelerates Abbott's emerging markets growth following the recent acquisition of Solvay Pharmaceuticals and announcements last week of Abbott's collaboration with Zydus Cadila as well as the creation of a new stand-alone Established Products Division to focus on expanding the global markets for its leading branded generics portfolio.

"This strategic action will advance Abbott into the leading market position in India, one of the world's most attractive and rapidly growing markets," said Miles D. White, chairman and chief executive officer, Abbott. "Our strong position in branded generics and growing presence in emerging markets is part of our ongoing diversified pharmaceutical strategy, complementing our market-leading proprietary pharmaceutical offerings and pipeline in developed markets."

"Emerging markets represent one of the greatest opportunities in health care not only in pharmaceuticals but across all of our business segments. Today, emerging markets represent more than 20 percent of Abbott's total business," said Mr. White.

"With this deal, the combined Healthcare Solutions and Abbott businesses will become the clear market leader in India, with a market share of approximately 7 percent," said Ajay Piramal, chairman, Piramal Group. "This was our collective vision and I am glad that those who are part of Piramal's Healthcare Solutions business will realize this dream."

The Indian Pharmaceutical Market

India is one of the world's fastest-growing pharmaceutical markets, due in large part to branded generics. The market will generate nearly $8 billion in pharmaceutical annual sales this year, a number that is expected to more than double by 2015. Abbott estimates the growth of its Indian pharmaceutical business with Piramal to approach 20 percent annually, with expected sales of more than $2.5 billion by 2020.

Branded generics have significant brand equity in many international markets, providing durable, sustainable franchises for future growth. Piramal markets the products in its Healthcare Solutions business in India only and does not market traditional generic products. Today, branded generics account for 25 percent of the global pharmaceutical market, have the majority of market share in the largest emerging markets, and are expected to outpace growth of patented and generic products.

The Mumbai-based Piramal Healthcare Solutions business has a comprehensive portfolio of branded generics with annual sales expected to exceed $500 million next year in India, and market-leading brands in multiple therapeutic areas, including antibiotics, respiratory, cardiovascular, pain and neuroscience. This business grew 23 percent in 2010 (fiscal year ended March 31, 2010), faster than the market in India. Piramal has a strong commercial presence, including the largest sales force in India with a unique model that includes dedicated sales personnel in rural areas inhabited by 70 percent of the population. The combined Abbott and Piramal sales forces will be the industry's largest in India.

Piramal's Healthcare Solutions business will become part of Abbott's newly created, stand-alone Established Products Division. Piramal's Healthcare Solutions business employs more than 5,000 people in India. Abbott, which is celebrating its 100th year in India, has more than 2,500 employees across all of its businesses there.

Abbott's Established Products Strategy

Throughout the past decade, Abbott has built a leading portfolio of branded generics, through its own products as well as those acquired with the 2001 acquisition of Knoll's pharmaceutical business. In 2007, the company established a separate business unit within its international pharmaceutical division dedicated to established products.

Additionally, a new geographic region focused on Russia, India and China was created, which resulted in the doubling of Abbott's growth rate in those countries.

Most recently, the company acquired Solvay Pharmaceuticals, obtaining a diverse branded generics portfolio and providing significant critical mass in key emerging markets.

As a result of these combined actions, Abbott is now among the leading multinational health care companies in numerous emerging markets. Approximately 20 percent of Abbott's pharmaceutical sales today are in emerging markets.

"We have assembled a market-leading branded generics portfolio tailored to the unique needs of emerging markets, strongly positioning Abbott to meet the current and future geographic and market dynamics in pharmaceuticals," said Olivier Bohuon, executive vice president, global pharmaceuticals, Abbott. "Piramal has built a reputation for high-quality, well-known and trusted pharmaceutical brands. We look forward to welcoming the accomplished staff of Piramal's Healthcare Solutions business to Abbott."

Pharmaceutical sales in emerging markets are expected to grow at three times the rate of developed markets and account for 70 percent of pharmaceutical growth over the next several years. This explosive growth is occurring as demographics, rising incomes, modernization of health systems and an increase in the treatment of chronic disease create greater demand for medicines.

Financial Highlights

Under terms of the agreement, Abbott will purchase the assets of Piramal's Healthcare Solutions business for a $2.12 billion up-front payment with payments of $400 million annually for the next four years, beginning in 2011. The transaction will not impact Abbott's ongoing earnings per share guidance in 2010. Abbott plans to fund the transaction with cash on the balance sheet.

This transaction is subject to shareholder approval of Piramal Healthcare Limited and other customary closing conditions, and is expected to close in the second half of 2010. This transaction is being conducted by a wholly-owned subsidiary of Abbott, resulting in full ownership of the assets of Piramal's Healthcare Solutions business (Domestic Formulations).

Abbott Conference Call

Abbott will conduct a special conference call today at 7:30 a.m. Central time (8:30 a.m. Eastern time) to provide an overview of the transaction. The live Web cast will be accessible through Abbott's Investor Relations Web site at

For more information on today's announcement, please go to Abbott's press kit at

Friday, May 21, 2010

The Argument Continues - Blue Ray or HD, Intel or AMD and now Illumina or Life Tech?

Nick Loman at Pathogens: Genes and Genomes says that the key players emerging third-generation sequencing market are comparable to the Intel x86 family and its famed competitor, reduced instruction set computing — or RISC — chips in the early 1990s. "Despite the seeming obvious killer advantages … RISC chips resoundingly failed in the desktop PC market, never challenging Intel’s dominance," Loman recalls. He writes that labs considering which third-gen sequencing instruments to invest in is "very similar to the common nerd dilemma: buy a new laptop now, or wait for the next model?" Loman wonders whether the theory behind Moore's law will hold true for the transition from second- to third-generation sequencing technologies. "I propose that Illumina are Intel, and the Genetic Analyzer family — GA1, GA II, GAIIx, HiSeq 2000 — are x86. Life Tech is AMD, producing similar technology with much reduced market share," Loman writes, adding that "the third-generation technologies could end up repeating the RISC story."

Continuing the analogy, Loman notes that when RISC was introduced, "there was a huge base of proprietary applications available that only ran on Intel x86 architecture," much like today, where "both the academic community and the commercial companies have invested in the second-generation space," and many have obtained de novo assemblers and aligners specifically for Illumina-generated data. "Finally, Intel managed to outmaneuver the threat from RISC by copying some of the best ideas from RISC and integrating them into the x86 family," Loman writes, evoking the possibility that Illumina could perform a "bolt-on upgrade" to the HiSeq using technologies from Oxford Nanopore, given the January 2009 marketing agreement between the firms. The blogger also says that Helicos BioSciences has "had a torrid time, failing to get any kind of market penetrance" with its single-molecule sequencer. Loman's best guess? "Certain third-generation technologies will be successful but not in direct competition with Illumina," he writes.

Thursday, May 20, 2010

OGI Starts Open Access Genomics Fund

The Ontario Genomics Institute has started a new fund that will be used to make genomics research papers available as open access from the date of their publications in journals.

The OGI Genomics Publication Fund (GPF) will contribute up to C$3,000 ($2,900) per publication to genomics researchers in Ontario who want to make their papers available.

The GPF will be open to researchers at Ontario-based academic, industry, or government institutions, and its goal is to maximize access to important genomics publications and to increase the visibility and citations of genomics research conducted in the province.

OGI expects to support up to 35 open access publications over the next 12 months, and it will either reimburse special fees charged by traditional publishers to make individual manuscripts open access or to defray publication costs for manuscripts published in open access.

"OGI's program is targeting those publications with the greatest potential reach. Open sharing of knowledge should help to foster cross-disciplinary collaborations and to catalyze further research," James Till, university professor emeritus at the University of Toronto, said in a statement.

"The general public also doesn't have easy access to publications, which is incredibly unjust considering it's their tax dollars that are often funding research and indirectly paying for the literature through grants," Richard Roberts, who is CSO of New England Biolabs, said.

Roberts also said that greater access to such genomic research information "will help the public understand what the current treatments are, what trials are taking place, and what new medicines might be available to them."

Wednesday, May 19, 2010

Computational Biologist @ Pfizer UK

Computational Biologist

Computational Sciences are integral to Pfizer’s strategy to accelerate
the discovery of important new medicines. This is your opportunity to
join a dynamic, innovative team, focusing on the development and use
of novel computational methods and algorithms to enhance the
efficiency and effectiveness of drug discovery.

We are looking for a Computational Biologist with skills in working
with genomics data, developing novel algorithms and generating new
biological hypotheses to join our Computational Sciences Center of
Emphasis in Sandwich, UK.

Specific areas of interest include:

• Identification of new datasets and data mining techniques relevant
to drug discovery
• Developing and testing novel methods and algorithms, working with
academic collaborators where opportunities arise
• Building and testing prototype systems (both interfaces and
algorithm libraries)
• Working with scientists to apply tools to every day problems
• Awarded or about to obtain Ph.D. in a life sciences, computational /
quantitative sciences discipline or equivalent experience
• Familiar with key recent publications, techniques and algorithms
within computational biology and genomics
• Knowledge of applied mathematics / statistics and algorithm development.
• The ability to apply understanding of statistics / machine learning
to new problems and datasets
• A sound understanding of molecular biology and application of
algorithms to solve biological problems
• A sound understanding of current state of the art methods in omics
data analysis, geneset enrichment, network analysis and genetics
• Advanced knowledge of a prototyping and scripting language (e.g.
R/Python/Ruby etc.) to work up new methods and algorithms and deliver
prototype code
• Working knowledge of production languages such as Java

Job Information
Position Type: Industrial and Commerical
Reference (Job ID number): 940074
Start Date: ASAP
Duration: Full Time
Status: open

Contact Information
Computational Sciences CoE
Ketan Patel

Tuesday, May 18, 2010

Pune-based Serum Institute of India (SII) has developed a H1N1 vaccine

In a major advancement in influenza science, India is ready with its first indigenous vaccine against H1N1 swine flu.

Pune-based Serum Institute of India (SII) has developed a H1N1 vaccine — not a painful syringe shot but a harmless nasal spray — which can be used by anybody above the age of three except pregnant women.

To cost around Rs 150, SII will apply to the drug controller general for licensure of its product next week.

Scientists, who are presently completing tabulation of results from the vaccine’s phase-III clinical trial, say it is safe and effective with side-effects being runny nose and a bout of sneezing.

Interestingly, the breakthrough comes exactly a year after India reported its first case of swine flu (May 15, 2009).

Confirming this to TOI, SII’s executive director (operations) Adar Poonawala said, “Our nasal mist vaccine is now ready. We will apply for licensure next week. It had no side-effects which are synonymous to injectible vaccines like fever, swelling or convulsions.”

Poonawala added, “India now has the capability to make its very own seasonal influenza vaccines. With the technology now in place, all we have to do is switch the pandemic H1N1 strain with the seasonal flu virus.”

The vaccine will be delivered into your nose through a devise fitted on top of a syringe. A quick spray in each nostril, the major route that the flu virus takes to enter, and the body develops antibodies to protect against H1N1.

“It is a live attenuated vaccine containing weakened forms of the H1N1 virus designed not to cause the flu. The strain was given to by the World Health Organisation once H1N1 was declared a pandemic,” said Serum’s H1N1 vaccine project director Dr Rajeev Dhere.

Explaining the clinical trials of this vaccine, Serum’s additional medical director Dr Prasad Kulkarni said it was a double blind placebo control trial involving 330 people of which 110 were 18-49 years, 110 were above 50 years and the rest children aged 3-17 years.

This means half of them were given the vaccine while half were given placebo. Testing of the samples was jointly done by Serum and the National Institute of Virology (Pune). Trials were conducted in three institutes from Pune — KEM hospital, D Y Patil Medical College and Bharatiya Vidyapith — and one each from Indore and Ahmedabad — Chacha Nehru hospital and Lambda Lab.

Do you want to know more?

Agilent Completes Varian Acquisition

Agilent Technologies said after the close of the market Friday that it has completed the roughly $1.5 billion acquisition of Varian.

Completion of the deal was announced a day after Agilent said that the European Commission has informed the firm that it had met the conditions set forth in January for the acquisition of Varian. Those conditions included the divestiture of certain product lines, which were recently sold to Bruker.

Agilent said today that the majority of Varian's product lines will become part of the Chemical Analysis Group, with some key businesses being housed in Agilent's Life Sciences Group. Adding to Agilent's portfolio of mass spectrometry, liquid and gas chromatography, and array products, the firm's life sciences business gains Varian's nuclear magnetic resonance, MRI, and X-ray products.

"These technology platforms will open new doors for Agilent and its customers," Nick Roelofs, Agilent's senior VP and president of LSG, said in a statement. "This technology will play a key role in Agilent's growth through applications such as pharmaceutical and therapeutics."

Agilent is scheduled to report its second-quarter 2010 financial results after the close of the market today.

Monday, May 17, 2010

Opening for Senior Bioinformatics Analyst @ Ocimum Bio Solutions

Senior Bioinformatics Analyst


- PhD or equivalent experience in life science / biology.

- 4+ years demonstrated success in working in the life sciences, preferably in an industry research settings.

-In-depth understanding, analysis and processing of data from various post genomic technologies and their applications.

- In-depth knowledge in gene expression (genotyping, chromosomal copy number, SNPs, gene signatures) and sequencing

-Wide understanding of commercially available bioinformatics/clinical genomics tools and databases

-Ability to handle complex / multi projects in rapidly changing environment.

- Good knowledge on Perl, C / java and any relational databases.


- Analyzing and documenting project requirements.

-Developing biocomputing algorithms for tools development and data analysis.

-Performing biological analysis on data from various domains like gene expression, sequencing , molecular lab.

-Co-ordination with sales team on projects to understand the expectations and requirements of scientists / lab.

-Co-ordination with other teams from IT, statistics, QA and sales team.

Interview process consists of:

1. Analytical test paper - 30 minutes

2. Technical Paper: Bioinformatics – 1 hr

3. Presentation on Bioinformatics Topic (Candidate can choose the topic)

3. Face-to-Face Interviews (at least a couple)

Note: Only shortlisted candidates will progress through the various steps outlined above.

Telephonic interviews can be conducted for the candidates who cannot attend for written test and for the face to face round client will pay train fare( upto 3 tier).

To apply contact:

Chandra Gadde,
Hand Phone: 9000385559 | Board: 040-64614559

Thursday, May 13, 2010

Lockheed Martin to Apply Text Mining to Medical Records to Merge Phenotypic, Genomic Data

Lockheed Martin is working with researchers at Johns Hopkins University and medical informatics firm Sage Analytica to mine a "unique" set of medical records with the aim of integrating clinical and genomic data for prostate cancer research.

The researchers are using Lockheed Martin's rule-based natural language processing platform, called ClinRead, to extract phenotypic descriptors from a set of clinical records tracking 33 men with metastatic prostate cancer for over 15 years before they succumbed to the disease.(Read Full Article)

Wednesday, May 12, 2010

EMBL Launches Genomics Data Resource

The European Molecular Biology Laboratory (EMBL) has launched a genomics resource called the European Nucleotide Archive (ENA) that consolidates three DNA and RNA sequence databases.

EMBL's European Bioinformatics Institute (EMBL-EBI) will host the ENA resource, which is made up of the EMBL Nucleotide Sequence Database, the European Trace Archive, and the Sequence Read Archive (SRA).

The European Trace Archive, formerly maintained at the Wellcome Trust Sanger Institute, contains raw data from electrophoresis-based sequencing machines, while the SRA is a new repository for raw data from next-generation, array-based sequencing platforms.

The ENA research team plans to launch new features for the resource over the coming year, including enhancements for the browser, improved interactive submissions tools and organism and project-centered portals into ENA data.

"ENA has been designed to provide our users with improved access both to annotated and to raw sequence data through the same user-friendly interface," Guy Cochrane, ENA's team leader, said in a statement.

"It provides graphical browsing, web services, text search, and a new rapid sequence similarity search. ENA also provides access to related information, with over 190 million cross references to external records, many of which are in other EMBL-EBI data resources," Cochrane added.

"As major generators of DNA sequence data, it is important to us that the research community has ready access not only to annotated sequence information, but also to raw data," Tim Hubbard, head of informatics at the Wellcome Trust Sanger Institute, added in the statement.

Funding for the ENA is provided by EMBL, the Wellcome Trust, and the European Commission's Framework Programme 7.

Monday, May 10, 2010

BioTorrents: A file sharing service for scientific data

BioTorrents, a website that allows open access sharing of scientific data and uses the popular BitTorrent peer-to-peer file sharing technology. BioTorrents allows files to be transferred rapidly due to the sharing of bandwidth across multiple institutions and provides more reliable file transfers due to the built-in error checking of the file sharing technology. BioTorrents contains multiple features, including keyword searching, category browsing, RSS feeds, torrent comments, and a discussion forum.

BioTorrents is available at

A complete description of BioTorrents is described in the manuscript at

Friday, May 7, 2010

Opportunity: Microarray & Next-generation Sequence Curator for Gene Expression Omnibus (GEO) curation team

Computercraft seeks a highly motivated Molecular Biologist to join the Gene Expression Omnibus (GEO) curation team onsite at the National Institutes of Health (NIH) in Bethesda, MD. GEO is the largest fully public repository for functional genomic data, primarily microarray and next-generation sequence datasets. More information on GEO can be found on the web site

We are currently looking for someone with a background in molecular biology, genomics, or biomedicine that is capable of working with large datasets. This person will be a member of the GEO curation team, helping to review and process incoming data submissions. The successful candidate will work at NIH's National Center for Biotechnology Information (NCBI) in the National Library of Medicine (NLM).

The successful candidate will perform the following tasks:
- Review and evaluate data submissions for structural integrity and content.
- Communicate extensively with researchers, resolving issues relating to submission procedures, formats, content, and site navigation.
- Utilize UNIX C shell commands and scripts to assemble, edit, and upload large data files to the database.
- Perform advanced database curation and assembly of comparable datasets to reflect biological variables and experimental design
- Test
, develop, and troubleshoot new query, data display, and analysis features.

- This challenging position requires a Ph.D. or M.Sc in molecular biology or related field.
- Excellent general computer skills, including familiarity with spreadsheets, are required.
- Excellent written/verbal communication skills are an absolute requirement.

- Practical experience with microarrays or high-throughput sequencing is highly desirable but not required to apply.
- Experience with LINUX/UNIX is highly desired.

Computercraft offers a competitive salary and an excellent benefits package including PPO health insurance with 100% company paid premiums, 401K program with matching, paid time off and holiday pay, life insurance, flexible spending and disability coverage. We offer an excellent work life balance with a standard 40 hour work week and the chance to work alongside accomplished scientists at NIH/NCBI.

To apply for this position or learn about other Computercraft job opportunities, please visit the Careers section of our website:

Computercraft is an equal opportunity employer.

Wednesday, May 5, 2010

BioSlax-Bioinformatics Live-CD Linux Slaxware Suite

BioSLAX is a new live CD/DVD suite of bioinformatics tools
that has been released by the resource team of the
BioInformatics Center (BIC), National University of Singapore (NUS).
Bootable from any PC, this CD/DVD runs the compressed
SLACKWARE flavour of the LINUX operating system also known as SLAX.

SLAX is becoming the live CD/DVD of choice because of its ability to
modularize almost any application and plug it into the system on the fly.

The system can also be installed to USB thumb drives or
directly to the PC as a regular Linux using the BioSLAX installer provided.

It consists of commonly used Bioinformatics Applications,
Software and Algorithms.

For Download,

Monday, May 3, 2010

Position Open - R&D - Bioinformatician

R&D - BioinformaticianReputed client of GlanzHR Services Private Limited

Experience: 5 - 10 Years
Location: Bengaluru/Bangalore
Education: UG - Any Graduate - Any Specialization PG - M.Sc - Microbiology
Industry Type: Pharma/ Biotech/Clinical Research
Role: Research Scientist
Functional Area: Healthcare, Medical, R&D
Posted Date: 28 Apr

Desired Candidate Profile
M Sc. / Advanced diploma in Bioinformatics or Genetics / Biotechnology /
life sciences with good bioinformatics knowledge with analytical and
reasoning skills,Knowledge of population genetics is a plus.Job Description
Analyze genomics data from multip le perspectives to derive new knowl
edge,Understand, benchmark&imple ment tools available for above t
ask,Develop new genomics data analy sis methods,Understd,implement& maintain
quality control

Keywords: M Sc. / Advanced diploma in Bioinformatics or Genetics /
Biotechnology / life sciences with good bioinformatics knowledge.population
genetics is a plus.Analyze genomics data from multiple perspectives to
derive new knowledge.

Company Profile
A Bio tech Co.

Contact Details
Company Name: Reputed client of GlanzHR Services Private Limited
Email Address:

Reference ID: R&D - Bioinformatician
Hukum C. Rawal

Life Science and Informatics

What is this?
is this a new industry?
or a old wine in a new bottle?

Well Life Sciences and Informatics can be anything form computational biology, all omes and omics, core bioinformatics to curation and literature mining, database creation, in the area of biology, chemistry , bio-chem space.

There are number of companies in India and bangalore is the forefront as a major bio-cluster with 20 to 30 companies in this sphere.

now how good are these companies doing?
how good are they in terms of the international markets and how profitable is their business?
what do they do?
their clients?

These are some interesting things that could be discussed in this blog page...

Tag It