9 Feb 2009

From BioWiki
Jump to navigationJump to search

Topics:

  • support for short-read sequencing
  • Distributing bioinformatics tools for Academic use via clouds?

Transcript of Minutes

Meeting Minutes

Brief Description:

After announcing yourself and a brief discussion of meetings at ISMB2009 by Fran Lewitter, we will be discussing 2 topics.

Support for short-read sequencing (led by Hershel Safer & David Sexton)

A year ago we discussed computing hardware to support short-read sequencing. In this phone call we'll focus on practical aspects of helping biologists access and make sense of their data. We will address the following two questions.

Please come prepared to discuss current practices in your environment and what kinds of developments would be productive for your institution.

  • What kinds of analyses should we offer biologists who do short-read sequencing?
  • What standard processing ought we do beyond aligning reads to a genome?
  • What kinds of additional processing can be standardized?
  • What are the best ways of offering customized analyses?
  • Can and should the short-read sequencing community develop a collaborative, open-source LIMS?
It would save the effort of every organization developing its own, which is more or less what is happening now. 
Several open-source and commercial LIMS are available, some directed at short-read sequencing and others 
applicable to more general biological research. 
  • Is a community effort possible and worthwhile?
  • Might a general system serve the needs of many institutions? If so, how might such a project get off the ground?

Distributing bioinformatics tools for Academic use via clouds (led by Dawei Lin and Brent Richter)

One of the foundations of Bioinformatics research and development is the availability of open source tools. Dissemination of knowledge and encouragement of collective development has been a tremendous benefit for the community. Installing and deploying those tools, however, are often non-trivial tasks for even the most seasoned bioinformatician. The major difficulty usually derives from technical dependencies of the tools on system and software libraries which may not be packaged with the distribution.

Today, with the wide use of Linux and Virtual Machine (VM) technologies, the possibility of distributing the tool as a VM, complete with all the necessary dependencies included in the OS, is possible.

Such an idea has been prototyped in several Bioinformatics Cores to quickly building basic environments including Bioperl tools, Genome Browsers (gBrowse, UCSC browser), data analysis pipeline and customized LIMS system. What are the limitations/benefits to the cores in packaging tools this way?

With VM's available such as Biskit for structural biology (http://biskit.pasteur.fr/), snowflock for bioinformatics clustering (http://sysweb.cs.toronto.edu/snowflock), DNALinux for a suite of bioinformatics tools (http://www.vmware.com/appliances/directory/963), and even databases such as wormbase from CSHL (http://www.wormbase.org/wiki/index.php/Virtual_Machines), has there been direct experiences within the bionfo-core community with such systems? What are the utilities? Do they provide the promise of easy use for biologists to setup on their own desktops/servers?