ISMB 2012: Workshop Proposal

From BioWiki
Revision as of 04:24, 18 July 2012 by Bgrichter (talk | contribs)
Jump to navigationJump to search

The final ISMB Bioinfo Core Workshop agenda can be found at [1]

Click on Show on the Workshop page if the agenda does not display.


Introduction

The 2012 ISMB meeting was held 15-17 July in Long Beach, California. The bioinfo-core workshop was 2 hours, split into two one hour topics. Within each topic we had 2 short talks (~10 mins each) by members of the group to introduce the topic, followed by an interactive panel discussion with all workshop participants where the topic was further explored. Below you will find the materials from the conference.

You can also read through the workshops tweets by searching on [2]] at twitter.com.

As in previous years we have aimed to have one topic mostly focused on science and analysis, and another which focuses on the technical aspects of running a core facility.

Introduction to the Session by Brent Richter

  • Call for Nominations to the bioinfo-core organizing committee
  • 10 years of the Bioinfo-core

Topic 1 (science): Extracting biological information from diverse data sources

Scientists have typically thought of experiments in terms of using a single technology to answer a question. Even the advent of genome wide analyses have generally been performed using a single type of analysis (sequencing, microarray, mass-spec etc). However we are seeing that a complete biological picture of an experimental system can only really be gained by integrating together multiple diverse data types. Increasingly, research departments and projects are beginning to find their sequencing data limits their findings (gene associations are noisy in that the genes identified, once examined closely, usually point to other up/down stream factors) and begin to turn to data and analysis in the domains of network and systems biology. In some fields (eg epigenetics) the need for this type of integration of methylation, histone modification, nucleosome positioning, mass spec and RNA-seq is already present, but the same principle applies to most biological systems.

In this session we will aim to look at the practicalities of this type of integration, including success stories, pitfalls and failures and some discussion about the tools available to help.

Speakers

  • Session Leaders: Jim Cavalcoli + Fran Lewitter
         Discussion: Simon Lin, CAMDA.  Can we leverage CAMDA as a standard dataset for testing tools?

Topic 2: Handling the increase of demand and requirements from different sets of users within a core

Many core facilities deal with diverse sets of users who come with very different needs. Balancing the needs of these different sets of users is a challenging task which can be compounded by financial, regulatory or political pressures. Common headaches might include the special requirements of users working with clinical samples or for pipelines "translating" from research use to production or even clinical sequencing service. Pressure to transition and concerns of traceability, tracking, and privacy must be addressed. Additionally, external or commercial samples might come with service level agreements which dictate the level, speed or quality of service required.

Keeping everyone happy in these situations is a potential minefield for a core, so in this session we will hear how some existing facilities have coped with these pressures, and what challenges remain. We will discuss in detail some of the specific problems and solutions core facilities have had when faced with new business drivers and the expectations of new communities coming to the Core: from "translational" areas such as clinical sequencing to external demand such as from commercial or outside collaborators.

Speakers

  • Session Leaders: Brent Richter and David Sexton


         Discussion: There is no good, free bioinformatics helpdesk format.  EMBOSS?  no good way to transfer results.  
         BLAST, other web sources? have to do training but they are slow. EXCEL? We just say no.  
         statistics: we offer training.  Our users can't use excel, but they don't have expertise in other tools, so we promote the use of R.  
         What about modern lab scientists? We train them in R.  It is a scripting language, 
         so they have switched from perl to R, with the benefit of training them up in stats and bioconductor in the process.  
         Outstanding problem: display results in genome browser.
         Has anyone a perfect solution to display data in a genome wide format?  Galaxy does bridge the gap between beginner and more advanced user
         it gets' them started.  After grasping the tools learning curve, they transition to the command line tools, though.  
         Galaxy does not replace the bioinformatician, the sysadmin, big-data, it needs maintenance.
         Still, no plasmid mapping tools.  Not perfect nor ideal.
         Dealing with different regulations in different countries.

Speaker Notes