ISMB 2012: Workshop Proposal

From BioWiki
Jump to navigationJump to search

The final ISMB Bioinfo Core Workshop agenda can be found at [1]

Click on Show on the Workshop page if the agenda does not display.



Introduction

The 2012 ISMB meeting will be held 15-17 July in Long Beach, California. The Bioinfo-core group is again proposing a workshop within the main body of the conference. This workshop will be 2 hours, split into two one hour topics. Within each topic there will be short talks (~10 mins each) by members of the group to introduce the topic, followed by an interactive panel discussion with all workshop participants where the topic can be further explored.

This page will initially help use to put together a proposal for ISMB, and assuming our proposal is accepted will become the official program for the workshop.

As in previous years we have aimed to have one topic mostly focused on science and analysis, and another which focuses on the technical aspects of running a core facility.

Topic 1 (science): Extracting biological information from diverse data sources

Scientists have typically thought of experiments in terms of using a single technology to answer a question. Even the advent of genome wide analyses have generally been performed using a single type of analysis (sequencing, microarray, mass-spec etc). However we are seeing that a complete biological picture of an experimental system can only really be gained by integrating together multiple diverse data types. Increasingly, research departments and projects are beginning to find their sequencing data limits their findings (gene associations are noisy in that the genes identified, once examined closely, usually point to other up/down stream factors) and begin to turn to data and analysis in the domains of network and systems biology. In some fields (eg epigenetics) the need for this type of integration of methylation, histone modification, nucleosome positioning, mass spec and RNA-seq is already present, but the same principle applies to most biological systems.

In this session we will aim to look at the practicalities of this type of integration, including success stories, pitfalls and failures and some discussion about the tools available to help.

Introduction

Speakers

  • Session Leaders: Jim Cavalcoli + Fran Lewitter
  • Talk 1: Stephen Turner, University of Virginia
  • Talk 2: James Cavalcoli, University of Michigan

Discussion: Simon Lin, CAMDA. Can we leverage CAMDA as a standard dataset for testing tools?

Topic 2: Handling the increase of demand and requirements from different sets of users within a core

Many core facilities deal with diverse sets of users who come with very different needs. Balancing the needs of these different sets of users is a challenging task which can be compounded by financial, regulatory or political pressures. Common headaches might include the special requirements of users working with clinical samples or for pipelines "translating" from research use to production or even clinical sequencing service. Pressure to transition and concerns of traceability, tracking, and privacy must be addressed. Additionally, external or commercial samples might come with service level agreements which dictate the level, speed or quality of service required.

Keeping everyone happy in these situations is a potential minefield for a core, so in this session we will hear how some existing facilities have coped with these pressures, and what challenges remain. We will discuss in detail some of the specific problems and solutions core facilities have had when faced with new business drivers and the expectations of new communities coming to the Core: from "translational" areas such as clinical sequencing to external demand such as from commercial or outside collaborators.

Speakers

  • Session Leaders: Brent Richter and David Sexton
  • Talk 1: Bioinformatics Helpdesk: plasmid maps to HiSeq, Hans-Rudolph Hotz, Friedrich Miescher Institute for Biomedical Research, Switzerland


         Discussion: There is no good, free bioinformatics helpdesk format.  EMBOSS?  no good way to transfer results.  BLAST, other web sources? have to do training but they are slow. EXCEL? We just say no.  statistics: we offer training.  Our users can't use excel, but they don't have expertise in other tools, so we promote the use of R.  What about modern lab scientists? We train them in R.  It is a scripting language, so they have switched from perl to R, with the benefit of training them up in stats and bioconductor in the process.  problem: display results in genome browser.
         Has anyone a perfect solution to display data in a genome wide format?  Galaxy does bridge the gap between beginner and more advanced user--it gets' them started.  when powered up, they transition to command line tools, though.  Galaxy does not replace the bioinformatician, the sysadmin, big-data, it needs maintenance, no plasmid mapping tools.  Not perfect nor ideal.
  • Talk 2: Challenges with Medical Data, Reinhard Schneider, University of Luxembourg, Luxembourg
          Medical Data in Core Facilities
         Dealing with different regulations in different countries.

Speaker Notes