Difference between revisions of "ISMB 2017: BioinfoCoreWorkshop"

From BioWiki
Jump to navigationJump to search
Line 1: Line 1:
We are holding a Bioinfo-core workshop at the [https://www.iscb.org/ismbeccb2017 ISMB/ECCB meeting] in Prague.  We have been given a [https://www.iscb.org/cms_addon/conferences/ismbeccb2017/workshops.php#WK02 half-day slot in the program] on Monday, July 24, 2017, 10:00 am – 12:30 pm
+
We are holding a Bioinfo-core workshop at the [https://www.iscb.org/ismbeccb2017 ISMB/ECCB meeting] in Prague.  We have been provided a [https://www.iscb.org/cms_addon/conferences/ismbeccb2017/workshops.php#WK02 half-day slot in the program] on Monday, July 24, 2017, 10:00 am – 12:30 pm
 +
 
 +
=Standing on Two Legs: Managing Operations in a Core and Ensuring Scientific Reproducibility=
  
 
== Workshop Structure ==
 
== Workshop Structure ==
Line 5: Line 7:
 
The workshop is split into two sessions with a required break between.  Each session will have two 15 minute talks followed by a 30 minute discussion.  
 
The workshop is split into two sessions with a required break between.  Each session will have two 15 minute talks followed by a 30 minute discussion.  
  
* The first slot will have 2 15 minute talks on the topic of Managing Core Facilities followed by a 30 minute panel discussion. After the break we will have 2 15 minute talks about Ensuring Scientific Reproducibility, followed by a 30 minute panel discussion.
+
* The first slot will have 2 15 minute talks on the topic of Managing Core Facilities followed by a 40 minute audiance discussion. After the break we will have 2 15 minute talks about Ensuring Scientific Reproducibility, followed by a 40 minute panel discussion.
  
 
== Workshop topics ==
 
== Workshop topics ==
Line 15: Line 17:
 
Speaker: '''Russell Hamilton''', University of Cambridge<br>
 
Speaker: '''Russell Hamilton''', University of Cambridge<br>
 
Time: 10:00-10:15<br>
 
Time: 10:00-10:15<br>
 +
  
 
'''Managing people in a core facility'''<br>
 
'''Managing people in a core facility'''<br>
Line 20: Line 23:
 
Time: 10:15-10:30
 
Time: 10:15-10:30
  
Management Panel: 10:30-11:00
 
  
 +
One constant of running a core facility: have to deal with people at all levels: funders, staff, investigators/clients, etc.
 +
Difference in your clients affect what kind of staff, in terms of skills and personalities, need to be maintained in a core: need for pipeline redundancy and translation to staffing, do they need to interact directly with clients? customer service is important, for example.
 +
Managers have to take care of their people: do they have what they need in computers, tools and training?  ensure they can take pride in their work, work out  time/priority conflicts, do they have the big picture of the mission on the organization, the core and their clients?  Talk to your folks as to what motivates them and ways to stay engaged.  Assign projects to stretch their abilities and instill a sense of  ownership to them for  their projects.  celebrate successes.
 +
Annette has always Advocated for the need for bioinformatics within her organizations--a priority.
 +
 +
Some Questions:
 +
What's your biggest horror story?
 +
what where some of the traits you look for in a candidate.
 +
what's the size of your team(s) and the reporting structure?
 +
 +
Management Panel: 10:30-11:10
 +
 +
balance of operating a core and research: research is secondary, but how are you judged?  how do you manage  phd students through the core?  they are co-supervised. 
 +
a fair number of people in the audience maintain both research and service.  how are these balanced?  administrative part of running the core comes first.
 +
how many people are hired into academic tracks vs. professional track?
 +
The problem of career progression: none at all--hired into a position and remain at that hiring position.  This was highlighted at a recent Cores conference in the UK.  what is interesting is that funders were present and discussed putting together a professional development track to parallel an academic track--Wellcome, etc.  what is rewarded?
 +
what is the balance in providing training vs. owning the expertise in a specific analytical domain? Train collaborators?  how much of the analysis is a core doing by themselves and how much to empower other to perform.  self-help is important to get started and to answer low-live questions. However, development of this body of knowledge in how-to articles or knowledge bases is challenging to do as it takes time to put into the effort.
 +
another potential topic: couple of discussion around: training and education--how much do you train and what?  What level?  does it cut into your business?
 +
training at CSIRO--develop a mission and stay focused.  aim to provide others literacy so they understand some of the basic information.  "data internships" where a customer interns with a bioinformatician for a week or 2.
 +
How do you stay organized as you grow in people?  is it manageable: are you all working like mad vs. same number of projects, but they are now more shared (greater bandwidth). 
 +
the "embedded" bioinformatician: how to keep them engaged in the core.  dynamic of research bioinformaticians embedded in the core--these folks generally feel most isolated, reach out and engage them as a community.
 +
   
 +
Different organizational structures: study/survey opportunity--reporting to an academic?  working inside institutional funding?
 +
 +
'''Ensuring Reproducibility'''
 +
 +
 +
'''Developing Reliable QC at the Swedish National Genomics Infrastructure'''<br>
 +
Speaker: '''Phil Ewels''', SciLifeLab (Sweden)<br>
 +
Time: 11:30-11:45<br>
 +
[[File:Phil_Ewels_ISMB_BioinfoCore_2017.pdf]]
 +
 +
Core facility for all of Sweden.  Maintain a continuum of QC, from fully automated and rigorous using MultiQC to the occasional QC  for development.  utilizes continuous integration: GitHub, docker and travis CL.  uses [https://github.com/search?q=topic%3Anextflow&type=Repositories Nextflow] for NGI-RNAseq and others.
 +
everything is on GitHub.  http://opensource.scilifelab.se, http://multiqc.info, http://GitHub.com/SciLifeLab.
 +
 +
'''Reproducible and fully documented data analyses at the Functional Genomics Center Zurich'''<br>
 +
Speaker: '''Lennart Opitz''', University of Zurich<br>
 +
Time: 11:45-12<br>
 +
 +
Leverages an Open Technology Platform since 2002
 +
 +
Reproducibility Panel: 12-12:30
 +
 +
Opitz: How do you deal with external clients and data backup?  Data is backed up by policy.  the users signs up for retention of their data for a period of time--industry and external users understand their data is captured and kept for 3-6 months.  QC results are kept forever but original data is purged.
 +
 +
What is everyone's definition of reproducibility?  is it that you are able to run the data again in 10 years and get the same results and are you ensured to get the same results?  Ewels has everything on GitHub and can re-run data using  specific version of tools branched on GitHub.
 +
[http://singularity.lbl.gov singularity] to run containers on HPC. 
 +
ISO certification: validation steps, everything is audited, complete documentation system around the IT and informatics systems. 
 +
 +
Do you have a sense as to whether your users understand the investments you have done to setup such a rigorous system?  a lot of people couldn't reproduce their old pipeline and didn't trust them. They've written the QC measures to demonstrate quality,
 +
the value and improve reproducibility and trust.
 +
try to develop interactive tools with SHINY.
 +
training on experimental design and coaching, involving the core in experimental design.  It's an advantage to work in a center of core labs, to work together with the sequencing lab/informatics lab, etc so the kick off is all done together.  also advantage to have wet lab
 +
right next store.
 +
idea of an interactive tool that exports a reproducible script.
  
 
'''Ensuring Reproducibility'''
 
'''Ensuring Reproducibility'''

Revision as of 05:15, 24 July 2017

We are holding a Bioinfo-core workshop at the ISMB/ECCB meeting in Prague. We have been provided a half-day slot in the program on Monday, July 24, 2017, 10:00 am – 12:30 pm

Standing on Two Legs: Managing Operations in a Core and Ensuring Scientific Reproducibility

Workshop Structure

The workshop is split into two sessions with a required break between. Each session will have two 15 minute talks followed by a 30 minute discussion.

  • The first slot will have 2 15 minute talks on the topic of Managing Core Facilities followed by a 40 minute audiance discussion. After the break we will have 2 15 minute talks about Ensuring Scientific Reproducibility, followed by a 40 minute panel discussion.

Workshop topics

Managing a Core Facility


Setting up a new bioinformatics core facility: a first year review
Speaker: Russell Hamilton, University of Cambridge
Time: 10:00-10:15


Managing people in a core facility
Speaker: Annette McGrath, CSIRO The University of Queensland
Time: 10:15-10:30


One constant of running a core facility: have to deal with people at all levels: funders, staff, investigators/clients, etc. Difference in your clients affect what kind of staff, in terms of skills and personalities, need to be maintained in a core: need for pipeline redundancy and translation to staffing, do they need to interact directly with clients? customer service is important, for example. Managers have to take care of their people: do they have what they need in computers, tools and training? ensure they can take pride in their work, work out time/priority conflicts, do they have the big picture of the mission on the organization, the core and their clients? Talk to your folks as to what motivates them and ways to stay engaged. Assign projects to stretch their abilities and instill a sense of ownership to them for their projects. celebrate successes. Annette has always Advocated for the need for bioinformatics within her organizations--a priority.

Some Questions: What's your biggest horror story? what where some of the traits you look for in a candidate. what's the size of your team(s) and the reporting structure?

Management Panel: 10:30-11:10

balance of operating a core and research: research is secondary, but how are you judged? how do you manage phd students through the core? they are co-supervised. a fair number of people in the audience maintain both research and service. how are these balanced? administrative part of running the core comes first. how many people are hired into academic tracks vs. professional track? The problem of career progression: none at all--hired into a position and remain at that hiring position. This was highlighted at a recent Cores conference in the UK. what is interesting is that funders were present and discussed putting together a professional development track to parallel an academic track--Wellcome, etc. what is rewarded? what is the balance in providing training vs. owning the expertise in a specific analytical domain? Train collaborators? how much of the analysis is a core doing by themselves and how much to empower other to perform. self-help is important to get started and to answer low-live questions. However, development of this body of knowledge in how-to articles or knowledge bases is challenging to do as it takes time to put into the effort. another potential topic: couple of discussion around: training and education--how much do you train and what? What level? does it cut into your business? training at CSIRO--develop a mission and stay focused. aim to provide others literacy so they understand some of the basic information. "data internships" where a customer interns with a bioinformatician for a week or 2. How do you stay organized as you grow in people? is it manageable: are you all working like mad vs. same number of projects, but they are now more shared (greater bandwidth). the "embedded" bioinformatician: how to keep them engaged in the core. dynamic of research bioinformaticians embedded in the core--these folks generally feel most isolated, reach out and engage them as a community.

Different organizational structures: study/survey opportunity--reporting to an academic? working inside institutional funding?

Ensuring Reproducibility


Developing Reliable QC at the Swedish National Genomics Infrastructure
Speaker: Phil Ewels, SciLifeLab (Sweden)
Time: 11:30-11:45
File:Phil Ewels ISMB BioinfoCore 2017.pdf

Core facility for all of Sweden. Maintain a continuum of QC, from fully automated and rigorous using MultiQC to the occasional QC for development. utilizes continuous integration: GitHub, docker and travis CL. uses Nextflow for NGI-RNAseq and others. everything is on GitHub. http://opensource.scilifelab.se, http://multiqc.info, http://GitHub.com/SciLifeLab.

Reproducible and fully documented data analyses at the Functional Genomics Center Zurich
Speaker: Lennart Opitz, University of Zurich
Time: 11:45-12

Leverages an Open Technology Platform since 2002

Reproducibility Panel: 12-12:30

Opitz: How do you deal with external clients and data backup? Data is backed up by policy. the users signs up for retention of their data for a period of time--industry and external users understand their data is captured and kept for 3-6 months. QC results are kept forever but original data is purged.

What is everyone's definition of reproducibility? is it that you are able to run the data again in 10 years and get the same results and are you ensured to get the same results? Ewels has everything on GitHub and can re-run data using specific version of tools branched on GitHub. singularity to run containers on HPC. ISO certification: validation steps, everything is audited, complete documentation system around the IT and informatics systems.

Do you have a sense as to whether your users understand the investments you have done to setup such a rigorous system? a lot of people couldn't reproduce their old pipeline and didn't trust them. They've written the QC measures to demonstrate quality, the value and improve reproducibility and trust. try to develop interactive tools with SHINY. training on experimental design and coaching, involving the core in experimental design. It's an advantage to work in a center of core labs, to work together with the sequencing lab/informatics lab, etc so the kick off is all done together. also advantage to have wet lab right next store. idea of an interactive tool that exports a reproducible script.

Ensuring Reproducibility


Developing Reliable QC at the Swedish National Genomics Infrastructure
Speaker: Phil Ewels, SciLifeLab (Sweden)
Time: 11:30-11:45
File:Phil Ewels ISMB BioinfoCore 2017.pdf

Reproducible and fully documented data analyses at the Functional Genomics Center Zurich
Speaker: Lennart Opitz, University of Zurich
Time: 11:45-12

Reproducibility Panel: 12-12:30