DOE Joint Genome Institute home page MGM Workshop home page

Agenda

Monday

Provide feedback Please use this form to provide feedback on the workshop

8:30 AM

Registration

 

 

Introductory seminars (Methods & Technologies)

 

9:00–9.30

Welcome and Overview of the Workshop

picture of Nikos

Nikos Kyrpides

09:30–10:00

1. Introduction to the JGI

The powerful high-throughput DNA sequencing technologies catalyzed by the Human Genome Project, which have contributed to dramatic advances in biomedicine, are now being directed to characterizing the genomes of plants and microbes. Leading this effort is the US Department of Energy (DOE) Joint Genome Institute (JGI), a national user facility that unites the expertise of five national laboratories to advance genomics in support of the DOE mission areas of bioenergy, carbon cycling, and bioremediation.

picture of Jim

Jim Bristow

10:00–10:45

2. New Sequencing Technologies

JGI's future depends on new sequencing technologies and applications developed based on these technologies. With multiple sequencing platforms available, JGI's R&D team has been aimed to develop sequencing applications based on the strength provided by different platforms. Our areas of development lie in de novo whole genome shotgun sequencing, transcriptome sequencing, and metagenomic sample diversity study. Examples of JGI's available sequencing applications in genomic research will be discussed.

picture of Chen

Feng Chen

10:45–11:00

Break

11:00–11:45

3. Sequence Assembly Overview

While the ultimate goals for sequencing projects vary as much as the samples themselves, identifying gene content is a nearly universal goal. Recent work has shown that the lower limit for sequence lengths producing good annotation still exceeds read lengths achievable using next generation sequencing platforms. Therefore, assembly is a common step in analysis pipelines, since it can increase sequence length and reduce complexity via clustering. This talk will provide a high level overview of assembly, and discuss challenges and limitations, especially using next generation sequence data

picture of Alex

Alex Copeland

11:45–13:00

Lunch - JGI Facilities Tour

 

13:00–13:30

4. Single cell genomics

The bulk of finished microbial genomes to date are derived from bacteria and archaea that can be readily grown in culture. However, the vast majority of microorganisms on this planet elude current culturing attempts, severely limiting access to their genomes. While various enrichment methods as well as metagenomic approaches have been successfully applied to aid the genome analysis of such non-cultivable environmental microbes, these methodologies are not suitable for countless community members of interest. Single cell genomics is a new approach which aims to access the genome from an individual microbial cell. Single cells can be isolated from the community using optical tweezers, micromanipulators, flow-sorting, or serial dilutions. After cell lysis, the microbial genome is amplified using multiple displacement amplification (MDA), allowing random genome shotgun sequencing.  The advantages as well problems associated with the single cell genomics approach will be discussed.

picture of Tanja

Tanja Woyke

13:30–14:00

5. Introduction to Metagenomics

Metagenomics, the application of high throughput sequencing to environmental samples is an emerging field that is rapidly advancing our understanding of how microbial communities function and evolve. This introductory talk with trace the roots of metagenomics, it's current practice and speculate on future developments in the field

picture of Susannah

Susannah Tringe

14:00–14:30

6. Introduction to IMG

picture of Nikos

Nikos Kyrpides

14:30–14.45

Break

 

14:45–15.30

7. Sequence Clustering

Studying of any novel object starts with establishing it's relationships with other objects of similar nature. Thus, annotating a protein starts with finding similarities to other proteins. Unprecedented progress in sequencing technology makes evaluation and analysis of pair-wise relationships impractical. Instead, sequences can be grouped into classes carrying common properties. After such grouping, comparison of a novel protein to the entire sequence database can be replaced with testing for membership with a far smaller set of clusters. The precise relationships then can be computed within a cluster. Using hierarchical classifications provides even more reduction of a search space. This talk discusses methods for efficient sequence clustering and applications of clustering in protein annotation and functional prediction, as well as in genome assembly and transcriptomic studies

picture of Denis

Denis Kaznadzey

 

15:30–16:15

8. Basic Bioinformatics Tools

Introduction to the concepts behind the most essential tools in computational biology and bioinformatics. These will include blast alignments, hidden Markov models, analysis using sequences, multiple sequence analysis, protein family classifications, and basics of phylogenetics.

picture of Amrita

Amrita Pati

16:15–17:00

9. Data Sources - Annotation

Genome analysis and gene function prediction depends on the comparison of sequences to the existing information stored in databases. They can either be simple repositories of nucleotide or protein sequence, or contain curated information, related to the function of the genetic elements. Used in combination, bioinformatics databases constitute the most powerful method for gene function prediction. In this presentation databases commonly used for genome analysis will be discussed.

picture of Kostas

Kostas Mavrommatis

17:00–19:00

Poster Session, JGI tours and Dinner Reception

 

 

Tuesday

 

Microbial Genome Analysis &  IMG tutorial start

 

09:00–09:30

10. Submission to IMG/ER [Live Demo]

picture of Marcel

Marcel Huntemann

09:30–10:15

11. IMG-GOLD [Live Demo]

Genome Project selection and Metadata curation

picture of Ioanna

Ioanna Pagani

10:15–11:15

12. IMG Genomes [Live Demo]

Microbial genome data analysis in IMG is set in the comparative context of multiple microbial genomes. IMG allows navigating the microbial genome data space along three key dimensions: genomes (organisms), functions (terms and pathways), and genes. In this section, IMG-based comparative analysis of gene families and genomes will be presented.  Tools that will be discussed include phylogenetic profiles and occurrences, homology-based and chromosomal context analysis, VISTA, abundance profiles, and genome clustering.

picture of Nikos

Nikos Kyrpides

11:15–12:00

Hands on Genomes [Exercises]

Users

12:00–13:00

Lunch

13:00–13:30

[continuation of exercises]

Users

13:30–14:00

13. Finding the genes in microbial genomes

Annotation of microbial genomes usually starts with finding the genes coding for stable RNAs (rRNA and tRNA) and protein-coding genes (CDSs). The principles underlying gene prediction in microbial genomes, as well as different implementations of these algorithms and most popular gene finding tools will be discussed.

picture of Natalia

Natalia Ivanova

14:00–14:30

14. Gene models Quality Control (Gene QC)

Accurate gene prediction is an indispensable step for correct subsequent genome analysis. All currently available tools for automatic gene-finding have a 10-15% error rate in their accuracy.  A methodology for gene model validation and manual curation will be presented.

picture of Amrita

Amrita Pati

14:30–15:30

15. IMG Genes[Live Demo]

picture of Kostas

Kostas Mavrommatis

15:30–17:00

Hands on Genes [Exercises]

Users

Provide feedback

Please use this form to provide feedback on the workshop

Wednesday

 

IMG Tutorial   (Genome annotation and analysis)

 

09:00–9:45

16. IMG Terms and Pathways

Description of the Control Vocabularies for the annotations in IMG (IMG Terms) and the curation of the IMG pathway database (IMG pathways)

picture of Natalia

Natalia Ivanova

09:45–10:15

17. IMG Gene Context Analysis

Functional annotation is based on sequence similarity but can be facilitated by additional information provided by the gene context. The tools that exploit the gene context analysis will be presented.

picture of Kostas

Kostas Mavrommatis

10:15–11:15

18. IMG  - Functions and Pathways [Live Demo]

IMG has several ways for users to interact with protein functions and pathways, including Clusters of Orthologous Groups (COGs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. In addition, JGI is developing a controlled vocabulary for the representation of functions and pathways known as IMG Terms and Pathways. The use of the various Functional Groups and their Pathways and their importance in comparative genome analysis will be presented and discussed.

picture of Iain

Iain Anderson

11:15–12:00

Hands on IMG Functions [Exercises]

Users

12:00–13:00

Lunch

13:00–13:30

[continuation of exercises]

Users

13:30–14:15

19. IMG-ER Curation Environment (MyIMG)

picture of Natalia

Natalia Ivanova

14:15–15:30

Hands on IMG-ER [Exercises]

Users

15:30–16:00

20. Expression Data in IMG

New tools for handling and exploring expression data (proteomics – transcriptomics) will be discussed.

 

 

picture of Kostas

Kostas Mavrommatis

16:00–16:30

21. A Genome Analysis test case

The methodology and steps to analyze a genome in IMG will be presented with a user case

 

picture of Kostas
Kostas Mavrommatis

16:30–17:00

22. Analysis of Haloarchaeal Genomes

The application of tools from IMG to comparison of ten haloarchaeal genomes will be presented.  The major tools presented will be the phylogenetic profiler, function profiles, and gene neighborhoods.

 

 

picture of Iain

Iain Anderson

Provide feedback

Please use this form to provide feedback on the workshop

Thursday

 

IMG/M Tutoriall  (metagenome analysis)

 

09:00–09:30

23. Introduction to IMG/M

IMG/M  [Live Demo]

picture of Natalia
Natalia Ivanova

09:30–10:00

24. Pre-processing of metagenomic datasets

The rapid increase of metagenomic projects is leading to an exponential growth of the sequence data, which in turn creates new challenges related to efficient data storage and analysis. This problem is expected to become more prominent as new sequencing technologies are adopted and large scale sequencing projects are carried out.  The methods used to process the data prior to their integration in IMG/MER will be presented.Methods developed in house that allow efficient compression of the datasets and representation without loss of sequence, contextual and functional information will be presented as well.

picture of Kostas
Kostas Mavrommatis

10:00–10:30

25. Statistical analysis of metagenomic datasets

The systematic evaluation of the relative abundances of individual as well as sets of protein functions across various metagenomic datasets, can yield statistically significant deductions about over- and under-representation of protein function(s) and biological pathways in these communities. We can derive statistical methods for comparing the relative abundances of both individual as well as sets of protein families in 2 given metagenomic datasets. Statistical models for modeling individual abundances and methods for identifying protein families whose difference in abundances are statistically significant, will be presented.

picture of Amrita
Amrita Pati

10:30–10.45

Break

10:45–11:45

26. Metagenome analysis in IMG/M  [Live Demo]

A snapshot of microbial community structure can be derived from analysis of metagenomic data. IMG/M methods and tools for establishing the taxonomic identity of community members will be presented along with tools for determining the fine population structure, genetic variation and genome dynamics of the dominant populations. Methods for assessing the diversity and abundance of microbial communities will be discussed.

picture of Natalia
Natalia Ivanova

11:45–14:00

Hands on IMG/M [Exercises]

Users

12:00–13:00

Lunch

14:00–14:30

27. Metatranscriptomics

Metatranscriptomics provides a snapshot of gene expression of the entire microbial community. Rapid technological advances in ultra-high-throughput sequencing are making sequencing-based transcriptomics (RNA-seq) a viable alternative to microarrays for microbial gene expression analyses. As still in its infancy, some challenges and progress made in metatranscriptomics will be discussed.

picture of Shaomei
Shaomei He

14:30–15:00

28. Soil Metagenomics

Soil microbes are responsible for global cycling of carbon and nutrients, but most have never been isolated and their functions are not known. Sequencing of soil meyagenomes has been particularly challenging due to the high complexity and diversity of soil microbial communities. Recently JGI has launched ambitious soil metagenomics sequencing projects to tackle this challenge. Examples of ongoing soil metagenomics sequencing projects will be discussed.

picture of Janet
Janet Jansson

15:00–15:30

29. A Metagenome analysis test case

The methodology and steps to analyze a metagenome in IMG/M-ER will be presented as a use case scenario

picture of Natalia
Natalia Ivanova

15:30–15:45

Break

 

15:45–16:30

IMG Tutorial "postmortem"

 

picture of Victor
Victor Markowitz

16:30–17:00

OPEN DISCUSSION - IMG future features

 

Provide feedback

Please use this form to provide feedback on the workshop

Friday

 

CAMERA, Greengenes & JGI Eukaryotic portal tutorials

 

09:00–11:00

32. CAMERA

CAMERA (http://camera.calit2.net/) stands for Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis. The aim of this project is to serve the needs of the microbial ecology research community by creating a rich, distinctive data repository and a bioinformatics tools resource that will address many of the unique challenges of metagenomic analysis

Michael Chiu and Shulei Sun

11:00–11:15

Break

11:15–11:45

33. Accurate Estimation of Microbial Community Using Pyrotags

Pyrosequencing of small subunit ribosomal RNA amplicons (pyrotags) is rapidly gaining popularity as the method of choice for profiling microbial communities. It has revealed that the extent of rare microbial populations in several environments, the "rare biosphere", is orders of magnitude higher than previously thought. However, the large amount of data, and errors associated with the sequencing technology present significant analytical challenges. I will show how sequencing errors can potentially inflate diversity estimates. I will describe PyroTagger – a fast, scalable computational pipeline designed to ensure accurate estimates of microbial diversity.

picture of Julian

Julien Tremblay

11:45–12:15

34. QIIME

picture of Justin

Justin Kuczynski

12:15–13:00

Lunch

13:00–14:00

35. ARB [Live Demo]

ARB is a software package designed to allow the efficient analysis of ribosomal RNA sequences. It incorporates tools for database management, automatic and manual sequence alignment, phylogenetic tree calculation and the design of discriminatory oligonucleotides used as probes (e.g. for fluorescence in situ hybridization) and primers.

picture of Christian

Christian Rinke  

14:00–15:00

36. Greengenes and PhyloChip

Greengenes (http://greengenes.lbl.gov/) is a web application assisting molecular ecologists with data analysis. Aligning 16S rRNA gene sequences, removing chimeras, and classifying the members of a microbial community against all of the five dominant bacterial and archaeal taxonomies will be covered. Two advanced methods will also be discussed: integration of PhyloChip community analysis with sequencing data and how to import your Greengenes pre-processed data into ARB for visualization. Participants may preview the online tutorial from the Greengenes website.

picture of Todd

Todd DeSantis

15:00–15:15

Break

15:15–15:45

37. Annotation of Eukaryotic Genomes

Over 50 eukaryotic genomes from different taxonomic groups are annotated at JGI using JGI Annotation pipeline. The pipeline integrates several gene prediction, annotation, and analysis tools to annotate a diverse set of genomes in high-throughput but genome-specific manner. To address gene prediction challenges in eukaryotes that often display high repeat content, low gene density, and complex gene structure, we combine different gene predictors with available experimental data and comparative genomics analysis. JGI Eukaryotic Portal provides web-based tools for user communities to enable comprehensive genome analysis and manual curation of predicted genes and functions.

 

picture of Igor

Igor Grigoriev

15:45–17:00

Mycocosm/ Fungal Portal Tutorial [Live Demo]

http://genome.jgi-psf.org/Tutorial

http://www.jgi.doe.gov/fungi

picture of Andrea

Andrea Aerts

 

End of Workshop

 

Provide feedback

Please use this form to provide feedback on the workshop

 

OnLine Tools

IMG

IMG/M

IMG-ER

IMG-EDU

Artemis

VISTA

Greengenes

ARB

GOLD

BLAST

ClustalX

COGs

EBI

Eukaryotic Portal

InterPro

KEGG

NCBI/GenBank

Pfam

PIR

Sequencher