Chlamydomonas reinhardtii genome project Estimated genome size: 80-100 MB Collaborators: Paul Lefebvre and Carolyn Silflow (U. Minnesota), Arthur Grossman (Carnegie Insititute, Stanford CA), Libby Harris (Duke), David Stern (Cornell), and the 1. The importance of the genome to biomedical and biological research The importance of the Chlamydomonas genome for biomedical and biological research is based on the unique experimental advantages of this model organism. Chlamydomonas is the only photosynthetic unicellular eukaryote to be widely used for genetic studies of photosynthesis and carbon assimilation. The power of a model system with haploid genetics and a six hour generation time is indisputable, and is evident, for example, in the critical role that the haploid yeast Saccharomyces cerevisae has played in our understanding of basic cellular functions, including cell cycle control and signal transduction. Chlamydomonas fulfills a similar role in plant and algal functional genomics. The elucidation of the sequence of the nuclear genome will magnify the power of this system many fold. Because it is a facultative photoheterotroph that can grow using either photosynthesis or by assimilating a fixed carbon source, Chlamydomonas is the only genetically- tractable eukaryote in which mutations in carbon fixation and photosynthesis are conditional rather than lethal. Among the original contributions of Chlamydomonas to date are: the first demonstration of recombination in chloroplast genomes; the first mutations affecting the specificity of the chloroplast gene encoding ribulose-1,5-bisphosphate carboxylase, (the key enzyme in carbon fixation and the most abundant protein in the biosphere); the first demonstration of transformation of a chloroplast genome using a cloned gene; and the first identification of a high-affinity nitrate transporter in a photosynthetic eukaryote. Contributions from Chlamydomonas research extend far beyond photosynthesis and metabolism. These include: the discovery of functions in two new members of the tubulin gene family (delta and eta tubulin), the first identification of a retinoblastoma homologue in a unicellular organism, and the elucidation of function of the gene affected by the major inherited form of polycystic kidney (PCK) disease in humans. 2. The importance of the genome to DOE mission and stated goals In the near term, the elucidation of the sequence of the Chlamydomonas genome will greatly accelerate research on the molecular basis of photosynthesis and carbon assimilation, leading to a greater understanding of biomass and biofuels production. In the longer term, this project will enhance prospects for the use of genetically-engineered Chlamydomonas strains for multiple projects of interest to the DOE, including the production of commercially useful quantities of hydrogen and the use of Chlamydomonas in the detoxification of soils by removing heavy metals from contaminated sites. 3. The size and interest of the research community who will use the proposed genome sequence The Chlamydomonas community is a large, closely-knit community of researchers from many countries. As one measure of the size of the community, the latest of the biannual International Chlamydomonas Meetings attracted more than 220 researchers, representing slightly more than half of the approximately 200 laboratories using this organism as a primary research focus. As a measure of research productivity, a search of the PubMed database at the National Center for Biotechnology Information (NCBI) lists 2,861 publications with the word "Chlamydomonas," as compared to such other popular systems as zebrafish (2,490 publications) and C. elegans (6,864 publications). There are more than 40 current NIH grants for Chlamydomonas research; the NSF database has 193 grant listings since 1989. Many other Chlamydomonas projects are funded by the USDA competitive grants program, by the DOE external grants program, and by the Department of Defense. The community of Chlamydomonas researchers has a long history of close communication and collaboration. Dr. Elizabeth Harris directs the NSF-funded Chlamydomonas Genetics Center at Duke University, which maintains and distributes wild-type and mutant strains (currently numbering more than 3000), and maintains the Chlamydomonas web site and ChlamyDB, an AceDB-based searchable database with up-to-date information on all aspects of Chlamydomonas research. She has recently mounted a comprehensive web site for the Chlamydomonas Genome Project. The Genetics Center also handles delivery of plasmids, lambda, and BAC clones, and cDNA and genomic libraries, to researchers world-wide for a nominal fee. The most recent group effort in this field is the "Chlamydomonas Genome Project" described below, which was funded by the NSF ($3.3 million over 3 years) beginning in 2000. 4. Resources available to complement the sequence A full array of research resources and reagents are freely available to all in the Chlamydomonas community. In addition to the cDNA and genomic DNA libraries available through the Genetics Center, researchers for decades have distributed libraries, strains and reagents (such as antibodies) in a culture of cooperation that has always characterized research in this model organism. A large insert BAC library, consisting of 15,000 clones arrayed on filters, (8X coverage of the genome) is freely available to all. This library is the centerpiece of one aim of the Genome Project: to place overlapping BAC clone contigs on every part of the extensive genetic map. As of 12/01, more than 40% of the genetic map has been covered by ordered BAC contigs at the University of Minnesota (Lefebvre and Silflow laboratories). The ordered library, combined with the ease of transformation of this system, makes positional cloning of genes the method of choice for obtaining any gene of interest identified by mutation. Another part of the Genome Project, based at the Carnegie Institute at Stanford University (Arthur Grossman, P.I.) is conducting large-scale EST sequencing, with the goal of sequencing full-length cDNAs for the great majority of expressed genes. The cDNA sequences are being obtained from libraries produced under many conditions of growth and nutrient stress, to identify transcripts from genes expressed under different conditions. To date more than 65,000 EST sequences are available from this project and a parallel effort at the Kazusa Institute in Japan. The Grossman group is preparing high density microarrays of cDNAs to be distributed to the community. The complete sequence of both the chloroplast and mitochondrial genomes are available, and David Stern’s group at Cornell, as another part of the Genome Project, are producing a library of strains in which each expressed gene in the chloroplast genome has been knocked out by homologous gene disruption. 5. Other funding support possible for the Genome Sequencing Project The NSF-funded Genome Grant does not propose genome sequencing, but the efforts currently funded by that grant will be of great assistance to the genome sequencing effort. For example, more than 280 molecular markers have been placed on the Chlamydomonas genetic map, and these have been correlated with their corresponding BAC contigs. This scaffold will be useful in the final stages of genome sequence assembly, to unite shotgun sequence contigs that cannot be merged based on sequence alone. The EST sequences will also be of great use in identifying expressed genes and establishing the intron/exon structure of genes suggested by the genome sequence. We have recently been given informal approval by the NIGMS to submit a $1 million request (for the Feb. 1 deadline) for direct support of the genome sequencing effort. Formal approval should follow the submission of a required letter of intent. 6. Indications of how the genome sequence will be used (next steps) The genome sequence will greatly accelerate research in many areas. For example, combining the sequence of the ends of the clones in the BAC library with the genomic sequence, a complete tiling path of BAC clones anchored to the genetic map will rapidly cover the entire genome, with deep coverage (i.e., multiple clone breakpoints). The density of the molecular/genetic map is such that each point on the genome is within, on average, 2 centimorgans of a molecular marker. In Chlamydomona, one centimorgan corresponds, on average, to 100 kb. As a result, the genome sequence combined with the BAC mapping project will make it possible to clone the gene affected by any mapped mutation within a few weeks. More than 100 mutants affecting different aspects of chloroplast assembly and photosynthesis are mapped, as well as more than 100 mutants affecting flagellar assembly and motility. The genome sequence, combined with the ease of biochemical isolation of organelles in Chlamydomonas, will provide a valuable entry to proteomic approaches. Fingerprinting of complex peptide mixtures using mass spectrometry, coupled with a complete genome sequence, will allow Chlamydomonas researchers to identify each of the protein components of complex cellular structures. This powerful approach has made it possible to identify all of the protein components of the yeast spindle pole body, based on the complete genome sequence. The first algal genome sequence will also be useful in evolutionary studies by extending the range of taxa available for comparing the genomes of photosynthetic organisms.