Mark Yandell
Professor of Human Genetics and Adjunct Associate Professor of Biomedical Informatics
Bioinformatics and Comparative Genomics
Molecular Biology Program
Education
B.S. University of Texas, Austin
Ph.D. University of Colorado, Boulder
Research
Sequenced genomes contain a treasure trove of information about how genes function and evolve. Getting at this information, however, is challenging and requires novel approaches that combine computer science and experimental molecular biology. My lab works at the intersection of both domains, and research in our group can be summarized as follows: generate hypotheses concerning gene function and evolution by computational means, and then test these hypotheses at the bench. This is easier said than done, as serious barriers still exist to using sequenced genomes and their annotations as starting points for experimental work. Some of these barriers lie in the computational domain, others in the experimental. Though challenging, overcoming these barriers offers exciting training opportunities in both computer science and molecular genetics, especially for those seeking a future at the intersection of both fields. Ongoing projects in the lab are centered on genome annotation and comparative genomics. New areas of inquiry include high-throughput biological image analysis, and exploring the relationships between sequence variation and human disease.
Genome Annotation
One of the great ironies of the DNA sequencing revolution is that genome annotation, not genome sequencing, has become the bottleneck in genomics today. New genomes are being sequenced at a far faster rate than they are being annotated. As of 2007, there are nearly 700 eukaryotic genomes in the sequencing pipeline. Many of these genomes are associated with relatively small research communities who are finding themselves left in the lurch when it comes to annotating their genomes.
Over the past year my lab has been working on an easy-to-use genome annotation pipeline called MAKER. Our goal is to provide research communities without extensive bioinformatics expertise the means to independently annotate their genomes and to distribute the results to the larger biomedical community. For proof of principle, we have collaborated with the S. mediterranea genome project lead by Prof. Alejandro Sánchez Alvarado, Dept. of Neurobiology & Anatomy, University of Utah School of Medicine. To date, our successful annotation of this genome has produced three papers—one describing MAKER, one describing the genome database that we constructed from MAKER's outputs, and another paper describing the our analyses of the S. mediterranea genome and its contents. The first two papers are now in press at Genome Research and Nucleic Acids Research respectively; the third is under review at Science. Going forward, we plan to use the S. mediterranea genome annotations for functional genomics screens. This work will provide many opportunities for research with both computational and experimental components.
High-Throughput Biological Image Analysis
The production and analysis large numbers of digital images is an emerging field of bioinformatics. High-throughput imaging screens typically involve placing living cells or embryos in 96 well plates, and then adding different RNAi constructs or small molecules to each well. An automated microscope is then used to capture the results as digital images. These screens combine computation, genomics and molecular biology in new ways—genome annotations are used to design RNAi constructs; cell-lines and embryos expressing various fluorescent markers must be constructed; and software must be written to process the results. My lab is currently engaged in active collaborations with other groups on campus working in this area, as there is a pressing need to develop image-processing pipelines to analyze the data these screens produce.
In 2006, I helped to organize an R21 large-equipment grant to purchase an automated confocal microscope for high-throughput image based screens. The application was successful, and the university has now acquired a BD Pathway Bioimager. This instrument will provide a basic resource for university researchers carrying out high-throughput image-based screens.
In a continuation of my collaboration with the S. mediterranea genome project, Prof. Sánchez Alvarado and I are using the S. mediterranea genome annotations for a genome-wide, image-based RNAi screen for genes involved in cellular regeneration and wound healing. The Bioimager is essential equipment for this work. Our results to date demonstrate that S. mediterranea is an ideal organism for high-throughput image-based screening, in part because it is literally a flatworm. This fact allows us to circumvent some of the technological problems that limit the scope and power of image-based screens of (not so flat) D. melanogaster and C. elegans.
Sequence Variation and Human Disease
The Utah Population database (UTPD) and associated phenotype & clinical data collected through the Utah Genetic Reference Project (UGRP) offer unique resources for human genomics research. Tying the clinical and phenotypic data contained within these databases to the genome and genome annotations, however, is a challenging task. My is lab interested in characterizing large-scale trends in the UTPD & UGRP data, both with respect to sequence variation and demographics; developing methods to identify cohorts for clinical studies; and the development of diagnostic devices for purposes of personalized medicine.