KEYNOTE PRESENTATIONS
|
Nancy J. Cox
Professor and Section Chief,
Section of Genetic Medicine, Department of Medicine
and Dept. of Human Genetics, U. of Chicago, MI,
U.S.A.
|
New Approaches to Understanding the Genetic Component to Common Human Disease
|
Although genome-wide association studies (GWAS) has enabled us to identify many new loci with highly significant and reproducible associations to common diseases and related quantitative traits, these discoveries have not yet given us much new understanding of the biology underlying disease, nor enabled us to develop accurate predictive risk models. In this talk I will describe a new approach to characterizing the genetic component to common diseases with complex inheritance that promises both a more comprehensive understanding of the biological basis of disease as well as practical utility for predicting risk. Examples of the application of this approach to data on such disparate complex traits as bipolar disorder, schizophrenia, type 2 diabetes and autism illustrate well its value, and demonstrate that it can be applied equally well to data generated through array genotyping or next generation sequencing.
|
Trey
Ideker
Division Chief of Genetics
Professor, Depts. of Medicine and Bioengineering
UC San Diego, CA, U.S.A.
|
Turning Protein Networks into Ontologies
|
Ontologies have been very useful for capturing knowledge as a
hierarchy of concepts and their interrelationships. In biology, a
prime challenge has been to develop ontologies of gene function given
only partial biological knowledge and inconsistency in how this
knowledge is curated by experts. I will present a method by which
large networks of gene and protein interaction, as are being mapped
systematically for many species, can be transformed to assemble an
ontology with equivalent coverage and power to the manually-curated
Gene Ontology (GO). The network-extracted ontology contains 4,123
biological concepts and 5,766 relations, capturing the majority of
known cellular components as well as many additional concepts,
triggering subsequent updates to GO. Using genetic interaction
profiling we provide further support for novel concepts related to
protein trafficking, including a link between Nnf2 and YEL043W. This
work enables a shift from using ontologies to evaluate data to using
data to construct and evaluate ontologies.
|
Takashi
Gojobori
Vice-Director
of National Institute of Genetics (NIG)
Professor at Center for Information Biology and DNA Data Bank of Japan
(DDBJ) in NIG, Mishima, Japan
|
Big Data needs Good Tools: Translational Bioinformatics in Cell Innovation Project
|
As we know, the next-generation sequencing (NGS) technologies are changing a paradigm of genomic science so rapidly. First, a huge amount of nucleotide sequence data comes out from medical institutions such as medical schools of the universities and even city hospitals rather than laboratories in basic sciences. Second, how to obtain appropriate DNA or RNA samples timely in a given condition has become more crucial than how to be equipped with expensive sequencing machines, because sequencing itself is no longer a limiting factor of conducting genomic research from both viewpoints of time and cost. Third, when almost all the targets will become sequence-based, we have to deal with not only SNPs but also other types of variations such as CNV, Indels, and other DNA rearrangements. This may urge us to change the present due-course of GWAS, for example. Fourth and finally, development of powerful and accurate bioinformatics tool as well as construction of appropriate database have become essential in analyzing the so-called Big Data in order to produce the significant outcome. This paradigm change should be more emphasized particularly when we focus on translational medical research. In Japan, we conduct the research and development of NGS-sequence-based bioinformatics tools under the name of Cell Innovation Project in collaboration with RIKEN. I would present the current progress of this particular project with special reference to translational bioinformatics.
|
Maricel Kann
Assistant Professor,
Depts. of Biological Sciences and Computational Sciences and
Engineering
University of Maryland Baltimore County, MD, U.S.A.
|
A Protein-Domain Approach for the Analysis of Disease Mutations
|
Identifying the functional context for key molecular disruptions in complex diseases is a major goal of modern medicine that will lead to earlier diagnosis and more effective personalized therapies. Most available resources for visualization and analysis of disease mutations center on gene analysis and do not leverage information about the functional context of the mutation. In addition, these gene-centric approaches are confounded by the fact that gene products (proteins) may share some functional sub-units or protein domains but not others. I will describe a resource for domain mapping of disease mutations, DMDM, a protein domain database developed by our group in which each disease mutation is aggregated and displayed by its protein domain location. We have also developed a methodology using domain significance scores (DS- Scores) to detect statistically significant disease mutation clusters at the protein domain level. When we applied the DS-Scores to human data, we identified domain hotspots in oncogenes, tumor suppressors, as well as in genes associated with Mendelian diseases. In addition, I will describe recent work on analyzing cancer somatic mutations from individual cancer patient genomes. We found that incorporating information about classification of proteins and protein sites leads to new hypotheses regarding the role of tumor somatic mutations in cancer. Our analysis confirms that the domain-centric approach creates a framework for leveraging structural genomics and evolution into the analysis of disease mutations.
|
Jason Moore
Professor of Genetics, Professor of
Community and Family Medicine
Director of the Institute for Quantitative Biomedical Sciences
Director of the Graduate Program in Quantitative Biomedical Sciences
Associate Director for Bioinformatics, Norris-Cotton Cancer Center
Editor-in-Chief, BioData Mining
|
Computational Intelligence Strategies for Embracing the Complexity of Genetic Architecture
|
Given infinite time, humans would progress through modeling complex data in a manner that is dependent on prior knowledge of their domain, computer science and statistics as well as their prior experience working with other data. For example, a human modeler interested in identifying genetic risk factors for type II diabetes might start by examining insulin metabolism genes. We will review extensions and enhancements to an artificial intelligence-based computational evolution system (CES) that has the ultimate objective of tinkering with data as a human would. The key to the CES system is the ability to identify and exploit expert knowledge from biological databases or prior analytical results. Our prior studies have demonstrated that CES is capable of efficiently navigating large and rugged fitness landscapes toward the discovery of biologically meaningful genetic models of disease predisposition.
|
Jessica
Tenenbaum
Associate Director for
Bioinformatics
Duke Translational Medicine Institute Biomedical Informatics Core
Duke University, NC, U.S.A.
|
Informatics to enable precision medicine: achievements, obstacles and opportunities
|
The field of translational bioinformatics is at an exciting stage of progression. The past 5-10 years have seen the establishment of TBI as a widely recognized discipline unto itself, and the launch of a number of large-scale initiatives that TBI has enabled. A recent report from the National Academies describes how the recent explosion of molecular data coupled with clinical data on actual patients holds the potential to define an entirely new taxonomy of disease. In this new taxonomy, disease would be classified not solely by macroscopic symptoms many of which have been observed for centuries, but rather based on underlying molecular and environmental causes. This paradigm shift, enabled by novel methods for the generation, storage, analysis, and visualization of "big data" in biology and medicine, promises to do nothing short of rewrite the textbook of medicine moving forward. It will change the way we approach biomedical research and practice across the spectrum of scale, from molecules to populations.
As technology continues to advance, assay costs to decrease, and as methods are further refined, the next decade is likely to feature increasingly pervasive examples of applied translational bioinformatics, both in healthcare and other areas of day to day life. In this talk I will highlight success stories and outstanding achievements in, or enabled by, translational bioinformatics. I will describe some important caveats and obstacles we face in this rapidly advancing field, as well as some ideas on how to address those hurdles. Finally, I will explore some of the tremendous opportunities we face in the years ahead.
|
Olga Troyanskaya
Associate Professor, Lewis-Sigler
Institute for Integrative Genomics
and Department of Computer Science,
Princeton University, Princeton, NJ, U.S.A.
|
Understanding complex human disease through cell-lineage specific networks
|
The ongoing explosion of new technologies in functional genomics
offers the promise of understanding gene function, interactions, and
regulation at the systems level. This should enable us to develop
comprehensive descriptions of genetic systems of cellular controls,
including those whose malfunctioning becomes the basis of genetic
disorders, such as cancer, and others whose failure might produce
developmental defects in model systems. However, the complexity and
scale of human molecular biology make it difficult to integrate this
body of data, understand it on a systems level, and apply it to the
study of specific pathways or genetic disorders. These challenges are
further exacerbated by the biological complexity of metazoans,
including diverse biological processes, individual tissue types and
cell lineages, and by the increasingly large scale of data in higher
organisms. I will describe how we address these challenges through the
development of bioinformatics frameworks for the study of gene
function and regulation in complex biological systems and through
close coupling of these methods with experiments, thereby contributing
to understanding of human disease. I will specifically discuss how
integrated analysis of functional genomics data can be leveraged to
study cell-lineage specific gene expression, to identify proteins
involved in disease in a way complementary to quantitative genetics
approaches, and to direct both large-scale and traditional biological
experiments.
|
Naomichi
Matumoto
Professor, Dept. of Human
Genetics,
Yokohama
City University Graduate School of Medicine, Yokohama, Japan.
|
Exome sequencing in mendelian disorders
|
Disease-related genome analysis (DGA) has been developed and sophisticated together with technology advances. The advent and frequent update of next generation sequencers (NGSs) can attain the appropriate accuracy for mutation analysis and push¡¡¡¡DGA into the new stages. We now use Illumina Genome Analyzer (GA) IIx and Hiseq2000 which can produce as much as 60-Gb and 600-Gb sequences in one run, respectively. To focus on genes, we utilized exon capture methods such as SureSelect (Agilent). The current NGS protocol uses 100-108-bp pair-end reads and usually produces 8-9 Gb sequences (per one sample) could be enough for analysis of the whole exome: 90 % of exome bait regions are covered by 8-10 reads or more. Sequences are aligned using MAQ, BWA, Novoalign and commercial-based NextGENe software all of which are able to extract nucleotide changes and small insertions/deletions. The most critical step is the priority scheme selecting variants. We have been successful in addressing culprit mutations in several Mendelian diseases. I will present our procedures used in our projects including Coffin-Siris syndrome and others.
|
Steven
E. Brenner
Professor, Depts. of Plant and
Microbial Biology and Molecular and Cell Biology
Affiliated Associate Professor, Dept. of Bioengineering
UC Berkeley.
|
Ultraconserved nonsense:
gene regulation by alternative splicing & RNA surveillance
|
Nonsense-mediated mRNA decay (NMD) is a cellular RNA surveillance system
that recognizes transcripts with premature termination codons and degrades
them. Using RNA-Seq, we discovered large numbers of natural alternative
splice forms that appear to be targets for NMD. This coupling of
alternative splicing and RNA surveillance can be used as a means of gene
regulation. We found that all conserved members of the human SR family of
splice regulators have an ¡°unproductive¡± alternative mRNA isoform targeted
for NMD degradation. Preliminary data suggest that this is used for
creating a network of auto- and cross-regulation of splice factors.
Strikingly, the splice pattern for each SR protein is shared with mouse,
and each alternative splice is associated with an ultraconserved or
highly-conserved region of ~100 or more nucleotides of perfect identity
between human and mouse--amongst the most conserved regions in these
genomes. Further, we recently discovering that most ancient known
alternative splicing event is in this family and creates an alternate
transcript to be degraded by NMD. Despite conservation since the
pre-Cambrian, when the genes duplicate they change their regulation, so
that nearly every human SR gene has its own distinctive sequences for
unproductive splicing. As a result, this elaborate mode of gene
regulation has ancient origins and can involve exceptionally conserved
sequences, yet after gene duplication it evolves swiftly and often.
|
Yi-Xue
Li
Professor,
Chairman of Department of Bioinformatics and Biostatistics, College of
Life Science and Biotechnology, Shanghai Jiao Tong University, Director
of Shanghai Center for Bioinformation Technology, Vice Director of Key
Laboratory of Systems Biology at the Shanghai Institutes for Biological
Science, Chinese Academy of Sciences, and director in the Shanghai
Society for Bioinformatics.
|
Dynamic conservation of gene co-expression and oncogene deciphering
Liyun Yuan, Guohui Ding, Y. Eugene Chen, Zhe Chen, Yixue Li
|
Gene expression profiling from patients provide much biological information for oncogene deciphering. Traditional methods, such as the Student¡¯s t-test and the clustering methods, identify differentially expressed genes through comparison of adjacent disease stages. These methods take no account of time conservative features and cooperative properties of gene signatures across whole disease stages, and may cause high false positive in finding disease related genes. Some new methods, like the multiclass ordinal analyses, were developed to identify genes involved in cancer development by extracting consistently increasing or decreasing expressed gene signatures in consideration of global changes of gene expression. Because gene expression profiling data are too complicated and heterogeneous, it is still a challenge to mine disease related genes.
In fact, by using different methods on same gene expression data, we rarely get consistent results. In term of this, we developed an algorithm to deal with gene expression data in consideration of time serial conservation properties. In our method any specific expressed gene can be ranked with a time conservative score to evaluate its importance in cancer progression and development. Comparing with current methods, our algorithm can effectively and exactly identify functional gene sets by evaluating the global conservative properties of gene expression signatures. According to our approaches, a total of 480 genes in 29 clusters were obtained, only 8 percent of them can be identified by other studies. In a case study, 2 clusters were randomly selected and 9 genes were carefully annotated. All of those genes showed strong functional link with carcinoma occurrences and themselves form a small gene regulatory network mediated by P53, c-Myc, Sp1, IRF1, etc...
Thus, to some extent, our evolutionary conservation analyzing based methodology compensates for the inherent weaknesses of current statistics methods and provides a new way for dynamic gene expression profile analyzing
|
|
|