Blog
Jan 5, 2017

Researcher Spotlight: Resolving short reads and distinguishing variants in PMS2

Shauna Clark

Dr Charlly Kao
Dr Charlly Kao

Dr. Charlly Kao, Center for Applied Genomics (CAG) at the Children’s Hospital of Philadelphia

Charlly Kao is a translational research scientist at the Center for Applied Genomics. He joined CAG in 2011, and his current focus is on translating discoveries at CAG into potential diagnostics and therapeutics and to evaluate/develop technologies and partnerships that could facilitate the translation process. He completed his B.S. in Biochemistry and Biology from California State University, Los Angeles, received his Ph.D. from the University of Minnesota, Twin Cities, in Molecular, Cellular, Developmental Biology and Genetics with an emphasis in immunology, and did his post-doctoral training in immunology at the Wistar Institute and the University of Pennsylvania.

At ASHG this past year, Dr. Kao gave a great talk about using Linked-Read Technology in combination with standard, short-read sequencing to examine an area of the genome that was previously difficult to interrogate due to pseudogene interference. Recently, we got to talk to him a bit more about how Linked-Reads play a part in his current research, possible future applications, and why 10x technology is so exciting. Check out the full conversation below:

10x:  You have a Ph.D. in Molecular, Cellular, Developmental Biology and Genetics with an emphasis on immunology; what got you interested in pursuing this area of study?

CK:  I have been working at the Center for Applied Genomics as a translational scientist for the past 5 years, where one of my roles is to evaluate and implement new technologies that are relevant and potentially transformative for the fields of diagnostics and translational medicine. We believe that linked-read sequencing has the potential to be one of these transformative technologies because of the ease with which it can be appended to existing NGS workflows to add a new dimension to genomic and transcriptome sequencing.

10x:  With regards to your talk at ASHG this year, entitled "Combining linked read technology with standard target-enrichment NGS can accurately resolve short reads and distinguish variants in the Lynch/CMMRD syndrome gene, PMS2, from its pseudogene, PMS2CL": What challenges exist for analysis of PMS2 variants?

CK:  The PMS2 gene is currently difficult to characterize using NGS sequencing relying on relatively short (~200-400 base) reads due to the presence of a pseudogene, PMS2CL, that has nearly identical sequence homology to exon 9 and exons 11-15 of PMS2. Since the gene and pseudogene are each within a larger segmental duplication block located on the same chromosome (~700kb apart), there is also frequent recombination between the two regions that can generate "hybrid alleles" that would scramble any common SNP or variant from being useful as gene- or pseudogene-specific markers. So without any additional longer-range information, it is difficult to map any short read within the homologous region to it’s correct location in either the gene or the pseudogene. Current diagnostic screens require hybrid approaches involving multiple assays to definitively identify a variant in PMS2, which adds complexity and cost. This also significantly hampers the ability to include PMS2 within broader cancer screening workflows for both assessment of germline risk and determining pathogenic drivers of established tumors.

10x:  How did using 10x technology enable or facilitate your work?

**CK: ** Linked-read barcoding on to NGS short reads provides an elegant means to attach long-range information that would be otherwise absent in traditional short reads. So we can now link a read within the gene/pseudogene regions to other reads that reside in non-ambiguous, unique regions farther away, thus enabling us to resolve whether that previously ambiguous read came from the gene or pseudogene. Notably, this technology also enables us to phase reads and variants as well as detecting/resolving larger structural variants that would have been challenging for unlinked traditional NGS.

10x:  What specific potential applications might this research hold?

**CK: ** Many other genes besides PMS2 have highly homologous pseudogenes or paralogs that are mostly inaccessible to traditional NGS approaches, and a number of these "dark matter" genes are involved in Mendelian diseases. We envision the current PMS2 project to serve as a model to optimize the platform and demonstrate proof-of-concept for applying linked read sequencing more broadly for other "dark matter" clinical genes, as well as in the context of whole-exome/whole-genome sequencing for unresolved cases of suspected genetic disorders. This platform also provides a more robust means for detecting and resolving structural variations, especially translocations and inversions, on a genome-wide level that would otherwise be impractical, expensive, or difficult (if not impossible) to do with any other current methodologies.

10x:  What do you personally find most surprising or exciting or important about your work?

CK:  The most exciting part of this process is being able to see where the work can be directly applied in a way that can have a positive impact on diagnosis and treatment decisions in human health and disease. Overall, it is remarkable how rapidly the field of genomics is evolving, and to be personally involved and to be making a contribution in this effort on a daily basis is an actualization of the vision of why I first decided to pursue a career in biomedical sciences.

10x:  Are there any other notable projects that you are currently working on or plan to start in the future that you think would benefit from 10x technology?

CK:  The single cell transcriptome applications represent the 2nd "arm" of the Chromium instrument, which we believe also holds tremendous promise. The platform offers a significant increase in throughput compared to most other single-cell platforms, and apparently also has a simpler and faster workflow. There are a number of applications where single-cell resolution of the transcriptome could be transformative, and one area that is particularly exciting is the prospect of applying this for biomarker discovery in the context of immunotherapy for cancer and autoimmunity.

10x:  Any other topics or projects that might be of interest.

**CK: ** We think that the ability to integrate Linked-Read sequencing within an automated workflow end-to-end is another exciting area which we are also in the process of testing and implementing. Longer term, we envision that there is a path to using Linked-Read panels (and possible whole exomes) as Laboratory Developed Tests, where the next challenges to getting to that point are the bioinformatics infrastructure to systematically process the Linked-Read output in a more "clinic-ready" format, as well as validation studies to confirm that adding Linked-Reads provide the necessary level of sensitivity/specificity in a clinical setting.

Watch the full ASHG  presentation, "Resolving Short Reads and Distinguishing Variants in PMS2" , available as part of our Scientific Seminars library.