Applied genomics for tuberculosis diagnosis and surveillance

  1. Goig Serrano, Galo Adrián
Dirigida por:
  1. Iñaki Comas Espadas Director/a

Universidad de defensa: Universitat de València

Fecha de defensa: 24 de julio de 2020

Tribunal:
  1. Fernando González Candelas Presidente
  2. Laura Gómez Valero Secretario/a
  3. Juan Antonio Gabaldón Estevan Vocal

Tipo: Tesis

Teseo: 631006 DIALNET

Resumen

Tuberculosis (TB) remains one of the main causes of death worldwide. Over the past years, whole-genome sequencing (WGS) of Mycobacterium tuberculosis (MTB), its causative agent, has become an invaluable tool in the study, diagnosis and surveillance of the disease. Analyzing the complete genome of the bacteria allows the accurate prediction of drug resistance, detect transmission between patients and study outbreaks with unprecedented resolution, and provides new insights into the evolution and genetic diversity of this deathly pathogen. However, its effective use as a diagnostic tool has been hampered by its dependance on the long and cumbersome process of culturing MTB bacteria to obtain enough biomass for DNA extraction. For this reason, WGS of MTB directly from clinical specimens (dWGS) is considered to suppose a major breakthrough in TB diagnosis and control. The aim of this thesis is to apply the cutting-edge genomic techniques used in the research laboratory to develop novel tools for the diagnosis and surveillance of TB, including the development of a workflow that allows performing dWGS of MTB. Through the thesis, we performed a large-scale comparative analysis to identify genetic markers that are completely specific to MTB bacteria. We found that markers used up to date in TB tests, even in those endorsed by the World Health Organization, are non-specific. We provide a comprehensive catalog of MTB-specific markers to develop novel molecular assays for TB, and develop a highly specific qPCR that allows to accurately quantify MTB DNA in complex samples such as clinical specimens. We also developed a solid framework for the computational analysis of MTB WGS data, with special emphasis in data obtained directly from clinical specimens. Interestingly, we found that contaminant DNA in WGS analysis is a major pitfall in sequencing studies, introducing many errors that greatly bias the variant analysis. We implemented and validated a methodology to remove such contamination from sequencing data and demonstrated that this methodology is pivotal when performing dWGS of bacterial organisms. Finally, we implemented a complete workflow to perform dWGS of MTB. We were able to sequence and analyze the complete genomes of the infective MTB bacteria directly from respiratory samples of TB patients in less than week. With our analysis, we were able not only to detect MTB in the samples, but to provide a full report of antibiotic resistance and transmission between patients. We incorporated our data to the transmission network of TB in the Comunidad Valenciana and showed, for the first time, the use of dWGS for high-resolution genomic epidemiology of TB.