Bioinformatics tools for metagenomic analysis of bacterial communities

    Research output


    Metagenomics, that is sequence analysis of DNA extracted directly from the environment, bypasses strain isolation and cultivation, and the associated limitations of conventional analyses that rely on them. This is a rather recent field of study, enabled by the development of high-throughput sequencing techniques, that requires extensive computational analysis to be usable. As a result, related software is in need of development, which is what the present project aimed to accomplish. To this end, we created two new methods, PaSiT and MAGISTA. PaSiT is a new method designed to efficiently compute inter-genome distances, which can be used to obtain the taxonomy of genomes obtained through metagenome analysis, without requiring extensive computational infrastructure. MAGISTA is a machine-learning approach designed to provide an alternative to marker-gene-based approaches for estimating the quality of these putative genomes. In addition to developing new tools, we also evaluated the quality of existing sequencing technologies and tools that analyse their output using a pre-defined mix of 227 bacterial strains, the most complex DNA mock created so far. The sequencing platforms considered were those produced by Illumina, Oxford Nanopore Technologies, and Pacific Biosciences. We concluded that overall Oxford Nanopore Technologies provided the best value for metagenomics, but other technologies had their own use-cases.
    Original languageEnglish
    QualificationMaster of Science
    Awarding Institution
    • Universiteit Gent
    • Vandamme, Peter, Supervisor, External person
    • Van Houdt, Rob, SCK CEN Mentor
    Date of Award9 Nov 2021
    StatePublished - 9 Nov 2021

    Cite this