The genetics of ancient pathogens
Title: The genetics of ancient pathogens
DNr: Berzelius-2023-231
Project Type: LiU Berzelius
Principal Investigator: Mario Vicente <mario.vicente@su.se>
Affiliation: Stockholms universitet
Duration: 2023-10-06 – 2024-05-01
Classification: 60103
Homepage: http://palaeogenetics.com
Keywords:

Abstract

In this metagenomic project we aim to analyze DNA and RNA sequences extracted from ancient specimens to find the presence of bacteria, eukaryotes, and viruses. Our aim is to understand better the patterns of human microbiome changes, past epidemics and diet and link our results to archaeological observations. We have developed aMeta workflow in collaboration with SciLifeLab Bioinformatics long term support (WABI) tailored to ancient metagenomics. We are now applying it on DNA sequences extracted from many ancient specimens ranging from Mesolithic to Middle Age Scandinavia, Iberia after the Umayyad expansion and Central Anatolia. The main idea is to assign DNA sequences to a taxonomy level. To do this, we compare ancient DNA reads to a reference genome collection. The size of the reference genome collection is important to minimize false-positive identifications. In addition, such comparisons to genomes databases require a fair number of computational hours. Previously we worked on the Kebnekaise cluster as they had very large-memory nodes well suited to our specific needs. However, in December 2022 Kebnekaise has been discontinued by SNIC and other computer clusters such as Rackham or Snowy have very limit or inexistent number of nodes above 1TB. We haven then moved to Dardel but since a crucial update in February 2023, users of java encounter a SIGBUS error. Some aspects of our pipeline are very dependent on java and therefore we are looking for an alternative server. We would like to apply for a small compute project to test if Berzelius can provide us a viable solution for our computational needs’ prior to considering an application for a medium size project.