Authors: Alexandre Gouy
Abstract: straf is an application for the analysis of genetic markers in forensics, for example for parental testing or criminal investigations. It does quality control of the data and characterises genetic diversity and relatedness of the samples.
straf (STR Analysis for Forensics) is a web application dedicated to the analysis of genetic markers (called STRs) in a forensic science context. STRs are an extremely popular type of genetic markers used in routine to characterize the DNA of individuals. Forensic geneticists typically gather the DNA of a large sample of individuals in their countries, characterize them genetically in order to create a reference population. These populations are then used in forensic practice, for example for parental testing or criminal investigations. The application allows geneticists to assess the quality of their data and characterise the genetic diversity and relatedness of the populations they are interested in.
straf’s story is a tale of two disciplines: forensic and population genetics, having a lot in common. For example, DNA profiling, used in criminal investigations or parental testing, aims at matching different DNA samples and understanding how related they are in terms of DNA. In population genetics, a common goal is to characterise the genetic diversity of a set of populations, by looking at how related individuals are within and between populations. Both fields aim at understandind the relatedness of individuals based on their DNA.
straf was born from the encounter of two scientists: a forensic geneticist and a population geneticist, in 2017, in the beautiful city of Bern (Switzerland). Martin came to visit a population genetics lab, where Alexandre was pursuing his Ph.D. thesis at that time. This encounter led to a fruitful collaboration when they realised that some tools used in population genetics could be leveraged by the forensics community.
There was a strong need for such an application in forensics. Forensics parameters (metrics to be computed for any genetic dataset) were typically computed by geneticists by using an old spreadsheet that had been created by one of the suppliers of an assay used to genotype samples. This spreadsheet was convenient because it was based on Microsoft Excel, hence anyone could use it. It has been since then removed from the Internet, and forensic geneticists started sharing this spreadsheet among each other, circulating it almost secretly, “under the cloak” as French speakers would say. As R was very popular in population genetics, Alexandre and Martin felt that something could be done to save the forensics community!
A few weeks later, straf was born, and after four year, straf had become a widely used tool by the forensics community. The positive reception of the software in the community motivated its development over the years until the release of straf 2.0 in 2021. straf’s story highlights the importance of communication between fields.
- Various statistical methods are implemented, including:
- the computation of forensic and population genetics statistics and quality metrics of interest
- PCA and MDS based on different genetic distances measures, to study relatedness between populations
- File conversion utilities to be facilitate the use of common genetic data analysis tools
- straf has been implemented as an R package using Shiny modules
- an online book is being written using bookdown: The STRAF Book (work in progress!)
- The scientific publication cited almost 100 times as of May 2021! The software is now used worldwide by forensic geneticists.
- Looking at the list of citations truly makes you travel the world!
- It has also been used as teaching support for population genetics
- Some authors from the biomedical field also used it for their studies around prenatal diagnosis of Duchenne Muscular Dystrophy
- Forensic geneticists have finally replaced their old spreadsheet by a modern a more reliable application! Thousands of hours have been spared by avoiding manual conversion of data in several old-and-poorly-designed formats. This time has been used to focus on their forensics work, or enjoy their personal lives!
Keywords: forensics, genetics, research, life sciences
Shiny app: https://agouy.shinyapps.io/straf/
Repo: GitHub - agouy/straf: STRAF: STR Analysis for Forensics
RStudio Cloud: RStudio Cloud