This website has changed. We hope you can find what you need easily, but items have moved around. If you have trouble finding what you are looking for please let us know.

Contact us

Forensic STR allele extraction using a machine learning paradigm.


We present a machine learning approach to short tandem repeat (STR) sequence detection and extraction from massively parallel sequencing data called Fragsifier. Using this approach, STRs are detected on each read by first locating the longest repeat stretches followed by locus prediction using k-mers in a machine learning sequence model. This is followed by reference flanking sequence alignment to determine precise STR boundaries. We show that Fragsifier produces genotypes that are concordant with profiles obtained using capillary electrophoresis (CE), and also compared the results with that of STRait Razor and the ForenSeq UAS. The data pre-processing and training of the sequence classifier is readily scripted, allowing the analyst to experiment with different thresholds, datasets and loci of interest, and different machine learning models.

view journal