SeqMatcher: efficient genome sequence matching with AVX-512 extensions

Bibliographic Details
Title: SeqMatcher: efficient genome sequence matching with AVX-512 extensions
Authors: Espinosa, Elena, Quislant-del-Barrio, Ricardo, Larrosa-Jiménez, Rafael, Plata-González, Óscar Guillermo
Source: RIUMA. Repositorio Institucional de la Universidad de Málaga
Universidad de Málaga
Publisher Information: Springer Science and Business Media LLC, 2024.
Publication Year: 2024
Subject Terms: 0301 basic medicine, Genome assembly, Approximate string matching, 0206 medical engineering, 02 engineering and technology, Myers algorithm, SIMD, 03 medical and health sciences, Algoritmos computacionales, Hyyrö algorithm, Arquitectura de ordenadores, AVX-512
Description: The recent emergence of long-read sequencing technologies has enabled substantial improvements in accuracy and reduced computational costs. Nonetheless, pairwise sequence alignment remains a time-consuming step in common bioinformatics pipelines, becoming a bottleneck in de novo whole-genome assembly. Speeding up this step requires heuristics and the development of memory-frugal and efficient implementations. A promising candidate for all of the above is Myers’ algorithm. However, the state-of-the-art implementations face scalability challenges when dealing with longer reads and large datasets. To address these challenges, we propose SeqMatcher, a fast and memory-frugal genomics sequence aligner. By leveraging the long registers of AVX-512, SeqMatcher reduces the data movement and memory footprint. In a comprehensive performance evaluation, SeqMatcher achieves speedups of up to 12.32x for the unbanded version and 26.70x for the banded version compared to the non-vectorized implementation, along with energy footprint reductions of up to 2.59x. It also outperforms state-of-the-art implementations by factors of up to 29.21x, 17.56x, 13.47x, 9.12x, and 8.81x compared to Edlib, WFA2-lib, SeqAn, BSAlign, and QuickEd, while improving energy consumption with reductions of up to 6.78x.
Document Type: Article
Language: English
ISSN: 1573-0484
0920-8542
DOI: 10.1007/s11227-024-06789-0
Access URL: https://hdl.handle.net/10630/36202
Rights: CC BY
Accession Number: edsair.doi.dedup.....5b9e47ea42e9c119e83d7cb46ade76d4
Database: OpenAIRE
Be the first to leave a comment!
You must be logged in first