Score: 1

HERP: Hardware for Energy Efficient and Realtime DB Search and Cluster Expansion in Proteomics

Published: November 5, 2025 | arXiv ID: 2511.03437v2

By: Md Mizanur Rahaman Nayan , Zheyu Li , Flavio Ponzina and more

Potential Business Impact:

Finds patterns in data much faster and with less energy.

Business Areas:
Big Data Data and Analytics

Database search and clustering are fundamental components of many data analytics problems, such as mass spectrometry-driven proteomics. Traditional full clustering and search algorithms suffer from high resource usage and long latencies. We introduce HERP, a lightweight incremental clustering method and a highly parallelizable database (DB) search platform that utilizes 3T2MTJ SOT-MRAM based CAM in 7nm technology for in-memory acceleration. A single hardware initialization using pre-clustered proteomics data allows for continuous DB searching and local re-clustering, providing a more practical and efficient alternative to clustering from scratch. Heuristics derived from the initial pre-clustered data guide the incremental process, accelerating clustering by 20x at a cost of 0.3% increase in clustering error where DB search results overlap by 96% with SOTA algorithms validating search quality. For a 131GB human genome proteomics dataset HERP setup requires 1.19mJ for 2M spectra while 1000 query search consumes only 1.1uJ at SOTA accuracy. Bucket-wise parallelization and query scheduling provides additional 100x speedup.

Country of Origin
🇺🇸 United States

Page Count
7 pages

Category
Computer Science:
Databases