Score: 1

ReCross: Efficient Embedding Reduction Scheme for In-Memory Computing using ReRAM-Based Crossbar

Published: September 12, 2025 | arXiv ID: 2509.10627v1

By: Yu-Hong Lai , Chieh-Lin Tsai , Wen Sheng Lim and more

Potential Business Impact:

Makes computer recommendations faster and use less power.

Business Areas:

RISC Hardware

Deep learning-based recommendation models (DLRMs) are widely deployed in commercial applications to enhance user experience. However, the large and sparse embedding layers in these models impose substantial memory bandwidth bottlenecks due to high memory access costs and irregular access patterns, leading to increased inference time and energy consumption. While resistive random access memory (ReRAM) based crossbars offer a fast and energy-efficient solution through in-memory embedding reduction operations, naively mapping embeddings onto crossbar arrays leads to poor crossbar utilization and thus degrades performance. We present ReCross, an efficient ReRAM-based in-memory computing (IMC) scheme designed to minimize execution time and enhance energy efficiency in DLRM embedding reduction. ReCross co-optimizes embedding access patterns and ReRAM crossbar characteristics by intelligently grouping and mapping co-occurring embeddings, replicating frequently accessed embeddings across crossbars, and dynamically selecting in-memory processing operations using a newly designed dynamic switch ADC circuit that considers runtime energy trade-offs. Experimental results demonstrate that ReCross achieves a 3.97x reduction in execution time and a 6.1x improvement in energy efficiency compared to state-of-the-art IMC approaches.

A Time- and Energy-Efficient CNN with Dense Connections on Memristor-Based Chips

Hardware Architecture

Makes AI chips faster and use less power.

17 Aug 2025 0

87%

In-memory Training on Analog Devices with Limited Conductance States via Multi-tile Residual Learning

Machine Learning (CS)

Trains AI better with cheaper, simpler computer parts.

2 Oct 2025 1

87%

Leveraging Recurrent Patterns in Graph Accelerators

Hardware Architecture

Makes computer chips faster and last longer.

1 Dec 2025 1

View PDF Login to Bookmark

Country of Origin

🇹🇼 Taiwan, Province of China

Page Count

7 pages

ReCross: Efficient Embedding Reduction Scheme for In-Memory Computing using ReRAM-Based Crossbar

Makes computer recommendations faster and use less power.

Technical Abstract

A Time- and Energy-Efficient CNN with Dense Connections on Memristor-Based Chips

In-memory Training on Analog Devices with Limited Conductance States via Multi-tile Residual Learning

Leveraging Recurrent Patterns in Graph Accelerators