Score: 0

Transformer-Based Modeling of User Interaction Sequences for Dwell Time Prediction in Human-Computer Interfaces

Published: December 19, 2025 | arXiv ID: 2512.17149v1

By: Rui Liu, Runsheng Zhang, Shixiao Wang

This study investigates the task of dwell time prediction and proposes a Transformer framework based on interaction behavior modeling. The method first represents user interaction sequences on the interface by integrating dwell duration, click frequency, scrolling behavior, and contextual features, which are mapped into a unified latent space through embedding and positional encoding. On this basis, a multi-head self-attention mechanism is employed to capture long-range dependencies, while a feed-forward network performs deep nonlinear transformations to model the dynamic patterns of dwell time. Multiple comparative experiments are conducted with BILSTM, DRFormer, FedFormer, and iTransformer as baselines under the same conditions. The results show that the proposed method achieves the best performance in terms of MSE, RMSE, MAPE, and RMAE, and more accurately captures the complex patterns in interaction behavior. In addition, sensitivity experiments are carried out on hyperparameters and environments to examine the impact of the number of attention heads, sequence window length, and device environment on prediction performance, which further demonstrates the robustness and adaptability of the method. Overall, this study provides a new solution for dwell time prediction from both theoretical and methodological perspectives and verifies its effectiveness in multiple aspects.

PinFM: Foundation Model for User Activity Sequences at a Billion-scale Visual Discovery Platform

Machine Learning (CS)

Helps apps show you things you'll like.

17 Jul 2025 0

87%

TIDFormer: Exploiting Temporal and Interactive Dynamics Makes A Great Dynamic Graph Transformer

Machine Learning (CS)

Helps computers understand changing online connections faster.

31 May 2025 1

87%

TimeFormer: Transformer with Attention Modulation Empowered by Temporal Characteristics for Time Series Forecasting

Machine Learning (CS)

Predicts future events better by learning from the past.

8 Oct 2025 1

View PDF Login to Bookmark

Transformer-Based Modeling of User Interaction Sequences for Dwell Time Prediction in Human-Computer Interfaces

Technical Abstract

PinFM: Foundation Model for User Activity Sequences at a Billion-scale Visual Discovery Platform

TIDFormer: Exploiting Temporal and Interactive Dynamics Makes A Great Dynamic Graph Transformer

TimeFormer: Transformer with Attention Modulation Empowered by Temporal Characteristics for Time Series Forecasting