AgentSense: Virtual Sensor Data Generation Using LLM Agents in Simulated Home Environments
By: Zikang Leng , Megha Thukral , Yaqi Liu and more
Potential Business Impact:
Creates realistic smart home activity data for AI.
A major challenge in developing robust and generalizable Human Activity Recognition (HAR) systems for smart homes is the lack of large and diverse labeled datasets. Variations in home layouts, sensor configurations, and individual behaviors further exacerbate this issue. To address this, we leverage the idea of embodied AI agents-virtual agents that perceive and act within simulated environments guided by internal world models. We introduce AgentSense, a virtual data generation pipeline in which agents live out daily routines in simulated smart homes, with behavior guided by Large Language Models (LLMs). The LLM generates diverse synthetic personas and realistic routines grounded in the environment, which are then decomposed into fine-grained actions. These actions are executed in an extended version of the VirtualHome simulator, which we augment with virtual ambient sensors that record the agents' activities. Our approach produces rich, privacy-preserving sensor data that reflects real-world diversity. We evaluate AgentSense on five real HAR datasets. Models pretrained on the generated data consistently outperform baselines, especially in low-resource settings. Furthermore, combining the generated virtual sensor data with a small amount of real data achieves performance comparable to training on full real-world datasets. These results highlight the potential of using LLM-guided embodied agents for scalable and cost-effective sensor data generation in HAR.
Similar Papers
AgentSense: LLMs Empower Generalizable and Explainable Web-Based Participatory Urban Sensing
Artificial Intelligence
Helps cities understand problems by asking people.
Scaling Human Activity Recognition: A Comparative Evaluation of Synthetic Data Generation and Augmentation Techniques
CV and Pattern Recognition
Creates fake motion data to train activity trackers.
LLM Agent-Based Simulation of Student Activities and Mental Health Using Smartphone Sensing Data
Human-Computer Interaction
Models student minds to improve well-being.