Score: 0

From Voices to Worlds: Developing an AI-Powered Framework for 3D Object Generation in Augmented Reality

Published: March 4, 2025 | arXiv ID: 2503.16474v1

By: Majid Behravan, Denis Gracanin

Potential Business Impact:

Creates 3D objects from your voice for games.

Business Areas:

Augmented Reality Hardware, Software

This paper presents Matrix, an advanced AI-powered framework designed for real-time 3D object generation in Augmented Reality (AR) environments. By integrating a cutting-edge text-to-3D generative AI model, multilingual speech-to-text translation, and large language models (LLMs), the system enables seamless user interactions through spoken commands. The framework processes speech inputs, generates 3D objects, and provides object recommendations based on contextual understanding, enhancing AR experiences. A key feature of this framework is its ability to optimize 3D models by reducing mesh complexity, resulting in significantly smaller file sizes and faster processing on resource-constrained AR devices. Our approach addresses the challenges of high GPU usage, large model output sizes, and real-time system responsiveness, ensuring a smoother user experience. Moreover, the system is equipped with a pre-generated object repository, further reducing GPU load and improving efficiency. We demonstrate the practical applications of this framework in various fields such as education, design, and accessibility, and discuss future enhancements including image-to-3D conversion, environmental object detection, and multimodal support. The open-source nature of the framework promotes ongoing innovation and its utility across diverse industries.

Transcending Dimensions using Generative AI: Real-Time 3D Model Generation in Augmented Reality

Graphics

Makes 3D models from pictures in AR.

27 Apr 2025 0

88%

Say It, See It: A Systematic Evaluation on Speech-Based 3D Content Generation Methods in Augmented Reality

Human-Computer Interaction

Creates 3D objects from words and pictures.

17 Aug 2025 1

87%

LatticeWorld: A Multimodal Large Language Model-Empowered Framework for Interactive Complex World Generation

Artificial Intelligence

Creates virtual worlds from text and images.

5 Sep 2025 1

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

7 pages

From Voices to Worlds: Developing an AI-Powered Framework for 3D Object Generation in Augmented Reality

Creates 3D objects from your voice for games.

Technical Abstract

Transcending Dimensions using Generative AI: Real-Time 3D Model Generation in Augmented Reality

Say It, See It: A Systematic Evaluation on Speech-Based 3D Content Generation Methods in Augmented Reality

LatticeWorld: A Multimodal Large Language Model-Empowered Framework for Interactive Complex World Generation