TiInsight: A SQL-based Automated Exploratory Data Analysis System through Large Language Models
By: Jun-Peng Zhu , Boyan Niu , Peng Cai and more
Potential Business Impact:
Lets computers find answers in data.
The SQL-based exploratory data analysis has garnered significant attention within the data analysis community. The emergence of large language models (LLMs) has facilitated the paradigm shift from manual to automated data exploration. However, existing methods generally lack the ability for cross-domain analysis, and the exploration of LLMs capabilities remains insufficient. This paper presents TiInsight, an SQL-based automated cross-domain exploratory data analysis system. First, TiInsight offers a user-friendly GUI enabling users to explore data using natural language queries. Second, TiInsight offers a robust cross-domain exploratory data analysis pipeline: hierarchical data context (i.e., HDC) generation, question clarification and decomposition, text-to-SQL (i.e., TiSQL), and data visualization (i.e., TiChart). Third, we have implemented and deployed TiInsight in the production environment of PingCAP and demonstrated its capabilities using representative datasets. The demo video is available at https://youtu.be/JzYFyYd-emI.
Similar Papers
An LLM-Based Approach for Insight Generation in Data Analysis
Artificial Intelligence
Finds hidden patterns in data automatically.
DB-Explore: Automated Database Exploration and Instruction Synthesis for Text-to-SQL
Computation and Language
Helps computers understand databases to answer questions.
Text-to-SQL for Enterprise Data Analytics
Computation and Language
Lets people ask questions about data using normal words.