Semantic-Driven Topic Modeling for Analyzing Creativity in Virtual Brainstorming
By: Melkamu Abay Mersha, Jugal Kalita
Potential Business Impact:
Finds good ideas in group brainstorming chats.
Virtual brainstorming sessions have become a central component of collaborative problem solving, yet the large volume and uneven distribution of ideas often make it difficult to extract valuable insights efficiently. Manual coding of ideas is time-consuming and subjective, underscoring the need for automated approaches to support the evaluation of group creativity. In this study, we propose a semantic-driven topic modeling framework that integrates four modular components: transformer-based embeddings (Sentence-BERT), dimensionality reduction (UMAP), clustering (HDBSCAN), and topic extraction with refinement. The framework captures semantic similarity at the sentence level, enabling the discovery of coherent themes from brainstorming transcripts while filtering noise and identifying outliers. We evaluate our approach on structured Zoom brainstorming sessions involving student groups tasked with improving their university. Results demonstrate that our model achieves higher topic coherence compared to established methods such as LDA, ETM, and BERTopic, with an average coherence score of 0.687 (CV), outperforming baselines by a significant margin. Beyond improved performance, the model provides interpretable insights into the depth and diversity of topics explored, supporting both convergent and divergent dimensions of group creativity. This work highlights the potential of embedding-based topic modeling for analyzing collaborative ideation and contributes an efficient and scalable framework for studying creativity in synchronous virtual meetings.
Similar Papers
LLM-Assisted Topic Reduction for BERTopic on Social Media Data
Computation and Language
Cleans up messy text to find clear ideas.
TopiCLEAR: Topic extraction by CLustering Embeddings with Adaptive dimensional Reduction
Computation and Language
Finds hidden topics in social media posts.
Unsupervised Document and Template Clustering using Multimodal Embeddings
Computation and Language
Groups similar papers by words, look, and layout.