An Automated Grey Literature Extraction Tool for Software Engineering
By: Houcine Abdelkader Cherief , Brahim Mahmoudi , Zacharie Chenail-Larcher and more
Potential Business Impact:
Finds hidden software secrets for better research.
Grey literature is essential to software engineering research as it captures practices and decisions that rarely appear in academic venues. However, collecting and assessing it at scale remains difficult because of their heterogeneous sources, formats, and APIs that impede reproducible, large-scale synthesis. To address this issue, we present GLiSE, a prompt-driven tool that turns a research topic prompt into platform-specific queries, gathers results from common software-engineering web sources (GitHub, Stack Overflow) and Google Search, and uses embedding-based semantic classifiers to filter and rank results according to their relevance. GLiSE is designed for reproducibility with all settings being configuration-based, and every generated query being accessible. In this paper, (i) we present the GLiSE tool, (ii) provide a curated dataset of software engineering grey-literature search results classified by semantic relevance to their originating search intent, and (iii) conduct an empirical study on the usability of our tool.
Similar Papers
Intelligent Scientific Literature Explorer using Machine Learning (ISLE)
Information Retrieval
Helps scientists find and understand research faster.
Towards the Next Generation of Software: Insights from Grey Literature on AI-Native Applications
Software Engineering
Helps build smarter computer programs using AI.
LLM-Guided Genetic Improvement: Envisioning Semantic Aware Automated Software Evolution
Software Engineering
Helps computers fix code better and faster.