A Reproducible LLM-Assisted Framework for Mapping AI Research in Gambling Studies
Session Title
AI in Gambling: Mapping the Research Landscape & Advancing Behavioral Risk Detection
Presentation Type
Paper Presentation
Start Date
26-5-2026 12:00 AM
Abstract
The interdisciplinary intersection of artificial intelligence (AI) and gambling research remains poorly delineated within existing bibliographic databases, limiting systematic analysis of its evolution, structure, and research priorities. This work presents a scalable, AI-assisted bibliometric methodology for constructing and analyzing a high-precision corpus in domains lacking standardized taxonomies. Using the OpenAlex database, we first apply a deliberately high-recall retrieval strategy combining topic-based queries and an extensive keyword expansion, yielding over 100,000 candidate publications related to gambling. To recover precision, we employ a locally deployed large language model (DeepSeek-R1) to semantically filter and classify publications based on title and abstract content, reducing the corpus to approximately 30,000 gambling-related works. A second-stage LLM classifier identifies AI-relevant research, resulting in a curated corpus of 899 AI-and-gambling publications spanning 2010–2025. We then apply BERTopic-based clustering and large language model-assisted labeling to uncover dominant research themes and their temporal dynamics, enabling both macro-level trend analysis and fine-grained subdomain exploration. As a case study, we demonstrate the method’s utility by tracing the field’s shift from poker-centric AI research toward sports betting and consumer protection following regulatory and technological inflection points.
A Reproducible LLM-Assisted Framework for Mapping AI Research in Gambling Studies
The interdisciplinary intersection of artificial intelligence (AI) and gambling research remains poorly delineated within existing bibliographic databases, limiting systematic analysis of its evolution, structure, and research priorities. This work presents a scalable, AI-assisted bibliometric methodology for constructing and analyzing a high-precision corpus in domains lacking standardized taxonomies. Using the OpenAlex database, we first apply a deliberately high-recall retrieval strategy combining topic-based queries and an extensive keyword expansion, yielding over 100,000 candidate publications related to gambling. To recover precision, we employ a locally deployed large language model (DeepSeek-R1) to semantically filter and classify publications based on title and abstract content, reducing the corpus to approximately 30,000 gambling-related works. A second-stage LLM classifier identifies AI-relevant research, resulting in a curated corpus of 899 AI-and-gambling publications spanning 2010–2025. We then apply BERTopic-based clustering and large language model-assisted labeling to uncover dominant research themes and their temporal dynamics, enabling both macro-level trend analysis and fine-grained subdomain exploration. As a case study, we demonstrate the method’s utility by tracing the field’s shift from poker-centric AI research toward sports betting and consumer protection following regulatory and technological inflection points.