Award Date

5-15-2025

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Committee Member

Kazem Taghva

Second Committee Member

Mingon Kang

Third Committee Member

Laxmi Gewali

Fourth Committee Member

Emma Regentova

Fifth Committee Member

Fatma Nasoz

Sixth Committee Member

Ashok Singh

Number of Pages

Abstract

Machine reading comprehension is a critical step in development of applications that require the semantic understanding of human speech-to-text driven work. Many devices such as smart home appliances like the Amazon Echo Dot, Google Home, or smart assistants like Apple Siri or Microsoft Cortana are examples of these applications. The comprehension task involves a deeper understanding and recognition of named entities such as person names, locations, medicals codes, quantities, abbreviations, and acronyms in speech or text data. In this dissertation, we explore and extend the different approaches and techniques in modern research that tackles the problem of recognition and definition of acronyms and abbreviations. Also, we offer different techniques for disambiguation of abbreviations that are caused by the abundance and frequent introduction of new abbreviations. We provide the following contributions: 1) A historical background on the rule-based and statistical methods for finding acronyms and their definitions. 2) A method based on the bidirectional encoder representations from transformers question answering model to find acronym definitions in each document. Our experiments show that this model can correctly predict 94% of acronym expansions assuming a Jaro–Winkler threshold distance of greater than 0.8. 3) An exploration of the different approaches and techniques to solve the problem of ambiguous abbreviations and their definitions. We reverse engineered the process of creating ad hoc abbreviations and found some preliminary statistics on what makes them easier or harder to define. In addition to recognition and definition of acronyms and abbreviations, this dissertation contributes to a systematic generative method to create datasets and use them to build a corpus for acronym expansion. Our approach for data generation can be used in many applications where there are no standard datasets.

Keywords

Abbreviations; Acronyms; Machine Learning; Machine Reading Comprehension; Natural Langauge Processing

Disciplines

Artificial Intelligence and Robotics | Computer Engineering | Computer Sciences

File Format

pdf

File Size

851 KB

Degree Grantor

University of Nevada, Las Vegas

Language

English

Repository Citation

Choi, Sing, "Developments on Abbreviations Towards Machine Reading Comprehension" (2025). UNLV Theses, Dissertations, Professional Papers, and Capstones. 5255.
http://dx.doi.org/10.34917/39206702

Rights

IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/

Download

Included in

Artificial Intelligence and Robotics Commons, Computer Engineering Commons

COinS

UNLV Theses, Dissertations, Professional Papers, and Capstones

Developments on Abbreviations Towards Machine Reading Comprehension

Award Date

Degree Type

Degree Name

Department

First Committee Member

Second Committee Member

Third Committee Member

Fourth Committee Member

Fifth Committee Member

Sixth Committee Member

Number of Pages

Abstract

Keywords

Disciplines

File Format

File Size

Degree Grantor

Language

Repository Citation

Rights

Included in

Author Corner

Browse

Search

UNLV Theses, Dissertations, Professional Papers, and Capstones

Developments on Abbreviations Towards Machine Reading Comprehension

Author

Award Date

Degree Type

Degree Name

Department

First Committee Member

Second Committee Member

Third Committee Member

Fourth Committee Member

Fifth Committee Member

Sixth Committee Member

Number of Pages

Abstract

Keywords

Disciplines

File Format

File Size

Degree Grantor

Language

Repository Citation

Rights

Included in

Share

Author Corner

Browse

Search