Document Type

Conference Proceeding

Publication Date

2024

Abstract

This study compares the interpretability of the topics resulting from three topic modeling techniques, namely, LDA, BERTopic, and RoBERTa. Using a case study of three healthcare apps (MyChart, Replika, and Teladoc), we collected 39,999, 52,255, and 27,462 reviews from each app, respectively. Topics were generated for each app using the three topic models and labels were assigned to the resulting topics. Comparative qualitative analysis showed that BERTopic, RoBERTa, and LDA have relatively similar performance in terms of the final list of resulting topics concerning human interpretability. The LDA topic model achieved the highest rate of assigning labels to topics, but the labeling process was very challenging compared to BERTopic and RoBERTa, where the process was much easier and faster given the fewer numbers of focused words in each topic. BERTopic and RoBERTa generated more cohesive topics compared to the topics generated by LDA.

Comments

Originally published as part of the AMCIS 2024 conference proceedings:

El-Gayar, Omar; Al-Ramahi, Mohammad; Wahbeh, Abdullah; Nasralah, Tareq; and Elnoshokaty, Ahmed, "A Comparative Analysis of the Interpretability of LDA and LLM for Topic Modeling: The Case of Healthcare Apps" (2024). AMCIS 2024 Proceedings. 22. https://aisel.aisnet.org/amcis2024/health_it/health_it/22

Share

COinS