Sentiment Analysis And Topic Modeling on User Reviews of Online Tutoring Applications Using Support Vector Machine and Latent Dirichlet Allocation

Authors

  • Mohammad Rezza Fahlevvi Institute of Home Affairs Governance

DOI:

https://doi.org/10.30983/knowbase.v2i2.5906

Keywords:

Ruangguru, Google Play Store (GPS), Sentiment Analyst, Topic Modelling, Support Vector Machine, Latent Dirichlet Allocation (LDA), Confusion Matrix, Rouge

Abstract

Ruangguru is an online non-formal education application in Indonesia. There are several appealing features that encourage students to study online. The app's release on the Google Play Store will assist app developers in receiving feedback through the review feature.Users submit various topics and comments about Ruangguru in the review feature of Ruangguru, making it difficult to manually identify public sentiments and topics of conversation. Opinions submitted by users on the review feature are interesting to research further. This study aims to classify user opinions into positive and negative classes and model topics in both classes. Topic modeling aims to find out the topics that are often discussed in each class. The stages of this study include data collection, data cleaning, data transformation, and data classification with the Support Vector Machine method and the Latent Dirichlet Allocation method for topic modeling. The results of topic modeling with the LDA method in each positive and negative class can be seen from the coherence value. Namely, the higher the coherence value of a topic, the easier the topic is interpreted by humans. The testing process in this study used Confusion Matrix and ROUGE. The results of model performance testing using the Confusion Matrix are shown with accuracy, precision, recall, and f-measure values of 0.9, 0.9, 0.9, and 0.89, respectively. The results of model performance testing using ROUGE resulted in the highest recall, precision, and f-measure of 1, 0.84, and 0.91. The highest coherence value is found in the 20th topic, with a value of 0.318. Using the Support Vector Machine and Latent Dirichlet Allocation algorithms are considered adequate for sentiment analysis and topic modeling for the Ruangguru application.

References

Pujilestari, Yulita. (2020). Dampak Positif Pembelajaran Online Dalam Sistem Pendidikan Indonesia Pasca Pandemi Covid-19. Adalah: Buletin Hukum & Keadilan, 1(1), 49-56.

Apjii.or.id. (2018). Asosiasi Penyelenggara Jasa Internet Indonesia. Rectrieved Juni 29, 2020. https//apjii.or.od/content/read/104/348/BULETIN-APJIIEDISI-22- Maret-2018.

Yang, D., Lavonen, J. M., & Niemi, H. (2018). Online learning engagement: Critical factors and research evidence from literature. Themes in ELearning, 11(1), 1–18.

Atsani, KH. L. G. M. Z. (2020). Transformasi Media Pembelajaran Pada Masa Pandemi COVID-19. Jurnal Studi Islam, 1(1), 82-93.

Iman, Usman. 2019. Mulai Belajar. Jakarta: Gramedia Pustaka Utama.

Shofi, S. A., Rachmadi, A., & Herlambang, A. D. (2019). Analisis Kebutuhan Pengguna Aplikasi Ruangguru Menggunakan Metode Fuzzy Kano. Jurnal Pengembangan Teknolohi Informasi dan Ilmu Komputer, 3(5), 4307-4315.

R. Ferdiana, F. Jatmiko, D. D. Purwanti, A. S. T. Ayu, and W. F. Dicka, “Dataset Indonesia untuk Analisis Sentimen,†J. Nas. Tek. Elektro dan Teknol. Inf., vol. 8, no. 4, pp. 334–339, 2019.

A. Alamsyah, W. Rizkika, D. D. A. Nugroho, F. Renaldi, and S. Saadah, “Dynamic large scale data on Twitter using sentiment analysis and topic modeling case study: Uber,†in 2018 6th International Conference on Information and Communication Technology, ICoICT 2018, 2018, vol. 0, no. c, pp. 254–258.

R. Ardianto, T. Rivanie, Y. Alkhalifi, F. S. Nugraha, and W. Gata, “Sentiment Analysis on E-Sports For Education Curriculum Using Naive Bayes and Support Vector Machine,†J. Comput. Sci. Inf., vol. 13, no. 2, pp. 109–122, 2020.

M. Cendana and S. D. H. Permana, “Pra-Pemrosesan Teks Pada Grup Whatsapp Untuk Pemodelan Topik,†Junal Mantik Penusa, vol. 3, no. 3, pp. 107–116, 2019.

N. L. P. M. Putu, Ahmad Zuli Amrullah, and Ismarmiaty, “Analisis Sentimen dan Pemodelan Topik Pariwisata Lombok Menggunakan Algoritma Naive Bayes dan Latent Dirichlet Allocationâ€, J. RESTI (Rekayasa Sist. Teknol. Inf.) , vol. 5, no. 1, pp. 123 - 131, Feb. 2021.

Naury, Chairullah, Dhomas Hatta Fudholi, and Ahmad Fathan Hidayatullah. "Topic modelling pada sentimen terhadap headline berita online berbahasa indonesia menggunakan LDA dan LSTM." Jurnal Media Informatika Budidarma 5.1 (2021): 24-33.

Ramos, S., Soares, J., Cembranel, S. S., Tavares, I., Foroozandeh, Z., Vale, Z., & Fernandes, R. (2021). Data mining techniques for electricity customer characterization. Procedia Computer Science, 186, 475 – 488. https://doi.org/10.1016/j.procs.2021.04.168.

Fahlevvi, M.R., 2022. Analisis Sentimen Terhadap Ulasan Aplikasi Pejabat Pengelola Informasi dan Dokumentasi Kementerian Dalam Negeri Republik Indonesia di Google Playstore Menggunakan Metode Support Vector Machine. Jurnal Teknologi dan Komunikasi Pemerintahan, 4(1), pp.1-13.

M. Cendana and S. D. H. Permana, “Pra-Pemrosesan Teks Pada Grup Whatsapp Untuk Pemodelan Topik,†Junal Mantik Penusa, vol. 3, no. 3, pp. 107–116, 2019.

D. Blei, L. Carin, and D. Dunson, “Probabilistic topic models,†IEEE Signal Process. Mag., vol. 27, no. 6, pp. 55–65, 2010.

D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,†J. Mach. Learn. Res., vol. 3, pp. 993–1022, 2003.

B. Zhao, "Web Scraping," in Springer International Publishing AG, USA, 2017.

Fahlevvi, M.R., 2019. Pemodelan Topik Pada Portal Berita Online Menggunakan Latent Dirichlet Allocation (LDA). Sleman: Universitas Gadjah Mada.

Downloads

Submitted

2022-09-05

Accepted

2022-10-27

Published

2022-11-28