A Comparative Analysis of Machine Learning Regression Models for TikTok User Engagement Prediction

Authors

  • Lidiasonata Sitohang Department of Computer Science, Faculty of Information Technology, Universitas Budi Luhur, Jakarta, Indonesia
  • Swinsikya Sitohang Department of Computer Science, Faculty of Information Technology, Universitas Budi Luhur, Jakarta, Indonesia
  • Hari Soetanto Department of Computer Science, Faculty of Information Technology, Universitas Budi Luhur, Jakarta, Indonesia

DOI:

https://doi.org/10.54518/rh.6.3.2026.1166

Keywords:

Elastic Net, Engagement Prediction, Linear Regression, Machine Learning, Predictive Modeling, SVR, TikTok

Abstract

The rapid growth of TikTok has made user engagement prediction a critical challenge for content creators and digital marketers, particularly given the high multicollinearity among interaction features such as likes, comments, and shares. This study aims to conduct a comparative analysis of three machine learning models, namely linear regression, elastic net, and support vector regression, in predicting TikTok user engagement levels. The methodology employs a quantitative approach using the cross-industry standard process for data mining framework, evaluating model performance through mean absolute error, root mean squared error, mean absolute percentage error, and coefficient of determination metrics. Findings reveal that the elastic net is the most reliable model, achieving a mean absolute error of 3.98 and root mean squared error of 9.37 with a coefficient of determination of 1.000, supported by consistent cross-validation results across five folds. Linear regression produced trivial perfect scores due to the direct summation relationship between input features and the target variable, while support vector regression demonstrated suboptimal performance with a mean absolute error of 74.58, indicating difficulty in capturing linear data patterns. These results suggest that regularization-based models offer a more practical and generalizable framework for social media engagement prediction, providing actionable insights for practitioners in developing data-driven content strategies.

Downloads

Download data is not yet available.

References

Ababil, O. J., Wibowo, S. A., & Zahro, H. Z. (2022). Penerapan metode regresi linier dalam prediksi penjualan liquid vape di toko vapor Pandaan berbasis website. JATI (Jurnal Mahasiswa Teknik Informatika), 6(1), 186–195. https://doi.org/10.36040/jati.v6i1.4537.

Adiguno, S., Syahra, Y., & Yetri, M. (2022). Prediksi peningkatan omzet penjualan menggunakan metode regresi linier berganda. Jurnal Sistem Informasi Triguna Dharma (JURSI TGD), 1(4), 275–283. https://doi.org/10.53513/jursi.v1i4.5331.

Akhmad, E. P. A. (2020). Data mining menggunakan regresi linear untuk prediksi harga saham perusahaan pelayaran. Jurnal Aplikasi Pelayaran dan Kepelabuhanan, 10(2), 120–128. https://doi.org/10.30649/japk.v10i2.13.

Almumtazah, N., Azizah, N., Putri, Y. L., & Novitasari, D. C. R. (2021). Prediksi jumlah mahasiswa baru menggunakan metode regresi linier sederhana. Jurnal Ilmiah Matematika dan Terapan, 18(1), 31–40. https://doi.org/10.22487/2540766X.2021.v18.i1.15465.

Almutairi, A., & Rawat, D. B. (2024). A multi-criteria decision analysis approach for predicting user popularity on social media. In Future of Information and Communication Conference (pp. 290–303). Cham, Switzerland: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-54053-0_21.

Andariesta, D. T., & Wasesa, M. (2023). Machine learning models to predict the engagement level of Twitter posts: Indonesian e-commerce case study. In Proceedings of the 8th International Conference on Computer Science and Computational Intelligence (ICCSCI 2023) (Vol. 227, pp. 823–832). Malang: Procedia Computer Science. https://doi.org/10.1016/j.procs.2023.10.588.

Asante, I. O., Jiang, Y., Hossin, A. M., & Luo, X. (2023). Optimization of consumer engagement with artificial intelligence elements on electronic commerce platforms. Journal of Electronic Commerce Research, 24(1), 7–28.

Asmawi, M. A. H. A., Isawasan, P., Shamugam, L., & Salleh, K. A. (2025). A data science approach to exploring the relationship between TikTok engagement and revenue in Malaysia: A case study of the beauty and personal care sector. Jurnal Online Informatika, 10(2), 372–383. https://doi.org/10.15575/join.v10i2.1633.

Erlany, D., Henny, F. M. S., Ngatno, N., Prabawani, B., & Widiartanto, W. (2022). SME digital transformation as a post-pandemic recovery facility in Pekalongan Indonesia. In Proceedings of International Conference on Multidisciplinary Research (Vol. 5, No. 1, pp. 91-99). Banda Aceh: ICMR. https://doi.org/10.32672/pic-mr.v5i1.5259.

Fatimah, A. F., & Nasir, M. (2025). Utilization of short-form videos (TikTok, Reels, Shorts) to increase brand engagement and visibility. Journal of Digital Marketing and Search Engine Optimization, 2(1), 16–32. https://doi.org/10.59261/jseo.v2i1.7.

Gunawan, R., & Suhendra, A. (2022). Perbandingan algoritma random forest dan support vector regression dalam memprediksi engagement rate media sosial. Jurnal Informatika dan Sistem Informasi, 3(1), 45–56. https://doi.org/10.37859/jf.v13i02.4976.

Guo, Y., Ban, C., Yang, J., Goh, K. Y., Liu, X., Peng, X., & Li, X. (2024). Analyzing and predicting consumer response to short videos in e-commerce. ACM Transactions on Management Information Systems, 15(4), 1–23. https://doi.org/10.1145/3690393.

Hardiyanto, B., & Rozi, F. (2020). Prediksi penjualan sepatu menggunakan metode K-nearest neighbor. JOEICT (Journal of Education and Information Communication Technology), 4(2), 13–18. https://doi.org/10.29100/joeict.v4i2.1693.

Harsiti, Z. M, & Srihartini, E. (2022). Penerapan metode regresi linier sederhana untuk prediksi persediaan obat jenis tablet. JSiI (Jurnal Sistem Informasi), 9(1), 12–16.

Irshad, M. S., Anand, A., & Ram, M. (2024). Trending or not? Predictive analysis for YouTube videos. International Journal of System Assurance Engineering and Management, 15(4), 1568–1579. https://doi.org/10.1007/s13198-023-02034-8.

John, Z. Q. (2010). The elements of statistical learning: data mining, inference, and prediction. London: Springer Nature. https://doi.org/10.1111/j.1467-985X.2010.00646_6.x.

Johnson, M. E., & Malaga, R. A. (2024). Exploring the relationship between YouTube video optimisation practices and video rankings for online marketing: A machine learning approach. Journal of Business Analytics, 7(2), 120–135. https://doi.org/10.1080/2573234X.2023.2292536.

Kim, J., Ahn, H., & Park, E. (2024). Multi-Pop: Enhancing user engagement with content-based multimodal popularity prediction in social media. Expert Systems, 41(12), 13707–13717. https://doi.org/10.1111/exsy.13707.

Kuhn, M., & Johnson, K. (2013). Applied predictive modeling (Vol. 26). New York, NY: Springer.

Kwok, E., & Susanti, W. (2019). Penerapan metode regresi linier dalam aplikasi sistem peramalan jumlah bahan baku untuk produksi tahu. Mahasiswa Aplikasi Teknologi Komputer dan Informasi, 1(2), 1–8.

Li, D., Li, W., Lu, B., Li, H., Ma, S., Krishnan, G., & Wang, J. (2024). Delving deep into engagement prediction of short videos. In European Conference on Computer Vision (pp. 289–306). Cham: Springer Nature Switzerland.

Li, Z., Qian, Y., & Liu, M. (2024). Predicting video popularity of the military museum: A TikTok case study. In 2024 5th International Conference on Information Science and Education (ICISE-IE) (pp. 564–567). Piscataway, NJ: IEEE. https://doi.org/10.1109/ICISE-IE64355.2024.11025519.

May, S. T., & Siddoo, V. (2024). Factors affecting the short-form video consumption on social media platforms. In 2024 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON) (pp. 140–144). Piscataway, NJ: IEEE. https://doi.org/10.1109/ECTIDAMTNCON60518.2024.10480031.

McCarthy, R. V., McCarthy, M. M., & Ceccucci, W. (2022). Predictive models using regression. In Applying predictive analytics: Finding value in data (pp. 87–121). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-83070-0_4.

Murti, E. W. (2024). Analisis dan perbandingan algoritma prediksi dalam mengetahui perkiraan peningkatan jumlah kasus COVID-19 di Kabupaten Boyolali dengan metodologi CRISP-DM. Jikes: Jurnal Ilmu Kesehatan, 3(1), 24-34. https://doi.org/10.71456/jik.v3i1.933.

Padilah, T. N., & Adam, R. I. (2019). Analisis regresi linier berganda dalam estimasi produktivitas tanaman padi di Kabupaten Karawang. FIBONACCI: Jurnal Pendidikan Matematika dan Matematika, 5(2), 117–128. https://doi.org/10.24853/fbc.5.2.117-128.

Putri, M., & Hendrawan, A. (2026). Analysis of the best social media platforms for promotion using machine learning and RFE feature selection: A comparative study of gradient boosting, XGBoost, CNN, and SVR. Journal of Applied Informatics and Computing, 10(1), 513–521. https://doi.org/10.30871/jaic.v10i1.12049.

Safrin, F., & Simanjorang, F. (2023). Optimizing the use of e-commerce as a marketing medium for online shop businesses in the city of Medan. Research Horizon, 3(3), 235–248. https://doi.org/10.54518/rh.3.3.2023.126.

Salmon, S., Azahari, A., & Yusnita, A. (2022). Prediksi persediaan bahan baku makanan menerapkan algoritma apriori data mining. Building of Informatics, Technology and Science (BITS), 4(3), 1386–1394. https://doi.org/10.47065/bits.v4i3.2563.

Sapina, N., Nanda, A., Arifin, M. A., Rahmaddeni, R., & Efrizoni, L. (2025). Analisis faktor-faktor yang mempengaruhi engagement video di platform TikTok menggunakan multiple linear regression: Analysis of factors that influence video engagement on the TikTok platform using the multiple linear regression algorithm. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 5(3), 875–885. https://doi.org/10.57152/malcom.v5i3.1987.

Smith, A. E., & Mason, A. K. (1997). Cost estimation predictive modeling: Regression versus neural network. The Engineering Economist, 42(2), 137–161. https://doi.org/10.1080/00137919708903174.

Talebi, S., & Abdolvand, N. (2025). Building safer social spaces: Addressing body shaming with LLMs and explainable AI. International Journal of Web Research, 8(3), 73–81. https://doi.org/10.22133/ijwr.2025.525312.1286.

Wanajma, E. A. (2024). Innovation and technology based digital marketing strategy in increasing e-commerce business sustainability. Economic and Business Horizon, 3(3), 108–113. https://doi.org/10.54518/ebh.3.3.2024.511.

Xiao, L., Li, X., & Mou, J. (2026). Exploring user engagement behavior with short-form video advertising on short-form video platforms: A visual-audio perspective. Internet Research, 36(1), 154–188. https://doi.org/10.1108/INTR-07-2023-0521.

Downloads

Published

2026-06-25

How to Cite

Sitohang, L., Sitohang, S., & Soetanto, H. (2026). A Comparative Analysis of Machine Learning Regression Models for TikTok User Engagement Prediction. Research Horizon, 6(3), 1375–1386. https://doi.org/10.54518/rh.6.3.2026.1166

Similar Articles

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 > >> 

You may also start an advanced similarity search for this article.