APPLICATION OF LOGISTIC REGRESSION FOR HEALTH INSURANCE PREMIUM CLASSIFICATION

Authors

  • Celia Christy Merlinda Tantono President University
  • Aqilla Dheanya Lucetta President University
  • Putri Felicia President University
  • Edwin Setiawan Nugraha President University

DOI:

https://doi.org/10.24269/ijhs.v10i1.12271

Abstract

This study analyzes individual health insurance data using logistic regression to classify premiums into high and low categories based on ten medical and demographic predictors. By transforming a continuous premium variable into binary classes, high and low premiums, this research evaluates the significant factors influencing premium pricing decisions. Logistic regression was selected for its ability to model binary outcomes and estimate the likelihood that the customer belongs to the high premium category. To evaluate the significance of the predictors and the overall model fit, the Likelihood Ratio Test and Wald test were performed, identifying Age  and Weight  as significant predictors affecting premium classification. The final logistic regression model has an excellent ability to predict, given the area under the curve (AUC) of 0.97 and a 95% accuracy. These results show how well logistic regression works to enhance risk classification and support data driven policy modifications in insurance underwriting procedures.

Downloads

Download data is not yet available.

References

[1] K. Kaushik, A. Bhardwaj, A. D. Dwivedi, and R. Singh, “Machine Learning-Based Regression Framework to Predict Health Insurance Premiums,” Int J Environ Res Public Health, vol. 19, no. 13, p. 7898, Jun. 2022, doi: 10.3390/ijerph19137898.

[2] D. W. Astari and Kismiantini, “Analysis of Factors Affecting the Health Insurance Ownership with Binary Logistic Regression Model,” J Phys Conf Ser, vol. 1320, no. 1, p. 012011, Oct. 2019, doi: 10.1088/1742-6596/1320/1/012011.

[3] D. W. . Hosmer, Stanley. Lemeshow, and R. X. . Sturdivant, Applied logistic regression, 3rd ed. Wiley, 2013.

[4] D. M. W. Powers, “Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation,” International Journal of Machine Learning Technology, vol. 2, no. 1, pp. 37–63, Oct. 2020.

[5] D. G. . Kleinbaum and Mitchel. Klein, Logistic regression : a self-learning text, 3rd ed. Springer, 2010.

[6] G. James, D. Witten, T. Hastie, and R. Tibshirani, “Classification,” in An Introduction to Statistical Learning, 2nd ed., 2021, ch. 4, pp. 133–152. doi: 10.1007/978-1-0716-1418-1_4.

Downloads

Published

2026-03-25

Issue

Section

Artikel