Application of Poisson and negative binomials models to estimate the frequency of insurance claims



Generalized Linear Models (GLMs) are a modeling approach that allows the modeling of nonlinear behaviors and non-Gaussian distributions of residues. This approach is very useful for general insurance analysis, where the frequency of claims and the amount of claims distributions are usually non-Gaussian. In this article, the application of Poisson and Negative Binomial models to estimate the frequency of claims of auto insurance is discussed. The accuracy of the models was compared to choose the best model for determining pure insurance premiums using R software. The data used are a secondary dataset which is the motor vehicle insurance dataset from Sweden named dataOhlsson and the motor vehicle dataset from Australia named ausprivauto0405. The results of the exploration of the GLMs model are that Poisson's GLM and Negative Binomial models both are suitable models for estimating the number of claims for the dataOhlsson dataset. Both models have relatively similar parameter estimates, as well as the AIC and BIC values for the dataOhlsson dataset, however, both models are not suitable for estimating the number of claims for the ausprivauto0405 dataset. More investigation using different models is needed to ensure which model is more appropriate for estimating the frequency of insurance claims.


GLMs, frequency of claims, Poisson model, Negative Binomials model


Hector, A. 2021. Generalized linear models in The new statistics with R: An introduction for biologists, 2nd ed. (Oxford, Oxford Academic). DOI: 10.1093/oso/9780198798170.003.0015

Dunn, P.; Smyth, G. 2018. Generalized linear models with examples in R. (New York: Springer).

David, M. 2015. Auto insurance premium calculation using generalized linear models. Proc. Econ and Fin. 20 147-156. DOI: 10.1016/S2212-5671(15)00059-3.

Omari, C. O.; Nyambura, S. G. and Mwangi, J. M. 2018. Modeling the frequency and severity of auto insurance claims using statistical distributions. J. Math. Fin. 8 137-160. DOI:


Weldesenbet, A. B., Kebede, S. A., Ayele, B. H., & Tusa, B. S. 2021. Health insurance coverage and its associated factors among reproductive-age women in East Africa: A multilevel mixed-effects generalized linear model. Clinicoecon Outcomes Res. 13 693-701. DOI:10.2147/CEOR.S322087

Degeneffe, M. 2020. A comparative analysis of statistical models for the pricing of health insurance. Netspar. MSc 07/2020-012.

Lyubchich, V.; Newlands, N.K.; Ghahari, A.; Mahdi, T.; Gel, Y.R. 2019. Insurance risk assessment in the face of climate change: Integrating data science and statistics. WIREs: Comp. Stat. 11(4) p.e1462. DOI:


de Jong, P. and Heller, G. Z. 2008. Generalized linear models for insurance data. (New York: Cambridge University Press).

David, M.; Jemna, D. V. 2015. Modeling the frequency of auto insurance claims by means of Poisson and Negative Binomial models. Ann. Alexandru Ioan Cuza Univ.Econ. 62 (2) 151-168. DOI: 10.1515/aicue-2015-0011

Ohlsson, E.; Johansson, B. 2010. Non-life insurance pricing with generalized linear models (Heidelberg: Springer).

Ademola, A. A.; Sabri, R. M. 2021. Modeling claim frequency in insurance using count models. Asian J. Prob. Stat. 14 (4): 14-20. DOI: 10.9734/ajpas/2021/v14i430334

Full Text: PDF

DOI: 10.24815/jn.v23i1.26623


  • There are currently no refbacks.