Actuary, Statistics, Insurance, Data Science, Models, Claims, Severity, Automobiles


Insurance companies examine the risk of financial losses for their policyholders as a way to accurately price insurance policies. Within the automobile insurance sector, the frequency of crashes and the associated liabilities started to increase in late 2013 when it had been on the decline for close to a decade. The purpose of this research focuses on the possible correlated variables that could lead to a better understanding of this change.

To embark on this task, we teamed up with the Society of Actuaries, Casualty Actuarial Society, and the American Property Casualty Insurance Association to obtain data regarding frequency, severity, and loss costs. They have available resources in the insurance sector to inform other people about our findings.

The method for this project primarily focuses on using a random forest model and its associated variable importance plot. This method allowed us to determine which variables are most important for auto coverage.

Our team looked at the coverages of bodily injury, collision, comprehensive, personal injury protection, and property damage. This thesis will focus exclusively on personal injury protection, especially regarding severity, since that was my main contribution to the project.

Document Type


Publication Date



Physical and Mathematical Sciences



University Standing at Time of Publication