Author

Eric Shin

Date of Award

2024

First Advisor

Eric Kramer

Second Advisor

Jack Burkart

Abstract

This thesis investigates the application of Markov chains and logistic regression in predicting tennis match outcomes to enhance the predictive capabilities within sports analytics. Sports predictions captivate fans and drive a multibillion-dollar industry, influencing gambling markets, player recruitment strategies, and sponsorships. As a sports enthusiast with a background in tennis and a keen interest in data science, I was motivated to merge these passions to improve predictive methodologies in sports. The Markov chain model simulates entire tennis tournaments, providing insights into potential tournament outcomes based on probabilistic transitions between match states. This model’s effectiveness is benchmarked against Sportsbook futures, assessing its accuracy in reflecting the dynamics of tournament progression. In contrast, logistic regression is utilized to predict individual match outcomes, leveraging a wide array of match-related variables to achieve a substantial accuracy rate. Applied to data from the US Open, the logistic regression model demonstrates an impressive 84% accuracy in predicting match outcomes, notably surpassing the 70.6% benchmark, achieved by simply predicting victories for higher-ranked players - an improvement of 13.4%. This thesis underscores the logistic regression model’s superiority in capturing the complex factors influencing match results, offering valuable insights for enhancing competitive strategies and player performance. This comparative analysis contributes to sports analytics by elucidating the potential and constraints of both models, paving the way for future research to refine predictive analytics in tennis and beyond.

Simon's Rock Off-campus Download

Simon's Rock students and employees can log in from off-campus by clicking on the Off-campus Download button and entering their Simon's Rock username and password.

Share

COinS