Skip to main content

An Analysis of Car and Driver Impact on Formula 1 Success - Part 1


One of the most hotly debated topics in Formula 1 is over who should receive more of the plaudits for success, the car, or the driver. There are many views that range over the entire spectrum of opinions, from those who believe that the best car would win even with the worst driver, to those who think that the best driver can single-handedly drag a middling car to greatness.

As a Formula 1 fan, I have always been very intrigued by this question. The goal of my analysis in the paper is to provide a quantitative look at this age-old question and set up a foundation upon which further empirical research into this topic can be conducted. This project is adapted from a project that I worked on for a college Statistics for Data Science class that I had taken.

In this project, I looked at the last 30 years of Formula 1 data, spanning 1994 to 2023. I looked at various statistics that I thought would be insightful, and I eliminated metrics that were redundant and did not offer much avenue for exploration. After I had settled on the metrics that I thought would be most useful for analytical research, I narrowed down the scope of the data that I was going to use. I found data for the winners of each Drivers’ Championship from 1994 to 2023, the runner up in that season, and their best teammate. In situations where the runner up was the teammate of the winner of the championship for that season, I looked at the third-place finisher and their teammate for my analysis. Because of the transient nature of Formula 1 driving opportunities, many drivers lower down the order often get replaced mid-season, and this could have the possibility of skewing the data. Hence, I limited myself to top of the championship finishers, who have more stability in team and driver selection throughout the season.

Preliminary Variable Analysis

Before I went into an in-depth analysis of the data I had, I felt that it was important for me to first look at the individual variables that I was using, and try to identify some trends in the data, so that I could easily make connections between the results that I would find through the course of analysis, and the domain specific knowledge that I already possessed. The variables that I included in my data were:

1.     Position: This is the place that the driver secured in the final Formula 1 Drivers’ Championship in the season they were competing in. It follows a ranking scale, with lower numbers being more prestigious, and 1 indicating that the driver won the Drivers’ Championship in that season.

2.     Team: This metric was something that I created to make the data easier to analyze and sort through. Given that there were always only two teams per season in my data, I assigned 1 to the team of the winner of that season’s Championship. The runner up’s team would be assigned 2. The second drivers for both teams were assigned either 1 or 2 based on which team they were racing for in that season.

3.     Points: This metric is used to measure how many placing finishes a driver has over the course of the season. Points are the metric by which the Drivers’ Championship is analyzed, and so I made it a key part of my analysis. Due to the myriad scoring systems in Formula 1 over the years, I also had to introduce another variable to account for the change in system.

4.     Wins: This shows how many wins each driver had in that specific season.

5.     Win Percentage: This statistic is the percentage of total wins that driver had in that season. It is calculated by taking the number of races won, dividing by the total races raced that year, multiplied by 100.

6.     Poles: This shows how many pole positions each driver had in that specific season. A pole position indicates that a driver will start at the front of the grid for the race.

7.     Pole Percentage: This statistic is the percentage of total pole positions that driver had in that season. It is calculated by taking the number of pole positions, dividing by the total races raced that year, multiplied by 100.

8.     Point System: This statistic was one that I introduced to explain some of the drastic fluctuations in points scoring. Over the past thirty years of Formula 1, there have been a few minor changes, and one major change that caused a major impact on points scored. This variable was introduced to correct for the sudden change in scores over a certain period.

 

After finalizing the variables that I was going to use, I plotted some graph of the variables that I was using, to identify macro trends within the data.





This graph shows the points distribution for all drivers in the population set by points scored. From this we can find a few takeaways. Firstly, the graph has a rough outline of being a heavily left-skewed normal distribution. Secondly, there is a big spike in the points scored by drivers between 2009 and 2010, which is when the new points scoring rules came into effect. The points scored have been on an upward curve ever since the introduction of the new rules, where a win was increased from 10 points to 25 points, and every corresponding point value was also increased. Finally, the seasons that took place during the scoring systems that awarded 10 points for a win (Systems 1 and 2) seem to be clustered closer together than the season that had 25 points for a win (Systems 3 and 4), indicating a higher level of competition in scoring systems 1 and 2.

 


The next graph that I looked at was the graph of points totals of only the Championship winners for that season. While most of the trends that were observed in the earlier graph were also true in this graph, there is one very interesting pattern in this graph. Under scoring system 2, which was in effect from 2003 to 2009, the points scored by Championship winners decreased almost every season, which stands in sharp contrast to the rest of the data. Under every other scoring system, the number of points required to win a championship followed a general upward trend. This makes the contrast of the 2003-2009 seasons stand out even more, as there does not seem to be a reason as to why the points totals continuously decreased.


 

The graph that I looked at next was the graph of pole positions and wins secured by the championship winner season by season. I thought that this would be an interesting area of exploration, as Formula 1 teams sometimes eschew qualifying pace to set up their car best for the race. This graph also showed something very interesting. While for the most part, the pole positions secured and the race wins of the championship winner followed roughly the same path, there were only three seasons in the data where the championship winner had the same number of pole positions and wins in a season.


                                        


This graph is similar to the graph of pole positions and wins above, except that it shows the pole position and win data in percentage form. One thing that I noticed from this graph was that there were quite a few seasons where the championship winner had neither 50 percent of total pole positions nor wins. This happened more often in the earlier seasons covered by the data, which is likely because cars were not as reliable as they are in the more recent Formula 1 seasons.


                                     


The last variable analysis that I looked at was the year over year positioning of the winner’s and the runner up’s teammates in the Championship. For the most part, it seemed that the winner’s teammate usually outperformed the runner up’s teammate. For both drivers’ highest possible positions, the winner’s teammate finished second in the championship 9 times, as compared to the runner up’s teammate finished third in the championship 6 times.

In part 2 of this project, I will establish 3 research questions, and through various statistical analysis, I will establish their veracity.

Comments

Popular posts from this blog

An Analysis of Car and Driver Impact on Formula 1 Success - Part 2

  In the previous part of this project, I looked at the variables I was using, and some of the trends that I identified through a preliminary analysis. Part 2 of this project is dedicated to:  - The research questions I formulated  - The statistical analyses that I used for each question  - The interpretation of my analysis  - What conclusions I was able to draw to answer each research question Research Questions Based on my preliminary analysis of the variables that I was working with, I came up with more questions that I was interested in exploring, in addition to my original goal of figuring out whether the car or driver was more crucial to Formula 1 success. One of the first things that piqued my interest was how the different points systems affected overall scoring. While it was immediately clear that the change in point scoring systems from 10 points for a win to 25 points for a win resulted in drastic changes to the point totals, my hypothesis was that the change from sys

The Car Driver Quandary

  If you were a Formula 1 team principal, would you rather have the best driver or the best car? For those of you who are regular readers of the blog, firstly, thank you for your support. It means a lot to me. Secondly, this might seem like a weird hook, considering my previous articles have all been about basketball. While that is true, I am actually a big fan of multiple sports, including football (soccer to American readers), baseball, Formula 1, and yes, basketball. Instead of creating a different blog for each sport, I figured it would be better to write about multiple different sports on the same blog. The dilemma of car or driver popped into my head while I was watching the US Grand Prix this past weekend. While Max Verstappen was comfortably on his way to another victory, the 50 th of his career, there were many interesting battles throughout the rest of the field. This got me thinking about the age-old question of whether having the best driver or the best car brings more

Statistics To Help the Spurs – A Retrospective

A little while ago, I made a post about how the San Antonio Spurs could get better using statistics. A season and change later, it’s time to re-examine the players involved in that piece and see how the Spurs are playing. As of the new year of 2022, the Spurs are 14-20, with the 5 players averaging the most minutes per game being Dejounte Murray, Keldon Johnson, Derrick White, Jakob Poeltl, and Doug McDermott. Murray is the only player who was in the starting 5 from last season. White, Johnson, Poeltl were all on the roster and have moved into bigger roles, while McDermott is a free agent signing in his first season with the Spurs. The previous starting 5 was Murray, Bryn Forbes, DeMar DeRozan, LaMarcus Aldridge, and Trey Lyles. 3 of those players (DeRozan, Aldridge, and Lyles) have moved on. Forbes, after winning an NBA championship with the Bucks, returned to the Spurs this season. The 2019-20 Spurs line-up had a win co-efficient of 0.579. The top line-up that I looked at was t