Skip to main content

Guess Who? (A lesson in Cryptography)

 


In the last few years, there has been a growing shift towards the analytics side of basketball. Formerly unknown analytics maestros like Daryl Morey are now becoming more recognized, and their principles are being used to build teams that can contend for championships.

A core tenet of sports analytics is stripping away the bias that is inherently present in humans, allowing for an evaluation of players at face value. Though the sports analytics movement has gained traction in NBA circles, many fans still haven’t embraced it, which precludes them from understanding certain trades and signings made by their favorite teams.

One way to eliminate bias without the benefit of supercomputers and reams of data is through using cryptography to encrypt players’ names. This ensures that players are evaluated only based on their stats.

Cryptography involves coding the plain text to make it secure and decoding the coded text to retrieve the plain text. The coding and decoding processes are accomplished using keys.

There are primarily two types of encryption: Symmetric key and Asymmetric key encryption.

Symmetric key encryption involves the use of the same key for both encryption and decryption. Symmetric key encryption can be broken if others know the key, which makes it vulnerable. One example of Symmetric key encryption is the Advanced Encryption Standard (AES).

Asymmetric key encryption involves the use of two keys, a public key to encrypt the plain text and a private key to decrypt the coded text. Asymmetric key encryption is harder to break than symmetric key encryption, but it also takes more time and computing power, due to the magnitude of the numbers involved. One example of Asymmetric key encryption is the RSA cryptosystem.


The RSA cryptosystem was invented by Ron Rivest, Adi Shamir, and Leonard Adleman in 1977. The name RSA comes from the first letter of the authors’ last names. The keys to encrypt and decrypt are generated by multiplying two large prime numbers. Though the formulae to generate the public and private keys are common knowledge, the difficulty in deriving the public and private keys lies in the time and computing power required to factorize the product of two large prime numbers.

In the RSA system, every participant uses two keys, public and private, to encrypt and decrypt plain text. As the names suggest, a private key is known only to the person while the public key of a person is known to all participants. The person encrypting the text uses the recipient’s public key to encrypt the plain text. The recipient uses their private key to decrypt the encrypted text. As the recipient’s private key is known only to the recipient, no one other than the recipient can decrypt the message easily.

Using the RSA module in a simple Python program, I generated public and private keys. I then encrypted a plain text, like the name Kawhi Leonard, resulting in the following bytes. The public key used to encrypt the plain text is this:



The plain text, Kawhi Leonard, when encrypted with the public key, looks something like this:



 To decrypt the bytes to get the plain text, I use the private key. Of course, I am not publishing the private key here as it is supposed to be private.

In a previous article, I wrote about how the San Antonio Spurs, using a linear regression model, can aim for another NBA championship. The table in this article listed combinations of the Spurs starting five and their predicted win-coefficients. Encrypting the players’ names will give us this table:

For reference, this is the table with the players’ names:

If the encrypted text is too large to fit in a page or sometimes even store efficiently, we can use Base64 encoding to create coded text that is smaller than encrypted text. This is space friendly, but it is easy to reverse, and does not provide much in the way of security.

For example, encoding the text Kawhi Leonard via Base64 results in the encoded text:



Using a Base64 decoder, we can get the plain text from the encoded text.

Thus, using cryptography, we can strip out the bias that is inherent in humans, allowing for a more objective evaluation of players’ talents, and maybe even learn a thing or two about coding, a very interesting subject in its own right!

Comments

  1. Impressive! I really like this post. Keep it up!

    ReplyDelete
    Replies
    1. Thank you very much, Sumedha. I hope you found my other articles interesting too.

      Delete

Post a Comment

Popular posts from this blog

Gone Too Soon: The Story Of Dražen Petrović

In the 1989-90 season, the Portland Trail Blazers bought out Dražen Petrović’s contract with Real Madrid and convinced him to join the NBA. This would mark the start of a trailblazing career that was tragically cut short.        Dražen Petrović was born in Šibenik, Croatia on the 22 nd of October, 1964. At the age of 15, he was already in the first team of his hometown club, and by the age of 18, Petrović had blossomed into a star for Šibenik. After serving in the military for a year, he moved to Cibona in 1984, where he would play till 1988. At Cibona, Petrović shined. He once scored 112 points in a Yugoslavian League game ( 40/60 FG, 10/20 3Pts, 22/22 FT), which is possibly the most efficient performance in any European league ever. He averaged 37.7 points in the Yugoslavian first division and 33.8 points in European competitions in his 4 years at Cibona, cementing his status as a European star. In 1988, at the age of 23, he moved to Real Madrid, where he stayed ...

Statistics to Help the Spurs

Every sports fan, diehard or casual, has watched Moneyball, the movie about the use of statistics in baseball. While sports has become more receptive to the use of statistics to identify players, many fans still do not like to use or misuse statistics to back up their opinions. As an avid NBA fan, I too love to concoct fictitious trades to help make my team better. Through the use of statistics, I am going to try to make well informed decisions regarding player acquisitions for the San Antonio Spurs, my favourite NBA team. To tackle this problem, I used a linear regression model. To create the model, I first collected box score data for the Spurs’ 2019-20 season. This data was then used to create a model that will give a composite score, which predicts a team’s record. According to the model, a score closer to 1 indicates a better record, while a score closer to 0 indicates a worse record. Using Basketball Reference, I identified 8 players who the Spurs could feasibly acquire and who...

An Analysis of Car and Driver Impact on Formula 1 Success - Part 2

  In the previous part of this project, I looked at the variables I was using, and some of the trends that I identified through a preliminary analysis. Part 2 of this project is dedicated to:  - The research questions I formulated  - The statistical analyses that I used for each question  - The interpretation of my analysis  - What conclusions I was able to draw to answer each research question Research Questions Based on my preliminary analysis of the variables that I was working with, I came up with more questions that I was interested in exploring, in addition to my original goal of figuring out whether the car or driver was more crucial to Formula 1 success. One of the first things that piqued my interest was how the different points systems affected overall scoring. While it was immediately clear that the change in point scoring systems from 10 points for a win to 25 points for a win resulted in drastic changes to the point totals, my hypothesis was ...