Introduction
Over the weekend, I attended my first Hackathon! It just so happens, it was related to Generative-AI. Being a fully-rookie team, my group quickly found used our interests in the intersections of Data Analytics and Basketball to create a project. The goal was to use LLMs to generate scouting reports for basketball teams to gain better insights and improve their stradegies. You can find our repository here.
Data Visualizations
Our first goal was to provide Data Visualizations to provide coaches with a clear way to examine their team’s strengths and weaknesses. We stumbled upon the Spider Chart through a Tik Tok video that we felt clearly showed patterns in playstyle like we hoped for, and decided to use it in our scouting reports.
After deciding our approach, we found a NBA dataset on Kaggle to use for our data visualizations. We used plotly to provide interactive graphs that allowed users to compare players and teams from either a single year, range of years, or all years.
Here is an example player comparison, comparing players from two different eras.
Or teams from two different eras.
And current teams.
Retrieval-augmented generation (RAG)
In order to provide the scouting reports, we used RAG. A blog post will follow about how we used RAG, but, overall, we fed the model exampels of NBA scouting reports, NBA articles, and player data in order for it to present the most accuate findings in a scouting report structure.
Overall, we were pleased with the results, however noticed inconsitencies in the responses where the model often referenced players that were no longer on the team, despite prompting it to analyze a team from a specific year. I believe this could be due to the use of older articles in either its training data or the articles we were providing it. In the future, I’d like to look into feeding the model data in the form of CSVs and providing it articles from ESPN written in only this year.
Here is an example scouting report output:
The Interface
As shown below, the user inputs either teams or players they would like to compare and specify the years they would like to analyze.
Conclusion
I had a great time at my first hackathon and exploring the world of Generative AI through sports. In the future, I’d like to further explore Langchain as well as RAG. I’d also like to tweak my data and provide current articles and csvs of data to provide more accurate scouting reports. I’d also like to see if models like LLava can provide insights given the spiderplots.