Professional Documents
Culture Documents
Megan Dernberger, Chaitu Konjeti, Taylor Smith, and Casey Van Kaer
Abstract
The SportVU camera system is an advanced optical recognition system with the
capability to record massive amounts of information, allowing for the tracking of player and ball
movement. Data is recorded 25 times per second and stored as a XML file that can be accessed
for further analysis. However, the enormous amount of data collected often has to be sent offsite
to be processed, which is expensive and time consuming. Software was developed, using Python
3.5 in the Integrated Development Environment (IDE) Spyder, to accelerate the processing of the
large amount of data and provide instantaneous results. The software generates both 2-
dimensional and 3-dimensional animations, which show instant playbacks of specified time
frames of the basketball game and can be used for visual analysis. Additionally, an individual
player distance analysis was generated, which allows for fatigue rates and possible player
correlations. Shot maps for individual players and entire teams can also be developed, which
provides a more efficient way to track all attempted shots and compare shot accuracy rates. Not
only does the software expedite the data processing, it also provides more efficient and impactful
results for the teams while cutting back on necessary expense and time.
Introduction
Big data research, the analysis of large data sets to reveal patterns, has had a recent surge
of interest within the last decade [2][3]. The expansiveness of computers and large data
collection has required the development of algorithms and software to analyze data. Big data
research is used within several industries such as banking, healthcare, government, and education
[1]. Algorithms are used in these industries to provide insight that can boost quality and output,
minimize waste, risk, and fraud, and can help implement better systems.
Big data research can also be used in a sports application as a method of understanding
the strengths and weaknesses of a team and can help to maximize success and minimize errors in
game [5]. Before advanced technology like SportVU, the traditional ways of analyzing plays
and game statistics were pencil and paper. This method was moderately successful at recording
data, but left room for user error and was highly inefficient. However, current technologies can
be utilized to perform these tasks in a significantly shorter time period with efficient algorithms.
Big data analysis is currently used to evaluate the efficiency of key plays, influence
future coaching decisions, and track off the ball events. Algorithms have been developed that
analyze the acceleration of a player during key plays in a game to assess its relationship to the
outcome of that play [6]. Traditional statistics only offer after-the-fact information that is often
generalized and unusable, at least for immediate use. Companies, such as STATS, LLC, have
developed equipment, such as SportVU, that utilizes missile tracking technology to provide
accurate data regarding the players activity in an immediately usable form [4].
The SportVU camera system is an advanced optical recognition system with the
capability to record various aspects of sports games, especially in basketball games. The system
includes six cameras, one over each basket and two on each of halfs sideline, that are able to
record the game clock, free throws, two-dimensional coordinates of each player and referee, and
three-dimensional coordinates of the ball [4]. It records and saves the coordinates twenty-five
times per second, resulting in massive amounts of data being generated and stored [4]. The data
is stored in an XML format, a file format that allows for data to be stored and transported. This
file format requires the use of software to access or visualize the desired data.
This project focuses on creating such a software that allows access to the game data,
generation of certain metrics, and visualization of the data. This paper presents a software that is
visual analysis of specific time segments of the game. Also through this software, distance
traveled by individual players can be calculated with possible implications such fatigue rates.
Furthermore, this software allows for the creation of individual player and full team shot maps
that differentiate between made and missed shots. From this, teams are able to analyze aspects
such as shot rates, shot positioning, and, potentially, defensive effectiveness while looking at
shot maps from opponents. This software is designed to be universal to all SportVU basketball
data files and serve as a basis for calculating other metrics, performing further analytics, and
generating additional figures for visualization of the basketball game in the future.
Methods
Algorithm Development
An XML file from a 2013 college basketball game between Duke and Florida Atlantic
was acquired. Python (version 2.7 and 3.5) was used to write the algorithm, Spyder, an
integrated development environment (IDE), was used to develop the algorithms and to design the
graphs and animations. The developed algorithms were designed to parse through the XML
output file and isolate usable portions of data. The algorithms further manipulated the data set to
create a two-dimensional and three-dimensional animation of the game (Figure 2 and 3), graph
the movement of the ball to determine number of dribbles (Figure 4), determine the total distance
traveled by each player (Figure 5), and create shot maps for each team and each player (Figure
6).
Animation
Both a two-
dimensional animation were generated using the matplotlib python library. Initially, a two-
dimensional animation was created by parsing through the XML file and obtaining the X and Y-
coordinates for each element on the court. These coordinates were then plotted and animated in
order to provide opportunity for visual analysis. A dynamic legend, indicating each team and
player using various colors and shapes, was implemented in order to allow real time player
identification and to facilitate frequent substitutions during the game. Another aspect
incorporated into the animation was a game clock that allowed user input to view and analyze
specific sections of the game. In an attempt to better view the movement of the ball throughout
the game and to provide data for further analysis, the Z-coordinate of the basketball was
implemented into the animation to provide a dynamic three-dimensional projection of the game.
This three-dimensional animation still incorporated all the aspects that were developed for the
two-dimensional animation. Additionally, a function was created that allows the user to
visualize the path of the ball during times specified by user input.
Distance
The distance algorithm was developed by calculating the distance between the X and Y-
coordinates at each recorded time point of each player using the formula, =
into a list and, eventually, added together to create a total distance for the game. Graphs were
created for user visualization of the distance traveled during the game.
Shot Map
A shot map is a figure that depicts the locations of where a player took a shot and
determines whether the shot was made or missed. Shot maps were created using ball and player
coordinates in conjunction with one another. The algorithm determined if, during a specified
amount of time, a shot occurred. To do this, the algorithm identified the time frames at which
the ball was above a player's head in a position that typically indicated a shot between 7.25 and
7.75 feet above the court. The Z-coordinates were used to determine the height of the ball. In
order to further insure the precision of the algorithm, the balls terminal location had to be close
to the rim. Once a shot was detected, the algorithm determined when the ball first left the
players hand and then recorded this location as the place where the shot was taken. This
Results
The 2-dimensional animation was developed to visualize the player and ball coordinates
that were collected from the XML file (Figure 2). This figure shows the game at 19:16 left in the
first half of the basketball game with the black team being Duke and the gray team being Florida
Atlantic. Building off of the previous 2-dimensional animation, 3-dimensional animation was
developed in order to depict a more dynamic view for further analysis of the basketball game
(Figure 3). This figure shows the game at 19:14 left in the first half with the black team being
Duke and the gray team being Florida Atlantic. The animation can be viewed from multiple
different perspectives.
Figure
2: This
2-
dimens
ional
graph
shows
all
locatio
ns of
the
home
players
, the
away players, and the ball at 19:16 (1154.28 seconds) left in the first half of the Duke vs. Florida Atlantic game. The dynamic
legend shows, in real time, the player IDs of all players on the court, which are identified by color and shape. The black shapes
are the home team, Duke, the gray shapes are the away team, Florida Atlantic, and the light grey circle is the basketball.
Figure 3: This 3-dimensional graph shows all the locations of the home players, the away players, and the ball at 19:14 left in the
first half of the Duke vs. Florida Atlantic game. The dynamic legend shows, in real time, the player IDs of all players on the
court, which are identified by color and shape. The black shapes are the home team, Duke, the gray shapes are the away team,
Florida Atlantic, and the light grey circle is the basketball.
A graph showing the path of a ball during a series of dribbles, a pass, and a shot was
generated (Figure 4). The figure shows a 5 second time lapse, starting at 14:07 left in the first
half. The play starts with dribbles in the lower right corner, a pass occurs across the court, and
then a 3-point shot is made. A graphical representation of the distance data was created (Figure
5). This data is separated into the two teams as well as individual players. From this,
teams for a specified time frame. The generated maps show the attempted shots taken by each
team and whether or not the shot was made or missed based on the color of the dot. A shot map
was created to show the locations of the shots, distinguishing between made or missed, from the
first half of the game (Figure 7). The official Duke Box Score shot map is shown for reference
(Figure 8) [7]. The algorithm created detected 83 shots and the official Duke Box Score had 85
Figure 7: This shot map is the official Box Score for Duke vs. Florida Atlantic for the [7]. The filled in dots represent the made
shots whereas the open dots represent the missed shots. The dots on the left represent shots taken by the away team (Florida
Atlantic), and the dots on the right represent shots taken by the home team (Duke).
Discussion
Multiple analyses, including shot maps, distance graphs, and 3d-representations were
developed using Python. These metrics can be implemented by coaches in order to improve
strategy and minimize avoidable mistakes on the court. The developed metrics were shown to be
fairly accurate when compared to the official Duke BoxScore in regards to made-and-missed
shots and location of shots. The 2-dimensional and 3-dimensional animation of the game, when
compared to the actual footage, were accurate representations except for time frames where the
camera was unable to locate players or the ball on the court leading to a glitch. Furthermore, all
desired aspects of the game, such as game-clock, lines on the court, and dynamic legend, were
successfully plotted. Sports metrics, such as the ones in this project, are becoming a valuable
commodity [8]. Teams are now using big data and computer algorithms to analyze their teams
Some limitations that were encountered throughout this project were hardware
capabilities and glitches in the data due to camera problems. A lack of hardware and computing
power drastically reduced the speed and efficiency of the algorithms. The SportVU camera
occasionally had problems locating certain players or the ball leading to minor glitches in the
data points. These glitches provide obstacles when trying to calculate the actual distance of the
player since some time points are not accurate. There was also limited data files to analyze, so
The software that was developed during this study could serve as a basis for numerous
future investigations. One such possibility would be to create an algorithm that shows real time
points for both teams, not just a sum of points at the end of the specified time frame. This
would open up the opportunity for finding correlations between previously created algorithms
and the success of the play or set of plays. Another possible idea for future projects is to find a
correlation between total distance that players run to outcome of the game. Later projects could
include development of algorithms that can calculate percentages of made or missed contested
shots, defensive efficiency, amount of dribbles a player takes during a game, and efficiency of
the interactions between players. The metrics can be used to visualize and understand each team
Conclusion
The development of advanced basketball metrics using the SportVU camera system is an
extension of big data analysis into sports. Big data analysis is used in a variety of fields, such as
accounting, banking, education, etc., and has been recently introduced into the realm of sports
statistics. Traditionally, statistics were measured by hand by a trained observer. This led to high
counts of human error and, ultimately, faulty statistics. In recent decades equipment, such as
SportVU, has been designed to correct this flaw. SportVU tracks and records the position of
each player, referee, and the ball during a basketball game. This data can then be stored and
analyzed. In this study, data that was gathered from a SportVU camera system during a Duke
basketball game in 2013 was able to be animated for visual analysis and analyzed using several
different algorithms. These algorithms measured aspects of the game such as individual player
distance traveled and generated shot maps denoting made or missed shots. These measurements,
figures, and animations can potentially be used to improve a teams tactics and/or personnel,
References
[1] Home, "What is big data and why it matters," 2016. [Online]. Available:
http://www.sas.com/en_us/insights/big-data/what-is-big-data.html. Accessed: Nov. 10, 2016.
[2] W. Raghupathi and V. Raghupathi, "Big data analytics in healthcare: Promise and potential," Health
Information Science and Systems, vol. 2, no. 1, p. 3, 2014.
[3] K. Kambatla, "Trends in big data analytics,". [Online]. Available:
http://barbie.uta.edu/~hdfeng/bigdata/Papers/Trends%20in%20big%20data%20analytics.pdf. Accessed: Nov.
10, 2016.
[4] "STATS SPORTVU WORKFLOWS,". [Online]. Available:
http://docs.stats.com/Webinars/STATS_Webinar_SportVU.pdf. Accessed: Nov. 10, 2016.
[5] D. Hanchett, "Playing Hardball with BIG DATA," 2012. [Online]. Available:
http://www.emc.com/collateral/article/137534-sports-analysis.pdf. Accessed: Nov. 10, 2016.
[6] P. Maymin, "Acceleration in the NBA: Towards an Algorithmic Taxonomy of Basketball Plays,".
[Online]. Available: http://www.sloansportsconference.com/wp-
content/uploads/2013/Acceleration%20in%20the%20NBA%20Towards%20an%20Algorithmic%20Taxonom
y%20of%20Basketball%20Plays.pdf. Accessed: Nov. 10, 2016.
[7] Florida Atlantic vs. Duke Shot Chart. 2013 [Online]. Available: http://www.espn.com/mens-college-
basketball/playbyplay?gameId=400499243. Accessed: Nov. 10, 2016
[8] F. Erulj and E. trumbelj, Basketball Shot Types and Shot Success in Different Levels of Competitive
Basketball, Plos One, vol. 10, no. 6, Mar. 2015
[9] S. J. Ibez, J. Sampaio, S. Feu, A. Lorenzo, M. A. Gmez, and E. Ortega, Basketball game-related
statistics that discriminate between teams season-long success, European Journal of Sport Science, vol. 8, no.
6, pp. 369372, 2008.
[10] J. H. Fewell, D. Armbruster, J. Ingraham, A. Petersen, and J. S. Waters, Basketball Teams as Strategic
Networks, PLoS ONE, vol. 7, no. 11, Jun. 2012.