You are on page 1of 13

Development of basketball metrics using SportVU, an advanced

optical recognition system

Megan Dernberger, Chaitu Konjeti, Taylor Smith, and Casey Van Kaer

School for Science and Math at Vanderbilt, Nashville

Abstract

The SportVU camera system is an advanced optical recognition system with the

capability to record massive amounts of information, allowing for the tracking of player and ball

movement. Data is recorded 25 times per second and stored as a XML file that can be accessed

for further analysis. However, the enormous amount of data collected often has to be sent offsite

to be processed, which is expensive and time consuming. Software was developed, using Python

3.5 in the Integrated Development Environment (IDE) Spyder, to accelerate the processing of the

large amount of data and provide instantaneous results. The software generates both 2-

dimensional and 3-dimensional animations, which show instant playbacks of specified time

frames of the basketball game and can be used for visual analysis. Additionally, an individual

player distance analysis was generated, which allows for fatigue rates and possible player

correlations. Shot maps for individual players and entire teams can also be developed, which

provides a more efficient way to track all attempted shots and compare shot accuracy rates. Not

only does the software expedite the data processing, it also provides more efficient and impactful

results for the teams while cutting back on necessary expense and time.
Introduction

Big data research, the analysis of large data sets to reveal patterns, has had a recent surge

of interest within the last decade [2][3]. The expansiveness of computers and large data

collection has required the development of algorithms and software to analyze data. Big data

research is used within several industries such as banking, healthcare, government, and education

[1]. Algorithms are used in these industries to provide insight that can boost quality and output,

minimize waste, risk, and fraud, and can help implement better systems.

Big data research can also be used in a sports application as a method of understanding

the strengths and weaknesses of a team and can help to maximize success and minimize errors in

game [5]. Before advanced technology like SportVU, the traditional ways of analyzing plays

and game statistics were pencil and paper. This method was moderately successful at recording

data, but left room for user error and was highly inefficient. However, current technologies can

be utilized to perform these tasks in a significantly shorter time period with efficient algorithms.

Big data analysis is currently used to evaluate the efficiency of key plays, influence

future coaching decisions, and track off the ball events. Algorithms have been developed that

analyze the acceleration of a player during key plays in a game to assess its relationship to the

outcome of that play [6]. Traditional statistics only offer after-the-fact information that is often

generalized and unusable, at least for immediate use. Companies, such as STATS, LLC, have

developed equipment, such as SportVU, that utilizes missile tracking technology to provide

accurate data regarding the players activity in an immediately usable form [4].

The SportVU camera system is an advanced optical recognition system with the

capability to record various aspects of sports games, especially in basketball games. The system

includes six cameras, one over each basket and two on each of halfs sideline, that are able to
record the game clock, free throws, two-dimensional coordinates of each player and referee, and

three-dimensional coordinates of the ball [4]. It records and saves the coordinates twenty-five

times per second, resulting in massive amounts of data being generated and stored [4]. The data

is stored in an XML format, a file format that allows for data to be stored and transported. This

file format requires the use of software to access or visualize the desired data.

This project focuses on creating such a software that allows access to the game data,

generation of certain metrics, and visualization of the data. This paper presents a software that is

capable of animating a basketball game in 2-dimensions as well as 3-dimensions, allowing for

visual analysis of specific time segments of the game. Also through this software, distance

traveled by individual players can be calculated with possible implications such fatigue rates.

Furthermore, this software allows for the creation of individual player and full team shot maps

that differentiate between made and missed shots. From this, teams are able to analyze aspects

such as shot rates, shot positioning, and, potentially, defensive effectiveness while looking at

shot maps from opponents. This software is designed to be universal to all SportVU basketball

data files and serve as a basis for calculating other metrics, performing further analytics, and

generating additional figures for visualization of the basketball game in the future.

Methods

Algorithm Development

An XML file from a 2013 college basketball game between Duke and Florida Atlantic

was acquired. Python (version 2.7 and 3.5) was used to write the algorithm, Spyder, an
integrated development environment (IDE), was used to develop the algorithms and to design the

graphs and animations. The developed algorithms were designed to parse through the XML

output file and isolate usable portions of data. The algorithms further manipulated the data set to

create a two-dimensional and three-dimensional animation of the game (Figure 2 and 3), graph

the movement of the ball to determine number of dribbles (Figure 4), determine the total distance

traveled by each player (Figure 5), and create shot maps for each team and each player (Figure

6).

Figure 1: This flowchart goes


through the process of how each
algorithm was created. Each
algorithm used the data from the
raw XML file, parsed through it,
and obtained the coordinates of
all players and the ball.

Animation

Both a two-

dimensional and three-

dimensional animation were generated using the matplotlib python library. Initially, a two-

dimensional animation was created by parsing through the XML file and obtaining the X and Y-

coordinates for each element on the court. These coordinates were then plotted and animated in

order to provide opportunity for visual analysis. A dynamic legend, indicating each team and
player using various colors and shapes, was implemented in order to allow real time player

identification and to facilitate frequent substitutions during the game. Another aspect

incorporated into the animation was a game clock that allowed user input to view and analyze

specific sections of the game. In an attempt to better view the movement of the ball throughout

the game and to provide data for further analysis, the Z-coordinate of the basketball was

implemented into the animation to provide a dynamic three-dimensional projection of the game.

This three-dimensional animation still incorporated all the aspects that were developed for the

two-dimensional animation. Additionally, a function was created that allows the user to

visualize the path of the ball during times specified by user input.

Distance

The distance algorithm was developed by calculating the distance between the X and Y-

coordinates at each recorded time point of each player using the formula, =

(2 1 ) 2 + (2 1 ) 2 where d is distance. These distances were then concatenated

into a list and, eventually, added together to create a total distance for the game. Graphs were

created for user visualization of the distance traveled during the game.

Shot Map

A shot map is a figure that depicts the locations of where a player took a shot and

determines whether the shot was made or missed. Shot maps were created using ball and player

coordinates in conjunction with one another. The algorithm determined if, during a specified

amount of time, a shot occurred. To do this, the algorithm identified the time frames at which
the ball was above a player's head in a position that typically indicated a shot between 7.25 and

7.75 feet above the court. The Z-coordinates were used to determine the height of the ball. In

order to further insure the precision of the algorithm, the balls terminal location had to be close

to the rim. Once a shot was detected, the algorithm determined when the ball first left the

players hand and then recorded this location as the place where the shot was taken. This

location was then plotted on a two-dimensional graph of the court.

Results

The 2-dimensional animation was developed to visualize the player and ball coordinates

that were collected from the XML file (Figure 2). This figure shows the game at 19:16 left in the

first half of the basketball game with the black team being Duke and the gray team being Florida

Atlantic. Building off of the previous 2-dimensional animation, 3-dimensional animation was

developed in order to depict a more dynamic view for further analysis of the basketball game

(Figure 3). This figure shows the game at 19:14 left in the first half with the black team being

Duke and the gray team being Florida Atlantic. The animation can be viewed from multiple

different perspectives.

Figure
2: This
2-
dimens
ional
graph
shows
all
locatio
ns of
the
home
players
, the
away players, and the ball at 19:16 (1154.28 seconds) left in the first half of the Duke vs. Florida Atlantic game. The dynamic
legend shows, in real time, the player IDs of all players on the court, which are identified by color and shape. The black shapes
are the home team, Duke, the gray shapes are the away team, Florida Atlantic, and the light grey circle is the basketball.

Figure 3: This 3-dimensional graph shows all the locations of the home players, the away players, and the ball at 19:14 left in the
first half of the Duke vs. Florida Atlantic game. The dynamic legend shows, in real time, the player IDs of all players on the
court, which are identified by color and shape. The black shapes are the home team, Duke, the gray shapes are the away team,
Florida Atlantic, and the light grey circle is the basketball.

A graph showing the path of a ball during a series of dribbles, a pass, and a shot was

generated (Figure 4). The figure shows a 5 second time lapse, starting at 14:07 left in the first

half. The play starts with dribbles in the lower right corner, a pass occurs across the court, and

then a 3-point shot is made. A graphical representation of the distance data was created (Figure

5). This data is separated into the two teams as well as individual players. From this,

comparisons between teams, players, and player positions can be made.


Figure 4: This example of the 3-dimensional path that the ball took during a shot at 14:07 (game clock = 851.00 to 846.56)
contains a dribble, a pass, and a made 3-point shot. The parabolic flight basketballs typically exhibit is clearly shown.

Figure 5: This figure


shows the distance that
each player ran during the
first half of the Duke vs.
Florida Atlantic game on
November 15, 2013. It
compares the distance
traveled by the home
players, depicted in black,
to the distance traveled by
the away team, depicted in
grey in miles.
The shot maps created display all the location of attempted shots from all players on both

teams for a specified time frame. The generated maps show the attempted shots taken by each

team and whether or not the shot was made or missed based on the color of the dot. A shot map

was created to show the locations of the shots, distinguishing between made or missed, from the

first half of the game (Figure 7). The official Duke Box Score shot map is shown for reference

(Figure 8) [7]. The algorithm created detected 83 shots and the official Duke Box Score had 85

shots recorded (Figure 6).

Figure 8: This table compares the BoxScore


values with the values generated by the
developed software for the first half of the
basketball game.
Figure 6: This shot map displays all the attempted shots from all players on both teams. The time frame for this map is the first
half of the game. The dots on the left represent the shots taken by the away team (Florida), and the dots on the right represent
the shots taken by the home team (Duke). The dark dots indicate a made shot whereas a light dot indicates a missed shot.

Figure 7: This shot map is the official Box Score for Duke vs. Florida Atlantic for the [7]. The filled in dots represent the made
shots whereas the open dots represent the missed shots. The dots on the left represent shots taken by the away team (Florida
Atlantic), and the dots on the right represent shots taken by the home team (Duke).
Discussion
Multiple analyses, including shot maps, distance graphs, and 3d-representations were

developed using Python. These metrics can be implemented by coaches in order to improve

strategy and minimize avoidable mistakes on the court. The developed metrics were shown to be

fairly accurate when compared to the official Duke BoxScore in regards to made-and-missed

shots and location of shots. The 2-dimensional and 3-dimensional animation of the game, when

compared to the actual footage, were accurate representations except for time frames where the

camera was unable to locate players or the ball on the court leading to a glitch. Furthermore, all

desired aspects of the game, such as game-clock, lines on the court, and dynamic legend, were

successfully plotted. Sports metrics, such as the ones in this project, are becoming a valuable

commodity [8]. Teams are now using big data and computer algorithms to analyze their teams

performance on and off the court [8].

Some limitations that were encountered throughout this project were hardware

capabilities and glitches in the data due to camera problems. A lack of hardware and computing

power drastically reduced the speed and efficiency of the algorithms. The SportVU camera

occasionally had problems locating certain players or the ball leading to minor glitches in the

data points. These glitches provide obstacles when trying to calculate the actual distance of the

player since some time points are not accurate. There was also limited data files to analyze, so

the robustness of the algorithm was unable to be tested.

The software that was developed during this study could serve as a basis for numerous

future investigations. One such possibility would be to create an algorithm that shows real time

points for both teams, not just a sum of points at the end of the specified time frame. This

would open up the opportunity for finding correlations between previously created algorithms

and the success of the play or set of plays. Another possible idea for future projects is to find a
correlation between total distance that players run to outcome of the game. Later projects could

include development of algorithms that can calculate percentages of made or missed contested

shots, defensive efficiency, amount of dribbles a player takes during a game, and efficiency of

the interactions between players. The metrics can be used to visualize and understand each team

as a network rather than individual players [10].

Conclusion

The development of advanced basketball metrics using the SportVU camera system is an

extension of big data analysis into sports. Big data analysis is used in a variety of fields, such as

accounting, banking, education, etc., and has been recently introduced into the realm of sports

statistics. Traditionally, statistics were measured by hand by a trained observer. This led to high

counts of human error and, ultimately, faulty statistics. In recent decades equipment, such as

SportVU, has been designed to correct this flaw. SportVU tracks and records the position of

each player, referee, and the ball during a basketball game. This data can then be stored and

analyzed. In this study, data that was gathered from a SportVU camera system during a Duke

basketball game in 2013 was able to be animated for visual analysis and analyzed using several

different algorithms. These algorithms measured aspects of the game such as individual player

distance traveled and generated shot maps denoting made or missed shots. These measurements,

figures, and animations can potentially be used to improve a teams tactics and/or personnel,

which can lead to on and off the court success.

References

[1] Home, "What is big data and why it matters," 2016. [Online]. Available:
http://www.sas.com/en_us/insights/big-data/what-is-big-data.html. Accessed: Nov. 10, 2016.
[2] W. Raghupathi and V. Raghupathi, "Big data analytics in healthcare: Promise and potential," Health
Information Science and Systems, vol. 2, no. 1, p. 3, 2014.
[3] K. Kambatla, "Trends in big data analytics,". [Online]. Available:
http://barbie.uta.edu/~hdfeng/bigdata/Papers/Trends%20in%20big%20data%20analytics.pdf. Accessed: Nov.
10, 2016.
[4] "STATS SPORTVU WORKFLOWS,". [Online]. Available:
http://docs.stats.com/Webinars/STATS_Webinar_SportVU.pdf. Accessed: Nov. 10, 2016.
[5] D. Hanchett, "Playing Hardball with BIG DATA," 2012. [Online]. Available:
http://www.emc.com/collateral/article/137534-sports-analysis.pdf. Accessed: Nov. 10, 2016.
[6] P. Maymin, "Acceleration in the NBA: Towards an Algorithmic Taxonomy of Basketball Plays,".
[Online]. Available: http://www.sloansportsconference.com/wp-
content/uploads/2013/Acceleration%20in%20the%20NBA%20Towards%20an%20Algorithmic%20Taxonom
y%20of%20Basketball%20Plays.pdf. Accessed: Nov. 10, 2016.
[7] Florida Atlantic vs. Duke Shot Chart. 2013 [Online]. Available: http://www.espn.com/mens-college-
basketball/playbyplay?gameId=400499243. Accessed: Nov. 10, 2016
[8] F. Erulj and E. trumbelj, Basketball Shot Types and Shot Success in Different Levels of Competitive
Basketball, Plos One, vol. 10, no. 6, Mar. 2015
[9] S. J. Ibez, J. Sampaio, S. Feu, A. Lorenzo, M. A. Gmez, and E. Ortega, Basketball game-related
statistics that discriminate between teams season-long success, European Journal of Sport Science, vol. 8, no.
6, pp. 369372, 2008.
[10] J. H. Fewell, D. Armbruster, J. Ingraham, A. Petersen, and J. S. Waters, Basketball Teams as Strategic
Networks, PLoS ONE, vol. 7, no. 11, Jun. 2012.

You might also like