This presentation discussion concepts of technological change and open knowledge and how these things have contributed to the explosion of type and availability of sports data available to the public. This includes advanced and sabermeteric stats, API's, and applied open knowledge concepts in the modern day sports media world.
2. Old
◦ Simple statistics – easily computable
◦ Limited amount of stats
◦ Stats only seen in newspapers
◦ Information available even during sports broadcasts
were limited
New
◦ Extremely complex statistics
◦ The number of “advanced statistics” has exploded
◦ The internet has made any stat available in seconds
◦ Broadcasters have all of this information at their
fingertips for any situation that comes up during a live
event
3. Technological change – anything the average
person wants to know is available online for
free
◦ Free databases
◦ Most people with televisions can pay for services to
watch every in whatever sport they choose
◦ Growing popularity and access to fantasy sports
◦ The increased visibility and spotlight due to
technology has created a need for athletes values to
need to be analyzed in new and exciting ways, and
these new ways are now measurable due to the
amount of data available
4. Defined as the search for objective
knowledge in baseball, but has become well
known as advanced statistics which go
beyond basic statistics in order to measure
true value of a player.i
◦ An example would be adjusting a baseball player’s
hitting statistics based on the dimensions of their
home stadium (some stadiums are easier to hit in).
◦ Largest open database is baseballreference.com
5. Made famous through “Moneyball”
◦ Billy Beane, GM of the Oakland Athletics, had enormous
success using sabermetrics to build a winning team with
a low payroll
◦ Last 15 Years: .81 standard deviations below the average
payroll (bottom 1/3), but a 54.8% win rate (top 3).
◦ In 10 of the past 15 seasons, they have had 10+ more
wins than would be expected based on payroll
◦ The general concept for Beane is that batters take a lot
of pitches (patient and selective), tend to walk a lot (get
on base a lot even if they don’t have a great batting
average), run the bases well, and play solid defense.ii
6. Technology used in Formula One racing to
monitor physical well-being is being applied
to performance across other sports.iii
◦ Can measure speed, efficiency in movements, and
fatigue – can help pinpoint how athletes are wasting
energy (e.g. hockey players performance during a
45 second shift).
◦ Possible end result is “smart jerseys” – would allow
for real time data to be relayed to coaches and
would give fans even more access to information
for analysis.
7. An API is a software intermediary that makes it
possible for application programs to interact with
each other and share data.
There are several API’s for sports including ESPN,
SportsData, and Yahoo.
Creates an enormous database of comprehensive
data coverage including all major sports and
leagues all over the world.
The public benefits through apps and online
databases, which are both mostly open sources.
◦ Real time reporting and analysis becoming the norm in
API’s.
8. Data collected includes live play-by-play,
scores/times, results/boxscores, standings,
player stats, team stats, leaders, rosters,
depth charts, profiles, transactions, splits
(such as home vs away or day vs night), and
historical stats.
◦ ESPN includes podcasts, headlines, draft data, and
feeds from sportscenter.
Creates a meeting of geeks and sports freaks
9. ESPN – 3 tiersiv
◦ Internal – ESPN employees and contractors using the API
to build ESPN apps
◦ Strategic Partners working with ESPN to include ESPN
content in their products/services
◦ Public – Independent, pre-approved developers using
ESPN content.
The coding is not open source such as Android
or Mozilla, but the information generated gets
distributed to the public
◦ Push notifications from the SportsCenter mobile
application with breaking news
◦ Real time scores on espn.com and the SportsCenter
application for almost any league in the world
10. SportsData
◦ Major customers include Bleacher Report,
Bloomberg Sports, Google, IBM, and NBC Sports.
◦ These customers use SportsData in broadcasts,
media, fantasy sports scouting, and in the case of
Google, accurate and dependable data for Google’s
vast customer base and search demands.v
◦ The cost to these companies can be tens of
thousands of dollars per month, but the benefits
can easily outweigh the costs.
Instrumental in NBC’s online coverage of the Super
Bowl.
11. Really only available to insiders, developers,
strategic partners, and wealthy corporations
with media & data needs.
The benefits, however, are also received by
the general public in information availability
and in media.
Not truly “open” in that the full use is not
public and is not free, but
12. Citizen Journalism
◦ The availability of statistics and information has
translated into the average person tweeting,
blogging, and facebooking about sports.
◦ Many individuals with full time jobs can even get
part-time gigs writing for larger blogs.
◦ People with knowledge of advanced statistics are in
need because they provide interesting viewpoints
on teams and players which can often captivate
readers and bring the measures into the main
public eye.
13. Data and Databases
◦ Massive amounts of information available. Any person
can now see, for example, how possession stats and
attempted shots correlate to wins and player efficiency
in hockey.
Information Overload and Filters
◦ Is it too much? For example, most people will not know
the majority of the stats on baseball reference unless
they take the time to read and comprehend the
breakdown of the calculation
◦ Important to be able to filter the information so that we
get find what we are seeking
Most of the time a question is simple, such as is a player
getting unlucky or is he/she simply performing poorly
How to manage your time to find this answer is important, or
else it can take a long time.
14. Equity – The benefits are only able to be
gained if there is internet access and a mobile
network.
◦ Similar to most other open knowledge sources in
that the access is dependent on how privileged the
individual is.
15. i http://sabr.org/sabermetrics
ii http://fivethirtyeight.com/features/billion-dollar-
billy-beane/
iii
http://www.economist.com/blogs/babbage/
2011/11/sports-technology
iv http://developer.espn.com/overview
v http://www.sportsdatallc.com/