"Big Data" is term heard more and more in industry – but what does it really mean? There is a vagueness to the term reminiscent of that experienced in the early days of cloud computing. This has led to a number of implications for various industries and enterprises. These range from identifying the actual skills needed to recruit talent to articulating the requirements of a "big data" project. Secondary implications include difficulties in finding solutions that are appropriate to the problems at hand – versus solutions looking for problems. This presentation will take a look at Big Data and offer the audience with some considerations they may use immediately to assess the use of analytics in solving their problems.
The talk begins with an idea of how big "Big Data" can be. This leads to an appreciation of how important "Management Questions" are to assessing analytic needs. The fields of data and analysis have become extremely important and impact nearly all facets of life and business. During the talk we will look at the two pillars of Big Data – Data Warehousing and Predictive Analytics. Then we will explore the open source tools and datasets available to NATO action officers to work in this domain. Use cases relevant to NATO will be explored with the purpose of show where analytics lies hidden within many of the day-to-day problems of enterprises. The presentation will close with a look at the future. Advances in the area of semantic technologies continue. The much acclaimed consultants at Gartner listed Big Data and Semantic Technologies as the first- and third-ranked top technology trends to modernize information management in the coming decade. They note there is an incredible value "locked inside all this ungoverned and underused information." HQ SACT can leverage this powerful analytic approach to capture requirement trends when establishing acquisition strategies, monitor Priority Shortfall Areas, prepare solicitations, and retrieve meaningful data from archives.
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
Big Data in NATO and Your Role
1. BIG DATA IN NATO: WHAT
IT MEANS TO YOU
Jay Gendron
jaygendron
October 29, 2014
2.
3. Our Journey
• BIG DATA
– What is BIG?
– What is DATA?
• Two Workhorses of Big Data
– Enterprise Data Warehousing
– Predictive Analytics
• Weaponry Available
– Open Source Tools
– Open Source Data
• Use Cases
• Future Trends
11. BIG data
There is no technical definition
“Big Data is at the heart of modern science and business…the
necessity of grappling with Big Data, and the desirability of
unlocking the information hidden within it, is now a key theme in
all the sciences – arguably the key scientific theme of our times.”
-Francis X. Diebold , University of Pennsylvania
A Personal Perspective on the
Origin(s) and Development of “Big Data”:
The Phenomenon, the Term, and the Discipline (2012)
3V’s =Volume
Velocity
Variety
Laney, D. (2001)
12. Volume
Large Synoptic Survey
Telescope (LSST)
40TB/day
100+PB in 10-year lifetime
Illumina HiSeq 2000
DNA Sequencer
~1TB/day; 30 TB/month
Images: https://d396qusza40orc.cloudfront.net/datasci/lecture_slides/week1/005_escience.pdf
13. 250 miles Exosphere
186 miles Thermosphere
25 miles Mesosphere
6 miles Troposhere
40 TB: 10,000 x 4,000,000,000 sheets
high
14. How Big is the Internet?
Size of the Internet as of 31st Dec 2013
14.3 Trillion -Webpages, live on the Internet.
48 Billion - Webpages indexed by Google Inc.
14 Billion - Webpages indexed by Microsoft's Bing.
672 Exabytes - 672,000,000,000 Gigabytes (GB) of accessible data.
Source: http://www.factshunt.com/2014/01/total-number-of-websites-size-of.html
1 EB = 1,000,000,000,000,000,000 = 1
22. The “DataMaster”
• Hiring a room of PhD’s won’t solve
Big Data
• They have a role…as does IT
• Ultimately Big Data will also be a
team effort like the web buildout
• …and You have a role on that team
Image: http://www.fivem.be/
24. Difficult Data is more apt
Enterprise Data
Warehousing
Images: Elephant - http://www.marcolotz.com/?p=77
Word Cloud - http://www.fotolia.com/id/36647313?by=serie
Predictive
Analytics
25. Enterprise Data Warehousing
• What began with MapReduce in 2004
• Evolved in open source like Hadoop
• Permanent contributions of evolution:
– Fault tolerance – running on many machines
and accounting for failures
– Schema-on-Read – more flexibility in working
with data in different forms
– User Defined Functions – giving developers
more freedom in where to place queries
Source: https://class.coursera.org/datasci-002/lecture/15
27. Desired End State
Image: http://www.ted.com/talks/nate_silver_on_race_and_politics?language=en
28. A team approach
BI
Predict
Stats
Viz Analyst
S/W
+ scale
+ algorithm
+ statistics
+ programming + data products
29. Impact of the Phenomenon
Theoretical
Empirical
Empirical + Computational
Images: Galileo - http://www.crystalinks.com/galileo.html; Formulae - https://msschwarzeducationstation.wordpress.com/page/2/; Computers -
http://www.utsystem.edu/blog/2011/09/26/ut-austin-awarded-50-million-build-faster-more-powerful-supercomputer; Book Cover -
http://radar.oreilly.com/2011/09/building-data-science-teams.html
30. Human-Computer Symbiosis
Sankar, S. (2012, June). The rise of human-computer cooperation [TEDGlobal 2012]. Podcast retrieved from
https://www.ted.com/talks/shyam_sankar_the_rise_of_human_computer_cooperation?language=en
31. People and Culture Count
People are “tinkerers” & “hobbyists”
Experience Perspective
We ALL have 1 thing in common
WE ARE ALL DIFFERENT!
40. …to make the point
…on page 13
Fawcett, T. & Provost, F. (2013, August 9). Data Science for Business. Retrieved from
https://www.safaribooksonline.com/library/view/data-science-for/9781449374273/.
42. Remember…You have a role
Informal poll at Univ of WA:
How much time do you spend
“handling data” as opposed to
“doing science”?
Most given response: 90%
47. Human-Computer Symbiosis
Sankar, S. (2012, June). The rise of human-computer cooperation [TEDGlobal 2012]. Podcast retrieved from
https://www.ted.com/talks/shyam_sankar_the_rise_of_human_computer_cooperation?language=en
49. Data Warehouse
Augmentation
Existing + NEW
Operational Efficiencies
More Data
Images: Arrow - http://canadawebservices.com/7-powerful-ways-increase-website-traffic/; Explore -
http://www.keadventure.com/page/explore_more!.html
50. Leveraging Use Cases
Business and Commerce
• Big Data Exploration
• 360 Degree View
• Security & Intelligence
• Operational Analysis
• Data Warehouse
Augmentation
NATO Enterprise
• FMN (“experiment” and
data mining)
• Enterprise (supporting
commands)
• Cyber Defence (threat
tactics)
• C2 (requirements text
analysis)
• Ent. Architecture &
Technology (req’ts)
52. According to Gartner
Image: http://www.forbes.com/sites/gilpress/2014/08/18/its-official-the-internet-of-things-takes-over-big-data-as-the-most-hyped-technology/
53. According to IBM
• More Analytics – Less Gut
• Data security and privacy
• Leaders with data knowledge
• Data-centric applications
• Integrating internal and external
• Investments in platforms
An example
Cho, I. (2013, February 3). 6 trends in big data and analytics [IBM Big Data Hub]. Podcast retrieved from
http://www.ibmbigdatahub.com/podcast/6-trends-big-data-and-analytics
54. Visual Analytics
Koblin, A. (2011, March). Visualizing ourselves ... with crowd-sourced data [TED2011]. Podcast retrieved from
http://www.ted.com/talks/aaron_koblin?language=en
55. Art and science meet
Miebach, N. (2011, July). Art made of storms [TEDGlobal 2011]. Podcast retrieved
from http://www.ted.com/talks/nathalie_miebach?language=en
56. Summary
• Big Data – has implications
• Big Data = Data + Analytics
• Open Source – tools and data
• Use Case – leverage others’ results
• Future
– More analytics and applications
– Need for data fluency among managers
– Need processes to encourage exploring
61. References
Cho, I. (2013, February 3). 6 trends in big data and analytics [IBM Big Data Hub]. Podcast retrieved from
http://www.ibmbigdatahub.com/podcast/6-trends-big-data-and-analytics.
Diebold, F.X. (2012, November 26). A personal perspective on the origin(s) and development of “big data”:
The phenomenon, the term, and the discipline. Retrieved from
http://www.ssc.upenn.edu/~fdiebold/papers/paper112/Diebold_Big_Data.pdf.
Fawcett, T. & Provost, F. (2013, August 9). Data Science for Business. Retrieved from
https://www.safaribooksonline.com/library/view/data-science-for/9781449374273/. (on page 13)
Howe, B. (2013). Data science in science [PDF document]. Retrieved from Lecture Notes Online Web site:
https://d396qusza40orc.cloudfront.net/datasci/lecture_slides/week1/005_escience.pdf.
Koblin, A. (2011, March). Visualizing ourselves ... with crowd-sourced data [TED2011]. Podcast retrieved from
http://www.ted.com/talks/aaron_koblin?language=en.
Laney, D. (2001), 3-D data management: Controlling data volume, velocity and variety. META Group Research
Note, February 6. Retrieved from http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-
Management-Controlling-Data-Volume-Velocity-and-Variety.pdf.
Miebach, N. (2011, July). Art made of storms [TEDGlobal 2011]. Podcast retrieved from
http://www.ted.com/talks/nathalie_miebach?language=en.
Sall, E. (2013, February 12). Top 5 big use cases [IBM Big Data Hub]. Podcast retrieved from
http://www.ibmbigdatahub.com/podcast/top-5-big-data-use-cases.
Sankar, S. (2012, June). The rise of human-computer cooperation [TEDGlobal 2012]. Podcast retrieved from
https://www.ted.com/talks/shyam_sankar_the_rise_of_human_computer_cooperation?language=en.