Más contenido relacionado


Applications of Big Data

  1. Applications of Big Data By Prashant Kumar Jadia Department of Computer Science and Engineering Hong Kong University of Science and Technology
  2. What is Big Data • McKinsey "Big data" refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. Various Definitions • Gartner Big Data in general is defined as high volume, velocity and variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. • Oreilly Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn't fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it. Source: Infosys Blogs URL: Date: 14-February-15
  3. Four V’s of Big Data Volume • 300 hours of video every minute to you tube • 10 billion posts on Facebook Everyday • 302 million monthly active users on Twitter Variety • 500 miliion tweets everyday • Millions of wearables and health monitors • Billions of photos uploaded everyday Velocity • Spread of sensor network • Growth in world connectivity Veracity • Different sources will have different formats of data • Health care, same data in various forms. Figures are as of May, 2015
  4. The Fifth ‘V’ Value Value DFAS has saved approx $4 billion in improper vendor payments Savings $100 million in erroneous claims eCall will save around 2500 lives every year Estimated savings of $450 billion in USA Health Care if Big Data is used Figures are as of May, 2015
  5. History of Big Data • First Documented use of Term Big Data 1997 by a paper from NASA: "Visualization provides an interesting challenge for computer systems: data sets are generally quite large, taxing the capacities of main memory, local disk, and even remote disk. We call this the problem of big data." • 3Vs first published in 2001 Gartner analyst Doug Laney introduced the 3Vs concept in a 2001 MetaGroup research publication, 3D data management: Controlling data volume, variety and velocity. • Rapid growth since 2007 - Better Internet bandwidth - Cheaper storage - Increased computing powe
  6. History of Big Data – Factors contributing to Growth Number of “Big Data” Papers published per year Source: An overview of Big Data Journal: The Next Wave | Vol. 20 | No. 4 | 2014
  7. History of Big Data – Factors contributing to Growth Computing Cost Performance 1992-2012 Source: From exponential technologies to exponential innovation URL: Date: 4-October-13
  8. History of Big Data – Factors contributing to Growth Storage Cost Performance 1992-2012 Source: From exponential technologies to exponential innovation URL: Date: 4-October-13
  9. History of Big Data – Factors contributing to Growth Global Internet Traffic Figures are as of May, 2015
  10. History of Big Data – Factors contributing to Growth Gartner Emerging Technologies 2012
  11. History of Big Data – Factors contributing to Growth Google search for Term “Big Data” – Signifying public interest Figures are as of May, 2015
  12. Big Data in Social Media Recommendation Systems Marketing Electioneering Influence Marketing Credit Scoring Candidature Check
  13. Big Data in Social Media The conversation Prism • What is Social Media? A group of Internet-based applications that build on the ideological and technological foundations of Web 2.0, and that allow the creation and exchange of user-generated content. • Social media is much more than FB and twitter. • Social media platforms for almost every sphere of life. Users /day Twitter 302 million 500 million tweets Facebook 936 million 55 million status update LinkedIn 364 million YouTube 1+ billion users 432000 hours of videos • How big are these platform? Figures are as of May, 2015
  14. Uses of Social Media Data • What can be mined out of ocean of data? Possibilities are endless.. .. .. UN project showcased an exciting application to discover association between food prices inflation and tweets about price of rice.
  15. Social Media – Recommendation Systems Many Types of recommendation systems • Facebook – Recommended Friends • LinkedIn – People You May Know • YouTube – Videos you may Link • Amazon – People also brought • Pinterest – Board Recommendation So, how does Recommendation Systems work?
  16. Social Media – Recommendation Systems People / Friend Recommendation - Using known information predict ties - Friends of Friends are likely to be friends Algorithm/research area - Community detection - Structural Holes
  17. Social Media – Electioneering • What is Electioneering? - The activity of trying to persuade people to vote for a particular political party. • What is the Big Data’s role in it? - Determine and target most perusable electoral base - Effectively choose marketing media for maximum reach for every dollar spent - Influencing the influencers
  18. Social Media – Electioneering • Maximizing return per dollar – Match billing record (set-up box company) with present voter list – Divide a day into 96 zones – Study the time slots usage of target electoral across 60 channels – Pick the slots with maximum reach per dollar • User Modelling – Model users as on rating of 0 – 10 for being perusable – Volunteers then call/visit electoral with appropriate content • Micro-targeting – Monitor social media facebook, twitter etc. – Micro target voters by delivering custom message to specific sub group
  19. Social Media – Influence Marketing What is Social Influence - Social influence occurs when one's opinions, emotions, or behaviours are affected by others, intentionally or unintentionally. What is Influence marketing - Discovering and predicting a users influence on connected nodes and ability to spread information.
  20. Social Media – Influence Marketing Use Case - Klout generates a score on a scale of 1-100 for a social user to represent her/his ability to engage other people and inspire social actions. - In 2012, Cathay Pacific opened access to SFO lounge to Klout user’s
  21. Big Data in Healthcare Self-aware Medics Sports and Fitness Tracking Clinical Trials Personalized Medicines Genomics Electronic Health records
  22. Big Data in Healthcare • Data characteristics - 1.2 billion clinical documents are produced in the U.S. each year. 60% are in unstructured format - Health trackers - GENOMICS • Savings - Can save up to $450 billion if healthcare industry uses big data analytics and patients make the right choices. - US Government recoveries from forfeiture, asset seizures and fines amounting to $4.3 dollars Figures are as of May, 2015
  23. Healthcare – Clinical Trials Before Big Data Era
  24. Healthcare – Personalized medicine Redefining Medicine
  25. Healthcare – Genomics Success Story Use Big Data and genomics to pin on disease root cause Story - Joshua Osborn(pictured), 14 year old admitted to hospital for high fever - MRI showed brain swelling. However, all related series of test showed negative result. - Doctors decided to run experimental DNA Technology - Extracted DNA using cerebrospinal fluid - With in 2 day, three million DNA sequences were extracted - From Sequence obtained, team subtracted all known human elements - Only 0.02 percent left out, belonged to lethal bacterium called Leptospira - Started the cure for the infection and within weeks Joshua was back home Underling Big Data Technology - SNAP, a spark based sequence aligner
  26. Big Data in Smart Cities Smart Transport Traffic Management Smart Governance Smart Energy Smart Economy
  27. Smart Cities – Internet of Things What is IoT The Internet of Things, also called The Internet of Objects, refers to a wireless network between objects, usually the network will be wireless and self-configuring, such as household appliances. -Wikipedia Benefits - Dynamic control of Life - Improve resource utilization - Automation support systems - Integrating physical systems with human society
  28. Smart Cities – Smart Transport Latest Use Case: eCall - Mandatory for all vehicles to have embedded impact sensors - Sensors can call emergency services in case of impact. - Devices activated only on accidents. Savings - Expected to reduce response time by 40-50% - Time saved = lives saved. 2500 lives annually Challenges - User privacy and concerns over being tracked and monitored
  29. Smart Cities – Smart Energy Use Case: Time based energy pricing - Monitor energy usage using smart meters - Report usage to both customer and energy company in real time. - Big data is used to predict and calculate pricing based on history and current utilization. Savings and benefits - Customer can better manage there energy usage - Potential to maximize saving on energy
  30. Smart Cities – Smart Energy Use Case: IBM HyREF - Cloud imaging technology can track clouds - Sensors for wind speed, temperature and direction. - Can predict 1 month in advance - Can predict weather 1 month in advance at interval of 15 mins Savings and benefits - Can better manage variable nature of winds - Better forecast of energy generation - Enable integration of traditional sources of power generation in case of outage
  31. Thank You