SlideShare una empresa de Scribd logo
1 de 9
Descargar para leer sin conexión
Big Data Predictive Analytics
Using social media to predict the results of
Dancing with the Stars

                                Rick Kawamura
                                      @r_kawamura
The Value of Big Data




  the                   Is unstructured, social media data credible
                        and can it be used to accurately predict

  question              future events?
The Value of Big Data


            Collect data from twitter, Facebook, and various fan sites.
            Cleanse data.
            Apply sentiment analysis.
            Organize, graph, and analyze.


the
            Determine who will be eliminated from the show the following day.

                                     Semantic



test
                                     Analysis
Fascinating – Kate’s Story of Survival

                      Kate Gosselin (from Jon & Kate plus 8) was the least talented of the
                      12 dancers, but survived 5 weeks before being eliminated.



                               For 5 weeks, Kate stole the headlines – not for her dancing, but for
                               her meltdowns, fights with her partner, and how she continued to
                               survive despite poor dancing performances.




                      While common sense would lead one to believe she was sure to be
                      eliminated each week, the data revealed a completely different (more
                      accurate) story.



                               Many comments throughout twitter and facebook showed viewers
                               disdain for Kate and a serious credibility problem for ABC and DWTS.
                               How could the worst dancer continue to survive? “The show must
                               be fixed”. “ABC is keeping her on for the ratings”.



                      Yet week after week, the data showed she was safe – that America
                      was voting to keep her on.
How the data showed Kate was safe
     The week before she was eliminated, and similar to most            The graph below shows the percent of all negative
1.   other weeks, Kate received the lowest score from the
     judges.                                                       2.   comments, Kate received close to 80%. The negative
                                                                        sentiment was strong.




     Positive sentiment, the best predictor of fan votes, showed        Combining the judges’ scores with positive fan
3.   Kate clearly had more support than four other contestants
     despite her large volume of negative comments.
                                                                   4.   sentiment, it was clear Kate would be safe.


                 % of Total Comments
The week Kate was eliminated – Data never lies
      Every week, Kate had the lowest score from the judges.            Kate alone received 40% of all comments in social
1.    This week was no different.                                2.     media, but 90% of it was negative.
                                                                                     % of Total Comments
                  Judges’ Scores
                                                                  50%
 30
 25                                                               40%
 20                                                               30%
 15                                                               20%
 10                                                               10%
  5
                                                                  0%
  0




      In previous weeks, Kate had more positive comments than           Given Kate had the lowest judges score and the lowest

3.    several of her competitors. However this week, while her
      total volume remained high, her percent of positive
      comments dropped significantly.
                                                                 4.     number of positive comments, it was clear this week
                                                                        that she would be eliminated.


                      % of Positive Comments
      35%
      30%
      25%
      20%
      15%
      10%
       5%
       0%
Key Takeaways
                Social, Unstructured Big Data is Credible
                Social data contains true sentiment that can be applied to
                data models to provide insight and intelligence.


                                             Clarity of Data
                                             In some cases, the answer is obvious. Other times it is a
                                             general sense or trend, but may not pinpoint the exact
                                             target.


                Sentiment Analysis
                Sentiment Analysis is a valuable technology. But fine tuning the “degree
                of sentiment” can be a challenge. Consider how you would rate the
                following: “I love Nicole”. “I voted for Chad”. “Erin is gorgeous”.


                                             Predicting Future Events
                                             As evidenced with Kate, the results clearly demonstrated
                                             the value social media data possesses to help predict future
                                             results.


                Data Veracity
                Who better represents America’s sentiment? Those who cast
                their votes by calling in or texting? Or those who express their
                views via social media?
Extracting Value from Social Media – 5 Tips
                 Data Trumps Conventional Wisdom
                 Think of Kate. Despite the overwhelming volume of
                 negative sentiment, her percent of positive sentiment still
                 dwarfed many of the contestants who lacked any drama


                                              Timing is Critical
                                              Working with data as close to an event as possible is most
                                              valuable. Utilizing data in real-time can provide a
                                              competitive advantage.


                 Don’t be blind to the Noise Factor
                 There is a significant amount of non-essential noise in social
                 media data that needs to be cleansed. It’s not all fluff, but
                 may not pertain to the question you are trying to answer.


                                              Not all Social Media Sentiment is Created Equal
                                              Not all data is needed or equal in weight. Is one tweet
                                              equal to one blog post? Is negative sentiment equally as
                                              important as positive sentiment?

                 Don’t Look at Data in a Vacuum
                 Context around the question you are trying to answer plays an
                 important role. Knowing to disregard negative sentiment because
                 votes are only cast for keeping contestants on the show is critical.
Thanks for Viewing
@r_kawamura

Más contenido relacionado

Similar a Big Data Predictive Analytics

Taking Your Social Media Presence to the Next Level
Taking Your Social Media Presence to the Next LevelTaking Your Social Media Presence to the Next Level
Taking Your Social Media Presence to the Next Level
Justin Wise
 
Luther Social Media Summit - Session 6: Social Media Effectiveness
Luther Social Media Summit - Session 6: Social Media EffectivenessLuther Social Media Summit - Session 6: Social Media Effectiveness
Luther Social Media Summit - Session 6: Social Media Effectiveness
Justin Wise
 

Similar a Big Data Predictive Analytics (20)

Science and the Public: Why Every Lab Should Tweet
Science and the Public: Why Every Lab Should TweetScience and the Public: Why Every Lab Should Tweet
Science and the Public: Why Every Lab Should Tweet
 
Backlash: When Former Fans Fight Back
Backlash: When Former Fans Fight BackBacklash: When Former Fans Fight Back
Backlash: When Former Fans Fight Back
 
Posting Our Hearts Out
Posting Our Hearts OutPosting Our Hearts Out
Posting Our Hearts Out
 
Sample Self Evaluation Essay.pdf
Sample Self Evaluation Essay.pdfSample Self Evaluation Essay.pdf
Sample Self Evaluation Essay.pdf
 
A Thousand Points of Like: Raising Money Through Social Media Channels
A Thousand Points of Like: Raising Money Through Social Media ChannelsA Thousand Points of Like: Raising Money Through Social Media Channels
A Thousand Points of Like: Raising Money Through Social Media Channels
 
Taking Your Social Media Presence to the Next Level
Taking Your Social Media Presence to the Next LevelTaking Your Social Media Presence to the Next Level
Taking Your Social Media Presence to the Next Level
 
Increasing Social Media ROI Using Gladwell's Tipping Point Framework
Increasing Social Media ROI Using Gladwell's Tipping Point FrameworkIncreasing Social Media ROI Using Gladwell's Tipping Point Framework
Increasing Social Media ROI Using Gladwell's Tipping Point Framework
 
Decoding Social Data Employing Non Discriminatory Analytics in Creating New D...
Decoding Social Data Employing Non Discriminatory Analytics in Creating New D...Decoding Social Data Employing Non Discriminatory Analytics in Creating New D...
Decoding Social Data Employing Non Discriminatory Analytics in Creating New D...
 
Science and Social Media
Science and Social MediaScience and Social Media
Science and Social Media
 
Social Media in Australia 2012
Social Media in Australia 2012Social Media in Australia 2012
Social Media in Australia 2012
 
Lee Fox - Youth and Nonprofit Partnerships
Lee Fox - Youth and Nonprofit PartnershipsLee Fox - Youth and Nonprofit Partnerships
Lee Fox - Youth and Nonprofit Partnerships
 
Social Intelligence for Clinical Trials.pptx
Social Intelligence for Clinical Trials.pptxSocial Intelligence for Clinical Trials.pptx
Social Intelligence for Clinical Trials.pptx
 
Debunking Myths About Generational Use of Social Media and Health Care
Debunking Myths About Generational Use of Social Media and Health CareDebunking Myths About Generational Use of Social Media and Health Care
Debunking Myths About Generational Use of Social Media and Health Care
 
Social Media Effectiveness Study - Echo 2011
Social Media Effectiveness Study - Echo 2011Social Media Effectiveness Study - Echo 2011
Social Media Effectiveness Study - Echo 2011
 
Social Media, Marketing and Fundraising
Social Media, Marketing and FundraisingSocial Media, Marketing and Fundraising
Social Media, Marketing and Fundraising
 
Youth & Non-Profit Partnerships
Youth & Non-Profit PartnershipsYouth & Non-Profit Partnerships
Youth & Non-Profit Partnerships
 
Luther Social Media Summit - Session 6: Social Media Effectiveness
Luther Social Media Summit - Session 6: Social Media EffectivenessLuther Social Media Summit - Session 6: Social Media Effectiveness
Luther Social Media Summit - Session 6: Social Media Effectiveness
 
2011 Oscars report by Networked Insights
2011 Oscars report by Networked Insights2011 Oscars report by Networked Insights
2011 Oscars report by Networked Insights
 
Measurement is Sexy - Wikibrands
Measurement is Sexy - WikibrandsMeasurement is Sexy - Wikibrands
Measurement is Sexy - Wikibrands
 
Whuffie Workshop at Best Buy
Whuffie Workshop at Best BuyWhuffie Workshop at Best Buy
Whuffie Workshop at Best Buy
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Big Data Predictive Analytics

  • 1. Big Data Predictive Analytics Using social media to predict the results of Dancing with the Stars Rick Kawamura @r_kawamura
  • 2. The Value of Big Data the Is unstructured, social media data credible and can it be used to accurately predict question future events?
  • 3. The Value of Big Data Collect data from twitter, Facebook, and various fan sites. Cleanse data. Apply sentiment analysis. Organize, graph, and analyze. the Determine who will be eliminated from the show the following day. Semantic test Analysis
  • 4. Fascinating – Kate’s Story of Survival Kate Gosselin (from Jon & Kate plus 8) was the least talented of the 12 dancers, but survived 5 weeks before being eliminated. For 5 weeks, Kate stole the headlines – not for her dancing, but for her meltdowns, fights with her partner, and how she continued to survive despite poor dancing performances. While common sense would lead one to believe she was sure to be eliminated each week, the data revealed a completely different (more accurate) story. Many comments throughout twitter and facebook showed viewers disdain for Kate and a serious credibility problem for ABC and DWTS. How could the worst dancer continue to survive? “The show must be fixed”. “ABC is keeping her on for the ratings”. Yet week after week, the data showed she was safe – that America was voting to keep her on.
  • 5. How the data showed Kate was safe The week before she was eliminated, and similar to most The graph below shows the percent of all negative 1. other weeks, Kate received the lowest score from the judges. 2. comments, Kate received close to 80%. The negative sentiment was strong. Positive sentiment, the best predictor of fan votes, showed Combining the judges’ scores with positive fan 3. Kate clearly had more support than four other contestants despite her large volume of negative comments. 4. sentiment, it was clear Kate would be safe. % of Total Comments
  • 6. The week Kate was eliminated – Data never lies Every week, Kate had the lowest score from the judges. Kate alone received 40% of all comments in social 1. This week was no different. 2. media, but 90% of it was negative. % of Total Comments Judges’ Scores 50% 30 25 40% 20 30% 15 20% 10 10% 5 0% 0 In previous weeks, Kate had more positive comments than Given Kate had the lowest judges score and the lowest 3. several of her competitors. However this week, while her total volume remained high, her percent of positive comments dropped significantly. 4. number of positive comments, it was clear this week that she would be eliminated. % of Positive Comments 35% 30% 25% 20% 15% 10% 5% 0%
  • 7. Key Takeaways Social, Unstructured Big Data is Credible Social data contains true sentiment that can be applied to data models to provide insight and intelligence. Clarity of Data In some cases, the answer is obvious. Other times it is a general sense or trend, but may not pinpoint the exact target. Sentiment Analysis Sentiment Analysis is a valuable technology. But fine tuning the “degree of sentiment” can be a challenge. Consider how you would rate the following: “I love Nicole”. “I voted for Chad”. “Erin is gorgeous”. Predicting Future Events As evidenced with Kate, the results clearly demonstrated the value social media data possesses to help predict future results. Data Veracity Who better represents America’s sentiment? Those who cast their votes by calling in or texting? Or those who express their views via social media?
  • 8. Extracting Value from Social Media – 5 Tips Data Trumps Conventional Wisdom Think of Kate. Despite the overwhelming volume of negative sentiment, her percent of positive sentiment still dwarfed many of the contestants who lacked any drama Timing is Critical Working with data as close to an event as possible is most valuable. Utilizing data in real-time can provide a competitive advantage. Don’t be blind to the Noise Factor There is a significant amount of non-essential noise in social media data that needs to be cleansed. It’s not all fluff, but may not pertain to the question you are trying to answer. Not all Social Media Sentiment is Created Equal Not all data is needed or equal in weight. Is one tweet equal to one blog post? Is negative sentiment equally as important as positive sentiment? Don’t Look at Data in a Vacuum Context around the question you are trying to answer plays an important role. Knowing to disregard negative sentiment because votes are only cast for keeping contestants on the show is critical.