SlideShare una empresa de Scribd logo
1 de 28
Human Big Data
Making sense of everything
Scarce < Data > Abundant
Hard Discs are Cheap
IBM 305 RAMAC
5 Meg Hard Drive, 1956
“...Should the current
technological and pricing
trends continue, we will see
2.5-inch 40-terabyte drives sell
for as little as US$40 by the
end of the decade.”
http://www.gizmag.com/hdd-storage-density/25004/
Unstructured
Data
not a problem
?
“It's a basic truth of the human condition, that
everybody lies. The only variable is about what...
…I’ve found that when you want to know the truth
about somebody, that someone is probably the last
person you should ask.” Dr Gregory House
Socially Desirable Answers
People will tell you
what you what to hear,
what they heard
and what they would like to hear…
…but rarely what they think
From attitudes to behaviours
Social Signals
“Let’s do coffee”
Not everyone (thing) lies
http://www.youtube.com/watch?feature=player_embedded&v=mkJ-Uy5dt5g
Language Tells
http://hedonometer.org/
Loose the attitude watch your
behaviour and mind your language
Bryden et al. EPJ Data Science 2013 2:3 doi:10.1140/epjds15
http://www.epjdatascience.com/content/2/1/3
Looking for patterns can seriously
damage your health wealth
Edcrowle - http://www.flickr.com/photos/edcrowle/383301200/sizes/z/in/photostream/
Correlation is not causality
You are biased
Really biased
Perceptual / Attentional biases
observer expectancy effect anchoring and focusing
conformation bias availability cascade bandwagon
Attention
But you are better than that…
self-enhancement bias
Presentation is bias
"[Big Data] is sometimes seen as a cure-all, as computers were in the 1970s. Chris
Anderson…wrote in 2008 that the sheer volume of data would obviate the need for
theory, and even the scientific method….
"[T]hese views are badly mistaken. The numbers have no way of speaking for
themselves. We speak for them. We imbue them with meaning….[W]e may
construe them in self-serving ways that are detached from their objective reality.
"Data-driven predictions can succeed--and they can fail. It is when we deny our
role in the process that the odds of failure rise. Before we demand more of our
data, we need to demand more of ourselves….Unless we work actively to become
aware of the biases we introduce, the returns to additional information may be
minimal--or diminishing."
Nate Silver, The Signal and the Noise
Presentation is bias
Via “Data Visualization: a successful design process” Andy Kirk
Reification
Designing out bias
How good is that?
 Good.
 Very good.
 Extremely good.
Social Data
How effectively does it drive social change?
 Slightly Effective
 Effective
 Highly effective
(84% of people in your job role say it is highly effective)
The iron cage
Stahlhartes Gehäuse
Max Weber
Game mechanics
points quest resource
collection progression levels
avatars social graph rewards
badges leaderboard
Human
Big
Data
gather | understand | replay
Big Human Data

Más contenido relacionado

Similar a Big Human Data

Infopresse montreal feb 6 big data
Infopresse montreal feb 6   big dataInfopresse montreal feb 6   big data
Infopresse montreal feb 6 big dataAlistair Croll
 
Bigger than Any One: Solving Large Scale Data Problems with People and Machines
Bigger than Any One: Solving Large Scale Data Problems with People and MachinesBigger than Any One: Solving Large Scale Data Problems with People and Machines
Bigger than Any One: Solving Large Scale Data Problems with People and MachinesTyler Bell
 
A Thinking Person's Guide to Using Big Data for Development: Myths, Opportuni...
A Thinking Person's Guide to Using Big Data for Development: Myths, Opportuni...A Thinking Person's Guide to Using Big Data for Development: Myths, Opportuni...
A Thinking Person's Guide to Using Big Data for Development: Myths, Opportuni...Junaid Qadir
 
Big data new physics giga om structure conference ny - march 2011
Big data new physics   giga om structure conference ny - march 2011Big data new physics   giga om structure conference ny - march 2011
Big data new physics giga om structure conference ny - march 2011Jeff Jonas
 
IBM IOD Conference 2011 Opening Keynote Deck
IBM IOD Conference 2011 Opening Keynote DeckIBM IOD Conference 2011 Opening Keynote Deck
IBM IOD Conference 2011 Opening Keynote DeckJeff Jonas
 
Big data tokyo (extended version)
Big data tokyo  (extended version)Big data tokyo  (extended version)
Big data tokyo (extended version)Lean Analytics
 
Sxsw tree slideshare
Sxsw tree slideshareSxsw tree slideshare
Sxsw tree slideshareroscoe007
 
World Business Forum Milano 2013 Tom Peters
World Business Forum Milano 2013  Tom PetersWorld Business Forum Milano 2013  Tom Peters
World Business Forum Milano 2013 Tom Peterswobi_it
 
David Turnbull - Hotel data - In the kingdom of the blind, the one eyed man i...
David Turnbull - Hotel data - In the kingdom of the blind, the one eyed man i...David Turnbull - Hotel data - In the kingdom of the blind, the one eyed man i...
David Turnbull - Hotel data - In the kingdom of the blind, the one eyed man i...Travel Tech Conference Russia
 
Machine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our WorldMachine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our WorldKen Tabor
 
Designing AI for Humanity at dmi:Design Leadership Conference in Boston
Designing AI for Humanity at dmi:Design Leadership Conference in BostonDesigning AI for Humanity at dmi:Design Leadership Conference in Boston
Designing AI for Humanity at dmi:Design Leadership Conference in BostonCarol Smith
 
Researchers, Discovery and the Internet: What Next?
Researchers, Discovery and the Internet: What Next?Researchers, Discovery and the Internet: What Next?
Researchers, Discovery and the Internet: What Next?David Smith
 
Bi(G) data: opportunities for BI Professionals
Bi(G) data: opportunities for BI ProfessionalsBi(G) data: opportunities for BI Professionals
Bi(G) data: opportunities for BI ProfessionalsAlbert Besselse
 
2016-12-06-v2-HDRF-Conf
2016-12-06-v2-HDRF-Conf2016-12-06-v2-HDRF-Conf
2016-12-06-v2-HDRF-ConfDickson Lukose
 
Matt sadler infomagination
Matt sadler infomaginationMatt sadler infomagination
Matt sadler infomaginationmattsadler
 
The Advantages and Disadvantages of Big Data
The Advantages and Disadvantages of Big DataThe Advantages and Disadvantages of Big Data
The Advantages and Disadvantages of Big DataNicha Tatsaneeyapan
 
Why Your Big Data Project Will Fail, and How to Avoid It
Why Your Big Data Project Will Fail, and How to Avoid ItWhy Your Big Data Project Will Fail, and How to Avoid It
Why Your Big Data Project Will Fail, and How to Avoid It303Computing
 
People Like You Like Presentations Like This
People Like You Like Presentations Like ThisPeople Like You Like Presentations Like This
People Like You Like Presentations Like ThisDavid Millard
 

Similar a Big Human Data (20)

Infopresse montreal feb 6 big data
Infopresse montreal feb 6   big dataInfopresse montreal feb 6   big data
Infopresse montreal feb 6 big data
 
Bigger than Any One: Solving Large Scale Data Problems with People and Machines
Bigger than Any One: Solving Large Scale Data Problems with People and MachinesBigger than Any One: Solving Large Scale Data Problems with People and Machines
Bigger than Any One: Solving Large Scale Data Problems with People and Machines
 
Seeing and talking about Big Data, Farida Vis, AHRC Subject Assocations
Seeing and talking about Big Data, Farida Vis, AHRC Subject AssocationsSeeing and talking about Big Data, Farida Vis, AHRC Subject Assocations
Seeing and talking about Big Data, Farida Vis, AHRC Subject Assocations
 
A Thinking Person's Guide to Using Big Data for Development: Myths, Opportuni...
A Thinking Person's Guide to Using Big Data for Development: Myths, Opportuni...A Thinking Person's Guide to Using Big Data for Development: Myths, Opportuni...
A Thinking Person's Guide to Using Big Data for Development: Myths, Opportuni...
 
Big data new physics giga om structure conference ny - march 2011
Big data new physics   giga om structure conference ny - march 2011Big data new physics   giga om structure conference ny - march 2011
Big data new physics giga om structure conference ny - march 2011
 
IBM IOD Conference 2011 Opening Keynote Deck
IBM IOD Conference 2011 Opening Keynote DeckIBM IOD Conference 2011 Opening Keynote Deck
IBM IOD Conference 2011 Opening Keynote Deck
 
Big data tokyo (extended version)
Big data tokyo  (extended version)Big data tokyo  (extended version)
Big data tokyo (extended version)
 
Sxsw tree slideshare
Sxsw tree slideshareSxsw tree slideshare
Sxsw tree slideshare
 
World Business Forum Milano 2013 Tom Peters
World Business Forum Milano 2013  Tom PetersWorld Business Forum Milano 2013  Tom Peters
World Business Forum Milano 2013 Tom Peters
 
David Turnbull - Hotel data - In the kingdom of the blind, the one eyed man i...
David Turnbull - Hotel data - In the kingdom of the blind, the one eyed man i...David Turnbull - Hotel data - In the kingdom of the blind, the one eyed man i...
David Turnbull - Hotel data - In the kingdom of the blind, the one eyed man i...
 
Machine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our WorldMachine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our World
 
Designing AI for Humanity at dmi:Design Leadership Conference in Boston
Designing AI for Humanity at dmi:Design Leadership Conference in BostonDesigning AI for Humanity at dmi:Design Leadership Conference in Boston
Designing AI for Humanity at dmi:Design Leadership Conference in Boston
 
The Guardian Avatar
The Guardian AvatarThe Guardian Avatar
The Guardian Avatar
 
Researchers, Discovery and the Internet: What Next?
Researchers, Discovery and the Internet: What Next?Researchers, Discovery and the Internet: What Next?
Researchers, Discovery and the Internet: What Next?
 
Bi(G) data: opportunities for BI Professionals
Bi(G) data: opportunities for BI ProfessionalsBi(G) data: opportunities for BI Professionals
Bi(G) data: opportunities for BI Professionals
 
2016-12-06-v2-HDRF-Conf
2016-12-06-v2-HDRF-Conf2016-12-06-v2-HDRF-Conf
2016-12-06-v2-HDRF-Conf
 
Matt sadler infomagination
Matt sadler infomaginationMatt sadler infomagination
Matt sadler infomagination
 
The Advantages and Disadvantages of Big Data
The Advantages and Disadvantages of Big DataThe Advantages and Disadvantages of Big Data
The Advantages and Disadvantages of Big Data
 
Why Your Big Data Project Will Fail, and How to Avoid It
Why Your Big Data Project Will Fail, and How to Avoid ItWhy Your Big Data Project Will Fail, and How to Avoid It
Why Your Big Data Project Will Fail, and How to Avoid It
 
People Like You Like Presentations Like This
People Like You Like Presentations Like ThisPeople Like You Like Presentations Like This
People Like You Like Presentations Like This
 

Último

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Último (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

Big Human Data

  • 1. Human Big Data Making sense of everything
  • 2.
  • 3.
  • 4. Scarce < Data > Abundant
  • 5. Hard Discs are Cheap IBM 305 RAMAC 5 Meg Hard Drive, 1956 “...Should the current technological and pricing trends continue, we will see 2.5-inch 40-terabyte drives sell for as little as US$40 by the end of the decade.” http://www.gizmag.com/hdd-storage-density/25004/
  • 7. ?
  • 8. “It's a basic truth of the human condition, that everybody lies. The only variable is about what... …I’ve found that when you want to know the truth about somebody, that someone is probably the last person you should ask.” Dr Gregory House
  • 9. Socially Desirable Answers People will tell you what you what to hear, what they heard and what they would like to hear… …but rarely what they think
  • 10. From attitudes to behaviours Social Signals “Let’s do coffee”
  • 11. Not everyone (thing) lies http://www.youtube.com/watch?feature=player_embedded&v=mkJ-Uy5dt5g
  • 13. Loose the attitude watch your behaviour and mind your language Bryden et al. EPJ Data Science 2013 2:3 doi:10.1140/epjds15 http://www.epjdatascience.com/content/2/1/3
  • 14. Looking for patterns can seriously damage your health wealth Edcrowle - http://www.flickr.com/photos/edcrowle/383301200/sizes/z/in/photostream/
  • 15. Correlation is not causality
  • 16. You are biased Really biased Perceptual / Attentional biases observer expectancy effect anchoring and focusing conformation bias availability cascade bandwagon
  • 18. But you are better than that… self-enhancement bias
  • 19. Presentation is bias "[Big Data] is sometimes seen as a cure-all, as computers were in the 1970s. Chris Anderson…wrote in 2008 that the sheer volume of data would obviate the need for theory, and even the scientific method…. "[T]hese views are badly mistaken. The numbers have no way of speaking for themselves. We speak for them. We imbue them with meaning….[W]e may construe them in self-serving ways that are detached from their objective reality. "Data-driven predictions can succeed--and they can fail. It is when we deny our role in the process that the odds of failure rise. Before we demand more of our data, we need to demand more of ourselves….Unless we work actively to become aware of the biases we introduce, the returns to additional information may be minimal--or diminishing." Nate Silver, The Signal and the Noise
  • 20. Presentation is bias Via “Data Visualization: a successful design process” Andy Kirk
  • 23. How good is that?  Good.  Very good.  Extremely good.
  • 24. Social Data How effectively does it drive social change?  Slightly Effective  Effective  Highly effective (84% of people in your job role say it is highly effective)
  • 25. The iron cage Stahlhartes Gehäuse Max Weber
  • 26. Game mechanics points quest resource collection progression levels avatars social graph rewards badges leaderboard

Notas del editor

  1. We are in the middle of a world-changing shift. When I started working in the business world, all of the data for running the business would fit into a single file in today’s speadsheet applications. Data was scarce, highly sought after, poured over in intimate detail and highly valued. Today we generate huge amounts of data every single day, and one of the biggest challenges in business is dealing with information-overload. The focus is less on gathering data, and more on curating and making sense of it. Technology to the rescue…
  2. IBM’s “portable” hard drive solution from the late 50’s. You’ve probably seen this photo, or variations of it. That drive is actually less than 5Meg – that’s probably less memory that is in your washing machine (at least if you’ve every washed a USB memory stick by accident it is!). Today though, storage is cheap. By the end of this decade, you will likely be able to affordably store the whole of today’s internet in the palm of your hand. That’s changing how we build computer systems, but also changes how we deal with data.
  3. The other big, historical, challenge with data was that we need to have a good idea of what we were going to collect, before we started collecting it. And if we wanted to change our minds half way through, well, that was going to mean tears before bedtime and a lot of hard work. Today, technologies like NoSQL databases me that we don’t have to worry about the structure of the data until we want to make sense of it. We can also collect different sorts of data in the same system, capturing more or less as our sources change. Platforms like Twitter and Facebook have lead to the development of technologies that can process and analyze huge amounts of data in near-real time – restructuring and refactoring it as we go. It’s a revolution in our relationship with data.
  4. You have to love the idea of a fictional doctor telling us that we are all liars. Of course we aren’t. But we do bend the truth occasionally. Especially if we think we are being watched (or listened too…)
  5. Socially desirable responding is a major challenge with surveys and many other traditional data gathering methods. Certainly, experts can help to minimize their impact, and control for other issues like response set bias. It gets interesting when we turn those biases from issues in to data. Technology allows us to record not just the answers, but to record the response times too. This is an all together different sort of data.
  6. We can move from ‘expressed’ measures (attitudes) to observed measures (behaviours). Of course, not all ‘behavioural’ measures are actually measures. Foursquare check ins, for example, are expressions – people don’t (usually) check in at every physical location that they visit. The choice of checking locations is an expression of attitudes about themselves and the location brands. Social media, contrary to opinion, is not about transparency. It is about continual, partial transparency. We need to get smarter about understanding the data that we collection, and learn new techniques to control for the biases in it.
  7. Of course some data is more ‘objective’ – this is a lovely visualisation of the Autodesk organisation over time. In our early days of working with human data, we spent quite a lot of time building these sorts of visualisation. They have become cheaper and easier to produce, and they are certainly good discussion points. The bigger lesson though, is that not all data matters, or at least much of the data we see as important actually is. I can predict more about the interactions of people in an organisation based on the physical distance between their desks, than I can from a hierarchical org chart. Objective information is good, but it is overly valued in business. Aggregate subjective information often tells us more. Not all opinions are never meaningless!
  8. One of the most interesting things about social media is that it gives us more access than ever before to the raw language that people use. As software algorhythms have become more advanced, and our understanding of language has improved, we can create software that can analyze, on aggregate, the emotional content of communications. The hedonometer is a great example – how happy is the Twitter-verse today? But language tells us much more…
  9. Shared vocabulary can predict social groupings and influence, in quite unexpected ways.
  10. But blindly looking for patterns is a dangerous sport. It hits many of the weak spots in our cognitive systems, and can lead us up all sorts of blind alleys.
  11. “correlation is not causality” – get it printed on a t-shirt. Say it randomly in meetings. The assumption that unrelated events have causal links almost makes the business world go around. Litterally. It is a much harder habit to break that you might think, for reasons I’ll come on to later. When we operate in the world of human data, it is an ever present danger – misunderstanding how variables do (or don’t) relate. Camera tripods have cameras on them. Cameras take good pictures. Cameras don’t like getting wet. Andy here is on a camera tripod. He takes very good pictures. He hates getting wet. Andy is, of course, a photographer, not a camera. But ascertaining that from a few variables (rather than a few megabytes of data and a lifetime of learning) is a very non-trivial problem.
  12. We have a sea of cognitive biases that play in to one another. We tend to fixate on the first thing that we see, becoming blind to other interpretations, we are biased towards spotting evidence that supports our hypothesis, and ignoring data that doesn’t. We value and believe things, based on repetition, more than reliability, and when you put that in a social context, we support what we believe that other people believe. It is a chain of events that leads to big mistakes, and big data is high octane petrol, especially in the business context, where we value ‘objective’ data so highly. Numbers are not always objective, they are vulnerable to subjectivity.
  13. You will have seen this video on line. Think through the consequences carefully. When told to diligently observer something, we completely miss the gorilla in the room, beating its chest. Apply that to big data. Our perceptual systems keep us safe from predators, and help use locate friends and relatives. They were not specified or tested for analysing terabytes of data on a computer screen…
  14. Any, just to make it worse… We always over estimate our capabilities. When asked, on average, everyone is above average!
  15. There is no such think as a neutral presentation of data. We always bring something of ourselves to the presentation, even if it is unconscious. Phenomenological approaches to psychological research understand, embrace and control for the biases of the researcher. Ignoring them, or even worse, denying their existence, simply increases their impact. Understand why (at an emotional level) you are measuring what you are measuring, and the story you tell well you present it.Human’s communicate at the level of stories, not at the level of data, so tell stories, and understand stories. Each story is a potentially narrative. Most data has multiple potential narratives. Without a narrative, an embracing context, data is meaningless, or at least meaning-less.
  16. By the way… Not all biased presentations are as simple and obvious as this example. But look at what is going off here! What do you learn? What can you tell about what is being said (and what is not being said!).
  17. The biggest challenge with Human Big Data is that it breaks the scientific model. Most people working in the data processing world come from a back ground that draws on the epistemology of natural science. We build a hypothesis, we construct experiments, we measure things. We gradually ‘discover’ the nature of the word around us. Of course human big data doesn’t work that way. Marketing people are paid to CHANGE WHAT PEOPLE BELIEVE. So, if we are measuring what people believe (attitudes) or how they behave (which is related to what their believe – says the marketing world!), we are changing the thing that we are measuring. At a higher level, we are using the learning from big human data to architect the social construct that people operate within. Marketing has never been static. What worded yesterday, won’t always work today. Humans adapt and normalise. If you use behaviour economics in your pricing, eventually you change the behaviors. Why don’t you by the cheapest or the most expensive wine today?
  18. When we gather and present human big data, we have to do our best to design out the biases. But we can turn this all on its head and use our biases, combined with big data, to change attitudes and outcomes.
  19. Nudge is common parlance today. Decision architectures generally play on age old cognitive biases. When we add social data…
  20. We turn on the turbo button when we add social proof. The Facebook like button that you see on websites has faces on it for a reason. Instrumenting behaviours and playing back the data is powerful.
  21. But we have to be careful. It can be too powerful. Overly rationalizing the world, and using social forces to drive compliance, can lead to an icy and brittle world. We need to use these new tools with caution. This is not a single move chess game, and eventually there is a tipping point at which contrarian approaches become the dominant strategy. If we measure emotional responses, and engineer the ultimate film script, and then every film studio follows it, suddenly it becomes bland. We have to apply our learning lightly, and with a bit of fun…
  22. Probably the fastest way to start a fist fight at a game developers conference is to describe “gamification” as psychology for dummies. Ok. Some parts of that statement might not be true. But what is true is that we can usefully borrow from the tool box of gamification. In gaming, the players enter “the circle” of the game – temporarily adopting a perspective on how the world works. We can escape the game. When we can’t, it stops being a game. There is something else that gamification gives us: The construction of measures. Not everything that we want to measure in human bug data has a metric that we can express. In the world of games, we create measures and play them back. Number of lives, energy level… Constructed measures can be used to turn the disadvantages of reification into a positive advantage. We create and combine the things that we can measure, into new measures. Number of twitter followers, number of Facebook likes. These are actually all just constructed measures, which are used to drive behaviours. The players play the game to earn the points they need to level up!