SlideShare una empresa de Scribd logo
1 de 86
Visualizing Data Journalism
Ritvvij Parrikh,
Founder, www.pykih.com
!
!
Fifth Elephant, Delhi Run-up Event,
India Today Mediaplex, June 14, 2014
Pykih is a data Visualization company. We build custom visual representations of
large data sets to make data actionable for readers.
We have satisfied customers in six countries.
Introduction
• Data Viz.

• Theory

• Case Study 1

• Case Study 2

• Summary

• Challenges in Data Journalism

• What we are doing about it for ourselves
Agenda
Data Visualization
Let’s explore the humble pie chart…
Party Percentage
E 38%
D 25%
C 20%
B 15%
A 2%
Break the whole into parts.
Let’s explore the humble pie chart…
Party Percentage
E 38%
D 25%
C 20%
B 15%
A 2%
Break the whole into parts.
Data: One dimensional Visual Encoding: Area
New Terms
• Dimension: Columns by which you group data.!
!
• Facts: Numbers that you can count, sum, average, etc.!
!
• Examples:!
• Seat count by party!
• Seat count by party and state!
!
• Visual Encoding: Area, Position, Colour, Length, Thickness, etc.
One-dimensional Charts
PIE is a one-dimensional chart
One-dimensional Charts …
A pie could have been a random shape broken by percentage
One-dimensional Charts …
Pie
Amoeba
Percentage!
Rectangle Donut
Percentage!
Triangle
Bubble
Election Donut
Funnel
Percentage Bar
Percentage !
Column
#1 - The same data can be Visualized in many (MANY!) different ways.
One-dimensional Charts …
Source: thehindu.com
What is wrong here?
One-dimensional Charts …
What is wrong here?
Problems:!
• Colour communicates no data!
• 3D communicates no data
Source: thehindu.com
One-dimensional Charts …
Source: thehindu.com
#2 - Your goal is to communicate data. Wrong use of visual encoding confuses.
Problems:!
• Colour communicates no data!
• 3D communicates no data
One-dimensional Charts …
Source: firstpost.com
What is wrong here?
One-dimensional Charts …
What is wrong here?
Problems:!
• Colour!
• Too many values. Too
cluttered.
Source: firstpost.com
One-dimensional Charts …
Problems:!
• Colour!
• Too many values. Too
cluttered.
#3 - AREA encoding is useful for only few values after which it is unreadable.
Source: firstpost.com
One-dimensional Charts …
Solution to problem of restricted space? Create a custom chart.
New Data Set
One dimensional: !
Seat count by party
Grouped One dimensional: !
Seat count by party grouped by alliance
Grouped One-dimensional Charts
Party Alliance Percentage
A NDA 38%
B NDA 25%
C NDA 20%
D UPA 15%
E Others 2%
Grouped One-dimensional Charts
Group various bubbles by colours
Party Alliance Percentage
A NDA 38%
B NDA 25%
C NDA 20%
D UPA 15%
E Others 2%
Grouped One-dimensional Charts
Group various bubbles by colours
Party Alliance Percentage
A NDA 38%
B NDA 25%
C NDA 20%
D UPA 15%
E Others 2%
#4 - You can always fit in an extra dimension (GROUP) in charts using colour.
New Data Set
One dimensional: !
Seat count by party
Grouped One dimensional: !
Seat count by party grouped by alliance
Two dimensional: !
Which party won in which year
Two-dimensional Charts
Plot two data points
Party Constituency
A Z
B Y
C X
D V
E W
23Visual encoding: Position, Length
Two-dimensional Charts…
Connect the dots and you get a line chart.
Two-dimensional Charts…
Scatter Line Area
Bar Column Spider
All these charts require the same data.#5 - Number of dimensions in data determines which chart to use
New Data Set
One dimensional: !
Seat count by party
Grouped One dimensional: !
Seat count by party grouped by alliance
Two dimensional: !
Which party won in which constituency
Weighted Two dimensional: !
Which party won in which constituency by what vote margin
Weighted Two-dimensional Charts
This is a 2d chart.
Weighted Two-dimensional Charts …
Let’s add weight to it, hence now we have three data points
X axis Y axis Weight
A Z 40
B Y 20
C X 1
D V 300
E W 60
28Visual encoding: Position, Length, Area
Weighted Two-dimensional Charts …
Weighted Scatter Circle Comparison
All these charts require the same data.#6 - You can always fit in an extra fact (WEIGHT) in charts using size.
New Data Set
One dimensional: !
Seat count by party
Grouped One dimensional: !
Seat count by party grouped by alliance
Two dimensional: !
Which party won in which constituency
Weighted Two dimensional: !
Which party won in which constituency by what vote margin
Grouped Weighted Two dimensional: !
Which party won in which constituency by what vote margin grouped by alliance
Grouped Weighted Two-dimensional Charts
Grouped Weighted Scatter Grouped Circle Comparison
31Visual encoding: Position, Length, Area, Colour
Multi-series Two-dimensional Charts …
RangeGanttMulti-series Line
Group Column Stack Column Group Stack Column
Stack Area Stack Percentage Area
Add more dimensions in creative ways.
Multi-series Two-dimensional Charts …
What is right and wrong here?
Source: livemint.com
Is the equities rally percolating into the broader market?
Multi-series Two-dimensional Charts …
What is right and wrong here?
Source: livemint.com
Is the equities rally percolating into the broader market?
Bad parts:!
• BSE Small-cap lines is not
visible and that’s the story.
Multi-series Two-dimensional Charts …
What is right and wrong here?
Good parts:!
• Y axis from 97 instead of 0
Source: livemint.com
Is the equities rally percolating into the broader market?
Bad parts:!
• BSE Small-cap lines is not
visible and that’s the story.
#7 - Purpose of line chart is to show trend. Focus on it.
Multi-series Two-dimensional Charts …
What is wrong here?
Source: livemint.com
Does IMF wear rose-tinted glasses?
Multi-series Two-dimensional Charts …
What is wrong here?
Source: livemint.com
Problems:!
• Cannot find the IMF line.
Does IMF wear rose-tinted glasses?
Multi-series Two-dimensional Charts …
What is wrong here?
Source: livemint.com
Does IMF wear rose-tinted glasses?
Problems:!
• Cannot find the IMF line.
#8 - Highlight the story for the user. Use color to highlight, not confuse.
New Data Set
All the data we encountered so far was RDBMS i.e. could fit in a SpreadSheet.
(rows and columns). !
!
Sometimes data is more complex. It can have“relationships”. !
!
Types of relationships:!
• Hierarchy / Tree!
• Multi-level relationships
Tree Charts
{
"name": "root",
"children": [
{
"name": "A",
"children": [
{"name": "A1"},
{"name": "A2"},
{"name": "A3"},
{"name": "A4"}
]
40Visual encoding: Position
Tree Charts
Dendrogram Circular Dendrogram
Grouped Weighted Tree Charts
Packed Circle
Sunburst Tree Rectangle
Tree Bar
Grouped Weighted Tree
42Visual encoding: Position, Size, Colour
Grouped Weighted Tree Charts
Sunburst
43Visual encoding: Position, Size, Colour
Grouped Multi-level Relationship Charts
{
“nodes”: [
{“name”: “A”, “group”: “G1”},
{“name”: “B”, “group”: “G2”},
…
],
"relations": [
{"from": “A”, "to": “B”},
{"from": “A”, "to": “C”},
…
]
44Visual encoding: Position
Grouped Multi-level Relationship Charts
Graph
Collapsible Graph
Hive
#9 - Look for relationships across data sets.
Weighted Grouped Multi-level Relationship Charts
Sankey
46Visual encoding: Position, Color, Size
Case: Mumbai Local Fare Chart
A fare exists for travel between station "A" and “B”. Hence, it is a relationship
chart.
Case: Mumbai Local Fare Chart
Matrix
Half Matrix
[
{"node1": "A", "node2": "B", "weight": 300},
{"node1": "A", "node2": "C", "weight": 900},
…
]
Case: Mumbai Local Fare Chart
49
#9 - Look for limitations. They can help you improve design.
Weighted Two-level Relationship Charts …
Chord
Number of people travel between various stations
• One dimensional charts!
• Grouped one dimensional charts!
!
• Two dimensional charts!
• Weighted Two dimensional charts!
• Grouped Two dimensional charts!
• Grouped Weighted Two dimensional charts!
!
• Multi-dimensional Charts!
!
• Tree Charts!
• Grouped Weighted Tree Charts!
!
• Multi-level Relationships Charts!
• Grouped Weighted Multi-level Relationships Charts!
!
• Two-level Relationships Charts!
• Grouped Weighted Two-level Relationships Charts
Taxonomy of Standard Data Visualizations
The same data can be visualized in many (MANY!)
ways. Without exploring the data, you will end up
visualizing all your data in pies, lines and bars.
Most Imp. Lesson
One
Dimension
Two
Dimension
Multi-
Dimension
Relationship Hierarchical Geo Maps
Dimension: Time N Y Y N N N
Dimension:
Group
Y Y Y Y Y Y
Fact: Weight N Y Y Y Y Y
Group and
Weight
N Y Y Y Y N
Fact: Many
values
May be Y Y Y Y Y
Multiple levels /
Zoomable N N N Y Y Y
Implications
List of Visual Encodings
Source: http://complexdiagrams.com/properties
Case Study #1:
Let’s apply what we learnt
IPL Score Card
ESPNCricInfo Score Card
56
57
Ball by ball!
Commentary
Per Batsman Statistics
Per Bowler Statistics
Fall of Wickets
Partnerships
Two innings Pre-match: Toss, Playing 11, Location, Time
Post-match: Win, by how much, Man of the match
Second Innings:
Current Run Rate, Required Run Rate, Target score
Overs: Most important data-point
1. Overs = Time!
2. One over !
1. has_many balls!
2. has_one bowler!
3. has_many batsmen!
3. Existence of batsmen across overs is partnerships!
4. Partnerships and Fall of wickets are the same different data set
Ball by ball Commentary
Partnerships
Combine the two
Weighted two-dimensional chart
Y-axis: Balls per over
X-axis: Overs + Bowlers
Gantt chart
Y-axis: Batsmen
X-axis: Overs + Bowlers
All other “zoomable"
information is shown via
interactions
Putting it all together
Let’s see it live
http://www.firstpost.com/cricket-live-score/IPL/1-jun-2014-
kolkata-knight-riders-versus-kings-xi-punjab/2173/175977
Less reading. No scrolling. More awareness.
Case Study #2:
Let’s apply what we learnt
Election Counting Day
Election Counting Day
Data Set:!
• India has 50+ regional parties and two national parties.!
• During Election Counting Day (live), seats are either “Leading” or “Won”!
!
Data Properties / Relationships:!
• Hierarchical Relation between Alliance and Party!
• Won is confirmed. Leading is transient.!
!
What did readers want to know this Election:!
• How badly would UPA lose!
• How big would be the BJP victory!
• How big would the impact of AAP would be!
!
Real world facts to inspire design!
• BJP is a right wing party!
• AAP is left most followed by UPA!
• The Sansad Hall is a semi-circle
Election Counting Day
Data Set:!
• India has 50+ regional parties and two national parties.!
• During Election Counting Day (live), seats are either “Leading” or “Won”!
!
Data Properties / Relationships:!
• Hierarchical Relation between Alliance and Party!
• Won is confirmed. Leading is transient.!
!
What did readers want to know this Election:!
• How badly would UPA lose!
• How big would be the BJP victory!
• How big would the impact of AAP would be!
!
Real world facts to inspire design:!
• BJP is a right wing party!
• AAP is left most followed by UPA!
• The Sansad Hall is a semi-circle
—> Group
—> Tree
—> Weight
—> Limitation
}
Hence, all other parties!
can be clubbed into !
other
—> Shape
—> Placement}
Choosing the right Grouped Weighted Tree Chart
Packed Circle
Sunburst Tree Rectangle
Tree Bar
Grouped Weighted Tree
68Visual encoding: Position, Size, Colour
Election Counting Day …
Sunburst
Sansad Chart
Sansad Chart
Focus on what is most imp.!
Alliance is more imp. than Party. We spent 200% more time reversing hierarchy
Let’s see it live
http://firstpost.com/election-results
Summary
1. Study properties and relationships of your Data Set!
2. Use your visual encodings wisely
Challenges in Data Journalism
Data Collection What’s the story Visualize Story
Journalist
Developer
Designer
• Govt. data!
• APIs!
• Scrape!
• Mine web!
• PDFs
• Clean the data!
• Model the data!
• Investigate
• Design!
• Build
Write
Technology is an integral part of data journalism.
Steps in data journalism
Data Driven Stories Visualization App
Day-to-day short stories derived
from data
Big apps. to educate large and
important event e.g. budget, election,
etc.
Formats in data journalism
Format #1 - Data Driven Stories
Source: http://factchecker.in/data-are-crimes-against-scheduled-castes-on-an-upswing-in-india/
Badaun Case —> Find legit Data —> Analyse —> Plot —> Story
Format #2 - Visualization Apps
Data Collection What’s the story Visualize Story
• Govt. data!
• APIs!
• Scrape!
• Mine web!
• PDFs
• Clean the data!
• Model the data!
• Investigate
• Design!
• Build
Write
Format: Visualization app
Format: Data Driven Stories
Journalist
Developer
Designer
Journalist
Implication
High Level
!
1. Quick access to appropriate data set

2. Quick analysis of this data

3. Consistently churn out neat charts, graphs and maps
Challenges
High Level
!
1. Quick access to appropriate data set

2. Quick analysis of this data

3. Consistently churn out neat charts, graphs and maps

!
Technical
!
1. Live Data Modelling

2. SEO

3. How to handle high traffic
Challenges
High Level
!
1. Quick access to appropriate data set

2. Quick analysis of this data

3. Consistently churn out neat charts, graphs and maps

!
Technical
!
1. Live Data Modelling

2. SEO

3. How to handle high traffic

!
From pykih perspective
!
1. How do you consistently build beautiful, real-time Visualizations?
Challenges
What we are doing about it
In-house tool called "Backstage"
#1 - Instead of waiting for data to be
standardised, we want to make large scale, high-
velocity, multi-format, data extraction durable.

!
#2 - Instead of expecting data-users / journalists
to have analytical skills, we are:

• simplifying exploration of large data sets

• automating extraction of metadata from
data sets

• simplifying assisted data standardisation

• building tools for assisted analysis

!
#3 - Instead of expecting data-users / journalists
to Visualize data correctly, we are attempting
automate meta-data driven Visualization

!
Other Experiments
• A data-driven blogging software

• Configuration Editor
Principles
—> Demo the worker
—> Demo the census dashboard
—> ISO example
—> Demo NLP based Date Standardiser
—> Story is in the outliers
Example: If data is ordinal then colour
automatically leverages saturation and
if data is ordinal then colour is distinct
Data Visualization company => Data and Visualization company

!
!
Effective Data Journalism leverages: You will end up NoSQL, Memory based databases,
NLP, OLAP modelling, Free Text Search, Statistics, etc.
Summary
We are at @pykih
Fun fact: The word pykih came to us
in a CAPTCHA. That’s the day we
decided that till we do good work it
does not matter what we are called.

Más contenido relacionado

Similar a Visualizing Data Journalism (HasGeek Fifth Elephant)

Data Visualization dataviz superpower
Data Visualization dataviz superpowerData Visualization dataviz superpower
Data Visualization dataviz superpowerJen Stirrup
 
SavvyData 'Ace of Charts' in FileMaker Pro 11
SavvyData 'Ace of Charts' in FileMaker Pro 11SavvyData 'Ace of Charts' in FileMaker Pro 11
SavvyData 'Ace of Charts' in FileMaker Pro 11SavvyData
 
Color Your Single-Series Charts The Way You Like
Color Your Single-Series Charts The Way You LikeColor Your Single-Series Charts The Way You Like
Color Your Single-Series Charts The Way You LikeCollabion Technologies
 
SLA Nov2009 Public
SLA Nov2009 PublicSLA Nov2009 Public
SLA Nov2009 Publicaspoerri
 
Lesson02_Static.11
Lesson02_Static.11Lesson02_Static.11
Lesson02_Static.11thangv
 
TibcoSpotfire@VGSoM
TibcoSpotfire@VGSoMTibcoSpotfire@VGSoM
TibcoSpotfire@VGSoMNilesh Kumar
 
Lesson02_new
Lesson02_newLesson02_new
Lesson02_newshengvn
 
Data visualization tools & techniques - 1
Data visualization tools & techniques - 1Data visualization tools & techniques - 1
Data visualization tools & techniques - 1Korivi Sravan Kumar
 
Lecture 7 creating_charts1
Lecture 7 creating_charts1Lecture 7 creating_charts1
Lecture 7 creating_charts1BBAMUMU2014
 
Guidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candyGuidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candyJen Stirrup
 
Intro to Stats by Sue Wasco and friends
Intro to Stats  by Sue Wasco and friendsIntro to Stats  by Sue Wasco and friends
Intro to Stats by Sue Wasco and friendsGenny Phillips
 
Tableau1 Basics-Dashboards.pptx
Tableau1 Basics-Dashboards.pptxTableau1 Basics-Dashboards.pptx
Tableau1 Basics-Dashboards.pptxChongHuiYee
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsSr Edith Bogue
 
Analyzing and Visualizing Data Chapter 6Data Represent.docx
Analyzing and Visualizing Data Chapter 6Data Represent.docxAnalyzing and Visualizing Data Chapter 6Data Represent.docx
Analyzing and Visualizing Data Chapter 6Data Represent.docxdurantheseldine
 
Designing Data Visualizations to Strengthen Health Systems
Designing Data Visualizations to Strengthen Health SystemsDesigning Data Visualizations to Strengthen Health Systems
Designing Data Visualizations to Strengthen Health SystemsAmanda Makulec
 
Data Visualization in Data Science
Data Visualization in Data ScienceData Visualization in Data Science
Data Visualization in Data ScienceMaloy Manna, PMP®
 
Data analytics using Power BI
Data analytics using Power BIData analytics using Power BI
Data analytics using Power BIMing Man Chan
 
Plan601 e session 4 lesson alt
Plan601 e session 4 lesson altPlan601 e session 4 lesson alt
Plan601 e session 4 lesson altrkottam
 

Similar a Visualizing Data Journalism (HasGeek Fifth Elephant) (20)

Data Visualization dataviz superpower
Data Visualization dataviz superpowerData Visualization dataviz superpower
Data Visualization dataviz superpower
 
SavvyData 'Ace of Charts' in FileMaker Pro 11
SavvyData 'Ace of Charts' in FileMaker Pro 11SavvyData 'Ace of Charts' in FileMaker Pro 11
SavvyData 'Ace of Charts' in FileMaker Pro 11
 
Color Your Single-Series Charts The Way You Like
Color Your Single-Series Charts The Way You LikeColor Your Single-Series Charts The Way You Like
Color Your Single-Series Charts The Way You Like
 
SLA Nov2009 Public
SLA Nov2009 PublicSLA Nov2009 Public
SLA Nov2009 Public
 
Lesson02_Static.11
Lesson02_Static.11Lesson02_Static.11
Lesson02_Static.11
 
TibcoSpotfire@VGSoM
TibcoSpotfire@VGSoMTibcoSpotfire@VGSoM
TibcoSpotfire@VGSoM
 
Lesson02_new
Lesson02_newLesson02_new
Lesson02_new
 
Data visualization tools & techniques - 1
Data visualization tools & techniques - 1Data visualization tools & techniques - 1
Data visualization tools & techniques - 1
 
Lecture 7 creating_charts1
Lecture 7 creating_charts1Lecture 7 creating_charts1
Lecture 7 creating_charts1
 
Guidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candyGuidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candy
 
Intro to Stats by Sue Wasco and friends
Intro to Stats  by Sue Wasco and friendsIntro to Stats  by Sue Wasco and friends
Intro to Stats by Sue Wasco and friends
 
Lec 3.pptx
Lec 3.pptxLec 3.pptx
Lec 3.pptx
 
Tableau1 Basics-Dashboards.pptx
Tableau1 Basics-Dashboards.pptxTableau1 Basics-Dashboards.pptx
Tableau1 Basics-Dashboards.pptx
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Analyzing and Visualizing Data Chapter 6Data Represent.docx
Analyzing and Visualizing Data Chapter 6Data Represent.docxAnalyzing and Visualizing Data Chapter 6Data Represent.docx
Analyzing and Visualizing Data Chapter 6Data Represent.docx
 
Designing Data Visualizations to Strengthen Health Systems
Designing Data Visualizations to Strengthen Health SystemsDesigning Data Visualizations to Strengthen Health Systems
Designing Data Visualizations to Strengthen Health Systems
 
Data Visualization in Data Science
Data Visualization in Data ScienceData Visualization in Data Science
Data Visualization in Data Science
 
manual.docx
manual.docxmanual.docx
manual.docx
 
Data analytics using Power BI
Data analytics using Power BIData analytics using Power BI
Data analytics using Power BI
 
Plan601 e session 4 lesson alt
Plan601 e session 4 lesson altPlan601 e session 4 lesson alt
Plan601 e session 4 lesson alt
 

Más de Ritvvij Parrikh

Introduction to Pykih's Services
Introduction to Pykih's ServicesIntroduction to Pykih's Services
Introduction to Pykih's ServicesRitvvij Parrikh
 
"A primer for custom data visualization" - An approach towards getting starte...
"A primer for custom data visualization" - An approach towards getting starte..."A primer for custom data visualization" - An approach towards getting starte...
"A primer for custom data visualization" - An approach towards getting starte...Ritvvij Parrikh
 
DataMeet 4: Data cleaning & census data
DataMeet 4: Data cleaning & census dataDataMeet 4: Data cleaning & census data
DataMeet 4: Data cleaning & census dataRitvvij Parrikh
 
Getting comfortable with Data
Getting comfortable with DataGetting comfortable with Data
Getting comfortable with DataRitvvij Parrikh
 
Talk at eChai, EDI, Ahmedabad
Talk at eChai, EDI, AhmedabadTalk at eChai, EDI, Ahmedabad
Talk at eChai, EDI, AhmedabadRitvvij Parrikh
 
Offline Advertisements Analytics Dashboard
Offline Advertisements Analytics DashboardOffline Advertisements Analytics Dashboard
Offline Advertisements Analytics DashboardRitvvij Parrikh
 
Google Analytics Dashboard Design
Google Analytics Dashboard DesignGoogle Analytics Dashboard Design
Google Analytics Dashboard DesignRitvvij Parrikh
 
Google Analytics Dashboard designed as an Infographic
Google Analytics Dashboard designed as an InfographicGoogle Analytics Dashboard designed as an Infographic
Google Analytics Dashboard designed as an InfographicRitvvij Parrikh
 
JARVIS:BI for FMCG Sales Managers
JARVIS:BI for FMCG Sales ManagersJARVIS:BI for FMCG Sales Managers
JARVIS:BI for FMCG Sales ManagersRitvvij Parrikh
 
Payroll Giving Management with TracksGiving
Payroll Giving Management with TracksGivingPayroll Giving Management with TracksGiving
Payroll Giving Management with TracksGivingRitvvij Parrikh
 
9 ways how cause marketing can help you achieve your marketing objectives.
9 ways how cause marketing can help you achieve your marketing objectives.9 ways how cause marketing can help you achieve your marketing objectives.
9 ways how cause marketing can help you achieve your marketing objectives.Ritvvij Parrikh
 
How TracksGiving can help you implement your campaigning software up quicker ...
How TracksGiving can help you implement your campaigning software up quicker ...How TracksGiving can help you implement your campaigning software up quicker ...
How TracksGiving can help you implement your campaigning software up quicker ...Ritvvij Parrikh
 

Más de Ritvvij Parrikh (15)

Introduction to Pykih's Services
Introduction to Pykih's ServicesIntroduction to Pykih's Services
Introduction to Pykih's Services
 
PykQuery.js
PykQuery.jsPykQuery.js
PykQuery.js
 
"A primer for custom data visualization" - An approach towards getting starte...
"A primer for custom data visualization" - An approach towards getting starte..."A primer for custom data visualization" - An approach towards getting starte...
"A primer for custom data visualization" - An approach towards getting starte...
 
DataMeet 4: Data cleaning & census data
DataMeet 4: Data cleaning & census dataDataMeet 4: Data cleaning & census data
DataMeet 4: Data cleaning & census data
 
Getting comfortable with Data
Getting comfortable with DataGetting comfortable with Data
Getting comfortable with Data
 
Talk at eChai, EDI, Ahmedabad
Talk at eChai, EDI, AhmedabadTalk at eChai, EDI, Ahmedabad
Talk at eChai, EDI, Ahmedabad
 
Offline Advertisements Analytics Dashboard
Offline Advertisements Analytics DashboardOffline Advertisements Analytics Dashboard
Offline Advertisements Analytics Dashboard
 
Google Analytics Dashboard Design
Google Analytics Dashboard DesignGoogle Analytics Dashboard Design
Google Analytics Dashboard Design
 
Dashboard fhub
Dashboard fhubDashboard fhub
Dashboard fhub
 
Google Analytics Dashboard designed as an Infographic
Google Analytics Dashboard designed as an InfographicGoogle Analytics Dashboard designed as an Infographic
Google Analytics Dashboard designed as an Infographic
 
Company presentation
Company presentationCompany presentation
Company presentation
 
JARVIS:BI for FMCG Sales Managers
JARVIS:BI for FMCG Sales ManagersJARVIS:BI for FMCG Sales Managers
JARVIS:BI for FMCG Sales Managers
 
Payroll Giving Management with TracksGiving
Payroll Giving Management with TracksGivingPayroll Giving Management with TracksGiving
Payroll Giving Management with TracksGiving
 
9 ways how cause marketing can help you achieve your marketing objectives.
9 ways how cause marketing can help you achieve your marketing objectives.9 ways how cause marketing can help you achieve your marketing objectives.
9 ways how cause marketing can help you achieve your marketing objectives.
 
How TracksGiving can help you implement your campaigning software up quicker ...
How TracksGiving can help you implement your campaigning software up quicker ...How TracksGiving can help you implement your campaigning software up quicker ...
How TracksGiving can help you implement your campaigning software up quicker ...
 

Último

Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 

Último (20)

Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 

Visualizing Data Journalism (HasGeek Fifth Elephant)

  • 1. Visualizing Data Journalism Ritvvij Parrikh, Founder, www.pykih.com ! ! Fifth Elephant, Delhi Run-up Event, India Today Mediaplex, June 14, 2014
  • 2. Pykih is a data Visualization company. We build custom visual representations of large data sets to make data actionable for readers. We have satisfied customers in six countries. Introduction
  • 3. • Data Viz. • Theory • Case Study 1 • Case Study 2 • Summary • Challenges in Data Journalism • What we are doing about it for ourselves Agenda
  • 5. Let’s explore the humble pie chart… Party Percentage E 38% D 25% C 20% B 15% A 2% Break the whole into parts.
  • 6. Let’s explore the humble pie chart… Party Percentage E 38% D 25% C 20% B 15% A 2% Break the whole into parts. Data: One dimensional Visual Encoding: Area
  • 7. New Terms • Dimension: Columns by which you group data.! ! • Facts: Numbers that you can count, sum, average, etc.! ! • Examples:! • Seat count by party! • Seat count by party and state! ! • Visual Encoding: Area, Position, Colour, Length, Thickness, etc.
  • 8. One-dimensional Charts PIE is a one-dimensional chart
  • 9. One-dimensional Charts … A pie could have been a random shape broken by percentage
  • 10. One-dimensional Charts … Pie Amoeba Percentage! Rectangle Donut Percentage! Triangle Bubble Election Donut Funnel Percentage Bar Percentage ! Column #1 - The same data can be Visualized in many (MANY!) different ways.
  • 11. One-dimensional Charts … Source: thehindu.com What is wrong here?
  • 12. One-dimensional Charts … What is wrong here? Problems:! • Colour communicates no data! • 3D communicates no data Source: thehindu.com
  • 13. One-dimensional Charts … Source: thehindu.com #2 - Your goal is to communicate data. Wrong use of visual encoding confuses. Problems:! • Colour communicates no data! • 3D communicates no data
  • 14. One-dimensional Charts … Source: firstpost.com What is wrong here?
  • 15. One-dimensional Charts … What is wrong here? Problems:! • Colour! • Too many values. Too cluttered. Source: firstpost.com
  • 16. One-dimensional Charts … Problems:! • Colour! • Too many values. Too cluttered. #3 - AREA encoding is useful for only few values after which it is unreadable. Source: firstpost.com
  • 17. One-dimensional Charts … Solution to problem of restricted space? Create a custom chart.
  • 18. New Data Set One dimensional: ! Seat count by party Grouped One dimensional: ! Seat count by party grouped by alliance
  • 19. Grouped One-dimensional Charts Party Alliance Percentage A NDA 38% B NDA 25% C NDA 20% D UPA 15% E Others 2%
  • 20. Grouped One-dimensional Charts Group various bubbles by colours Party Alliance Percentage A NDA 38% B NDA 25% C NDA 20% D UPA 15% E Others 2%
  • 21. Grouped One-dimensional Charts Group various bubbles by colours Party Alliance Percentage A NDA 38% B NDA 25% C NDA 20% D UPA 15% E Others 2% #4 - You can always fit in an extra dimension (GROUP) in charts using colour.
  • 22. New Data Set One dimensional: ! Seat count by party Grouped One dimensional: ! Seat count by party grouped by alliance Two dimensional: ! Which party won in which year
  • 23. Two-dimensional Charts Plot two data points Party Constituency A Z B Y C X D V E W 23Visual encoding: Position, Length
  • 24. Two-dimensional Charts… Connect the dots and you get a line chart.
  • 25. Two-dimensional Charts… Scatter Line Area Bar Column Spider All these charts require the same data.#5 - Number of dimensions in data determines which chart to use
  • 26. New Data Set One dimensional: ! Seat count by party Grouped One dimensional: ! Seat count by party grouped by alliance Two dimensional: ! Which party won in which constituency Weighted Two dimensional: ! Which party won in which constituency by what vote margin
  • 28. Weighted Two-dimensional Charts … Let’s add weight to it, hence now we have three data points X axis Y axis Weight A Z 40 B Y 20 C X 1 D V 300 E W 60 28Visual encoding: Position, Length, Area
  • 29. Weighted Two-dimensional Charts … Weighted Scatter Circle Comparison All these charts require the same data.#6 - You can always fit in an extra fact (WEIGHT) in charts using size.
  • 30. New Data Set One dimensional: ! Seat count by party Grouped One dimensional: ! Seat count by party grouped by alliance Two dimensional: ! Which party won in which constituency Weighted Two dimensional: ! Which party won in which constituency by what vote margin Grouped Weighted Two dimensional: ! Which party won in which constituency by what vote margin grouped by alliance
  • 31. Grouped Weighted Two-dimensional Charts Grouped Weighted Scatter Grouped Circle Comparison 31Visual encoding: Position, Length, Area, Colour
  • 32. Multi-series Two-dimensional Charts … RangeGanttMulti-series Line Group Column Stack Column Group Stack Column Stack Area Stack Percentage Area Add more dimensions in creative ways.
  • 33. Multi-series Two-dimensional Charts … What is right and wrong here? Source: livemint.com Is the equities rally percolating into the broader market?
  • 34. Multi-series Two-dimensional Charts … What is right and wrong here? Source: livemint.com Is the equities rally percolating into the broader market? Bad parts:! • BSE Small-cap lines is not visible and that’s the story.
  • 35. Multi-series Two-dimensional Charts … What is right and wrong here? Good parts:! • Y axis from 97 instead of 0 Source: livemint.com Is the equities rally percolating into the broader market? Bad parts:! • BSE Small-cap lines is not visible and that’s the story. #7 - Purpose of line chart is to show trend. Focus on it.
  • 36. Multi-series Two-dimensional Charts … What is wrong here? Source: livemint.com Does IMF wear rose-tinted glasses?
  • 37. Multi-series Two-dimensional Charts … What is wrong here? Source: livemint.com Problems:! • Cannot find the IMF line. Does IMF wear rose-tinted glasses?
  • 38. Multi-series Two-dimensional Charts … What is wrong here? Source: livemint.com Does IMF wear rose-tinted glasses? Problems:! • Cannot find the IMF line. #8 - Highlight the story for the user. Use color to highlight, not confuse.
  • 39. New Data Set All the data we encountered so far was RDBMS i.e. could fit in a SpreadSheet. (rows and columns). ! ! Sometimes data is more complex. It can have“relationships”. ! ! Types of relationships:! • Hierarchy / Tree! • Multi-level relationships
  • 40. Tree Charts { "name": "root", "children": [ { "name": "A", "children": [ {"name": "A1"}, {"name": "A2"}, {"name": "A3"}, {"name": "A4"} ] 40Visual encoding: Position
  • 42. Grouped Weighted Tree Charts Packed Circle Sunburst Tree Rectangle Tree Bar Grouped Weighted Tree 42Visual encoding: Position, Size, Colour
  • 43. Grouped Weighted Tree Charts Sunburst 43Visual encoding: Position, Size, Colour
  • 44. Grouped Multi-level Relationship Charts { “nodes”: [ {“name”: “A”, “group”: “G1”}, {“name”: “B”, “group”: “G2”}, … ], "relations": [ {"from": “A”, "to": “B”}, {"from": “A”, "to": “C”}, … ] 44Visual encoding: Position
  • 45. Grouped Multi-level Relationship Charts Graph Collapsible Graph Hive #9 - Look for relationships across data sets.
  • 46. Weighted Grouped Multi-level Relationship Charts Sankey 46Visual encoding: Position, Color, Size
  • 47. Case: Mumbai Local Fare Chart A fare exists for travel between station "A" and “B”. Hence, it is a relationship chart.
  • 48. Case: Mumbai Local Fare Chart Matrix Half Matrix [ {"node1": "A", "node2": "B", "weight": 300}, {"node1": "A", "node2": "C", "weight": 900}, … ]
  • 49. Case: Mumbai Local Fare Chart 49 #9 - Look for limitations. They can help you improve design.
  • 50. Weighted Two-level Relationship Charts … Chord Number of people travel between various stations
  • 51. • One dimensional charts! • Grouped one dimensional charts! ! • Two dimensional charts! • Weighted Two dimensional charts! • Grouped Two dimensional charts! • Grouped Weighted Two dimensional charts! ! • Multi-dimensional Charts! ! • Tree Charts! • Grouped Weighted Tree Charts! ! • Multi-level Relationships Charts! • Grouped Weighted Multi-level Relationships Charts! ! • Two-level Relationships Charts! • Grouped Weighted Two-level Relationships Charts Taxonomy of Standard Data Visualizations
  • 52. The same data can be visualized in many (MANY!) ways. Without exploring the data, you will end up visualizing all your data in pies, lines and bars. Most Imp. Lesson
  • 53. One Dimension Two Dimension Multi- Dimension Relationship Hierarchical Geo Maps Dimension: Time N Y Y N N N Dimension: Group Y Y Y Y Y Y Fact: Weight N Y Y Y Y Y Group and Weight N Y Y Y Y N Fact: Many values May be Y Y Y Y Y Multiple levels / Zoomable N N N Y Y Y Implications
  • 54. List of Visual Encodings Source: http://complexdiagrams.com/properties
  • 55. Case Study #1: Let’s apply what we learnt IPL Score Card
  • 57. 57 Ball by ball! Commentary Per Batsman Statistics Per Bowler Statistics Fall of Wickets Partnerships Two innings Pre-match: Toss, Playing 11, Location, Time Post-match: Win, by how much, Man of the match Second Innings: Current Run Rate, Required Run Rate, Target score
  • 58. Overs: Most important data-point 1. Overs = Time! 2. One over ! 1. has_many balls! 2. has_one bowler! 3. has_many batsmen! 3. Existence of batsmen across overs is partnerships! 4. Partnerships and Fall of wickets are the same different data set
  • 59. Ball by ball Commentary
  • 61. Combine the two Weighted two-dimensional chart Y-axis: Balls per over X-axis: Overs + Bowlers Gantt chart Y-axis: Batsmen X-axis: Overs + Bowlers All other “zoomable" information is shown via interactions
  • 62. Putting it all together
  • 63. Let’s see it live http://www.firstpost.com/cricket-live-score/IPL/1-jun-2014- kolkata-knight-riders-versus-kings-xi-punjab/2173/175977
  • 64. Less reading. No scrolling. More awareness.
  • 65. Case Study #2: Let’s apply what we learnt Election Counting Day
  • 66. Election Counting Day Data Set:! • India has 50+ regional parties and two national parties.! • During Election Counting Day (live), seats are either “Leading” or “Won”! ! Data Properties / Relationships:! • Hierarchical Relation between Alliance and Party! • Won is confirmed. Leading is transient.! ! What did readers want to know this Election:! • How badly would UPA lose! • How big would be the BJP victory! • How big would the impact of AAP would be! ! Real world facts to inspire design! • BJP is a right wing party! • AAP is left most followed by UPA! • The Sansad Hall is a semi-circle
  • 67. Election Counting Day Data Set:! • India has 50+ regional parties and two national parties.! • During Election Counting Day (live), seats are either “Leading” or “Won”! ! Data Properties / Relationships:! • Hierarchical Relation between Alliance and Party! • Won is confirmed. Leading is transient.! ! What did readers want to know this Election:! • How badly would UPA lose! • How big would be the BJP victory! • How big would the impact of AAP would be! ! Real world facts to inspire design:! • BJP is a right wing party! • AAP is left most followed by UPA! • The Sansad Hall is a semi-circle —> Group —> Tree —> Weight —> Limitation } Hence, all other parties! can be clubbed into ! other —> Shape —> Placement}
  • 68. Choosing the right Grouped Weighted Tree Chart Packed Circle Sunburst Tree Rectangle Tree Bar Grouped Weighted Tree 68Visual encoding: Position, Size, Colour
  • 69. Election Counting Day … Sunburst
  • 71. Sansad Chart Focus on what is most imp.! Alliance is more imp. than Party. We spent 200% more time reversing hierarchy
  • 72. Let’s see it live http://firstpost.com/election-results
  • 73. Summary 1. Study properties and relationships of your Data Set! 2. Use your visual encodings wisely
  • 74. Challenges in Data Journalism
  • 75. Data Collection What’s the story Visualize Story Journalist Developer Designer • Govt. data! • APIs! • Scrape! • Mine web! • PDFs • Clean the data! • Model the data! • Investigate • Design! • Build Write Technology is an integral part of data journalism. Steps in data journalism
  • 76. Data Driven Stories Visualization App Day-to-day short stories derived from data Big apps. to educate large and important event e.g. budget, election, etc. Formats in data journalism
  • 77. Format #1 - Data Driven Stories Source: http://factchecker.in/data-are-crimes-against-scheduled-castes-on-an-upswing-in-india/ Badaun Case —> Find legit Data —> Analyse —> Plot —> Story
  • 78. Format #2 - Visualization Apps
  • 79. Data Collection What’s the story Visualize Story • Govt. data! • APIs! • Scrape! • Mine web! • PDFs • Clean the data! • Model the data! • Investigate • Design! • Build Write Format: Visualization app Format: Data Driven Stories Journalist Developer Designer Journalist Implication
  • 80. High Level ! 1. Quick access to appropriate data set 2. Quick analysis of this data 3. Consistently churn out neat charts, graphs and maps Challenges
  • 81. High Level ! 1. Quick access to appropriate data set 2. Quick analysis of this data 3. Consistently churn out neat charts, graphs and maps ! Technical ! 1. Live Data Modelling 2. SEO 3. How to handle high traffic Challenges
  • 82. High Level ! 1. Quick access to appropriate data set 2. Quick analysis of this data 3. Consistently churn out neat charts, graphs and maps ! Technical ! 1. Live Data Modelling 2. SEO 3. How to handle high traffic ! From pykih perspective ! 1. How do you consistently build beautiful, real-time Visualizations? Challenges
  • 83. What we are doing about it In-house tool called "Backstage"
  • 84. #1 - Instead of waiting for data to be standardised, we want to make large scale, high- velocity, multi-format, data extraction durable. ! #2 - Instead of expecting data-users / journalists to have analytical skills, we are: • simplifying exploration of large data sets • automating extraction of metadata from data sets • simplifying assisted data standardisation • building tools for assisted analysis ! #3 - Instead of expecting data-users / journalists to Visualize data correctly, we are attempting automate meta-data driven Visualization ! Other Experiments • A data-driven blogging software • Configuration Editor Principles —> Demo the worker —> Demo the census dashboard —> ISO example —> Demo NLP based Date Standardiser —> Story is in the outliers Example: If data is ordinal then colour automatically leverages saturation and if data is ordinal then colour is distinct
  • 85. Data Visualization company => Data and Visualization company ! ! Effective Data Journalism leverages: You will end up NoSQL, Memory based databases, NLP, OLAP modelling, Free Text Search, Statistics, etc. Summary
  • 86. We are at @pykih Fun fact: The word pykih came to us in a CAPTCHA. That’s the day we decided that till we do good work it does not matter what we are called.