2. What is Big Data Analytics?
Key Facts:
• Not defined by size
• Hadoop is the leading framework
• Hadoop is open-source
Common issues
• Requires time and money to access data
• Many sources of data
• Quantity of data volume
Mike Lee + George Komoto
Thursday, October 31, 13
3. Platfora
Platfora transforms raw data in Hadoop into interactive, in-memory
business intelligence without the friction of IT or complexity of
existing approaches. A complete solution, it seamlessly connects data
to end-users. No separate data warehouse or ETL required.
The Challenge
Customers are getting stuck in the current stepped wizard approach.
The interface is not intuitive for non-database administrators.
The Solution
Design a graphical interface that permits creating multiple
connections in the same experience. The new workflow requires less
time to complete this task, and encourages more interactive
exploration and visualization of data.
Mike Lee + George Komoto
Thursday, October 31, 13
8. Mike Lee + George Komoto
Thursday, October 31, 13
9. Mike Lee + George Komoto
Thursday, October 31, 13
10. Mike Lee + George Komoto
Thursday, October 31, 13
11. Mike Lee + George Komoto
Thursday, October 31, 13
12. Design Process
User Research
•
Gain more insights from typical end users (i.e. Tableau users)
•
•
Inquire about preferred tools and methodologies
Understand pain points in current workflows
Design Iteration
•
•
Present UI sketches to Platfora team for feedback
Test wireframes with target end users
•
•
General Assembly back-end engineers
Business/data analysts within network
Mike Lee + George Komoto
Thursday, October 31, 13
13. Task Analysis: Deconstruct & Revise
Mike Lee ++ George Komoto
Mike Lee George Komoto
Thursday, October 31, 13
14. Task Analysis: Deconstruct & Revise
Select
Data Sources
(Linear Steps)
Mike Lee ++ George Komoto
Mike Lee George Komoto
Thursday, October 31, 13
15. Task Analysis: Deconstruct & Revise
Select
Data Sources
(Linear Steps)
Connect
Data Sources
(Design Canvas)
Mike Lee ++ George Komoto
Mike Lee George Komoto
Thursday, October 31, 13
16. Task Analysis: Deconstruct & Revise
Select
Data Sources
(Linear Steps)
Edit Data
Connect
Data Sources
(Design Canvas)
Mike Lee ++ George Komoto
Mike Lee George Komoto
Thursday, October 31, 13
17. Platfora Task Analysis
Created By: Mike Lee
Date Created: 17-OCT-2013
Last Revised: 28-OCT-2013
Tasks
1. Select Data
2. Parse Data
3. Manage Fields
4. Create Reference
5. Key
6. Finish & Save
Login
Data Catalog
Data Catalog
Home
Click ‘Add Dataset’
Select target
dataset to view
details
Data Catalog
Select Data:
Choose source
data for dataset
Key
Screen
Action
[trees_SF]
Click ‘Add Dataset’
Click ‘Continue’
Display
Input
View reference
details in field
02:58
Decision
Loading…
Manage Fields:
Add computed
fields and verify
field info
00:00
Select Data:
Choose source
data for dataset
[species_SF]
Parse Data:
How to extract
rows and columns
My Datasets >
trees_SF
The complete task flow for
importing datasets, adding
references, and preparing for
Vizboards
[species_ID]
Raw file
contains
header?
Click ‘Continue’
Yes
Parse Data:
How to extract
rows and columns
Select column for
dataset join
Wrangled / Raw
Raw file
contains
header?
Click ‘Create
References’
Select checkbox
Click ‘Continue’
Yes
Select checkbox
Manage Fields:
Add computed
fields and verify
field info
Click ‘Continue’
Select column for
dataset join
Manage Fields:
Add computed
fields and verify
field info
Click ‘Create
References’
Click ‘Define Key’
Create
References:
Set up joins to
dimension dataset
Define Key:
Indicate column(s)
that make up the
unique key
Select target
dataset from
dropdown
[species_SF]
Select foreign key
from dropdown
[species_ID]
Select field(s) to
include in key
[id]
Click ‘Save & Exit’
[species_ID]
The task flow we are focused
on for this project.
Select target
dataset from
dropdown
[species_SF]
Select foreign key
from dropdown
[species_ID]
Name
reference?
Yes
Confirm?
Create
References:
Set up joins to
dimension dataset
Enter reference
name
“Species”
Click ‘Add’
“Species” appears
in References tab
Yes
Success message
popup
Click ‘Save & Exit’
Name
reference?
Confirm?
Yes
Success message
popup
Mike Lee + George Komoto
Thursday, October 31, 13
Yes
18. User Research
Method
We conducted 7 interviews with people similar to our personas who
are currently using data analytics tools.
Findings
Access to data is a problem. Requests to make data warehouse
changes can take weeks. Preparation involves many schema and data
processing tools. The most common tool between stakeholders was
the data model diagram.
Opportunities Identified
Design a way to visualize and interact with the full data model.
Mike Lee + George Komoto
Thursday, October 31, 13
25. User Flow: Marybeth (Business Analyst)
1. Creates multiple links to a specific field from a different dataset that references additional data for
that row
2. Manages/edits references
3. Previews the content and see example data for each Fields
4. Identifies specific fields that can be used by another dataset to link rows back into that other dataset
Mike Lee + George Komoto
Thursday, October 31, 13
26. User Flow: Marybeth (Business Analyst)
1. Creates multiple links to a specific field from a different dataset that
references additional data for that row
Marybeth starts with the fact dataset. The key is already
selected. She uses the fly out menu to view her options.
She knows she needs to connect the other datasets in
the company catalog.
Mike Lee + George Komoto
Thursday, October 31, 13
Platfora automatically lists the datasets in the catalog. It
also knows to match data types with the existing key.
Here Marybeth can link one or both target datasets.
27. User Flow: Marybeth (Business Analyst)
2. Manages/edits references
Marybeth needs to make some changes to the reference
name. The flyout menu she used to create the references
also has a link to make the changes she needs.
Mike Lee + George Komoto
Thursday, October 31, 13
Marybeth edits the names of both of the target datasets.
Unchecking the boxes will remove the relationship.
28. User Flow: Marybeth (Business Analyst)
3. Previews the content and see example data for each Fields
To quickly verify the datasets, the i button in the upperright provides a quick preview. To make changes,
Marybeth would have to explore the import/parse data
to make larger changes.
Mike Lee + George Komoto
Thursday, October 31, 13
29. User Flow: Marybeth (Business Analyst)
4. Identifies specific fields that can be used by another dataset to link rows
back into that other dataset
Similar to the original dataset she configured, when she
has more datasets, she can create references from target
datasets to other datasets.
Mike Lee + George Komoto
Thursday, October 31, 13
30. Wireframe Feedback
Key Findings
•
Resolves the issue jumping back and forth within the import
data wizard, but still does not give a clear view of
relationships
•
Would require more development resources to create
because the components shown do not currently exist
•
Showing the relationships between the datasets should help
users avoid many of the issues they currently experience
Mike Lee + George Komoto
Thursday, October 31, 13
31. Design for Humans: Interactive Web UI
Managing relationships between datasets should be as intuitive
and visual as working with Platfora Vizboards.
Design Language
How do we communicate the relationships between fact and dimension
datasets?
Visual Interface
How do we translate action steps into intuitive interactions?
Mike Lee + George Komoto
Thursday, October 31, 13
32. Design Language: Entity-Relationship
Graphical user interfaces in
database administration is not
new. Several schematics exist to
represent relationships and the
flow of data.
http://www.agiledata.org/essays/agileDataModeling.html
Mike Lee ++ George Komoto
Mike Lee George Komoto
Thursday, October 31, 13
34. Visual Idea: Mozilla Collusion
(Star Schema)
(Entity
Relationship)
http://www.mozilla.org/en-US/collusion/demo/
Mike Lee + George Komoto
Thursday, October 31, 13
35. Mike Lee + George Komoto
Thursday, October 31, 13