1. webinar@softserveinc.com
Designing Big Data Systems Like
a Pro
Smart Decisions: An Architecture Design Game
Humberto Cervantes, Serge Haziyev, Olha Hrytsay, Rick Kazman
September 2015
2. Presenters
Dr. Rick Kazman is a Professor at the University of Hawaii and a Principal Researcher at the Software
Engineering Institute of Carnegie Mellon University (SEI). His primary research interests are software
architecture, design and analysis tools, software visualization, and software engineering economics. Rick has
created several highly influential methods and tools for architecture analysis, including the ATAM
(Architecture Tradeoff Analysis Method).
Dr. Humberto Cervantes is a professor at Universidad Autónoma Metropolitana–Iztapalapa in Mexico City.
His primary research interests include software architecture design methods and their adoption in industrial
settings. Dr. Cervantes is also a consultant for software development companies in topics related to software
architecture. He holds the Software Architecture Professional and ATAM Evaluator certificates from the SEI.
Serge Haziyev is VP of Software Architecture at SoftServe. Serhiy has more than 17 years of experience in
enterprise-level solutions including Big Data, SaaS/Clouds, SOA and Carrier-grade telecommunication
services. He specializes in software architecture methodologies, architectural patterns and software
development practices for large and complex projects in multiple industry verticals, including healthcare.
Olha Hrytsay works as a BI/DW consultant at SoftServe, Inc., a leading global outsourced product and
application development company. Olha has more than seven years of experience in building business
intelligence, data warehousing, and big-data solutions for a number of global companies in the network
security, health care, and finance business domains. Her current activities at SoftServe include leading the BI
Center of Excellence as well as design and implementation of data warehousing, data visualization, and
analytics solutions.
5. webinar@softserveinc.comwebinar
Game Motivation
This game intends to illustrate the essentials of
architecture design using an iterative method
such as ADD.
You will be competing against other software
architects (or other teams) from rival
companies, so you need to make smart design
decisions or else your competitors will leave
you behind!
13. Game Rules
The game is played in rounds
which represent the iterations.
For each round the game
provides:
- Iteration goal (i.e. selected
drivers)
- Element to refine
ADD Step 2: Establish iteration goal by selecting drivers
ADD Step 3: Choose one or more elements of the system to refine
15. webinar@softserveinc.comwebinar
Iteration 1 Goal: Logically Structure The
System
Drivers for the iteration:
- Ad-Hoc Analysis
- Real-time Analysis
- Unstructured data processing
- Scalability
- Cost Economy
Big Data System
Element to refine:
16. Game Rules
You will make the design
decision of selecting design
concepts:
- Reference architectures*
- Technology families*
- Specific technologies
* In the game they are considered as a type
of pattern
ADD Step 4: Choose one or more design concepts that satisfy the selected
drivers
17. Game Rules: Design Concepts Cards
Name and type of design
concept
Influence on drivers
Technologies Patterns
18. Iteration 1 Goal: Logically Structure The System
Select 1 Reference Architecture Card
Drivers for the iteration:
- Ad-Hoc Analysis
- Real-time Analysis
- Unstructured data processing
- Scalability
- Cost Economy
Alternatives:
• Extended Relational
• Pure Non-Relational
• Data Refinery
• Lambda Architecture
Big Data System
Element to refine:
20. Introduction
ADD Step 5: Instantiate elements, allocate responsibilities and define interfaces.
ADD Step 6: Sketch views and record design decisions
You will:
- Record the design decision
- Throw two dice to simulate how
well you instantiate your selected
design concepts
22. Introduction
We will review the
decisions together. The first
iteration will be reviewed
now but the rest will be
reviewed at the end.
ADD Step 7: Perform Analysis of Current Design and Review Iteration goal
and Design Objective
23. webinar@softserveinc.comwebinar
Iteration 1 Review
Design decision Driver points Bonus points Comments
Extended Relational 3+2+2+2+1=10 -4
This reference architecture is less appropriate for this solution mostly
because of cost and real-time analysis limitation
Pure Non-Relational 2+2.5+3+3+3=13.5
This reference architecture is closer to the goal than the others except
Lambda Architecture
Lambda Architecture
(Hybrid)
2.5+3+3+3+3=14.5 +2
This is the most appropriate reference architecture for this solution!
From the provided reference architectures Lambda Architecture
promises the largest number of benefits, such as access to real-time
and historical data at the same time.
Data Refinery (Hybrid) 3+1+3+2+1=10 -4
This reference architecture is less appropriate for this solution mostly
because of cost and real-time analysis limitation
Score Ad-Hoc Analysis, Real-time Analysis, Unstructured data processing, Scalability, Cost Economy
26. Game Scenario: Big Data System
Web Servers
24/7 Operations,
Support Engineers,
Developers
Real-time
Dashboard
Management
Static Reports
• Real-time monitoring
• Full-text search
• Historical static reports
• Available through BI corporate tool
• Hundreds of
servers
• Massive logs
from
multiple
sources
Data Scientists/
Analysts
Ad-Hoc
Reports
• Raw and aggregated historical data
• Ad-hoc analysis
• Human-time queries
UC-1,2
UC-3
UC-4
UC1 - Monitor online services
UC2 - Troubleshoot online service issues
UC3 - Provide management reports
UC4 - Provide ad-hoc data analytics