SlideShare una empresa de Scribd logo
1 de 21
© ALTOROS Systems | CONFIDENTIAL
Andrei Yurkevich
Chief Technology Officer
andrei.yurkevich@altoros.com
© ALTOROS Systems | CONFIDENTIAL 2
• Hadoop/NoSQL performance engineering
• Cluster Automation & Server Templates on Joyent, AWS, SoftLayer, Rackspace,
CloudStack and OpenStack using Chef/Puppet, RightScale and SCALR
• 300+ employees globally (UK, USA, Denmark, Switzerland, Norway, Belarus,
Argentina)
• v
Featured customers Partners
© ALTOROS Systems | CONFIDENTIAL 3
© ALTOROS Systems | CONFIDENTIAL 4
© ALTOROS Systems | CONFIDENTIAL
56
Combinations
© ALTOROS Systems | CONFIDENTIAL
56
Combinations
15625
© ALTOROS Systems | CONFIDENTIAL 7
© ALTOROS Systems | CONFIDENTIAL 8
No clear business goals
Big amounts of data
from many sources
Architecture design
The variety of tools
Compatibility of technologies/platforms
Lack of professionals
All features in one release
Budget
© ALTOROS Systems | CONFIDENTIAL 9
© ALTOROS Systems | CONFIDENTIAL 10
Functional requirements Value Non-functional requirements
The amount of data added daily: 2.5 TB
• Infrastructure-independent
architecture
• Scalability
• Open-source tools
Data type:  raw data
 processed
data
Data storage time:
 raw data
 Processed data
 min a week
 min a year
Response time:
 for building reports based on a
pre-set template
 for building reports for a
custom period of time
 < 30 sec
 < 6 hours
Uptime: 99%
Fault-tolerance: required
Deployment cost per day: < $1,000
© ALTOROS Systems | CONFIDENTIAL 11
Amazon AWS Joyent Rackspace
Types of a contract On Demand, Reserved,
Spot
On Demand,
Reserved
On Demand
Types of instances
(classified by compute
units)
• General Purpose
• Compute optimized
• Memory optimized
• Storage optimized
• Standard
• High Memory
• High CPU
• High Storage
• High I/O
• General Purpose
Storage options • EBS
• S3
• Low-cost storage
• Network storage
based on ZFS
• Cloud Block
Storage
• Cloud Files
Operating systems Linux, Windows SmartOS, Linux,
Windows
Linux, Windows
A management
console
AWS Console Joyent
SmartDataCenter
Cloud Control Panel
A Cloud API • Command line
interface
• Java, .NET, Ruby
SDK and API
• Command line
interface (CLI)
• Node.js SDK
• REST API
REST API
Regions America, Europe, Asia,
Australia
North America,
Europe
America, Europe, Asia,
Australia
Estimated cost per
month
$18,300 $17,500 $21,350
© ALTOROS Systems | CONFIDENTIAL 12
a good fit a normal fit a bad fit
Option 2 Option 1
Feature Amazon AWS Joyent Rackspace
Types of a contract On Demand, Reserved,
Spot
On Demand, Reserved On Demand
Types of instances
(classified by compute
units)
• General Purpose
• Compute optimized
• Memory optimized
• Storage optimized
• Standard
• High Memory
• High CPU
• High Storage
• High I/O
• General Purpose
Storage options • EBS
• S3
• Low-cost storage
• Network storage
based on ZFS
• Cloud Block Storage
• Cloud Files
Operating systems Linux, Windows SmartOS, Linux,
Windows
Linux, Windows
A management console AWS Console Joyent SmartDataCenter Cloud Control Panel
A Cloud API • Command line
interface
• Java, .NET, Ruby
SDK and API
• Command line
interface (CLI)
• Node.js SDK
• REST API
REST API
Regions America, Europe, Asia,
Australia
North America, Europe America, Europe, Asia,
Australia
Estimated cost per month $18,300 $17,500 $21,350
Score 1.5 3.5
© ALTOROS Systems | CONFIDENTIAL 13
Features HBase Cassandra MongoDB MySQL Cluster
License Apache Apache AGPL GPL
Protocol HTTP/REST (also
Thrift)
Thrift and custom
binary CQL3
Custom, binary
(BSON)
JDBC, ODBC
Data model Column family Column family JSON documents Tables
Queries / Query
Language
JRuby-based
(JIRB) shell
Cassandra Query
Language
JavaScript
expressions
SQL
Partitioning
Strategy
Ordered
Partitioning
Random
Partitioning
Sharding by key Partition by key
Replication
between nodes
yes yes yes yes
Replication
between data
centers
no
yes
no
yes
Capability to store
2.5 TB daily
yes yes yes yes
Implementation
Experience
1+ 1+ 2+ 5+
Score 2 3 2 5
a good fit a normal fit a bad fit
© ALTOROS Systems | CONFIDENTIAL 14
Features HBase Cassandra MongoDB MySQL Cluster
License Apache Apache AGPL GPL
Protocol HTTP/REST (also
Thrift)
Thrift and custom
binary CQL3
Custom, binary
(BSON)
JDBC, ODBC
Data model Column family Column family JSON documents Tables
Queries / Query
Language
JRuby-based
(JIRB) shell
Cassandra Query
Language
JavaScript
expressions
SQL
Partitioning
Strategy
Ordered
Partitioning
Random
Partitioning
Sharding by key Partition by key
Replication
between data
centers
no
yes
no
yes
Capability to store
2.5 TB daily
yes yes yes yes
Implementation
Experience
1+ 1+ 2+ 5+
Deployment cost
per day
$450 $400 $500 $1,500
Score 2.5 4 2.5 0
a good fit a normal fit a bad fit
© ALTOROS Systems | CONFIDENTIAL 15
© ALTOROS Systems | CONFIDENTIAL 16
Feature HBase Cassandra MongoDB
Replication between data
centers
Asynchronous,
needs testing
Replicas can span
data centers with
synchronous
replication
Not supported
A cluster admin node NameNode Any node mongos process
Implementation
Experience
1+ 1+ 2+
Time spent on inserting
30 MB of data
7 sec 9 sec 20 sec
Deployment cost per day $450 $400 $500
Score 2 2.5 0
a good fit a normal fit a bad fit
© ALTOROS Systems | CONFIDENTIAL 17
© ALTOROS Systems | CONFIDENTIAL 18
© ALTOROS Systems | CONFIDENTIAL 19
A requirement The prototype features
Storing of 2.5 TB of daily raw data for a week Capable
Storing of 1.5 TB of processed data for a year Capable
Response time for building reports based on a pre-set
template
~25 sec
Response time of less than 6 hours for building a custom
report
~7 hours
Scalability Good
Infrastructure Independence Yes
Using open-source tools For all components
Fault-tolerance Yes
Deployment cost per day < $1,000 ~$600
© ALTOROS Systems | CONFIDENTIAL
Properly visualize and test the
functionality
Detect bottlenecks and change a
technology/tool/database before it
was implemented in the real system
Get a real vision of the final solution
Make sure you stick to the budget
20
© ALTOROS Systems | CONFIDENTIAL 21
Andrei Yurkevich
President/CTO
andrei.yurkevich@altoros.com

Más contenido relacionado

La actualidad más candente

Performance Demystified for SQL Server on Azure Virtual Machines
Performance Demystified for SQL Server on Azure Virtual MachinesPerformance Demystified for SQL Server on Azure Virtual Machines
Performance Demystified for SQL Server on Azure Virtual MachinesAmit Banerjee
 
Kenshoo - Use Hadoop, One Week, No Coding
Kenshoo - Use Hadoop, One Week, No CodingKenshoo - Use Hadoop, One Week, No Coding
Kenshoo - Use Hadoop, One Week, No CodingMapR Technologies
 
Cloud Storage in Azure, AWS and Google Cloud
Cloud  Storage in Azure, AWS and Google CloudCloud  Storage in Azure, AWS and Google Cloud
Cloud Storage in Azure, AWS and Google CloudThurupathan Vijayakumar
 
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...Amazon Web Services
 
Persistent Storage for Containerized Applications
Persistent Storage for Containerized ApplicationsPersistent Storage for Containerized Applications
Persistent Storage for Containerized ApplicationsColleen Corrice
 
Compare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBCompare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBAmar Das
 
[Pgday.Seoul 2018] PostgreSQL 성능을 위해 개발된 라이브러리 OS 소개 apposha
[Pgday.Seoul 2018]  PostgreSQL 성능을 위해 개발된 라이브러리 OS 소개 apposha[Pgday.Seoul 2018]  PostgreSQL 성능을 위해 개발된 라이브러리 OS 소개 apposha
[Pgday.Seoul 2018] PostgreSQL 성능을 위해 개발된 라이브러리 OS 소개 apposhaPgDay.Seoul
 
Sql saturday azure storage by Anton Vidishchev
Sql saturday azure storage by Anton VidishchevSql saturday azure storage by Anton Vidishchev
Sql saturday azure storage by Anton VidishchevAlex Tumanoff
 
MongoDB and AWS: Integrations
MongoDB and AWS: IntegrationsMongoDB and AWS: Integrations
MongoDB and AWS: IntegrationsMongoDB
 
Journey Through the AWS Cloud; Storage and Archiving
Journey Through the AWS Cloud; Storage and ArchivingJourney Through the AWS Cloud; Storage and Archiving
Journey Through the AWS Cloud; Storage and ArchivingAmazon Web Services
 
Cost Effective Archiving and Backup in the AWS Cloud with Amazon Glacier
Cost Effective Archiving and Backup in the AWS Cloud with Amazon GlacierCost Effective Archiving and Backup in the AWS Cloud with Amazon Glacier
Cost Effective Archiving and Backup in the AWS Cloud with Amazon GlacierAmazon Web Services
 
Redis Labs and SQL Server
Redis Labs and SQL ServerRedis Labs and SQL Server
Redis Labs and SQL ServerLynn Langit
 
Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...
Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...
Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...Amazon Web Services
 
Overview and Best Practices for Amazon Elastic Block Store - September 2016 W...
Overview and Best Practices for Amazon Elastic Block Store - September 2016 W...Overview and Best Practices for Amazon Elastic Block Store - September 2016 W...
Overview and Best Practices for Amazon Elastic Block Store - September 2016 W...Amazon Web Services
 
Introduction to Amazon Relational Database Service
Introduction to Amazon Relational Database ServiceIntroduction to Amazon Relational Database Service
Introduction to Amazon Relational Database ServiceAmazon Web Services
 
GDG Ternopil TechTalks Web #1 2015 - Data storages in Microsoft Azure
GDG Ternopil TechTalks Web #1 2015 - Data storages in Microsoft AzureGDG Ternopil TechTalks Web #1 2015 - Data storages in Microsoft Azure
GDG Ternopil TechTalks Web #1 2015 - Data storages in Microsoft AzureAndriy Deren'
 
Introduction to AWS Outposts
Introduction to AWS OutpostsIntroduction to AWS Outposts
Introduction to AWS OutpostsScyllaDB
 
Data storage for the cloud ce11
Data storage for the cloud ce11Data storage for the cloud ce11
Data storage for the cloud ce11CloudExpoEurope
 
Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...
Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...
Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...NetApp
 
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksMigrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksAmazon Web Services
 

La actualidad más candente (20)

Performance Demystified for SQL Server on Azure Virtual Machines
Performance Demystified for SQL Server on Azure Virtual MachinesPerformance Demystified for SQL Server on Azure Virtual Machines
Performance Demystified for SQL Server on Azure Virtual Machines
 
Kenshoo - Use Hadoop, One Week, No Coding
Kenshoo - Use Hadoop, One Week, No CodingKenshoo - Use Hadoop, One Week, No Coding
Kenshoo - Use Hadoop, One Week, No Coding
 
Cloud Storage in Azure, AWS and Google Cloud
Cloud  Storage in Azure, AWS and Google CloudCloud  Storage in Azure, AWS and Google Cloud
Cloud Storage in Azure, AWS and Google Cloud
 
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
 
Persistent Storage for Containerized Applications
Persistent Storage for Containerized ApplicationsPersistent Storage for Containerized Applications
Persistent Storage for Containerized Applications
 
Compare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBCompare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDB
 
[Pgday.Seoul 2018] PostgreSQL 성능을 위해 개발된 라이브러리 OS 소개 apposha
[Pgday.Seoul 2018]  PostgreSQL 성능을 위해 개발된 라이브러리 OS 소개 apposha[Pgday.Seoul 2018]  PostgreSQL 성능을 위해 개발된 라이브러리 OS 소개 apposha
[Pgday.Seoul 2018] PostgreSQL 성능을 위해 개발된 라이브러리 OS 소개 apposha
 
Sql saturday azure storage by Anton Vidishchev
Sql saturday azure storage by Anton VidishchevSql saturday azure storage by Anton Vidishchev
Sql saturday azure storage by Anton Vidishchev
 
MongoDB and AWS: Integrations
MongoDB and AWS: IntegrationsMongoDB and AWS: Integrations
MongoDB and AWS: Integrations
 
Journey Through the AWS Cloud; Storage and Archiving
Journey Through the AWS Cloud; Storage and ArchivingJourney Through the AWS Cloud; Storage and Archiving
Journey Through the AWS Cloud; Storage and Archiving
 
Cost Effective Archiving and Backup in the AWS Cloud with Amazon Glacier
Cost Effective Archiving and Backup in the AWS Cloud with Amazon GlacierCost Effective Archiving and Backup in the AWS Cloud with Amazon Glacier
Cost Effective Archiving and Backup in the AWS Cloud with Amazon Glacier
 
Redis Labs and SQL Server
Redis Labs and SQL ServerRedis Labs and SQL Server
Redis Labs and SQL Server
 
Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...
Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...
Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...
 
Overview and Best Practices for Amazon Elastic Block Store - September 2016 W...
Overview and Best Practices for Amazon Elastic Block Store - September 2016 W...Overview and Best Practices for Amazon Elastic Block Store - September 2016 W...
Overview and Best Practices for Amazon Elastic Block Store - September 2016 W...
 
Introduction to Amazon Relational Database Service
Introduction to Amazon Relational Database ServiceIntroduction to Amazon Relational Database Service
Introduction to Amazon Relational Database Service
 
GDG Ternopil TechTalks Web #1 2015 - Data storages in Microsoft Azure
GDG Ternopil TechTalks Web #1 2015 - Data storages in Microsoft AzureGDG Ternopil TechTalks Web #1 2015 - Data storages in Microsoft Azure
GDG Ternopil TechTalks Web #1 2015 - Data storages in Microsoft Azure
 
Introduction to AWS Outposts
Introduction to AWS OutpostsIntroduction to AWS Outposts
Introduction to AWS Outposts
 
Data storage for the cloud ce11
Data storage for the cloud ce11Data storage for the cloud ce11
Data storage for the cloud ce11
 
Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...
Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...
Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...
 
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksMigrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
 

Similar a Altoros Systems CTO discusses big data architecture design

Best Practices for running the Oracle Database on EC2 webinar
Best Practices for running the Oracle Database on EC2 webinarBest Practices for running the Oracle Database on EC2 webinar
Best Practices for running the Oracle Database on EC2 webinarTom Laszewski
 
Migrating Oracle Databases to AWS
Migrating Oracle Databases to AWSMigrating Oracle Databases to AWS
Migrating Oracle Databases to AWSAWS Germany
 
Postgres for Digital Transformation: NoSQL Features, Replication, FDW & More
Postgres for Digital Transformation:NoSQL Features, Replication, FDW & MorePostgres for Digital Transformation:NoSQL Features, Replication, FDW & More
Postgres for Digital Transformation: NoSQL Features, Replication, FDW & MoreAshnikbiz
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWSMigrating enterprise workloads to AWS
Migrating enterprise workloads to AWSTom Laszewski
 
Moving to the cloud; PaaS, IaaS or Managed Instance
Moving to the cloud; PaaS, IaaS or Managed InstanceMoving to the cloud; PaaS, IaaS or Managed Instance
Moving to the cloud; PaaS, IaaS or Managed InstanceThomas Sykes
 
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)Jeff Chu
 
Experience sql server on l inux and docker
Experience sql server on l inux and dockerExperience sql server on l inux and docker
Experience sql server on l inux and dockerBob Ward
 
More Cache for Less Cash
More Cache for Less CashMore Cache for Less Cash
More Cache for Less CashMichael Collier
 
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloudA1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloudDr. Wilfred Lin (Ph.D.)
 
AWS Webcast - AWS Webinar Series for Education #2 - Getting Started with AWS
AWS Webcast - AWS Webinar Series for Education #2 - Getting Started with AWSAWS Webcast - AWS Webinar Series for Education #2 - Getting Started with AWS
AWS Webcast - AWS Webinar Series for Education #2 - Getting Started with AWSAmazon Web Services
 
AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...
AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...
AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...Amazon Web Services
 
IT Press Tour #17 - OpenIO & Technology
IT Press Tour #17 - OpenIO & TechnologyIT Press Tour #17 - OpenIO & Technology
IT Press Tour #17 - OpenIO & TechnologyOpenIO Object Storage
 
Azure SQL Database
Azure SQL DatabaseAzure SQL Database
Azure SQL Databaserockplace
 
Beyond EBS Stroage Alternatives in the Cloud
Beyond EBS Stroage Alternatives in the CloudBeyond EBS Stroage Alternatives in the Cloud
Beyond EBS Stroage Alternatives in the CloudNetApp
 
AWS Webcast - Webinar Series for State and Local Government #2: Discover the ...
AWS Webcast - Webinar Series for State and Local Government #2: Discover the ...AWS Webcast - Webinar Series for State and Local Government #2: Discover the ...
AWS Webcast - Webinar Series for State and Local Government #2: Discover the ...Amazon Web Services
 
Harness the Power of Hybrid Cloud with AWS and Avere
Harness the Power of Hybrid Cloud with AWS and AvereHarness the Power of Hybrid Cloud with AWS and Avere
Harness the Power of Hybrid Cloud with AWS and AvereAmazon Web Services
 
KoprowskiT_SQLRelay2014#8_Birmingham_FromPlanToBackupToCloud
KoprowskiT_SQLRelay2014#8_Birmingham_FromPlanToBackupToCloudKoprowskiT_SQLRelay2014#8_Birmingham_FromPlanToBackupToCloud
KoprowskiT_SQLRelay2014#8_Birmingham_FromPlanToBackupToCloudTobias Koprowski
 
AWS Public Cloud solution for ABC Corporation
AWS Public Cloud solution for ABC CorporationAWS Public Cloud solution for ABC Corporation
AWS Public Cloud solution for ABC CorporationManpreet Sidhu
 

Similar a Altoros Systems CTO discusses big data architecture design (20)

Best Practices for running the Oracle Database on EC2 webinar
Best Practices for running the Oracle Database on EC2 webinarBest Practices for running the Oracle Database on EC2 webinar
Best Practices for running the Oracle Database on EC2 webinar
 
Migrating Oracle Databases to AWS
Migrating Oracle Databases to AWSMigrating Oracle Databases to AWS
Migrating Oracle Databases to AWS
 
Postgres for Digital Transformation: NoSQL Features, Replication, FDW & More
Postgres for Digital Transformation:NoSQL Features, Replication, FDW & MorePostgres for Digital Transformation:NoSQL Features, Replication, FDW & More
Postgres for Digital Transformation: NoSQL Features, Replication, FDW & More
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWSMigrating enterprise workloads to AWS
Migrating enterprise workloads to AWS
 
Moving to the cloud; PaaS, IaaS or Managed Instance
Moving to the cloud; PaaS, IaaS or Managed InstanceMoving to the cloud; PaaS, IaaS or Managed Instance
Moving to the cloud; PaaS, IaaS or Managed Instance
 
IaaS azure_vs_amazon
IaaS azure_vs_amazonIaaS azure_vs_amazon
IaaS azure_vs_amazon
 
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
 
Experience sql server on l inux and docker
Experience sql server on l inux and dockerExperience sql server on l inux and docker
Experience sql server on l inux and docker
 
More Cache for Less Cash
More Cache for Less CashMore Cache for Less Cash
More Cache for Less Cash
 
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloudA1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
A1 keynote oracle_infrastructure_as_a_service_move_any_workload_to_the_cloud
 
AWS Webcast - AWS Webinar Series for Education #2 - Getting Started with AWS
AWS Webcast - AWS Webinar Series for Education #2 - Getting Started with AWSAWS Webcast - AWS Webinar Series for Education #2 - Getting Started with AWS
AWS Webcast - AWS Webinar Series for Education #2 - Getting Started with AWS
 
AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...
AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...
AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...
 
IT Press Tour #17 - OpenIO & Technology
IT Press Tour #17 - OpenIO & TechnologyIT Press Tour #17 - OpenIO & Technology
IT Press Tour #17 - OpenIO & Technology
 
Azure SQL Database
Azure SQL DatabaseAzure SQL Database
Azure SQL Database
 
AWS Webcast - Website Hosting
AWS Webcast - Website HostingAWS Webcast - Website Hosting
AWS Webcast - Website Hosting
 
Beyond EBS Stroage Alternatives in the Cloud
Beyond EBS Stroage Alternatives in the CloudBeyond EBS Stroage Alternatives in the Cloud
Beyond EBS Stroage Alternatives in the Cloud
 
AWS Webcast - Webinar Series for State and Local Government #2: Discover the ...
AWS Webcast - Webinar Series for State and Local Government #2: Discover the ...AWS Webcast - Webinar Series for State and Local Government #2: Discover the ...
AWS Webcast - Webinar Series for State and Local Government #2: Discover the ...
 
Harness the Power of Hybrid Cloud with AWS and Avere
Harness the Power of Hybrid Cloud with AWS and AvereHarness the Power of Hybrid Cloud with AWS and Avere
Harness the Power of Hybrid Cloud with AWS and Avere
 
KoprowskiT_SQLRelay2014#8_Birmingham_FromPlanToBackupToCloud
KoprowskiT_SQLRelay2014#8_Birmingham_FromPlanToBackupToCloudKoprowskiT_SQLRelay2014#8_Birmingham_FromPlanToBackupToCloud
KoprowskiT_SQLRelay2014#8_Birmingham_FromPlanToBackupToCloud
 
AWS Public Cloud solution for ABC Corporation
AWS Public Cloud solution for ABC CorporationAWS Public Cloud solution for ABC Corporation
AWS Public Cloud solution for ABC Corporation
 

Más de Altoros

Maturing with Kubernetes
Maturing with KubernetesMaturing with Kubernetes
Maturing with KubernetesAltoros
 
Kubernetes Platform Readiness and Maturity Assessment
Kubernetes Platform Readiness and Maturity AssessmentKubernetes Platform Readiness and Maturity Assessment
Kubernetes Platform Readiness and Maturity AssessmentAltoros
 
Journey Through Four Stages of Kubernetes Deployment Maturity
Journey Through Four Stages of Kubernetes Deployment MaturityJourney Through Four Stages of Kubernetes Deployment Maturity
Journey Through Four Stages of Kubernetes Deployment MaturityAltoros
 
SGX: Improving Privacy, Security, and Trust Across Blockchain Networks
SGX: Improving Privacy, Security, and Trust Across Blockchain NetworksSGX: Improving Privacy, Security, and Trust Across Blockchain Networks
SGX: Improving Privacy, Security, and Trust Across Blockchain NetworksAltoros
 
Using the Cloud Foundry and Kubernetes Stack as a Part of a Blockchain CI/CD ...
Using the Cloud Foundry and Kubernetes Stack as a Part of a Blockchain CI/CD ...Using the Cloud Foundry and Kubernetes Stack as a Part of a Blockchain CI/CD ...
Using the Cloud Foundry and Kubernetes Stack as a Part of a Blockchain CI/CD ...Altoros
 
A Zero-Knowledge Proof: Improving Privacy on a Blockchain
A Zero-Knowledge Proof:  Improving Privacy on a BlockchainA Zero-Knowledge Proof:  Improving Privacy on a Blockchain
A Zero-Knowledge Proof: Improving Privacy on a BlockchainAltoros
 
Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.Altoros
 
Containers and Kubernetes
Containers and KubernetesContainers and Kubernetes
Containers and KubernetesAltoros
 
Distributed Ledger Technology for Over-the-Counter Trading
Distributed Ledger Technology for Over-the-Counter TradingDistributed Ledger Technology for Over-the-Counter Trading
Distributed Ledger Technology for Over-the-Counter TradingAltoros
 
5-Step Deployment of Hyperledger Fabric on Multiple Nodes
5-Step Deployment of Hyperledger Fabric on Multiple Nodes5-Step Deployment of Hyperledger Fabric on Multiple Nodes
5-Step Deployment of Hyperledger Fabric on Multiple NodesAltoros
 
Deploying Kubernetes on GCP with Kubespray
Deploying Kubernetes on GCP with KubesprayDeploying Kubernetes on GCP with Kubespray
Deploying Kubernetes on GCP with KubesprayAltoros
 
UAA for Kubernetes
UAA for KubernetesUAA for Kubernetes
UAA for KubernetesAltoros
 
Troubleshooting .NET Applications on Cloud Foundry
Troubleshooting .NET Applications on Cloud FoundryTroubleshooting .NET Applications on Cloud Foundry
Troubleshooting .NET Applications on Cloud FoundryAltoros
 
Continuous Integration and Deployment with Jenkins for PCF
Continuous Integration and Deployment with Jenkins for PCFContinuous Integration and Deployment with Jenkins for PCF
Continuous Integration and Deployment with Jenkins for PCFAltoros
 
How to Never Leave Your Deployment Unattended
How to Never Leave Your Deployment UnattendedHow to Never Leave Your Deployment Unattended
How to Never Leave Your Deployment UnattendedAltoros
 
Cloud Foundry Monitoring How-To: Collecting Metrics and Logs
Cloud Foundry Monitoring How-To: Collecting Metrics and LogsCloud Foundry Monitoring How-To: Collecting Metrics and Logs
Cloud Foundry Monitoring How-To: Collecting Metrics and LogsAltoros
 
Smart Baggage Tracking: End-to-End Sensor-Based Solution
Smart Baggage Tracking: End-to-End Sensor-Based SolutionSmart Baggage Tracking: End-to-End Sensor-Based Solution
Smart Baggage Tracking: End-to-End Sensor-Based SolutionAltoros
 
Navigating the Ecosystem of Pivotal Cloud Foundry Tiles
Navigating the Ecosystem of Pivotal Cloud Foundry TilesNavigating the Ecosystem of Pivotal Cloud Foundry Tiles
Navigating the Ecosystem of Pivotal Cloud Foundry TilesAltoros
 
AI as a Catalyst for IoT
AI as a Catalyst for IoTAI as a Catalyst for IoT
AI as a Catalyst for IoTAltoros
 
Over-Engineering: Causes, Symptoms, and Treatment
Over-Engineering: Causes, Symptoms, and TreatmentOver-Engineering: Causes, Symptoms, and Treatment
Over-Engineering: Causes, Symptoms, and TreatmentAltoros
 

Más de Altoros (20)

Maturing with Kubernetes
Maturing with KubernetesMaturing with Kubernetes
Maturing with Kubernetes
 
Kubernetes Platform Readiness and Maturity Assessment
Kubernetes Platform Readiness and Maturity AssessmentKubernetes Platform Readiness and Maturity Assessment
Kubernetes Platform Readiness and Maturity Assessment
 
Journey Through Four Stages of Kubernetes Deployment Maturity
Journey Through Four Stages of Kubernetes Deployment MaturityJourney Through Four Stages of Kubernetes Deployment Maturity
Journey Through Four Stages of Kubernetes Deployment Maturity
 
SGX: Improving Privacy, Security, and Trust Across Blockchain Networks
SGX: Improving Privacy, Security, and Trust Across Blockchain NetworksSGX: Improving Privacy, Security, and Trust Across Blockchain Networks
SGX: Improving Privacy, Security, and Trust Across Blockchain Networks
 
Using the Cloud Foundry and Kubernetes Stack as a Part of a Blockchain CI/CD ...
Using the Cloud Foundry and Kubernetes Stack as a Part of a Blockchain CI/CD ...Using the Cloud Foundry and Kubernetes Stack as a Part of a Blockchain CI/CD ...
Using the Cloud Foundry and Kubernetes Stack as a Part of a Blockchain CI/CD ...
 
A Zero-Knowledge Proof: Improving Privacy on a Blockchain
A Zero-Knowledge Proof:  Improving Privacy on a BlockchainA Zero-Knowledge Proof:  Improving Privacy on a Blockchain
A Zero-Knowledge Proof: Improving Privacy on a Blockchain
 
Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.
 
Containers and Kubernetes
Containers and KubernetesContainers and Kubernetes
Containers and Kubernetes
 
Distributed Ledger Technology for Over-the-Counter Trading
Distributed Ledger Technology for Over-the-Counter TradingDistributed Ledger Technology for Over-the-Counter Trading
Distributed Ledger Technology for Over-the-Counter Trading
 
5-Step Deployment of Hyperledger Fabric on Multiple Nodes
5-Step Deployment of Hyperledger Fabric on Multiple Nodes5-Step Deployment of Hyperledger Fabric on Multiple Nodes
5-Step Deployment of Hyperledger Fabric on Multiple Nodes
 
Deploying Kubernetes on GCP with Kubespray
Deploying Kubernetes on GCP with KubesprayDeploying Kubernetes on GCP with Kubespray
Deploying Kubernetes on GCP with Kubespray
 
UAA for Kubernetes
UAA for KubernetesUAA for Kubernetes
UAA for Kubernetes
 
Troubleshooting .NET Applications on Cloud Foundry
Troubleshooting .NET Applications on Cloud FoundryTroubleshooting .NET Applications on Cloud Foundry
Troubleshooting .NET Applications on Cloud Foundry
 
Continuous Integration and Deployment with Jenkins for PCF
Continuous Integration and Deployment with Jenkins for PCFContinuous Integration and Deployment with Jenkins for PCF
Continuous Integration and Deployment with Jenkins for PCF
 
How to Never Leave Your Deployment Unattended
How to Never Leave Your Deployment UnattendedHow to Never Leave Your Deployment Unattended
How to Never Leave Your Deployment Unattended
 
Cloud Foundry Monitoring How-To: Collecting Metrics and Logs
Cloud Foundry Monitoring How-To: Collecting Metrics and LogsCloud Foundry Monitoring How-To: Collecting Metrics and Logs
Cloud Foundry Monitoring How-To: Collecting Metrics and Logs
 
Smart Baggage Tracking: End-to-End Sensor-Based Solution
Smart Baggage Tracking: End-to-End Sensor-Based SolutionSmart Baggage Tracking: End-to-End Sensor-Based Solution
Smart Baggage Tracking: End-to-End Sensor-Based Solution
 
Navigating the Ecosystem of Pivotal Cloud Foundry Tiles
Navigating the Ecosystem of Pivotal Cloud Foundry TilesNavigating the Ecosystem of Pivotal Cloud Foundry Tiles
Navigating the Ecosystem of Pivotal Cloud Foundry Tiles
 
AI as a Catalyst for IoT
AI as a Catalyst for IoTAI as a Catalyst for IoT
AI as a Catalyst for IoT
 
Over-Engineering: Causes, Symptoms, and Treatment
Over-Engineering: Causes, Symptoms, and TreatmentOver-Engineering: Causes, Symptoms, and Treatment
Over-Engineering: Causes, Symptoms, and Treatment
 

Último

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Último (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

Altoros Systems CTO discusses big data architecture design

  • 1. © ALTOROS Systems | CONFIDENTIAL Andrei Yurkevich Chief Technology Officer andrei.yurkevich@altoros.com
  • 2. © ALTOROS Systems | CONFIDENTIAL 2 • Hadoop/NoSQL performance engineering • Cluster Automation & Server Templates on Joyent, AWS, SoftLayer, Rackspace, CloudStack and OpenStack using Chef/Puppet, RightScale and SCALR • 300+ employees globally (UK, USA, Denmark, Switzerland, Norway, Belarus, Argentina) • v Featured customers Partners
  • 3. © ALTOROS Systems | CONFIDENTIAL 3
  • 4. © ALTOROS Systems | CONFIDENTIAL 4
  • 5. © ALTOROS Systems | CONFIDENTIAL 56 Combinations
  • 6. © ALTOROS Systems | CONFIDENTIAL 56 Combinations 15625
  • 7. © ALTOROS Systems | CONFIDENTIAL 7
  • 8. © ALTOROS Systems | CONFIDENTIAL 8 No clear business goals Big amounts of data from many sources Architecture design The variety of tools Compatibility of technologies/platforms Lack of professionals All features in one release Budget
  • 9. © ALTOROS Systems | CONFIDENTIAL 9
  • 10. © ALTOROS Systems | CONFIDENTIAL 10 Functional requirements Value Non-functional requirements The amount of data added daily: 2.5 TB • Infrastructure-independent architecture • Scalability • Open-source tools Data type:  raw data  processed data Data storage time:  raw data  Processed data  min a week  min a year Response time:  for building reports based on a pre-set template  for building reports for a custom period of time  < 30 sec  < 6 hours Uptime: 99% Fault-tolerance: required Deployment cost per day: < $1,000
  • 11. © ALTOROS Systems | CONFIDENTIAL 11 Amazon AWS Joyent Rackspace Types of a contract On Demand, Reserved, Spot On Demand, Reserved On Demand Types of instances (classified by compute units) • General Purpose • Compute optimized • Memory optimized • Storage optimized • Standard • High Memory • High CPU • High Storage • High I/O • General Purpose Storage options • EBS • S3 • Low-cost storage • Network storage based on ZFS • Cloud Block Storage • Cloud Files Operating systems Linux, Windows SmartOS, Linux, Windows Linux, Windows A management console AWS Console Joyent SmartDataCenter Cloud Control Panel A Cloud API • Command line interface • Java, .NET, Ruby SDK and API • Command line interface (CLI) • Node.js SDK • REST API REST API Regions America, Europe, Asia, Australia North America, Europe America, Europe, Asia, Australia Estimated cost per month $18,300 $17,500 $21,350
  • 12. © ALTOROS Systems | CONFIDENTIAL 12 a good fit a normal fit a bad fit Option 2 Option 1 Feature Amazon AWS Joyent Rackspace Types of a contract On Demand, Reserved, Spot On Demand, Reserved On Demand Types of instances (classified by compute units) • General Purpose • Compute optimized • Memory optimized • Storage optimized • Standard • High Memory • High CPU • High Storage • High I/O • General Purpose Storage options • EBS • S3 • Low-cost storage • Network storage based on ZFS • Cloud Block Storage • Cloud Files Operating systems Linux, Windows SmartOS, Linux, Windows Linux, Windows A management console AWS Console Joyent SmartDataCenter Cloud Control Panel A Cloud API • Command line interface • Java, .NET, Ruby SDK and API • Command line interface (CLI) • Node.js SDK • REST API REST API Regions America, Europe, Asia, Australia North America, Europe America, Europe, Asia, Australia Estimated cost per month $18,300 $17,500 $21,350 Score 1.5 3.5
  • 13. © ALTOROS Systems | CONFIDENTIAL 13 Features HBase Cassandra MongoDB MySQL Cluster License Apache Apache AGPL GPL Protocol HTTP/REST (also Thrift) Thrift and custom binary CQL3 Custom, binary (BSON) JDBC, ODBC Data model Column family Column family JSON documents Tables Queries / Query Language JRuby-based (JIRB) shell Cassandra Query Language JavaScript expressions SQL Partitioning Strategy Ordered Partitioning Random Partitioning Sharding by key Partition by key Replication between nodes yes yes yes yes Replication between data centers no yes no yes Capability to store 2.5 TB daily yes yes yes yes Implementation Experience 1+ 1+ 2+ 5+ Score 2 3 2 5 a good fit a normal fit a bad fit
  • 14. © ALTOROS Systems | CONFIDENTIAL 14 Features HBase Cassandra MongoDB MySQL Cluster License Apache Apache AGPL GPL Protocol HTTP/REST (also Thrift) Thrift and custom binary CQL3 Custom, binary (BSON) JDBC, ODBC Data model Column family Column family JSON documents Tables Queries / Query Language JRuby-based (JIRB) shell Cassandra Query Language JavaScript expressions SQL Partitioning Strategy Ordered Partitioning Random Partitioning Sharding by key Partition by key Replication between data centers no yes no yes Capability to store 2.5 TB daily yes yes yes yes Implementation Experience 1+ 1+ 2+ 5+ Deployment cost per day $450 $400 $500 $1,500 Score 2.5 4 2.5 0 a good fit a normal fit a bad fit
  • 15. © ALTOROS Systems | CONFIDENTIAL 15
  • 16. © ALTOROS Systems | CONFIDENTIAL 16 Feature HBase Cassandra MongoDB Replication between data centers Asynchronous, needs testing Replicas can span data centers with synchronous replication Not supported A cluster admin node NameNode Any node mongos process Implementation Experience 1+ 1+ 2+ Time spent on inserting 30 MB of data 7 sec 9 sec 20 sec Deployment cost per day $450 $400 $500 Score 2 2.5 0 a good fit a normal fit a bad fit
  • 17. © ALTOROS Systems | CONFIDENTIAL 17
  • 18. © ALTOROS Systems | CONFIDENTIAL 18
  • 19. © ALTOROS Systems | CONFIDENTIAL 19 A requirement The prototype features Storing of 2.5 TB of daily raw data for a week Capable Storing of 1.5 TB of processed data for a year Capable Response time for building reports based on a pre-set template ~25 sec Response time of less than 6 hours for building a custom report ~7 hours Scalability Good Infrastructure Independence Yes Using open-source tools For all components Fault-tolerance Yes Deployment cost per day < $1,000 ~$600
  • 20. © ALTOROS Systems | CONFIDENTIAL Properly visualize and test the functionality Detect bottlenecks and change a technology/tool/database before it was implemented in the real system Get a real vision of the final solution Make sure you stick to the budget 20
  • 21. © ALTOROS Systems | CONFIDENTIAL 21 Andrei Yurkevich President/CTO andrei.yurkevich@altoros.com

Notas del editor

  1. VolumeVelocityVarietyWhere to start?
  2. Everything seemed to be smooth. However, there was just one slight detail about MySQL Cluster. Its architecture requires putting all data into RAM, so we needed a cluster that would have 2.5 TB of RAM. The actual deployment cost was about $500 up the budget. So, we had to start from scratch again.
  3. HBase was 2 seconds faster than Cassandra but what about fault tolerance? HBase has additional node that serves as a coordinator for the entire system. If it fails – the system fails. Surely we can add a secondary management node, but then we may exceed the budget. Cassandra has decentralized architecture it means that all nodes of its cluster have equal roles and every node can serve as a coordinator. It makes this database extremely fault tolerant. 
  4. raw data – is all data that comes from sensorsprocessed data – is the data that was aggregated for each 10 minutes. This data is used for building reports.