SlideShare una empresa de Scribd logo
1 de 9
Descargar para leer sin conexión
Small Data Sets to Scale:
  Planning for the Evolution of Data

      Poornima Vijayashanker
     CEO & Founder BizeeBee
     poornima@bizeebee.com
           @poornima
       www.femgineer.com
AGENDA
I. Stealth Mode - “pre-data” phase
II. Launch
III. Compute Growth Rate
IV. Optimizations
V. Data Storage
Pre-Data
Stealth Mode - “pre-data” phase
Small initial data set
  Easy storage
  Storage solutions like Heroku, RackSpace
  Design features around it

Simplicity of Storage v. Complexity of Design
  e.g. Mint - 3 months of financial data, FB - social graph is
 limited to universities
0 to 100k to 1M
0 - 100k easiest schema design
  Single DB - with user & static data
  Single instance of app accessing the db

100k - 1M+ time to re-design db and app
  Break up databases - user & static
  Multiple instances of the app
Growth Rate
What is your user growth rate?
  Basic unit e.g. Mint - transaction
  User generated content
  Size of unit e.g. FB - photo

Storage capacity v. Seek v. Size
Optimizations
Capacity - throw hardware
Seek - throw software
  Cache data

Size - design around it
  Limit usage size e.g. 4MB picture
Optimizations Cont’d
Code Level
  Processes - Computation v. Retrieval
  DB Techniques - Index, De-Normalize

Data Level
  Partioning: Siloed v. Interconnected
Data Storage
Single User’s Data v. Aggregated Data
  Single user’s data v. data aggregated across users
  e.g Mint - Spending Trends
  Scheme to compute, store, and retrieve aggregated data
Conclusion
  Start small - provide enough value to user
  Monitor & project growth rate of data
  Break data apart
  Simple optimizations - indexing, de-
normalizing, caching
  Large data sets - warehousing, partitioning
db
  Hiring designer & engineer for BizeeBee :)

Más contenido relacionado

Más de Mediabistro

Chris Leigh-Lancaster_Inside 3D Printing Melbourne
Chris Leigh-Lancaster_Inside 3D Printing MelbourneChris Leigh-Lancaster_Inside 3D Printing Melbourne
Chris Leigh-Lancaster_Inside 3D Printing MelbourneMediabistro
 
Terry Wohlers_Inside 3D Printing Melbourne
Terry Wohlers_Inside 3D Printing MelbourneTerry Wohlers_Inside 3D Printing Melbourne
Terry Wohlers_Inside 3D Printing MelbourneMediabistro
 
2014 07-09 Juan Llanos Presentation
2014 07-09 Juan Llanos Presentation2014 07-09 Juan Llanos Presentation
2014 07-09 Juan Llanos PresentationMediabistro
 
Gary Anderson_Inside 3D Printing Melbourne
Gary Anderson_Inside 3D Printing MelbourneGary Anderson_Inside 3D Printing Melbourne
Gary Anderson_Inside 3D Printing MelbourneMediabistro
 
James canning inside bitcoin melbourne final
James canning inside bitcoin melbourne finalJames canning inside bitcoin melbourne final
James canning inside bitcoin melbourne finalMediabistro
 
Gst & bitcoins slides- Potential Pitfalls
Gst & bitcoins slides- Potential PitfallsGst & bitcoins slides- Potential Pitfalls
Gst & bitcoins slides- Potential PitfallsMediabistro
 
Building a trading platform from scratch
Building a trading platform from scratchBuilding a trading platform from scratch
Building a trading platform from scratchMediabistro
 
Bitcoin Lateral Economics
Bitcoin Lateral EconomicsBitcoin Lateral Economics
Bitcoin Lateral EconomicsMediabistro
 
State of Ethereum, and Mining
State of Ethereum, and MiningState of Ethereum, and Mining
State of Ethereum, and MiningMediabistro
 
Future of Bitcoin Mining- Josh Zerlan
Future of Bitcoin Mining- Josh ZerlanFuture of Bitcoin Mining- Josh Zerlan
Future of Bitcoin Mining- Josh ZerlanMediabistro
 
Evan Wagner and Robby Dermody Presentation
Evan Wagner and Robby Dermody PresentationEvan Wagner and Robby Dermody Presentation
Evan Wagner and Robby Dermody PresentationMediabistro
 
Morning Keynote: Bobby Lee
Morning Keynote: Bobby LeeMorning Keynote: Bobby Lee
Morning Keynote: Bobby LeeMediabistro
 
Yuan Bao Presentation
Yuan Bao PresentationYuan Bao Presentation
Yuan Bao PresentationMediabistro
 
Bitcoin derivatives
Bitcoin derivativesBitcoin derivatives
Bitcoin derivativesMediabistro
 
Inside3 d printing_brianfederal
Inside3 d printing_brianfederalInside3 d printing_brianfederal
Inside3 d printing_brianfederalMediabistro
 
3 d printing_paultrani
3 d printing_paultrani3 d printing_paultrani
3 d printing_paultraniMediabistro
 
Inside3DPrinting_marktrageser
Inside3DPrinting_marktrageserInside3DPrinting_marktrageser
Inside3DPrinting_marktrageserMediabistro
 
Inside3DPrinting_johnhornick
Inside3DPrinting_johnhornickInside3DPrinting_johnhornick
Inside3DPrinting_johnhornickMediabistro
 
Inisde3DPrinting_naturalmachines
Inisde3DPrinting_naturalmachinesInisde3DPrinting_naturalmachines
Inisde3DPrinting_naturalmachinesMediabistro
 

Más de Mediabistro (20)

Chris Leigh-Lancaster_Inside 3D Printing Melbourne
Chris Leigh-Lancaster_Inside 3D Printing MelbourneChris Leigh-Lancaster_Inside 3D Printing Melbourne
Chris Leigh-Lancaster_Inside 3D Printing Melbourne
 
Terry Wohlers_Inside 3D Printing Melbourne
Terry Wohlers_Inside 3D Printing MelbourneTerry Wohlers_Inside 3D Printing Melbourne
Terry Wohlers_Inside 3D Printing Melbourne
 
2014 07-09 Juan Llanos Presentation
2014 07-09 Juan Llanos Presentation2014 07-09 Juan Llanos Presentation
2014 07-09 Juan Llanos Presentation
 
Gary Anderson_Inside 3D Printing Melbourne
Gary Anderson_Inside 3D Printing MelbourneGary Anderson_Inside 3D Printing Melbourne
Gary Anderson_Inside 3D Printing Melbourne
 
James canning inside bitcoin melbourne final
James canning inside bitcoin melbourne finalJames canning inside bitcoin melbourne final
James canning inside bitcoin melbourne final
 
Gst & bitcoins slides- Potential Pitfalls
Gst & bitcoins slides- Potential PitfallsGst & bitcoins slides- Potential Pitfalls
Gst & bitcoins slides- Potential Pitfalls
 
Building a trading platform from scratch
Building a trading platform from scratchBuilding a trading platform from scratch
Building a trading platform from scratch
 
Bitcoin Lateral Economics
Bitcoin Lateral EconomicsBitcoin Lateral Economics
Bitcoin Lateral Economics
 
State of Ethereum, and Mining
State of Ethereum, and MiningState of Ethereum, and Mining
State of Ethereum, and Mining
 
Future of Bitcoin Mining- Josh Zerlan
Future of Bitcoin Mining- Josh ZerlanFuture of Bitcoin Mining- Josh Zerlan
Future of Bitcoin Mining- Josh Zerlan
 
Evan Wagner and Robby Dermody Presentation
Evan Wagner and Robby Dermody PresentationEvan Wagner and Robby Dermody Presentation
Evan Wagner and Robby Dermody Presentation
 
Crypto Law
Crypto LawCrypto Law
Crypto Law
 
Morning Keynote: Bobby Lee
Morning Keynote: Bobby LeeMorning Keynote: Bobby Lee
Morning Keynote: Bobby Lee
 
Yuan Bao Presentation
Yuan Bao PresentationYuan Bao Presentation
Yuan Bao Presentation
 
Bitcoin derivatives
Bitcoin derivativesBitcoin derivatives
Bitcoin derivatives
 
Inside3 d printing_brianfederal
Inside3 d printing_brianfederalInside3 d printing_brianfederal
Inside3 d printing_brianfederal
 
3 d printing_paultrani
3 d printing_paultrani3 d printing_paultrani
3 d printing_paultrani
 
Inside3DPrinting_marktrageser
Inside3DPrinting_marktrageserInside3DPrinting_marktrageser
Inside3DPrinting_marktrageser
 
Inside3DPrinting_johnhornick
Inside3DPrinting_johnhornickInside3DPrinting_johnhornick
Inside3DPrinting_johnhornick
 
Inisde3DPrinting_naturalmachines
Inisde3DPrinting_naturalmachinesInisde3DPrinting_naturalmachines
Inisde3DPrinting_naturalmachines
 

Último

ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 

Último (20)

ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 

P. Vijayashanker From Small Datasets to Scale: Planning for the Evolution of Data Social Developer Summit

  • 1. Small Data Sets to Scale: Planning for the Evolution of Data Poornima Vijayashanker CEO & Founder BizeeBee poornima@bizeebee.com @poornima www.femgineer.com
  • 2. AGENDA I. Stealth Mode - “pre-data” phase II. Launch III. Compute Growth Rate IV. Optimizations V. Data Storage
  • 3. Pre-Data Stealth Mode - “pre-data” phase Small initial data set Easy storage Storage solutions like Heroku, RackSpace Design features around it Simplicity of Storage v. Complexity of Design e.g. Mint - 3 months of financial data, FB - social graph is limited to universities
  • 4. 0 to 100k to 1M 0 - 100k easiest schema design Single DB - with user & static data Single instance of app accessing the db 100k - 1M+ time to re-design db and app Break up databases - user & static Multiple instances of the app
  • 5. Growth Rate What is your user growth rate? Basic unit e.g. Mint - transaction User generated content Size of unit e.g. FB - photo Storage capacity v. Seek v. Size
  • 6. Optimizations Capacity - throw hardware Seek - throw software Cache data Size - design around it Limit usage size e.g. 4MB picture
  • 7. Optimizations Cont’d Code Level Processes - Computation v. Retrieval DB Techniques - Index, De-Normalize Data Level Partioning: Siloed v. Interconnected
  • 8. Data Storage Single User’s Data v. Aggregated Data Single user’s data v. data aggregated across users e.g Mint - Spending Trends Scheme to compute, store, and retrieve aggregated data
  • 9. Conclusion Start small - provide enough value to user Monitor & project growth rate of data Break data apart Simple optimizations - indexing, de- normalizing, caching Large data sets - warehousing, partitioning db Hiring designer & engineer for BizeeBee :)

Notas del editor