SlideShare a Scribd company logo
1 of 3
Thought about it…
                                                      Most of my wish list hasn’t changed
                                                      much
         An outside view…                               Sigmod 97 keynote about
                                                        search
                                                        CIDR 2003 keynote about new areas that
               Prof. Eric A. Brewer                     don’t fit DBMS well
                   UC Berkeley                        So, some review, some new stuff
            Intel Research (until July)




Proposal: Layered Database                           Example: Search Engines
 Pros:                                                No use of database technology
   Enable new database-like things                    Things that would have been helpful:
   Faster innovation for components                     High availability and replication
   Many parallel experiments (like Linux)               Atomic version vectors
   Should be public domain ideally                      Tools for new declarative languages
 Cons:                                                  Join machinery
   Hard to ensure global properties                   Not needed:
      But those that care will get them…
                                                        Complex locks, Query Optimization
 Closest is Berkeley DB (?)                             Transactions, Redo, Undo




Example: Scientific Computing                        Other Misfits
 Uses databases, but not a good fit                   Bioinformatics:
   Data often stored in files                           Wrong operators
   Most operators are outside the DBMS                  Need error propagation
   Database is an expensive replicated file system      Versioning, read mostly
   (in/out but no joins)
                                                      App Servers:
 Things that layered system might provide:
                                                        Session state, session migration
   Multi-version storage system
   New operators
                                                        App server will be a small database
   Tools for new declarative languages                Workflow




                                                                                                 1
So what happened?                                            Directions I’d like to see…
 Accepted: one size does not fit all…                         Integrated notion of statistics
 Couldn’t get much traction on layered                          Store the noise (rather than clean it)
 database                                                       Create cleaner views
 Built our own from scratch                                     Core probabilistic queries
   Stasis, Rusty Sears                                        Move away from update-in-place
   Open source, could be something special                      Many inputs are sacred (e.g. science)
 But big picture largely unchanged                              Transactional versioning
   Too hard to explore the fun spaces                           Provenance & annotation
   But layering DID happen!
      But whole database is now just transactional storage




Directions (2)                                               Many Core
 Better integration into PL                                   Hard to get any performance benefit for
 BASE semantics (not just ACID)                               I/O bound applications
 Repeated automatic extraction                                Main memory DB??
   Web crawlers do this                                         Limited by off-chip bandwidth
   Much of MapReduce workload                                   Need dataflow optimizations on/off chip
   Need to integrate with versioning,
   provenance, statistics
   Import is a continuous process, not an
   event




Backup                                                       1) Layering enables competition
                                                              Examples from OS community:
                                                                X86, SPEC benchmarks, Virtual machines
                                                                SCSI disks, RAID, NAS
                                                                Routers, Firewalls, Proxies
                                                              Some layers commodities (raw disks)
                                                              Some layers innovative (replication)
                                                              Always have unexpected uses




                                                                                                          2
2) Many more experiments                          3) Reduces Time to Market
 Centralized planning tries very few               Lower cost of entry
 things                                            More important:
                                                     Just good enough!
 Layering enables many more bets
                                                     Few global properties in early versions
   Also enables VC funding                              The web, search engines, even e-commerce
     Ex: IP layer, ASICs => networking startups         P2P
   Enables niche markets (lower cost of entry)          WebMethods

     Easier path for XML, bio, spatial, ….           Global properties added over time!

   Most bets fail, but some succeed                Ugly but fast wins the race…




Claims                                            Conclusions
 If you can’t control, then enable                 Can’t control (or predict) the future…
   This is the lesson from OS work for CIDR         better to enable broad innovation
   Unix, TCP enabled the web
     Neither attempted to control usage
                                                   Control
                                                     Make global properties tractable
   HTTP in turn enabled P2P                          But limits innovation
 DB research suffers from “Albatross 9i”
   Artifact hides the enabling technology          A public domain layered database:
   CIDR exists for this reason                       Would enable more innovation
                                                     Allow a broader range of properties




Rate of Innovation
 Claim: layering increases innovation

 1) Enables competition
 2) Many more experiments
 3) Reduces time to market




                                                                                                   3

More Related Content

More from infoblog

More from infoblog (11)

Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
 
Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)
 
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
 
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
 
Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)
 
Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)
 
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
 
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
 
Spot Sigs
Spot SigsSpot Sigs
Spot Sigs
 
Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)
 
Database Research Principles Revealed
Database Research Principles RevealedDatabase Research Principles Revealed
Database Research Principles Revealed
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

Claremont Report on Database Research: In Depth Talk (Eric A. Brewer)

  • 1. Thought about it… Most of my wish list hasn’t changed much An outside view… Sigmod 97 keynote about search CIDR 2003 keynote about new areas that Prof. Eric A. Brewer don’t fit DBMS well UC Berkeley So, some review, some new stuff Intel Research (until July) Proposal: Layered Database Example: Search Engines Pros: No use of database technology Enable new database-like things Things that would have been helpful: Faster innovation for components High availability and replication Many parallel experiments (like Linux) Atomic version vectors Should be public domain ideally Tools for new declarative languages Cons: Join machinery Hard to ensure global properties Not needed: But those that care will get them… Complex locks, Query Optimization Closest is Berkeley DB (?) Transactions, Redo, Undo Example: Scientific Computing Other Misfits Uses databases, but not a good fit Bioinformatics: Data often stored in files Wrong operators Most operators are outside the DBMS Need error propagation Database is an expensive replicated file system Versioning, read mostly (in/out but no joins) App Servers: Things that layered system might provide: Session state, session migration Multi-version storage system New operators App server will be a small database Tools for new declarative languages Workflow 1
  • 2. So what happened? Directions I’d like to see… Accepted: one size does not fit all… Integrated notion of statistics Couldn’t get much traction on layered Store the noise (rather than clean it) database Create cleaner views Built our own from scratch Core probabilistic queries Stasis, Rusty Sears Move away from update-in-place Open source, could be something special Many inputs are sacred (e.g. science) But big picture largely unchanged Transactional versioning Too hard to explore the fun spaces Provenance & annotation But layering DID happen! But whole database is now just transactional storage Directions (2) Many Core Better integration into PL Hard to get any performance benefit for BASE semantics (not just ACID) I/O bound applications Repeated automatic extraction Main memory DB?? Web crawlers do this Limited by off-chip bandwidth Much of MapReduce workload Need dataflow optimizations on/off chip Need to integrate with versioning, provenance, statistics Import is a continuous process, not an event Backup 1) Layering enables competition Examples from OS community: X86, SPEC benchmarks, Virtual machines SCSI disks, RAID, NAS Routers, Firewalls, Proxies Some layers commodities (raw disks) Some layers innovative (replication) Always have unexpected uses 2
  • 3. 2) Many more experiments 3) Reduces Time to Market Centralized planning tries very few Lower cost of entry things More important: Just good enough! Layering enables many more bets Few global properties in early versions Also enables VC funding The web, search engines, even e-commerce Ex: IP layer, ASICs => networking startups P2P Enables niche markets (lower cost of entry) WebMethods Easier path for XML, bio, spatial, …. Global properties added over time! Most bets fail, but some succeed Ugly but fast wins the race… Claims Conclusions If you can’t control, then enable Can’t control (or predict) the future… This is the lesson from OS work for CIDR better to enable broad innovation Unix, TCP enabled the web Neither attempted to control usage Control Make global properties tractable HTTP in turn enabled P2P But limits innovation DB research suffers from “Albatross 9i” Artifact hides the enabling technology A public domain layered database: CIDR exists for this reason Would enable more innovation Allow a broader range of properties Rate of Innovation Claim: layering increases innovation 1) Enables competition 2) Many more experiments 3) Reduces time to market 3