SlideShare a Scribd company logo
1 of 42
(R)Evolution in Database Systems
ο‚— RDBMS – The origins

ο‚— Concepts, Architecture and Principles

ο‚— Golden Age – Way of life.

ο‚— Changing Times– New Problems, New Needs

ο‚— Attack on the citadel - Revisiting the norms

ο‚— Ignited Minds – Working towards NoSQL Solutions

ο‚— Way Ahead– It is a Cloudy out there
ο‚— Girish Narasimha Raghavan


ο‚— Over 15 years experience building distributed, large
 scale and highly available enterprise systems.

ο‚— Current interest include build SAC (Social, Big Data
 Analytics, and Cloud) solutions.

ο‚— Likes to write and discuss technologies and its
 applications to solve real world problems.
  ο‚— http://randomtechthought.blogspot.com
ο‚— In the world data abounds. Always has and always will.
   ο‚— Record keeping is as old as Human race.
   ο‚— Consistent quest to improve storing , accessing, and analyzing
     records

ο‚— The early machines had serious shortcomings.
   ο‚— only a very limited amount of program code and data could be stored
     in memory.
   ο‚— Electromagnetic data storage was feasible only at an extremely high
     cost.

ο‚— Storing Data was an issue
   ο‚— Organizations had to store data – related to Administration,
     Research, Operations.
   ο‚— Data stored in proprietary formats – Database Systems did not exist
   ο‚— Plagued by data integrity issues
   ο‚— Non standard application logic for accessing stored data
ο‚— First attempt: File based systems
   ο‚— Data sets were growing and accumulating.
  ο‚— Data had to be managed at a detailed transaction level.
  ο‚— Computing systems started to be used for critical business
     needs.
  ο‚— Data inconsistency and redundancy.


ο‚— Enter Database Systems
  ο‚— Attempts to standardize the processes and rules to store and
    access data.
  ο‚— Intention to reuse, resell and redeploy solutions across
     organizations (with significant customizations).
  ο‚— Attempt to proactively manage Data Integrity and Quality.
ο‚— Database Systems and concepts Evolve
  ο‚— Hierarchical DBMS
     ο‚— Information represented using parent/child relationships
     ο‚— Tree structure is primary data structure.

  ο‚— Network DBMS
     ο‚— The relationships is represented in form of a network.
     ο‚— Graph is the primary data structure.

ο‚— Challenges Galore
  ο‚— Hardware Dependency – Software strongly dependent on the
    underlying hardware.
  ο‚— Modeling challenges – Representing data under a common
    structure.
  ο‚— Integration issues - Integrating across dependent packages was a
    nightmare.
  ο‚— Introducing new functionality and updates - Solution providers
    struggled with it across customized software deployment.
Father of the Relational
      Database model

       Edgar F Codd

A British Computer Scientist
    who made significant
contributions to the theory of
  Relational Databases while
       working for IBM.
ο‚— Landmark Paper by Codd - β€œA relational Model of Data for
 large shared Data Banks”.
  ο‚— Independence of Data from the Hardware- and Storage
    Implementation.
  ο‚— automatic navigation to the data set through high level
    nonprocedural language for data access.
  ο‚— Concept of keys (primary, secondary).
  ο‚— theoretical proposal, no practical design or implementation.


ο‚— Codd’s 12 rules for Relational management System
  ο‚— http://cims.clayton.edu/booth/ITDB%204201/Codd%20PDF.
    pdf
Application     Reporting
     1          Solutions




                  Database     Databases
Application     Management       Data
     2        Systems (DBMS)   Strorage




Application      Future
     3         Applications
ο‚— Data Definition
  ο‚— For describing data and data structures for handling the data


ο‚— Data Manipulation
  ο‚— For describing the operations associated with the data like storage, query, change,
     etc.

ο‚— Data Security and Integrity
  ο‚— For ensuring secure and controlled access to storage and manipulation of data.
  ο‚— For ensuring correctness, consistency and reliability of the data stored .

ο‚— Data Recovery and Concurrency
  ο‚— For providing and enforcing recovery and concurrency controls.

ο‚— Data Dictionary
  ο‚— For providing information about the data stored.
  ο‚— For Liaisoning between the conceptual and physical storage.

ο‚— Performance
   ο‚— For ensuring all the above mentioned operations are performed efficiently and
     effectively
External/User
How the user access and sees the data
          [Tables, Views]



       Conceptual/Logical
   How data is organized logically
          [Table Spaces]



       Physical/Internal
   How data is stored internally
          [Data Files]
ο‚— Relation (Tables)– Set of Tuples that have the
                                    same attributes.

                                ο‚— Tuples (Rows) – A Tuple usually represents an
                                    object and information about that object.

                                ο‚— Attribute (Columns)– Represent a particular
                                    characteristic of that object
ο‚— Domain - A domain describes the set of permitted values for a given attribute.
  It is the set from which the values of an attribute can be defined.
ο‚— Constraints - Constraints make it possible to further restrict the domain of an
  attribute. Constraints help in binding the attribute to a set of rules.
ο‚— Primary Key - A primary key is a (set of) attribute (s) that uniquely defines a
  relationship within a database.
ο‚— Foreign Key - The foreign key can be used to cross-reference tables.

ο‚— Cardinality - Expresses the number of instances of the entity to which another
  entity can be associated via a relation
ο‚— Index - An index is a mechanism for providing quicker access to data. Indices
  can be created on any combination of attributes on a relation.
ο‚— Based on the perception that real world can be modeled around
  base objects (entities) and relationship among them.

ο‚— Modeling of data in a top down fashion
  ο‚— Conceptual Model – The model is the highest and least granular
     model that defines master reference data entities that are
     commonly used in the problem space.

  ο‚— Logical Model – The model generally builds over the conceptual
     model by adding additional granular details like operational and
     transactional data entities.

  ο‚— Physical Model - Specifies relational database objects such
     as database tables, database indexes such as unique key indexes,
     and database constraints.

ο‚— The models can be visualized through what is commonly known
  as ER-Diagrams.
ο‚— Process for organizing the attributes and tables of a relational
  database to minimize redundancy and dependency.
ο‚— Objectives (as specified by Codd)
   ο‚— To free the collection of relations from undesirable insertion, update
     and deletion dependencies.
   ο‚— To reduce the need for restructuring the collection of relations, as new
     types of data are introduced, and thus increase the life span of
     application programs.
   ο‚— To make the relational model more informative to users.
   ο‚— To make the collection of relations neutral to the query statistics, where
     these statistics are liable to change as time goes by.
ο‚— Normal Forms (NF)
   ο‚— 1NF - it contains atomic values only
   ο‚— 2NF - 1NF + every non-key attribute is dependent on the primary key
   ο‚— 3NF - 2NF + every non-key attribute is non-transitively dependent on
     the primary key
ο‚— Properties that guarantee that database transactions are processed
  reliably.
   ο‚— Single logical operation (involving multiple steps) is called transaction.
ο‚— Properties
   ο‚— Atomicity – β€œAll or Nothing” – If one part of the transaction fails, entire
     transaction fails.
   ο‚— Consistency – Any data written to the database must be valid according
     to all defined rules, and constraints.
   ο‚— Isolation – Even during concurrent executions, the system result in a
     state that is same as the state which will be obtained when executed
     serially.
   ο‚— Durability - Once a transaction has been committed, the results will
     be stored permanently irrespective of errors and crashes that can occur
     post commit.
ο‚— In RDBMS ACID properties are implemented using various
  techniques like locking and Multi Versioning
ο‚— RDBMS based solutions is generally the first choice for
  database storage/access needs

ο‚— RDBMS solutions is now mature and predictable.


ο‚— An army of skilled specialists exists for using,
  managing and maintaining RDBMS based systems

ο‚— RDBMS has spawned an ecosystem of products that
  makes choosing RDBMS as no brainer
ο‚— Ensures Consistent behavior
   ο‚— With the table structure as the base, RDBMS provides a consistent mechanism for
     storing and accessing different data sets.
ο‚— Removes Redundancies
   ο‚— Through Normal forms, redundancies in the data are removed thereby addressing
     the errors that can arise from consistency of the data stored
ο‚— Avoid errors
   ο‚— Ensures Data integrity and quality by ensuring consistent storage, enforcing
     constraints and relationships and with ability to check data as they are entered
ο‚— Facilitates Easy analysis
   ο‚— With the SQL based query as the foundation, analyzing different data set is seamless.
     Also given the history of RDBMS, users are provided with a vast repository of tools to
     perform analysis.
ο‚— Ensures Robust Maintenance and Management
   ο‚— Database administrators are provided with tools that enable them to easily
     maintain, test, repair and back up the databases housed in the system.
ο‚— Is Secure
   ο‚— Offers good level of security and access control. Whole or part of the data can be
      securely shared across multiple users(applications) based on the privileges granted
      to them(it).
ο‚— Raise of Social Networks during early 2000s
   ο‚— World Wide Web acts as the foundation

ο‚— Shift in communication patterns
   ο‚— Sharing of personal information and usage of the same
   ο‚— Everyone turned into a publisher

ο‚— Increased focus around personalization
   ο‚— Recommendations, Ratings, Preferences and providing
     Personalized interfaces
ο‚— Big Data Flood
   ο‚— More data is being generated currently than what was generated till
     now throughout history of human kind
   ο‚— Need to store and process unstructured or semi structured data at
     volumes previously not anticipated and at frequencies not
     encountered previously
Ref: http://www.go-gulf.com/blog/60-seconds
ο‚— Accessible by users across the globe
   ο‚— Geography is irrelevant
   ο‚— Facebook, Google, Yahoo, Twitter, etc. have users across the world

ο‚— Highly networked and distributed systems
   ο‚—   Systems are accessed and connected over the Internet

ο‚— Need to be highly scalable
   ο‚— Should be able to handle additional load without redesign
   ο‚— Amazon sees a manifold increase in traffic to the site during the holiday seasons

ο‚— Expected to be highly available
   ο‚— Systems will be available for access and operations always
   ο‚— Google will incur a huge revenue and credibility loss if the site goes down

ο‚— Handle large data sets hitting the systems with high frequency
   ο‚— The data need to be stored and processed very quickly
   ο‚— Number of likes and comments on Facebook has exceeded 2.7 billion per day
ο‚— Brewers CAP Theorem
  ο‚— You can get only two out of the following three
     ο‚— Consistency – Same as Atomicity. You get β€œAll or Nothing”
     ο‚— Availability - Need to be available for operations always
     ο‚— Partition Tolerance – Need to work when some nodes are not
       accessible.


ο‚— RDBMS were essentially designed for CA
  ο‚— Latency (response times) is an unfortunate tradeoff for
    consistency
  ο‚— Partition tolerance becomes essential in distributed
    systems
ο‚— Beyond a point you cannot afford to Scale up storage
   ο‚— It becomes very expensive to keep scaling up.

ο‚— Is strict consistency really so important?
   ο‚— Ensuring consistency slows the system
   ο‚— Google found that moving from a 10-result page loading in 0.4 seconds to
      a 30-result page loading in 0.9 seconds decreased traffic and ad revenues
      by 20% (Linden 2006)
ο‚— Redundancy can be managed
   ο‚— Joins across normalized database tables is less efficient than reading
      from a data store
ο‚— Not All data is relational
  ο‚— Fitting every kind of data under the Rigid Schema structure of RDBMS is
     a challenge
   ο‚— Data read from RDBMS modeled back in its original model (say tree,
     graph, key value) induces significant stress on computing resources.
   ο‚— Attributes (columns) are restricted by domain to store similar data.
   ο‚— Managing semi structured, unstructured data like documents becomes a
     challenge.
ο‚— CRUD (Create, Read, Update, Delete) is crude
   ο‚— Updates and deletes should never be allowed as they destroy
     information.

ο‚— Logical and physical separation of concerns ignored
   ο‚— Relational model is a logical model
   ο‚— Database products implemented the relational model at the physical
     level as a set of btree files with multiple indexes.
   ο‚— Induces artificial overhead onto managing the database.


ο‚— It is over spinning disks
   ο‚— All RDBMS implementations assume that the data is coming from the
     disks
   ο‚— Legacy of an era when memory was expensive.
   ο‚— Memory based systems will be faster


ο‚— Databases are big and slow
  ο‚— Fundamentally not designed for big data sets
  ο‚— Long queries get slower with more data
ο‚— Core Tenets
  ο‚— Basically Available
      ο‚— System seem to work all the time
  ο‚— Soft State
      ο‚— It doesn’t have to be consistent all the time
  ο‚— Eventual Consistency
      ο‚— Becomes consistent eventually (at some later time)


ο‚— Significance
  ο‚— BASE is diametrically opposed to ACID.
      ο‚— ACID is pessimistic and forces consistency at the end of every operation
      ο‚— BASE is optimistic and accepts that the database consistency will be in a
        state of flux.
  ο‚— The availability is achieved through supporting partial failures
     without total system failure
      ο‚— It is ok for the system to be available for 80% of users and limit failure
        to 20% of the user.
  ο‚— Users should understand the implication of Eventual Consistency
      ο‚— Factors in a probability of data loss. Safety of the data is the tradeoff
      ο‚— Need to understand how eventual is Eventual
ο‚— NoSQL – Not Only SQL
   ο‚— It is not SQL and it is not Relational

ο‚— Essential Feature set
   ο‚— Elastic Scaling – Rely on Scale out rather than Scale up
   ο‚— Big Data – Handle High Volume, High Velocity, High Variability
   ο‚— Commoditize Manageability – Reduce dependence on highly skilled
     DBA and lower administration costs
   ο‚— Economics – Build over commodity hardware
   ο‚— Flexible data model – Remove data model based restrictions.

ο‚— Applicability
   ο‚—   Performance and real time nature over consistency
   ο‚—   High scalability
   ο‚—   Store and retrieve large data sets
   ο‚—   Does not require a relational model
ο‚— Key Value
   ο‚— Idea is to use a hash table where there is a unique key and a pointer to a
     particular item of data. Simplest to implement.
   ο‚— it is inefficient when you are only interested in querying or updating part
     of a value
ο‚— Column Store
   ο‚— Created to store and process very large amounts of data distributed over
     many machines
   ο‚— Still keys but they point to multiple columns.
   ο‚— The columns are arranged by column family.

ο‚— Document
  ο‚— The model is basically versioned documents that are collections of other
     key-value collections.
   ο‚— The semi-structured documents are stored in formats like JSON.
   ο‚— allowing nested values associated with each key
   ο‚— Document databases support querying more efficiently.

ο‚— Graph
   ο‚— flexible graph model is used which, again, can scale across multiple
      machines
Access Interfaces
                                                                Language Specific
REST/HTTP              Thrift                Map Reduce
                                                                      API



                          Logical Data Model
 Key Value       Column Family Store         Document                Graph



                       Support and Distribution
                  Multi Data Center            Dynamic
CAP Support                                                   Proactive Monitoring
                      Support                Provisioning



                            Data Persistence
                                                        Combination of Memory and
  Memory Based                  Disk Based
                                                                  Disk
NoSQL


Key Value         Column Store             Document        Graph


   MemCached           SimpleDB               CouchDB         Neo4J


      Redis             BigTable              MangoDB        InfoGrid


    SimpleDB             Hbase              Lotus Domino     FlockDB


  Tokyo Cabinet        Cassandra                Riak       InfiniteGraph


    Dynamo            HyperTable


    Voldemort          Azure TS
ο‚— It is not Mature
   ο‚— RDBMS is mature, stable and functionally rich.
   ο‚— Most NoSQL alternatives are in pre-production versions with many key
      features yet to be implemented.
ο‚— Support
   ο‚— Nost NoSQL systems are open source projects.
   ο‚— Support mostly offered by startup companies, with reach and
      credibility not on par with RDBMS Vendors.
ο‚— Analytics
   ο‚— NoSQL databases offer few facilities for ad-hoc query and analysis.
   ο‚— Even a simple query requires significant programming expertise.
   ο‚— At present, commonly used BI tools do not provide credible
      connectivity to NoSQL.
ο‚— Administration and Maintenance
   ο‚— The desired goal of zero maintenance is far away.
   ο‚— In reality significant effort t required to maintain the systems.
ο‚— Expertise
   ο‚— Currently very limited awareness and knowledge
ο‚— Scalability
  ο‚— Master Slave - One master many slaves
     ο‚— Write to master; Read from any of the slaves
  ο‚— Partitioning – Group and localize related functions across nodes
     ο‚— Partition Vertically (by functions) or Horizontally ( by keys)
  ο‚— Caching - Memory based cache in front of the Database
     ο‚— Address scaling issues due to read and write loads


ο‚— High Availability
  ο‚— Clustering - Group of systems responsible for a service
     ο‚— Build redundancy into a cluster to eliminate single points of failure
  ο‚— Mirroring and Replication – Maintain a hot standby
     ο‚— Handle planned or unplanned downtimes
  ο‚— Recovery Solutions - dependable data backup, restore, and
     recovery procedures
      ο‚— Combine process with tools
ο‚— Performance
   ο‚— Be open to Denormalization – And accelerate reads
       ο‚— Allow redundancy and duplicates to reduce joins
   ο‚— Optimize your costly queries- Analyze and optimize the expensive
      queries
       ο‚— Use a mix of design strategy, indices, and analysis from query optimization tools
   ο‚— Invest in better hardware – storage and memory
      ο‚— It is not a bad bet - The storage and memory costs have dropped significantly


ο‚— Rigid Schemas – Not all data is relational
   ο‚— Even the most schema-less model has some schema
       ο‚— World revolves round the structures
   ο‚— If Key-Value kind of store is needed, You can do the same in any
      RDBMS
       ο‚— RDBMS will provide an added advantage of structured access and queries
ο‚— Systems eventually will gravitate towards one of these three
  ο‚— Fast, agile, highly scalable data stores
  ο‚— Handlers of complex transactional semantics
  ο‚— Analytical processors and facilitators


ο‚— World is never binary
  ο‚— It is never either this or that.
  ο‚— Why fight over technicalities


ο‚— Drive decisions based on use cases
  ο‚—   Choose a model based on the use cases and scenarios
  ο‚—   Research and understand what your application needs
  ο‚—   Stay away from substituting β€œHard work” with β€œRhetoric”
  ο‚—   Be open to experimentation
ο‚—   http://www.guug.de/lokal/muenchen/2007-05-14/rdbmsc.pdf
ο‚—   http://ansonalex.com/infographics/twitter-usage-statistics-2012-infographic/
ο‚—   http://www.mountainman.com.au/software/history/it1.html
ο‚—   http://www.slideshare.net/renguzi/codd
ο‚—   http://cims.clayton.edu/booth/ITDB%204201/Codd%20PDF.pdf
ο‚—   http://www.scribd.com/doc/19381895/RDBMS-Concepts
ο‚—   http://www.gitta.info/DBSysConcept/en/text/DBSysConcept.pdf
ο‚—   http://en.wikipedia.org/wiki/Relational_database
ο‚—   http://en.wikipedia.org/wiki/ACID
ο‚—   http://blogs.hbr.org/now-new-next/2009/05/the-social-data-revolution.html
ο‚—   http://www.go-gulf.com/blog/60-seconds
ο‚—   http://en.wikipedia.org/wiki/CAP_theorem
ο‚—   http://highscalability.com/drop-acid-and-think-about-data
ο‚—   http://queue.acm.org/detail.cfm?id=1394128
ο‚—   http://www.bailis.org/blog/safety-and-liveness-eventual-consistency-is-not-safe/
ο‚—   http://www.techrepublic.com/blog/10things/10-things-you-should-know-about-nosql-databases/1772
ο‚—   http://rebelic.nl/engineering/the-four-categories-of-nosql-databases/
ο‚—   http://www.slideshare.net/ksankar/nosql-4559402
ο‚—   http://www.thevirtualcircle.com/2008/11/10/6-reasons-why-relational-database-will-be-superseded/
ο‚—   http://www.slideshare.net/sbtourist/scale-your-database-and-be-happy
ο‚—   Note:
    Many images used in the deck have been a result of using google image search. Even though, I have not been able to
    mention the sources of all the images individually, I extend my sincere thanks for the owners of the images for making the
    same available on the net

More Related Content

What's hot

Db trends final
Db trends   finalDb trends   final
Db trends finalCraig Mullins
Β 
[Www.pkbulk.blogspot.com]dbms01
[Www.pkbulk.blogspot.com]dbms01[Www.pkbulk.blogspot.com]dbms01
[Www.pkbulk.blogspot.com]dbms01AnusAhmad
Β 
Basha_ETL_Developer
Basha_ETL_DeveloperBasha_ETL_Developer
Basha_ETL_Developerbasha shaik
Β 
Heterogeneous Data - Published
Heterogeneous Data - PublishedHeterogeneous Data - Published
Heterogeneous Data - PublishedPaul Steffensen
Β 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsDATAVERSITY
Β 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLDr-Dipali Meher
Β 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
Β 
PharmMD ETL Developer Job Description
PharmMD ETL Developer Job DescriptionPharmMD ETL Developer Job Description
PharmMD ETL Developer Job Descriptionbrittanydalton
Β 
Sql no sql
Sql no sqlSql no sql
Sql no sqlDave Stokes
Β 
Resume_Ratna Rao updated
Resume_Ratna Rao updatedResume_Ratna Rao updated
Resume_Ratna Rao updatedRatna Rao yamani
Β 
ETL Developer Resume
ETL Developer ResumeETL Developer Resume
ETL Developer ResumeTeferi Tamiru
Β 
ETL_Developer_Resume_Shipra_7_02_17
ETL_Developer_Resume_Shipra_7_02_17ETL_Developer_Resume_Shipra_7_02_17
ETL_Developer_Resume_Shipra_7_02_17Shipra Jaiswal
Β 
Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongMassimo Cenci
Β 
To SQL or NoSQL, that is the question
To SQL or NoSQL, that is the questionTo SQL or NoSQL, that is the question
To SQL or NoSQL, that is the questionKrishnakumar S
Β 
Polyglot Persistence
Polyglot Persistence Polyglot Persistence
Polyglot Persistence Dr-Dipali Meher
Β 
Basha_ETL_Developer
Basha_ETL_DeveloperBasha_ETL_Developer
Basha_ETL_Developerbasha shaik
Β 

What's hot (20)

Db trends final
Db trends   finalDb trends   final
Db trends final
Β 
[Www.pkbulk.blogspot.com]dbms01
[Www.pkbulk.blogspot.com]dbms01[Www.pkbulk.blogspot.com]dbms01
[Www.pkbulk.blogspot.com]dbms01
Β 
Basha_ETL_Developer
Basha_ETL_DeveloperBasha_ETL_Developer
Basha_ETL_Developer
Β 
Heterogeneous Data - Published
Heterogeneous Data - PublishedHeterogeneous Data - Published
Heterogeneous Data - Published
Β 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture Patterns
Β 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
Β 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
Β 
PharmMD ETL Developer Job Description
PharmMD ETL Developer Job DescriptionPharmMD ETL Developer Job Description
PharmMD ETL Developer Job Description
Β 
Ramachandran_ETL Developer
Ramachandran_ETL DeveloperRamachandran_ETL Developer
Ramachandran_ETL Developer
Β 
Sql no sql
Sql no sqlSql no sql
Sql no sql
Β 
Resume_Ratna Rao updated
Resume_Ratna Rao updatedResume_Ratna Rao updated
Resume_Ratna Rao updated
Β 
Nosql
NosqlNosql
Nosql
Β 
ETL Developer Resume
ETL Developer ResumeETL Developer Resume
ETL Developer Resume
Β 
ETL_Developer_Resume_Shipra_7_02_17
ETL_Developer_Resume_Shipra_7_02_17ETL_Developer_Resume_Shipra_7_02_17
ETL_Developer_Resume_Shipra_7_02_17
Β 
Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrong
Β 
Etl techniques
Etl techniquesEtl techniques
Etl techniques
Β 
To SQL or NoSQL, that is the question
To SQL or NoSQL, that is the questionTo SQL or NoSQL, that is the question
To SQL or NoSQL, that is the question
Β 
My C.V
My C.VMy C.V
My C.V
Β 
Polyglot Persistence
Polyglot Persistence Polyglot Persistence
Polyglot Persistence
Β 
Basha_ETL_Developer
Basha_ETL_DeveloperBasha_ETL_Developer
Basha_ETL_Developer
Β 

Viewers also liked

Rdbms
RdbmsRdbms
Rdbmsrdbms
Β 
Rdbms
RdbmsRdbms
Rdbmstech4us
Β 
Introduction to RDBMS
Introduction to RDBMSIntroduction to RDBMS
Introduction to RDBMSSarmad Ali
Β 
Relational database management system (rdbms) i
Relational database management system (rdbms) iRelational database management system (rdbms) i
Relational database management system (rdbms) iRavinder Kamboj
Β 
3. Relational Models in DBMS
3. Relational Models in DBMS3. Relational Models in DBMS
3. Relational Models in DBMSkoolkampus
Β 
Amazon SimpleDB
Amazon SimpleDBAmazon SimpleDB
Amazon SimpleDBSean Collins
Β 
Historical Evolution of RDBMS
Historical Evolution of RDBMSHistorical Evolution of RDBMS
Historical Evolution of RDBMSShailesh Pachori
Β 
Procedures/functions of rdbms
Procedures/functions of rdbmsProcedures/functions of rdbms
Procedures/functions of rdbmsjain.pralabh
Β 
Difference between RDBMS & DBMS
Difference between RDBMS & DBMSDifference between RDBMS & DBMS
Difference between RDBMS & DBMSRisha Bagchi
Β 
Life and work of E.F. (Ted) Codd | Turing100@Persistent
Life and work of E.F. (Ted) Codd | Turing100@PersistentLife and work of E.F. (Ted) Codd | Turing100@Persistent
Life and work of E.F. (Ted) Codd | Turing100@PersistentPersistent Systems Ltd.
Β 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesJonathan Katz
Β 
library management system in SQL
library management system in SQLlibrary management system in SQL
library management system in SQLfarouq umar
Β 
Database : Relational Data Model
Database : Relational Data ModelDatabase : Relational Data Model
Database : Relational Data ModelSmriti Jain
Β 
Database Management Systems (DBMS)
Database Management Systems (DBMS)Database Management Systems (DBMS)
Database Management Systems (DBMS)Dimara Hakim
Β 
Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)
Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)
Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)Beat Signer
Β 
Corporate etiquette ppt by rahul kapoliya
Corporate etiquette ppt by rahul kapoliyaCorporate etiquette ppt by rahul kapoliya
Corporate etiquette ppt by rahul kapoliyarahul kapoliya
Β 
Dbms and rdbms ppt
Dbms and rdbms pptDbms and rdbms ppt
Dbms and rdbms pptrahul kapoliya
Β 

Viewers also liked (20)

Rdbms
RdbmsRdbms
Rdbms
Β 
Rdbms
RdbmsRdbms
Rdbms
Β 
Introduction to RDBMS
Introduction to RDBMSIntroduction to RDBMS
Introduction to RDBMS
Β 
Relational database management system (rdbms) i
Relational database management system (rdbms) iRelational database management system (rdbms) i
Relational database management system (rdbms) i
Β 
RDBMS.ppt
RDBMS.pptRDBMS.ppt
RDBMS.ppt
Β 
3. Relational Models in DBMS
3. Relational Models in DBMS3. Relational Models in DBMS
3. Relational Models in DBMS
Β 
Relational Database Management System
Relational Database Management SystemRelational Database Management System
Relational Database Management System
Β 
Amazon SimpleDB
Amazon SimpleDBAmazon SimpleDB
Amazon SimpleDB
Β 
Historical Evolution of RDBMS
Historical Evolution of RDBMSHistorical Evolution of RDBMS
Historical Evolution of RDBMS
Β 
Procedures/functions of rdbms
Procedures/functions of rdbmsProcedures/functions of rdbms
Procedures/functions of rdbms
Β 
Difference between RDBMS & DBMS
Difference between RDBMS & DBMSDifference between RDBMS & DBMS
Difference between RDBMS & DBMS
Β 
Life and work of E.F. (Ted) Codd | Turing100@Persistent
Life and work of E.F. (Ted) Codd | Turing100@PersistentLife and work of E.F. (Ted) Codd | Turing100@Persistent
Life and work of E.F. (Ted) Codd | Turing100@Persistent
Β 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Β 
library management system in SQL
library management system in SQLlibrary management system in SQL
library management system in SQL
Β 
RDBMS
RDBMSRDBMS
RDBMS
Β 
Database : Relational Data Model
Database : Relational Data ModelDatabase : Relational Data Model
Database : Relational Data Model
Β 
Database Management Systems (DBMS)
Database Management Systems (DBMS)Database Management Systems (DBMS)
Database Management Systems (DBMS)
Β 
Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)
Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)
Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)
Β 
Corporate etiquette ppt by rahul kapoliya
Corporate etiquette ppt by rahul kapoliyaCorporate etiquette ppt by rahul kapoliya
Corporate etiquette ppt by rahul kapoliya
Β 
Dbms and rdbms ppt
Dbms and rdbms pptDbms and rdbms ppt
Dbms and rdbms ppt
Β 

Similar to RDBMS to NoSQL. An overview.

database introductoin optimization1-app6891.pdf
database introductoin optimization1-app6891.pdfdatabase introductoin optimization1-app6891.pdf
database introductoin optimization1-app6891.pdfparveen204931475
Β 
Introduction to Database
Introduction to DatabaseIntroduction to Database
Introduction to DatabaseSiti Ismail
Β 
Dbms models
Dbms modelsDbms models
Dbms modelsdevgocool
Β 
Database Management System Introduction
Database Management System IntroductionDatabase Management System Introduction
Database Management System IntroductionSmriti Jain
Β 
Database Management System, Lecture-1
Database Management System, Lecture-1Database Management System, Lecture-1
Database Management System, Lecture-1Sonia Mim
Β 
Advanced Database Management System_Introduction Slide.ppt
Advanced Database Management System_Introduction Slide.pptAdvanced Database Management System_Introduction Slide.ppt
Advanced Database Management System_Introduction Slide.pptBikalAdhikari4
Β 
Lecture 1 to 3intro to normalization in database
Lecture 1 to 3intro to  normalization in databaseLecture 1 to 3intro to  normalization in database
Lecture 1 to 3intro to normalization in databasemaqsoodahmedbscsfkhp
Β 
Ch-1-Introduction-to-Database.pdf
Ch-1-Introduction-to-Database.pdfCh-1-Introduction-to-Database.pdf
Ch-1-Introduction-to-Database.pdfMrjJoker1
Β 
M.sc. engg (ict) admission guide database management system 4
M.sc. engg (ict) admission guide   database management system 4M.sc. engg (ict) admission guide   database management system 4
M.sc. engg (ict) admission guide database management system 4Syed Ariful Islam Emon
Β 

Similar to RDBMS to NoSQL. An overview. (20)

database introductoin optimization1-app6891.pdf
database introductoin optimization1-app6891.pdfdatabase introductoin optimization1-app6891.pdf
database introductoin optimization1-app6891.pdf
Β 
Introduction to Database
Introduction to DatabaseIntroduction to Database
Introduction to Database
Β 
Dbms Useful PPT
Dbms Useful PPTDbms Useful PPT
Dbms Useful PPT
Β 
Dbms models
Dbms modelsDbms models
Dbms models
Β 
Database management systems
Database management systemsDatabase management systems
Database management systems
Β 
23246406 dbms-unit-1
23246406 dbms-unit-123246406 dbms-unit-1
23246406 dbms-unit-1
Β 
Database Management System Introduction
Database Management System IntroductionDatabase Management System Introduction
Database Management System Introduction
Β 
Dbms unit i
Dbms unit iDbms unit i
Dbms unit i
Β 
27 fcs157al2
27 fcs157al227 fcs157al2
27 fcs157al2
Β 
Database Management System, Lecture-1
Database Management System, Lecture-1Database Management System, Lecture-1
Database Management System, Lecture-1
Β 
Database Concepts
Database ConceptsDatabase Concepts
Database Concepts
Β 
Advanced Database Management System_Introduction Slide.ppt
Advanced Database Management System_Introduction Slide.pptAdvanced Database Management System_Introduction Slide.ppt
Advanced Database Management System_Introduction Slide.ppt
Β 
DBMS and its Models
DBMS and its ModelsDBMS and its Models
DBMS and its Models
Β 
Lecture 1 to 3intro to normalization in database
Lecture 1 to 3intro to  normalization in databaseLecture 1 to 3intro to  normalization in database
Lecture 1 to 3intro to normalization in database
Β 
Dbms unit01
Dbms unit01Dbms unit01
Dbms unit01
Β 
Lecture#5
Lecture#5Lecture#5
Lecture#5
Β 
ICT L5+.pptx
ICT L5+.pptxICT L5+.pptx
ICT L5+.pptx
Β 
Ch-1-Introduction-to-Database.pdf
Ch-1-Introduction-to-Database.pdfCh-1-Introduction-to-Database.pdf
Ch-1-Introduction-to-Database.pdf
Β 
Unit 1 DBMS
Unit 1 DBMSUnit 1 DBMS
Unit 1 DBMS
Β 
M.sc. engg (ict) admission guide database management system 4
M.sc. engg (ict) admission guide   database management system 4M.sc. engg (ict) admission guide   database management system 4
M.sc. engg (ict) admission guide database management system 4
Β 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
Β 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
Β 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
Β 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
Β 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
Β 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
Β 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
Β 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
Β 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
Β 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
Β 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
Β 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
Β 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
Β 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
Β 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
Β 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
Β 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
Β 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
Β 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
Β 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
Β 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
Β 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Β 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Β 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Β 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Β 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Β 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
Β 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Β 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Β 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Β 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
Β 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Β 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Β 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
Β 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
Β 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
Β 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Β 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Β 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
Β 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Β 

RDBMS to NoSQL. An overview.

  • 2. ο‚— RDBMS – The origins ο‚— Concepts, Architecture and Principles ο‚— Golden Age – Way of life. ο‚— Changing Times– New Problems, New Needs ο‚— Attack on the citadel - Revisiting the norms ο‚— Ignited Minds – Working towards NoSQL Solutions ο‚— Way Ahead– It is a Cloudy out there
  • 3. ο‚— Girish Narasimha Raghavan ο‚— Over 15 years experience building distributed, large scale and highly available enterprise systems. ο‚— Current interest include build SAC (Social, Big Data Analytics, and Cloud) solutions. ο‚— Likes to write and discuss technologies and its applications to solve real world problems. ο‚— http://randomtechthought.blogspot.com
  • 4.
  • 5. ο‚— In the world data abounds. Always has and always will. ο‚— Record keeping is as old as Human race. ο‚— Consistent quest to improve storing , accessing, and analyzing records ο‚— The early machines had serious shortcomings. ο‚— only a very limited amount of program code and data could be stored in memory. ο‚— Electromagnetic data storage was feasible only at an extremely high cost. ο‚— Storing Data was an issue ο‚— Organizations had to store data – related to Administration, Research, Operations. ο‚— Data stored in proprietary formats – Database Systems did not exist ο‚— Plagued by data integrity issues ο‚— Non standard application logic for accessing stored data
  • 6. ο‚— First attempt: File based systems ο‚— Data sets were growing and accumulating. ο‚— Data had to be managed at a detailed transaction level. ο‚— Computing systems started to be used for critical business needs. ο‚— Data inconsistency and redundancy. ο‚— Enter Database Systems ο‚— Attempts to standardize the processes and rules to store and access data. ο‚— Intention to reuse, resell and redeploy solutions across organizations (with significant customizations). ο‚— Attempt to proactively manage Data Integrity and Quality.
  • 7. ο‚— Database Systems and concepts Evolve ο‚— Hierarchical DBMS ο‚— Information represented using parent/child relationships ο‚— Tree structure is primary data structure. ο‚— Network DBMS ο‚— The relationships is represented in form of a network. ο‚— Graph is the primary data structure. ο‚— Challenges Galore ο‚— Hardware Dependency – Software strongly dependent on the underlying hardware. ο‚— Modeling challenges – Representing data under a common structure. ο‚— Integration issues - Integrating across dependent packages was a nightmare. ο‚— Introducing new functionality and updates - Solution providers struggled with it across customized software deployment.
  • 8. Father of the Relational Database model Edgar F Codd A British Computer Scientist who made significant contributions to the theory of Relational Databases while working for IBM.
  • 9. ο‚— Landmark Paper by Codd - β€œA relational Model of Data for large shared Data Banks”. ο‚— Independence of Data from the Hardware- and Storage Implementation. ο‚— automatic navigation to the data set through high level nonprocedural language for data access. ο‚— Concept of keys (primary, secondary). ο‚— theoretical proposal, no practical design or implementation. ο‚— Codd’s 12 rules for Relational management System ο‚— http://cims.clayton.edu/booth/ITDB%204201/Codd%20PDF. pdf
  • 10.
  • 11. Application Reporting 1 Solutions Database Databases Application Management Data 2 Systems (DBMS) Strorage Application Future 3 Applications
  • 12. ο‚— Data Definition ο‚— For describing data and data structures for handling the data ο‚— Data Manipulation ο‚— For describing the operations associated with the data like storage, query, change, etc. ο‚— Data Security and Integrity ο‚— For ensuring secure and controlled access to storage and manipulation of data. ο‚— For ensuring correctness, consistency and reliability of the data stored . ο‚— Data Recovery and Concurrency ο‚— For providing and enforcing recovery and concurrency controls. ο‚— Data Dictionary ο‚— For providing information about the data stored. ο‚— For Liaisoning between the conceptual and physical storage. ο‚— Performance ο‚— For ensuring all the above mentioned operations are performed efficiently and effectively
  • 13. External/User How the user access and sees the data [Tables, Views] Conceptual/Logical How data is organized logically [Table Spaces] Physical/Internal How data is stored internally [Data Files]
  • 14. ο‚— Relation (Tables)– Set of Tuples that have the same attributes. ο‚— Tuples (Rows) – A Tuple usually represents an object and information about that object. ο‚— Attribute (Columns)– Represent a particular characteristic of that object ο‚— Domain - A domain describes the set of permitted values for a given attribute. It is the set from which the values of an attribute can be defined. ο‚— Constraints - Constraints make it possible to further restrict the domain of an attribute. Constraints help in binding the attribute to a set of rules. ο‚— Primary Key - A primary key is a (set of) attribute (s) that uniquely defines a relationship within a database. ο‚— Foreign Key - The foreign key can be used to cross-reference tables. ο‚— Cardinality - Expresses the number of instances of the entity to which another entity can be associated via a relation ο‚— Index - An index is a mechanism for providing quicker access to data. Indices can be created on any combination of attributes on a relation.
  • 15. ο‚— Based on the perception that real world can be modeled around base objects (entities) and relationship among them. ο‚— Modeling of data in a top down fashion ο‚— Conceptual Model – The model is the highest and least granular model that defines master reference data entities that are commonly used in the problem space. ο‚— Logical Model – The model generally builds over the conceptual model by adding additional granular details like operational and transactional data entities. ο‚— Physical Model - Specifies relational database objects such as database tables, database indexes such as unique key indexes, and database constraints. ο‚— The models can be visualized through what is commonly known as ER-Diagrams.
  • 16. ο‚— Process for organizing the attributes and tables of a relational database to minimize redundancy and dependency. ο‚— Objectives (as specified by Codd) ο‚— To free the collection of relations from undesirable insertion, update and deletion dependencies. ο‚— To reduce the need for restructuring the collection of relations, as new types of data are introduced, and thus increase the life span of application programs. ο‚— To make the relational model more informative to users. ο‚— To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time goes by. ο‚— Normal Forms (NF) ο‚— 1NF - it contains atomic values only ο‚— 2NF - 1NF + every non-key attribute is dependent on the primary key ο‚— 3NF - 2NF + every non-key attribute is non-transitively dependent on the primary key
  • 17. ο‚— Properties that guarantee that database transactions are processed reliably. ο‚— Single logical operation (involving multiple steps) is called transaction. ο‚— Properties ο‚— Atomicity – β€œAll or Nothing” – If one part of the transaction fails, entire transaction fails. ο‚— Consistency – Any data written to the database must be valid according to all defined rules, and constraints. ο‚— Isolation – Even during concurrent executions, the system result in a state that is same as the state which will be obtained when executed serially. ο‚— Durability - Once a transaction has been committed, the results will be stored permanently irrespective of errors and crashes that can occur post commit. ο‚— In RDBMS ACID properties are implemented using various techniques like locking and Multi Versioning
  • 18.
  • 19.
  • 20. ο‚— RDBMS based solutions is generally the first choice for database storage/access needs ο‚— RDBMS solutions is now mature and predictable. ο‚— An army of skilled specialists exists for using, managing and maintaining RDBMS based systems ο‚— RDBMS has spawned an ecosystem of products that makes choosing RDBMS as no brainer
  • 21. ο‚— Ensures Consistent behavior ο‚— With the table structure as the base, RDBMS provides a consistent mechanism for storing and accessing different data sets. ο‚— Removes Redundancies ο‚— Through Normal forms, redundancies in the data are removed thereby addressing the errors that can arise from consistency of the data stored ο‚— Avoid errors ο‚— Ensures Data integrity and quality by ensuring consistent storage, enforcing constraints and relationships and with ability to check data as they are entered ο‚— Facilitates Easy analysis ο‚— With the SQL based query as the foundation, analyzing different data set is seamless. Also given the history of RDBMS, users are provided with a vast repository of tools to perform analysis. ο‚— Ensures Robust Maintenance and Management ο‚— Database administrators are provided with tools that enable them to easily maintain, test, repair and back up the databases housed in the system. ο‚— Is Secure ο‚— Offers good level of security and access control. Whole or part of the data can be securely shared across multiple users(applications) based on the privileges granted to them(it).
  • 22.
  • 23. ο‚— Raise of Social Networks during early 2000s ο‚— World Wide Web acts as the foundation ο‚— Shift in communication patterns ο‚— Sharing of personal information and usage of the same ο‚— Everyone turned into a publisher ο‚— Increased focus around personalization ο‚— Recommendations, Ratings, Preferences and providing Personalized interfaces ο‚— Big Data Flood ο‚— More data is being generated currently than what was generated till now throughout history of human kind ο‚— Need to store and process unstructured or semi structured data at volumes previously not anticipated and at frequencies not encountered previously
  • 25. ο‚— Accessible by users across the globe ο‚— Geography is irrelevant ο‚— Facebook, Google, Yahoo, Twitter, etc. have users across the world ο‚— Highly networked and distributed systems ο‚— Systems are accessed and connected over the Internet ο‚— Need to be highly scalable ο‚— Should be able to handle additional load without redesign ο‚— Amazon sees a manifold increase in traffic to the site during the holiday seasons ο‚— Expected to be highly available ο‚— Systems will be available for access and operations always ο‚— Google will incur a huge revenue and credibility loss if the site goes down ο‚— Handle large data sets hitting the systems with high frequency ο‚— The data need to be stored and processed very quickly ο‚— Number of likes and comments on Facebook has exceeded 2.7 billion per day
  • 26.
  • 27. ο‚— Brewers CAP Theorem ο‚— You can get only two out of the following three ο‚— Consistency – Same as Atomicity. You get β€œAll or Nothing” ο‚— Availability - Need to be available for operations always ο‚— Partition Tolerance – Need to work when some nodes are not accessible. ο‚— RDBMS were essentially designed for CA ο‚— Latency (response times) is an unfortunate tradeoff for consistency ο‚— Partition tolerance becomes essential in distributed systems
  • 28. ο‚— Beyond a point you cannot afford to Scale up storage ο‚— It becomes very expensive to keep scaling up. ο‚— Is strict consistency really so important? ο‚— Ensuring consistency slows the system ο‚— Google found that moving from a 10-result page loading in 0.4 seconds to a 30-result page loading in 0.9 seconds decreased traffic and ad revenues by 20% (Linden 2006) ο‚— Redundancy can be managed ο‚— Joins across normalized database tables is less efficient than reading from a data store ο‚— Not All data is relational ο‚— Fitting every kind of data under the Rigid Schema structure of RDBMS is a challenge ο‚— Data read from RDBMS modeled back in its original model (say tree, graph, key value) induces significant stress on computing resources. ο‚— Attributes (columns) are restricted by domain to store similar data. ο‚— Managing semi structured, unstructured data like documents becomes a challenge.
  • 29. ο‚— CRUD (Create, Read, Update, Delete) is crude ο‚— Updates and deletes should never be allowed as they destroy information. ο‚— Logical and physical separation of concerns ignored ο‚— Relational model is a logical model ο‚— Database products implemented the relational model at the physical level as a set of btree files with multiple indexes. ο‚— Induces artificial overhead onto managing the database. ο‚— It is over spinning disks ο‚— All RDBMS implementations assume that the data is coming from the disks ο‚— Legacy of an era when memory was expensive. ο‚— Memory based systems will be faster ο‚— Databases are big and slow ο‚— Fundamentally not designed for big data sets ο‚— Long queries get slower with more data
  • 30.
  • 31. ο‚— Core Tenets ο‚— Basically Available ο‚— System seem to work all the time ο‚— Soft State ο‚— It doesn’t have to be consistent all the time ο‚— Eventual Consistency ο‚— Becomes consistent eventually (at some later time) ο‚— Significance ο‚— BASE is diametrically opposed to ACID. ο‚— ACID is pessimistic and forces consistency at the end of every operation ο‚— BASE is optimistic and accepts that the database consistency will be in a state of flux. ο‚— The availability is achieved through supporting partial failures without total system failure ο‚— It is ok for the system to be available for 80% of users and limit failure to 20% of the user. ο‚— Users should understand the implication of Eventual Consistency ο‚— Factors in a probability of data loss. Safety of the data is the tradeoff ο‚— Need to understand how eventual is Eventual
  • 32. ο‚— NoSQL – Not Only SQL ο‚— It is not SQL and it is not Relational ο‚— Essential Feature set ο‚— Elastic Scaling – Rely on Scale out rather than Scale up ο‚— Big Data – Handle High Volume, High Velocity, High Variability ο‚— Commoditize Manageability – Reduce dependence on highly skilled DBA and lower administration costs ο‚— Economics – Build over commodity hardware ο‚— Flexible data model – Remove data model based restrictions. ο‚— Applicability ο‚— Performance and real time nature over consistency ο‚— High scalability ο‚— Store and retrieve large data sets ο‚— Does not require a relational model
  • 33. ο‚— Key Value ο‚— Idea is to use a hash table where there is a unique key and a pointer to a particular item of data. Simplest to implement. ο‚— it is inefficient when you are only interested in querying or updating part of a value ο‚— Column Store ο‚— Created to store and process very large amounts of data distributed over many machines ο‚— Still keys but they point to multiple columns. ο‚— The columns are arranged by column family. ο‚— Document ο‚— The model is basically versioned documents that are collections of other key-value collections. ο‚— The semi-structured documents are stored in formats like JSON. ο‚— allowing nested values associated with each key ο‚— Document databases support querying more efficiently. ο‚— Graph ο‚— flexible graph model is used which, again, can scale across multiple machines
  • 34. Access Interfaces Language Specific REST/HTTP Thrift Map Reduce API Logical Data Model Key Value Column Family Store Document Graph Support and Distribution Multi Data Center Dynamic CAP Support Proactive Monitoring Support Provisioning Data Persistence Combination of Memory and Memory Based Disk Based Disk
  • 35. NoSQL Key Value Column Store Document Graph MemCached SimpleDB CouchDB Neo4J Redis BigTable MangoDB InfoGrid SimpleDB Hbase Lotus Domino FlockDB Tokyo Cabinet Cassandra Riak InfiniteGraph Dynamo HyperTable Voldemort Azure TS
  • 36.
  • 37. ο‚— It is not Mature ο‚— RDBMS is mature, stable and functionally rich. ο‚— Most NoSQL alternatives are in pre-production versions with many key features yet to be implemented. ο‚— Support ο‚— Nost NoSQL systems are open source projects. ο‚— Support mostly offered by startup companies, with reach and credibility not on par with RDBMS Vendors. ο‚— Analytics ο‚— NoSQL databases offer few facilities for ad-hoc query and analysis. ο‚— Even a simple query requires significant programming expertise. ο‚— At present, commonly used BI tools do not provide credible connectivity to NoSQL. ο‚— Administration and Maintenance ο‚— The desired goal of zero maintenance is far away. ο‚— In reality significant effort t required to maintain the systems. ο‚— Expertise ο‚— Currently very limited awareness and knowledge
  • 38. ο‚— Scalability ο‚— Master Slave - One master many slaves ο‚— Write to master; Read from any of the slaves ο‚— Partitioning – Group and localize related functions across nodes ο‚— Partition Vertically (by functions) or Horizontally ( by keys) ο‚— Caching - Memory based cache in front of the Database ο‚— Address scaling issues due to read and write loads ο‚— High Availability ο‚— Clustering - Group of systems responsible for a service ο‚— Build redundancy into a cluster to eliminate single points of failure ο‚— Mirroring and Replication – Maintain a hot standby ο‚— Handle planned or unplanned downtimes ο‚— Recovery Solutions - dependable data backup, restore, and recovery procedures ο‚— Combine process with tools
  • 39. ο‚— Performance ο‚— Be open to Denormalization – And accelerate reads ο‚— Allow redundancy and duplicates to reduce joins ο‚— Optimize your costly queries- Analyze and optimize the expensive queries ο‚— Use a mix of design strategy, indices, and analysis from query optimization tools ο‚— Invest in better hardware – storage and memory ο‚— It is not a bad bet - The storage and memory costs have dropped significantly ο‚— Rigid Schemas – Not all data is relational ο‚— Even the most schema-less model has some schema ο‚— World revolves round the structures ο‚— If Key-Value kind of store is needed, You can do the same in any RDBMS ο‚— RDBMS will provide an added advantage of structured access and queries
  • 40. ο‚— Systems eventually will gravitate towards one of these three ο‚— Fast, agile, highly scalable data stores ο‚— Handlers of complex transactional semantics ο‚— Analytical processors and facilitators ο‚— World is never binary ο‚— It is never either this or that. ο‚— Why fight over technicalities ο‚— Drive decisions based on use cases ο‚— Choose a model based on the use cases and scenarios ο‚— Research and understand what your application needs ο‚— Stay away from substituting β€œHard work” with β€œRhetoric” ο‚— Be open to experimentation
  • 41.
  • 42. ο‚— http://www.guug.de/lokal/muenchen/2007-05-14/rdbmsc.pdf ο‚— http://ansonalex.com/infographics/twitter-usage-statistics-2012-infographic/ ο‚— http://www.mountainman.com.au/software/history/it1.html ο‚— http://www.slideshare.net/renguzi/codd ο‚— http://cims.clayton.edu/booth/ITDB%204201/Codd%20PDF.pdf ο‚— http://www.scribd.com/doc/19381895/RDBMS-Concepts ο‚— http://www.gitta.info/DBSysConcept/en/text/DBSysConcept.pdf ο‚— http://en.wikipedia.org/wiki/Relational_database ο‚— http://en.wikipedia.org/wiki/ACID ο‚— http://blogs.hbr.org/now-new-next/2009/05/the-social-data-revolution.html ο‚— http://www.go-gulf.com/blog/60-seconds ο‚— http://en.wikipedia.org/wiki/CAP_theorem ο‚— http://highscalability.com/drop-acid-and-think-about-data ο‚— http://queue.acm.org/detail.cfm?id=1394128 ο‚— http://www.bailis.org/blog/safety-and-liveness-eventual-consistency-is-not-safe/ ο‚— http://www.techrepublic.com/blog/10things/10-things-you-should-know-about-nosql-databases/1772 ο‚— http://rebelic.nl/engineering/the-four-categories-of-nosql-databases/ ο‚— http://www.slideshare.net/ksankar/nosql-4559402 ο‚— http://www.thevirtualcircle.com/2008/11/10/6-reasons-why-relational-database-will-be-superseded/ ο‚— http://www.slideshare.net/sbtourist/scale-your-database-and-be-happy ο‚— Note: Many images used in the deck have been a result of using google image search. Even though, I have not been able to mention the sources of all the images individually, I extend my sincere thanks for the owners of the images for making the same available on the net