This is the supporting slide deck for a presentation that was done at the Irish Hadoop Users Group on the 7th November 2016.
If you want further information please use the contact details on slide 26.
Summary
It used to be we didn’t have much of a choice, and that you used whatever mature RDBMS your ‘tribe’ was comfortable with, be it Oracle, DB2 or SQL*Server. Over that last decade that’s changed completely and you now have about 100 database-like technologies to choose from. Pretty much every possible architectural approach is now available to you, but how do you select the right one?
We're moving from an era where information is human generated and moves at human speeds to one where sources of data like IoT and M2M will simply swamp legacy technologies, both with increased volumes of data and much lower timescales in which you can extract value from it. It used to be that selecting software involved checking boxes in feature matrices. But if speed is such a fundamental requirement we're going to have to look at things differently, as every extra features 'adds slowness'. As a consequence there is no one 'silver bullet' solution that can be deployed everywhere - technologies such as Kafka, Apache Spark, Storm and even VoltDB are all very good in specific scenarios but can not replace a legacy DB.
In this presentation David will explain how you go about categorizing and understanding the new database and persistence technologies that exist.