Big Data (volume) and real-time information processing (velocity) are two important aspects of Big Data systems. At first sight, these two aspects seem to be incompatible. Are traditional software architectures still the right choice? Do we need new, revolutionary architectures to tackle the requirements of Big Data.
This presentation discusses the idea of the so-called lambda architecture for Big Data, which acts on the assumption of a bisection of the data-processing: in a batch-phase a temporally bounded, large dataset is processed either through traditional ETL or MapReduce. In parallel, a real-time, online stream processing is constantly calculating the values of the new data coming in during the batch phase. The combination of the two results, batch and online processing is giving the constantly up-to-date view.