4. The Big Insights Engine
One click, machine learning enabled insights
– Interoperability with data sources
– Ability to process varied data types
– Ability to rapidly perform statistical analysis and
choose winner
– Automatic data visualizations of key drivers
– Identify anomalies and trends
– Ability to feed the data out to other systems
which can act on triggers
6. Big Data: Industry Opportunity
High Retail
$300B
Financial Services
$160B-
Utilities
$200B
Telecomm
Improvement
$80B– Healthcare
$100B $100B
$50B Government
& Education
Prod.
Manufacturing
High
Maturity
Source: Big Data – The next frontier by Mckinsey, 2011
9. SQL vs NoSQL Databases
Traditional Databases NoSQL Databases
• Difficult to scale • Easy to scale using
cheap hardware
• Transaction overhead
• Distributed parallel
– Inefficient joins processing of job
– ACID causes latency • Schema independent
• Not optimal at enabling semi
handling diverse data structured data storage
types • Batch oriented – not
ideal for real time
• Easy integration with analytics
existing BI tools
Notas del editor
Data transmitted, createdetc is 3 times the amount data stored
Real time execution of model is very prevalent. But real time statistical analysis to come up with the model is yet under research.
In any given organization, a number of different data sources are employed – ranging from anywhere from vertica, to mysql to oracle to hadoop. Now the question is for the real value of this date, it needs to talk to these data sources in real time. Yes, there are data warehousing projects implemented, but those take a long time to implement and by the time they are ready, the data has already changed. Patterns to look for in video – say store monitoring cameras for shoplifting – $13B annually is lost due to shoplifting- so what the video should be able to identify is someone has picked up an item, and instead of putting it in the basket, they are putting it in their coat pockets. Then the system has to be trained for false positives,
US Healthcare – 300B Manufacturing – 50 percent decrease in product dev costs
Datameer is building one click visualization engine, hooking up to all kinds of data, however, they don’t have the ability to automate analytics yet.Palantir is to data analytics what ideo is to innovation