Read this presentation deck by big data expert and Infochimps CEO, Jim Kaskade. During the presentation, he explains where to start and how to effectively manage, protect and leverage the growing amounts of data in your enterprise. See the video presentation here: http://bigdata.infochimps.com/making-sense-of-big-data
2. #MakeSenseBD
“”
Information is powerful.
But it is how we use is it that will define us.
10/15/2012 Infochimps Confidential 2
3. #MakeSenseBD
First
What is Big Data?
“data sets so large and complex that it becomes
difficult to process using on-hand database
management tools.”
10/15/2012 Infochimps Confidential 3
5. #MakeSenseBD
It’s All About The Data
DIGITAL CONTENT
OPERATIONAL DATA
WEB LOGS
SOCIAL MEDIA
FILES
SMART GRIDS
TRANSACTIONAL DATA
AD IMPRESSIONS
R&D DATA
5
6. #MakeSenseBD
Problem
“Little Data For Business Users“
10/15/2012 Infochimps Confidential 6
8. #MakeSenseBD
Problem
One Size Does Not Fit All
Non-Relational Relational
Analytic Teradata IBM InfoSphere
Aster Netezza HP Vertica Infobright
Hadoop Hadapt ParAccel
Horton Calpont
EMC SAP Hana Oracle
Cloudera VectorWise
Greenplum SAP Sybase IQ Times-Ten
MapR
Zettaset
Operational Spark Oracle IBM DB2 SQLSrvr JustOneDB
InterSystems
Progress Document MarkLogic MySQL Ingress PostgreSQL
Objectivity McObject
Lotus Notes Sybase ASE EnterpriseDB
Versant
NoSQL CouchDB
NewSQL
MongoDB ‘Data as a Service’ HandlerSocket
Key Amazon RDS
Couchbase RavenDB Akiban
Cloudant SQL Azure
Value App Engine MySQL Cluster
Database.com
SimpleDB Clustrix
Xeround FathomDB
Drizzle
Riak Big Tables GenieDB
Redis
Cassandra
Graph SchoonerSQL ScaleBase ScalArc
Membrain Tokutek NimbusDB
FlockDB CodeFutures
Voldemort HyperTable
InfiniteGraph Continuent VoltDB
BerkeleyDB HBase
Neo4j Translattice
AllegroGraph
10/15/2012 Infochimps Confidential 8
9. #MakeSenseBD
Problem
Complexity of A New Data Architecture
Structured
BI User
Departmental
Reports (reports)
Online Teradata
Click Data Data Warehouse SQL BI
Data Mart Server
Virt Virt Virt
Online DM DM DM
BI Data
CRM Real-Time Data App
Data Streaming Server
Operational Customer
Application
POS
Data
BI
Hadoop Server
Cust Srvc Analytics User
Data
Call Logs NoSQL
Warehouse Platform
In-Memory
Social Sandbox
Sandbox
Sandbox
Sandbox
Analytics
Bus User
Semi-structured IT (ETL) (Reports)
15. #MakeSenseBD
Next
Hadoop + NoSQL technologies =
the ability to process large and complex
data sets without the challenges
associated with legacy, and at a fraction
of the price.
10/15/2012 Infochimps Confidential 15
17. #MakeSenseBD
Big Data Warehouse
Search Recommend
Rank
Analytic
Request Master: Answer
Score Next-Best-Action Name Node
Job Tracker
Ethernet Interconnect
Slave: Slave: Slave:
Task Trckr Task Trckr Task Trckr
Data Node Data Node Data Node
Semi-
.... Structured
Data
PARC | 17
18. #MakeSenseBD
Real
Time
Traditional Operational
Application Ecosystem
Deployment in
Analytic Public/Private Cloud
Appliances
Toolset Integration
Traditional
Decision Support Hardened
Batch
Large Small
Enterprise Enterprise
10/15/2012 Infochimps Confidential 18
20. #MakeSenseBD
Images Web, Mobile, CRM,
ERP, SCM…
Business
Docs,
Transactions &
Text Interactions
Web
Logs SQL NoSQL NewSQL
Social EDW MPP NewSQL
Sensors Business
Intelligence &
Analytics
Dashboards, Reports
GPS Visualization…
10/15/2012 Infochimps Confidential 20
21. #MakeSenseBD
Use Case
Hedge Fund
How do I predict whether companies will
make their quarterly earnings forecast?
10/15/2012 Infochimps Confidential 21
24. #MakeSenseBD
Cars
In Lot
News
Text
Web
Pricing Quarterly
Revenue
Prediction
Social
Sentiment
Weather
Sensors
Local
Employment
10/15/2012 Infochimps Confidential 24
25. #MakeSenseBD
Use Case
Media Company
How do I merge my traditional media
sources with new media sources to
provide improved and instant insights to
my customers?
10/15/2012 Infochimps Confidential 25
26. #MakeSenseBD
New Media
Data Scientist App Developer
Gnip
Powertrack
Business Users
Gnip
EDC
Sources Sentiment
Moreover
Metabase
In-Motion
Data Delivery APIs Listening
Service Application
TV
Transcription
NoSQL
Radio
Transcription
Print
Transcription
IT Staff
Traditional Media
10/15/2012 Infochimps Confidential 26
27. #MakeSenseBD
Use Case
Retail Company
How do I increase online revenue?
10/15/2012 Infochimps Confidential 27
28. #MakeSenseBD
Family 60% + 10%
Million $ Q 40%
Color 30%
Welcome 15% Kids Exclusive
Current Baby 60%
Approved Hue Denim
Weekend 15%
Threadless
Offers Sunday 25% Denim
Million $ Q
Spring 25%
Khakis
Color 30%
Color 30%
Million $ Q Color Denim 30%
Khakis Hoodies 10%
Dynamically Populated
Personalized Email
Known & Unknown Existing
Customers & Approved
Online/Offline Behavior Product
Content
29. #MakeSenseBD
Current
Campaign
Offers
Online
Click Data
Online Traditional
BI Data Analytics
Targeted Offers Personalized
Data & Products Email Campaign
Past CRM
Data Model
Hadoop Graph
POS Cluster Analytics
Data
Data
Model
Cust Srvc Measure
Call Logs Performance
Social
Product
Content
32. #MakeSenseBD
I’m Ready
So How Do I Start?
…without spending a *$#&-load of
money before proving ROI?
10/15/2012 Infochimps Confidential 32
33. #MakeSenseBD
Deployment Options
On-Premise
Public Cloud
Provider Trusted
Data Center Provider
10/15/2012 Infochimps Confidential 33
34. #MakeSenseBD
You Manage Someone Else Manages
$ $
$
$
$
Private Big Data Virtual Private Big Public Big Data Virtual Private Big Public Big Data
Cloud (You Data Cloud (You Cloud (You Data Cloud Cloud (Managed
Manage) Manage) Manage) (Managed Service) Service)
$Cost Security Risk Time To Market
10/15/2012 Infochimps Confidential 34
36. #MakeSenseBD
Infochimps
Enterprise Customers
• Managed Big Data Services
• Elastic & Secure Private &
Public Clouds
• Across a Global Network of App
BI
Analytics Sys
BI
Trusted Data Center Data
Lang
Data Intelligence Data
Delivery
Delivery Network
Providers Hadoop NoSQL
Infra
• With Batch & Real-time Delivery
Analytic Framework Global Network Of
• Supporting Structured & Data Center Infrastructure Providers
Unstructured Data
10/15/2012 Infochimps Confidential 36
37. #MakeSenseBD
Data
Intelligence Network
Cloud-based
Data PaaS
Virtual Private & Public Cloud
Data Tier4 Lights Out Data Centers
Marketplace OpenStack & VSphere
Managed Services
Big Data PaaS
Public Cloud
15,000 Data Sets Amazon & Rackspace
Managed Services
10/15/2012 Infochimps Confidential 37
38. #MakeSenseBD
Elastic Big Data PaaS
Deployment From Laptop to Cloud (Public & Private) Amazon, Rackspace, OpenStack & VSphere
Ironfan
10/15/2012 Infochimps Confidential 38
39. #MakeSenseBD
Big Data Managed Service Offerings
Community Public Virtual Private Private
Cloud Cloud Cloud
Access to Pre-integrated, pre- Pre-integrated, pre- Pre-integrated, pre-
Infochimps Big tested Big Data tested Big Data tested Big Data
Data Platform via stack stack stack
open source
Quickly deploy in Deployed in a Deployed in your
Deploy Anywhere Amazon trusted lights-out Data Center -
Cloud, Rackspace data center Open Stack or
Cloud network Vsphere
Try It Under Your High SLA
Control Fully Managed Managed Service Customized
Service Managed Service
10/15/2012 Infochimps Confidential 39
41. #MakeSenseBD
#1 Big Data Platform For The Cloud
#MakeSenseBD
www.infochimps.com/demo
1-855-DATA-FUN (1-855-328-2386)
10/15/2012 Infochimps Confidential 41
Notas del editor
Title slide: "Making Sense of Big Data" (I like the Elephant on the motorcycle as the background image here along with the descriptor, "We provide a suite of big data services in the cloud, used by enterprise customers who want to quickly unlock the value of their data"Slide 1: "Information is Powerful, but it's how we use it..." Set the stage, we are here today to learn how to leverage Big Data to derive value and achieve insights. Slide 2: "What is Big Data?" The message here is to start at the beginning and define it for those in the audience who might be unclear (we know there are many people who are). Use the first slide from your CloudCon deck here.Slide 3: State of the world - data is increasing exponentially and it's only going to continue and therefore require infrastructure and management in order to provide useful insights. Use your slide 2 from your Cloudcon deck - it has a nice image of volume (which is one of the tenants of big data)Slide 4: Why is this occurring? Here the message is new types of data, batch vs. real-time -- everyone is "listening" now and measuring more activity, actions, conversations than every before. Use the CloudCon slide that builds vertically from batch to real time and horizontally from large enterprise to small enterprise.Slide 5: Problem: "Little Data for Business Users" slide from CloudCon. The message here is that due to the influx and types of data, etc. the actual users are too far removed from it and therefore blind to how to instill insights from it. Walk through the build as it explains really well how data moves throughout an organization and where the roadblocks are for getting insight to execs to act upon. Slide 6: Use the #thisreallysucks slide here to drive home the current state of being.Slide 7: "Big Data for Business Users" slide. This is the end state of being for executives looking to use data to improve operational efficiency and competitive advantage.Slide 8: Use the build slide here to show how we bring the data to the app developer and therefore reduce the friction for executives.Slide 9: Use the #thisisreallygood slide here to enhance the point that this is the way data and info should flowSlide 10: How do you achieve this state?Slide 11: Introducing Infochimps. Use the "Good to Great" (#2) slide from your 451 deck to give a brief overview of who we are. We cut our teeth on big data having built the largest data marketplace, where we leveraged the latest technology (Hadoop, etc) to manage big data. We realized that others must be realizing the same issues as IC and decided to externalize our platform to help companies implement their own Big Data infrastructure.Slide 12: Big Data Cloud Platform (the solution to the Big Data problem). Use slide #7 from 451 deck. Walk through the platform and the components, allowing attendees to see that we offer an end-to-end, cloud-based solution. Call out the value of our 4 pillars here - Fast, Simple, Flexible, Enterprise-ready.Slide 13: Deployment options slide. This is where we talk about IC being offered as a managed service and the value it affords. NOTE: Not sure if we want to communicate the Data Intelligence Network since we have not publicly or formally announced it. Slide 14: IC in action: Infomart use case (challenge, IC solution, result)Slide 15: one more use case if time (Koupon)?Slide 16: Close: Infochimps the #1 Big Data Platform for the Cloud. Include sales contact number at bottom of the slide along with web address.
AvinashKaushik gave a talk at Strata 2012 in Santa Clara in March….and quoted an Kenyan Farmer.If you listen to all the hype of Big Data, it solves for the first problem.If you listen to all the vendors, there is a lot of emphasis on the first part (perhaps Infochimps included), and very little on the second.I think that’s because we don’t exactly know how to truly empower the organization to interact directly with any/all data available.It’s too expensive, risky, complex.
AvinashKaushik gave a talk at Strata 2012 in Santa Clara in March.If you listen to all the hype of Big Data, it solves for the first problem.If you listen to all the vendors, there is a lot of emphasis on the first part (perhaps Infochimps included), and very little on the second.I think that’s because we don’t exactly know how to truly empower the organization to interact directly with any/all data available.It’s too expensive, risky, complex.
40%+ YoY growth with 2012 generating 2.4Zettabytes alone.http://jameskaskade.com/?p=2040http://www.emc.com/collateral/demos/microsites/emc-digital-universe-2011/index.htm
Discussions with O’Reilly Media, Teradata, Aster Data, Yahoo!, eBay, and Facebook.The issue is not just the fact that unstructured data is exploding, but the number of sources and types of data as well…all fed from the explosion of devices used by people to interact with each other, products, and services.
We have a problem today WITH our data infrastructure….our ability to gleam insights.I think all of you know what I’m referring to…..It’s the fact that we’re operating on less than 15% of the corporate data available to us…..even with the ENTERPRISE DATA WAREHOUSE, the EDW which is supposedly storing a COMPLETE, SINGLE VIEW OF THE TRUTH….We’re still giving our business users…..a tiny bit…a little bit of data.
http://blogs.the451group.com/opensource/2011/04/15/nosql-newsql-and-beyond-the-answer-to-sprained-relational-databases/NoSQL databases designed to meet scalability requirements of distributed architectures and/or schema-less data management requirements, including big tables, key value stores, document database and graph databasesNewSQL databases designed to meet scalability requirements of distributed architectures or to improve performance such that horizontal scalability is no longer a necessity, including new MySQL storage engines, transparent sharding technologies, software and hardware appliances, and completely new databasesData grid/cache products designed to store data in memory to increase application and database performance, covering a spectrum of data management capabilities from non-persistent data caching to persistent caching, replication, and distributed data and compute grid functionalityhttp://en.wikipedia.org/wiki/DatabaseThe first generation of database systems were navigational,[2] applications typically accessed data by following pointers from one record to another. The two main data models at this time were the hierarchical model, epitomized by IBM's IMS system, and the Codasyl model (Network model), implemented in a number of products such as IDMS.http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redisNew SQL: The “new relational databases” that retain SQL & ACID compliance*Scalable with distributed architectures, or*Performance improved such that horizontal scalability no loner necessitySchoonerSQL: http://www.schoonerinfotech.com/Tokutek: http://www.tokutek.com/Continuent: http://www.continuent.com/Translattice: http://www.translattice.com/ScaleBase: http://www.scalebase.com/CodeFutures: http://www.codefutures.com/database-products/VoltDB: http://voltdb.com/HandlerSocket: https://github.com/ahiguti/HandlerSocket-Plugin-for-MySQLAkiban: http://www.akiban.com/MySQL Cluster: http://www.mysql.com/products/cluster/Clustrix: http://www.clustrix.com/Drizzle: http://www.drizzle.org/GenieDB: http://www.geniedb.com/ScalArc: http://scalarc.com/NimbusDB: http://nimbusdb.com/NimbusDB/NimbusDb.html
http://www.nrf-arts.org/content/unifiedposStep 1:Integrate into CRM (email)Step 2:Integrate into WebStep 3:Integrate into POS (UnifiedPOS)
The Business User
The Business User
The Business User
Being the CEO of Infochimps, I felt compelled to share a little “chimpy” research with you…The “Infinite Monkey Theorem”….is a METAPHOR that directly relates to Big Data, that I think you’ll appreciate.So what is the “Infinite Monkey Theorem”????The following definition is a variant of the original theorem….let me read it to you.This theorem has been traced back to Aristotle's “On Generation and Corruption”, where he makes deductions about the unexperienced and unobservable based on real experiences and real observations.
AMP:access module processorsPE: Parsing EngineBYNET: Banyan Cross-bar Switch YNET (Y Network)Store:The Parsing Engine dispatches a request to retrieve one or more rows.The BYNET ensures that appropriate AMP(s) are activated.The Parsing Engine dispatches a request to insert a row.The BYNET ensures that the row gets to the appropriate AMP (Access Module Processor) via the hashing algorithm.The AMP stores the row on its associated disk.Each AMP can have multiple physical disks associated with it.Retrieve:The AMPs (access module processors) locate and retrieve desired rows in parallel access and will sort, aggregate or format if needed.The BYNET returns retrieved rows to Parsing Engine.The Parsing Engine returns row(s) to requesting client application.Teradata’s shared-nothing architecture allows for highly scalable data volumes.
3 node Hadoop system:$8K/node$10K switch$4K/node HadoopDistro$24K + $10K x 25%x3 maintenance = $43K$4K x 3 x 3 = $36KTotal = There are three essential elements of an analytic platform: Strong support for analytic database query. A variety of query styles — at a minimum, SQL, MDX or graph.Strong support for analytic processes other than queries. Typically these would be in the areas of mathematics (statistics, predictive analytics, data mining, linear algebra, optimization, graph theory, etc.) and/or data transformation (e.g. sessionization, entity extraction).Strong integration between the first two.The point is — an analytic platform is something on which you can build a range of powerful analytic applications. Some specifics of what to look for in analytic platform may be found in the link above.http://www.dbms2.com/2011/02/24/analytic-platforms/http://www.dbms2.com/2011/01/18/architectural-options-for-analytic-database-management-systems/Enterprise data warehouse (Full or partial)Kinds of data likely to be included: All, but especially operationalLikely use styles: AllCanonical example: Central EDW for a big enterpriseStresses: Concurrency, reliability, workload managementClassical EDWs are Teradata, DB2, Exadata, and maybe Microsoft SQL ServerTraditional data martKinds of data likely to be included: AllLikely use styles: Business intelligence, budgeting/consolidation, investigativeExamples: Reporting servers, planning/consolidation servers, anything MOLAP, etc.Stresses: Performance, concurrency, TCOColumnar DBMS might have more attractive performance and TCO (Total Cost of Ownership); the same goes for Netezza. Some of them — e.g. Sybase IQ and Vertica — have excellent track records in concurrent usage as well.Investigative data mart — agileKinds of data likely to be included: All, especially customer-centricLikely use styles: InvestigativeCanonical example: A few analysts getting a few TB to examineStresses: Ease of setup/load, ease of admin, price/performanceInfobright is often cost-effective among columnar analytic DBMS. Investigative data mart — bigKinds of data likely to be included: All, especially customer-centric, logs, financial trade, scientificLikely use styles: InvestigativeCanonical example: Single-subject 20 TB – 20 PB relational databaseStresses: Performance, scale-out, analytic functionalityPerformance and scalability are major challenges, usually best addressed by MPP (Massively Parallel Processing) systems, such as Netezza, Vertica, Aster Data, ParAccel, Teradata, or Greenplum.Bit bucket - HadoopKinds of data likely to be included: Logs, other technical/externalLikely use styles: Staging/ETL, investigativeCanonical example: Log files in a Hadoop clusterStresses: TCO, scale-out, transform/big-query performance, ETL functionalityArchival data storeKinds of data likely to be included: Operational, CDR (call detail record), security logLikely use styles: Archival, reporting (for compliance), possibly also investigativeExamples: Any long-term detailed historical storeStresses: TCO, compression, scale-out, performance (if multi-use)Perhaps only Rainstor truly embraces the archival positioningOutsourced data martKinds of data likely to be included: AllLikely use styles: Traditional BI, investigative analytics, staging/ETLExamples: Advertising tracking, SaaS CRMStresses: Performance, TCO, reliability, concurrencyOracle shops = Vertica gets the nod in a number of these casesOperational analytic(s) serverKinds of data likely to be included: Customer-centric, log, financial tradeLikely use styles: Advanced operational analyticsExamples:Lower latency: Web or call-center personalization, anti-fraudHigher latency: Customer profiling, Basel 3 risk analysisStresses: Performance, reliability, analytic functionality, perhaps concurrencyhttp://www.dbms2.com/2011/07/05/eight-kinds-of-analytic-database-part-1/
The Business User
The way this is performed is by taking data sources like images and storing them into Hadoop. Then using Big Data tools like MapReduce to perform sophisticated analysis on those aggregated data sets.Why is this concept so disruptive?Things like a fraction of the price….no structured data model – aka no star schema…yet the ability to run sophisticated queries and algorithms against all your detailed data.
Being the CEO of Infochimps, I felt compelled to share a little “chimpy” research with you…The “Infinite Monkey Theorem”….is a METAPHOR that directly relates to Big Data, that I think you’ll appreciate.So what is the “Infinite Monkey Theorem”????The following definition is a variant of the original theorem….let me read it to you.This theorem has been traced back to Aristotle's “On Generation and Corruption”, where he makes deductions about the unexperienced and unobservable based on real experiences and real observations.
The current image shows a Walmart in Wichita, Kansas.Analysts count cars in Wal-Mart parking lots to measure overall customer traffic to understand growth versus its competition.For example, Wal-Mart's growthwas determined to come mostly from areas of high unemployment.This type of analysis is being performed in Amazon”s EC2…
The current image shows the a Target in the Moraine Point Plaza located in Gardiner, NorthAnalysts comparing satellite parking lot data with regional unemployment trends found Target's growth tended to come in areas of lower-than-average unemployment. Again, these processes are being performed in Amazon EC2.…this is interesting….but how do we process the data further to help derive more relevant insights?http://www.cnbc.com/id/38738810/Spying_For_Profits_The_Satellite_Image_Indicator
The previous examples of Walmart and Target involved using a regression algorithm which was executed against the satellite data + other data to produce a quarterly revenue prediction which BEAT all previous models.
Being the CEO of Infochimps, I felt compelled to share a little “chimpy” research with you…The “Infinite Monkey Theorem”….is a METAPHOR that directly relates to Big Data, that I think you’ll appreciate.So what is the “Infinite Monkey Theorem”????The following definition is a variant of the original theorem….let me read it to you.This theorem has been traced back to Aristotle's “On Generation and Corruption”, where he makes deductions about the unexperienced and unobservable based on real experiences and real observations.
Being the CEO of Infochimps, I felt compelled to share a little “chimpy” research with you…The “Infinite Monkey Theorem”….is a METAPHOR that directly relates to Big Data, that I think you’ll appreciate.So what is the “Infinite Monkey Theorem”????The following definition is a variant of the original theorem….let me read it to you.This theorem has been traced back to Aristotle's “On Generation and Corruption”, where he makes deductions about the unexperienced and unobservable based on real experiences and real observations.
The Business User
Being the CEO of Infochimps, I felt compelled to share a little “chimpy” research with you…The “Infinite Monkey Theorem”….is a METAPHOR that directly relates to Big Data, that I think you’ll appreciate.So what is the “Infinite Monkey Theorem”????The following definition is a variant of the original theorem….let me read it to you.This theorem has been traced back to Aristotle's “On Generation and Corruption”, where he makes deductions about the unexperienced and unobservable based on real experiences and real observations.
Being the CEO of Infochimps, I felt compelled to share a little “chimpy” research with you…The “Infinite Monkey Theorem”….is a METAPHOR that directly relates to Big Data, that I think you’ll appreciate.So what is the “Infinite Monkey Theorem”????The following definition is a variant of the original theorem….let me read it to you.This theorem has been traced back to Aristotle's “On Generation and Corruption”, where he makes deductions about the unexperienced and unobservable based on real experiences and real observations.
Slide 1: Company Overview.The best way to give an overview of your company is to state concisely your core value proposition: What unique benefit will you provide to what set of customers to address what particular need? Then you can add three or four additional dot points to clarify your target markets, your unique technology/solution, and your status (launch date, current customers, revenue rate, pipeline, funding needed). Key objective: Flesh out the foundation you established at the beginning. At this point, no one should have any question about what it is that your company does, or plans to do. The only questions that should remain are the details of how you are going to do it. Another key objective you should have achieved by this point in your presentation is to make sure that if there are some compelling brand names associated with your company (customers, partners, investors, advisors), your audience knows about them. Feel free to drop names early and often—starting with your first email introduction to the investor. Brand name relationships build your credibility, but do not overstate them if they are tenuous.Use-cases:RunaAutomated real-time online offers - monitors and analyzes shopper behavior on web, and then makes each shopper a personalized offerInfochimps helps Runa configure and manage their entire production system, including Hadoop, HBase, messaging, monitoring, and more. (using Ironfan – Robert Berger)SpringSenseintelligence enterprise document searchSpringSense uses Infochimps to scale its award-winning technology to process the full Wikipedia corpus - over 4 million articles - for rapid meaning-based search. (using Ironfan)Black LocusCompetitive pricing analytics platform for enterprisesIngesting millions of product pricing data points from the web, analyzing historical and current data, presenting analytic results in real-time.Koupon MediaMobile coupon platformFor every user who enters into the mobile coupon system, more demographic information is needed to help target the right coupon to the right customer and in real-time.BlueCavaBehavioral target marketing platform - joins customers across any/all devices & augments w/ demograph / behavioral for targeted advertisingFor every user who enters into the mobile coupon system, more demographic information is needed to help target the right coupon to the right customer and in real-time.A new Attribution data product (using Hadoop) which determines correlations between customer purchases / conversions to advertising impressions and website behavior.InfoMartLargest media company in Canada transforming business from print to digital – focus is on engaging and better understanding their audiencesSocial media listening platform which consists of both real-time social feed search / analytics / reporting for InfoMart and their customers + historic analysis / trending research.
Slide 3: Solution.What specifically are you offering to whom? Software, hardware, services, a combination? Use common terms to state concretely what you have, or what you do, that solves the problem you’ve identified. Avoid acronyms and don’t try to use these precious few words to create and trademark a bunch of terms that won’t mean anything to most people, and don’t use this as an opportunity to showcase your insider status and facility with the idiomatic lingo of the industry. If you can demonstrate your solution (briefly) in a meeting, this is the place to do it.Slide 3.1: Delivering the Solution.You might need an extra slide to show how your solution fits in the value chain or ecosystem of your target market. Do you complement commonly used technologies, or do you displace them? Do you change the way certain business processes get executed, or do you just do them the same way, but faster, better and cheaper? Do you disrupt the current value chain, or do you fit into established channels? Who exactly is the buyer, and is that person different than the user?
Slide 7: Go to Market Strategy.The single most compelling slide in any pitch is a pipeline of customers and strategic partners that have already expressed some interest in your solution—if they haven’t already joined your beta program. Too often this slide is, instead, a bland laundry list of standard sales and marketing tactics. You should focus on articulating the non-obvious, potentially disruptive elements of your strategy. Even better, frame your comments in terms of the critical hurdles you need to get over, and how you are going to jump them. If you don’t have a pipeline, and there is nothing unique or innovative about your strategy, then drop this slide and make the elements of your sales model clear in the discussion of your business model (next slide).
1. Which best describes your position in your organization? a. Executive (VP, SVP, C-level)b. Business User (Marketing, Product, etc.)c. Analytics Team (Data Scientist, Analyst, etc.)d. IT User (App Dev, DevOps, Project Manager, etc)e. Other2. Do you have a current or upcoming Big Data project? a. Yesb. Noc. Not Sure3. Which deployment option do you prefer? a. Public Cloudb. Private Cloudc. Virtual Private Cloudd. No Cloude. Not Sure