SlideShare a Scribd company logo
1 of 39
Download to read offline
Showcasing Data Science
Lab functionality
Welcome from Kognitio
www.kognitio.com
Today’s Web Seminar -
Presenters Host
Michael Hiskey
Vice President
Marketing & Business Development
Format &
Agenda
Keynote Presenters
Dr. Sharon Kirkham
Data Scientist
Kognitio Analytics Center of Excellence
• Big Data and Complexity– the need for Data Scientists 
Question Break #1
• Data Manipulation – functional demonstration
Question Break #2
• Product forecasting with parallel R  ‐ practical demonstration 
Question Break # 3
Kognitio
Kognitio is focused on providing the 
premier high‐performance analytical 
platform to power business insight 
around the world
• Kognitio invented the in‐memory analytical 
platform, first taking it to market in 1989
• Privately held
• Labs in the UK ‐ HQ in New York, NY 
The Data Science Lab
Data
Scientists &
Staff
Mathematic
Algorithms
MPP
Computing
BIG DATA
11
What do business users want to do?
Find patterns
Track life
time
journeys
Predict
behavior
Forecast
scenarios
Allocate
scarce
resources
Model
value
Characterize
groups
Visualize
discovery
Respond,
trigger,
manage,
promote
I’m a data scientist! Are you?
Entry level skills and development - aspiration
Machine
Learning
Graduates
I’m a data scientist! Are you?
Business
Expertise
Machine
Learning
Interpretation
skills
= Insight
Graduates
Need
guidance
Data
Scientist
Supporting the data scientist
Typical process – traditionally…
Database
Supporting the data scientist
Typical process – direct data preparation
Database
SQL processing
Supporting the data scientist
Typical process – produces analytical data set
Database
SQL processingData Set
Supporting the data scientist
Typical process – run analytics from server
Database
SQL processingData Set
???
Supporting the data scientist
Typical process – data samples often used
Database
SQL processingData Set
???
Data Samples
Process run
iteratively
= slow
Supporting the data scientist
Typical process – modelling process is honed
Database
SQL processingData Set
???
Data Samples
Process run
iteratively
= slow
Supporting the data scientist
Typical process – model is complete
Database
Data Set
???

Supporting the data scientist
Typical process – score full data (Ouch!)
Database
Data Set
???
Full data
to score
Supporting the data scientist
Push processes to DB – still produce analytical data set
Analytical Platform
SQL processingData Set
Supporting the data scientist
Push processes to DB – translate specific processes
Analytical Platform
SQL processingData Set
???
Translation
Supporting the data scientist
Push processes to DB – results passed back
Analytical Platform
SQL processingData Set
???
Translation
Result Data Set
Supporting the data scientist
Push processes to DB– modelling process is honed
Analytical Platform
SQL processingData Set
???
Translation
Result Data Set
Supporting the data scientist
Push processes to DB– model scoring done in DB
Analytical Platform
SQL processingData Set
???

Result Data Set
Supporting the data scientist
But we always want more! Complex data structure
Analytical Platform
Data Set
???

Result Data Set
SQL cannot handle
Data complexity.
How do I integrate
into my model?
Supporting the data scientist
But we always want more! non-standard processes
Database
SQL processingData Set
???
Data Samples Back where
we started
Supporting the data scientist
Bring Analytics to data – still produce analytical data set
SQL processing
SQL processing
Supporting the data scientist
Bring Analytics to data – can use other code for data prep
SQL processing
Kognitio scripting
Code executed
Using MPP
Data held in
Memory. Fast
access to CPUs
Supporting the data scientist
Bring Analytics to data – run analytics natively in Kognitio
SQL processing
Kognitio scripting
Code executed
Using MPP
Data held in
Memory. Fast
access to CPUs
One platform flexible working
from data prep through analytical
process
New! Kognitio version 8:
Enabling and extending the Analytical Platform
External Tables
External Functions
Not Only SQL
Hadoop Connector Other Connectors
Kognitio Storage
as an External table
General Availability:
June 2013
External Scripting – Data Transformation
Converting structured data into
XML format, i.e. furnishing
personalised content
Assembly
Converting XML into structured
data
Disassembly
Extracting complex information
from URLs
Pulling words from large text fields,
i.e. sentiment analysis
Parsing
Converting row based information
into columns for data mining,
i.e. supporting classification or
segmentation
Transposition
e.g. using perl
Examples where SQL is typically complex and extensive
Data Manipulation
Small Demo
Product Forecasting – with parallel R
Forecasting
Requirements
Forecast
Inputs
R running in an MPP environment
Persistence
Layer
Analytical
Platform
Layer
R running in an MPP environment
Persistence
Layer
Analytical
Platform
Layer
Kognitio
platform
specification
16 servers
462GB
Kognitio
RAM
128 Cores
This is old kit
2.9 billion
rows of
epos
184 day time series
for 12K products
R running in an MPP environment
Persistence
Layer
Analytical
Platform
Layer
R running in an MPP environment
Persistence
Layer
Analytical
Platform
Layer
1 output table
in RAM
128 parallel
instances of R
R running in an MPP environment
Persistence
Layer
Analytical
Platform
Layer
Application &
Client Layer
ExcelAll BI Tools
R running in an MPP environment
Persistence
Layer
Analytical
Platform
Layer
Application &
Client Layer
ExcelAll BI Tools
13 views of
different analytical
output
R running in an MPP environment
Persistence
Layer
Analytical
Platform
Layer
Application &
Client Layer
ExcelAll BI Tools
Result set
contained
# rows
12K forecasts and
stats calculated
in # seconds
2.9B EPOS items
collated into
time series
in # seconds
Product Forecasting
using parallel R Demo
Thank you for your participation today
• More information on today’s topic can be found at: 
• kognitio.com/mpp_r
• kognitio.com/product‐forecasting
• FREE TO USE – perpetual license
– www.kognitio.com/free
– Contact us for the pre‐release version 8
• Analyst White Papers
– EMA Comparative Analysis 
– In‐memory database platforms
– www.kognitio.com/emacompinmem
• Today’s slides (and more): www.slideshare.net/Kognitio
connect
www.kognitio.com
twitter.com/kognitiolinkedin.com/companies/kognitio
tinyurl.com/kognitio youtube.com/kognitio
NA: +1 855  KOGNITIO
EMEA: +44 1344 300 770

More Related Content

What's hot

Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Simplilearn
 
II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...
II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...
II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...
Dr. Haxel Consult
 
`Data mining
`Data mining`Data mining
`Data mining
Jebin R
 

What's hot (20)

Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
 
II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...
II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...
II-SDV 2012 Dealing with Large Data Volumes in Statistical Analysis and Text ...
 
II-SDV 2017: Gridlogics Technologies
II-SDV 2017: Gridlogics TechnologiesII-SDV 2017: Gridlogics Technologies
II-SDV 2017: Gridlogics Technologies
 
II-SDV 2017: Spotting the Stars in your Galaxy of Patent Data
II-SDV 2017: Spotting the Stars in your Galaxy of Patent DataII-SDV 2017: Spotting the Stars in your Galaxy of Patent Data
II-SDV 2017: Spotting the Stars in your Galaxy of Patent Data
 
Self-service consumption Data Catalog
Self-service consumption Data CatalogSelf-service consumption Data Catalog
Self-service consumption Data Catalog
 
what is data science
 what is data science what is data science
what is data science
 
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscaping
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscapingAI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscaping
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscaping
 
Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...
Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...
Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...
 
Lecture2 big data life cycle
Lecture2 big data life cycleLecture2 big data life cycle
Lecture2 big data life cycle
 
Kerstin Diwisch | Towards a holistic visualization management for knowledge g...
Kerstin Diwisch | Towards a holistic visualization management for knowledge g...Kerstin Diwisch | Towards a holistic visualization management for knowledge g...
Kerstin Diwisch | Towards a holistic visualization management for knowledge g...
 
ICIC 2017: Publication Analysis and Publication Strategy
ICIC 2017: Publication Analysis and Publication Strategy  ICIC 2017: Publication Analysis and Publication Strategy
ICIC 2017: Publication Analysis and Publication Strategy
 
Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]
Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]
Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]
 
Data Warehouse By Piyush
Data Warehouse By PiyushData Warehouse By Piyush
Data Warehouse By Piyush
 
`Data mining
`Data mining`Data mining
`Data mining
 
Survey on Text Mining Based on Social Media Comments as Big Data Analysis Usi...
Survey on Text Mining Based on Social Media Comments as Big Data Analysis Usi...Survey on Text Mining Based on Social Media Comments as Big Data Analysis Usi...
Survey on Text Mining Based on Social Media Comments as Big Data Analysis Usi...
 
Toolboxes for data scientists
Toolboxes for data scientistsToolboxes for data scientists
Toolboxes for data scientists
 
GraphTour London 2020 - Customer Journey
GraphTour London 2020  - Customer Journey GraphTour London 2020  - Customer Journey
GraphTour London 2020 - Customer Journey
 
ICIC 2017: Product presentations FIZ Karlsruhe
ICIC 2017: Product presentations FIZ KarlsruheICIC 2017: Product presentations FIZ Karlsruhe
ICIC 2017: Product presentations FIZ Karlsruhe
 
Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data science
 
ICIC 2017: New product presentation minesoft
ICIC 2017: New product presentation minesoftICIC 2017: New product presentation minesoft
ICIC 2017: New product presentation minesoft
 

Similar to Product forecastingwebinar 20130417

Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal GreenplumSimplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
VMware Tanzu
 

Similar to Product forecastingwebinar 20130417 (20)

DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
 
Data science lab enabling flexibility
Data science lab   enabling flexibilityData science lab   enabling flexibility
Data science lab enabling flexibility
 
Democratizing Apache Spark for the Enterprise with Jonathan Gole
Democratizing Apache Spark for the Enterprise with Jonathan GoleDemocratizing Apache Spark for the Enterprise with Jonathan Gole
Democratizing Apache Spark for the Enterprise with Jonathan Gole
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
 
How Data Virtualization Adds Value to Your Data Science Stack
How Data Virtualization Adds Value to Your Data Science StackHow Data Virtualization Adds Value to Your Data Science Stack
How Data Virtualization Adds Value to Your Data Science Stack
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
 
Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal GreenplumSimplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teams
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
 
Architecting for Data Science
Architecting for Data ScienceArchitecting for Data Science
Architecting for Data Science
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
 
Building Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceBuilding Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field Experience
 
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
 
03_aiops-1.pptx
03_aiops-1.pptx03_aiops-1.pptx
03_aiops-1.pptx
 
JavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data ScienceJavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data Science
 
Discover BigQuery ML, build your own CREATE MODEL statement
Discover BigQuery ML, build your own CREATE MODEL statementDiscover BigQuery ML, build your own CREATE MODEL statement
Discover BigQuery ML, build your own CREATE MODEL statement
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Product forecastingwebinar 20130417

  • 1. Showcasing Data Science Lab functionality Welcome from Kognitio www.kognitio.com
  • 2. Today’s Web Seminar - Presenters Host Michael Hiskey Vice President Marketing & Business Development Format & Agenda Keynote Presenters Dr. Sharon Kirkham Data Scientist Kognitio Analytics Center of Excellence • Big Data and Complexity– the need for Data Scientists  Question Break #1 • Data Manipulation – functional demonstration Question Break #2 • Product forecasting with parallel R  ‐ practical demonstration  Question Break # 3
  • 4. The Data Science Lab Data Scientists & Staff Mathematic Algorithms MPP Computing BIG DATA 11
  • 5. What do business users want to do? Find patterns Track life time journeys Predict behavior Forecast scenarios Allocate scarce resources Model value Characterize groups Visualize discovery Respond, trigger, manage, promote
  • 6. I’m a data scientist! Are you? Entry level skills and development - aspiration Machine Learning Graduates
  • 7. I’m a data scientist! Are you? Business Expertise Machine Learning Interpretation skills = Insight Graduates Need guidance Data Scientist
  • 8. Supporting the data scientist Typical process – traditionally… Database
  • 9. Supporting the data scientist Typical process – direct data preparation Database SQL processing
  • 10. Supporting the data scientist Typical process – produces analytical data set Database SQL processingData Set
  • 11. Supporting the data scientist Typical process – run analytics from server Database SQL processingData Set ???
  • 12. Supporting the data scientist Typical process – data samples often used Database SQL processingData Set ??? Data Samples Process run iteratively = slow
  • 13. Supporting the data scientist Typical process – modelling process is honed Database SQL processingData Set ??? Data Samples Process run iteratively = slow
  • 14. Supporting the data scientist Typical process – model is complete Database Data Set ??? 
  • 15. Supporting the data scientist Typical process – score full data (Ouch!) Database Data Set ??? Full data to score
  • 16. Supporting the data scientist Push processes to DB – still produce analytical data set Analytical Platform SQL processingData Set
  • 17. Supporting the data scientist Push processes to DB – translate specific processes Analytical Platform SQL processingData Set ??? Translation
  • 18. Supporting the data scientist Push processes to DB – results passed back Analytical Platform SQL processingData Set ??? Translation Result Data Set
  • 19. Supporting the data scientist Push processes to DB– modelling process is honed Analytical Platform SQL processingData Set ??? Translation Result Data Set
  • 20. Supporting the data scientist Push processes to DB– model scoring done in DB Analytical Platform SQL processingData Set ???  Result Data Set
  • 21. Supporting the data scientist But we always want more! Complex data structure Analytical Platform Data Set ???  Result Data Set SQL cannot handle Data complexity. How do I integrate into my model?
  • 22. Supporting the data scientist But we always want more! non-standard processes Database SQL processingData Set ??? Data Samples Back where we started
  • 23. Supporting the data scientist Bring Analytics to data – still produce analytical data set SQL processing SQL processing
  • 24. Supporting the data scientist Bring Analytics to data – can use other code for data prep SQL processing Kognitio scripting Code executed Using MPP Data held in Memory. Fast access to CPUs
  • 25. Supporting the data scientist Bring Analytics to data – run analytics natively in Kognitio SQL processing Kognitio scripting Code executed Using MPP Data held in Memory. Fast access to CPUs One platform flexible working from data prep through analytical process
  • 26. New! Kognitio version 8: Enabling and extending the Analytical Platform External Tables External Functions Not Only SQL Hadoop Connector Other Connectors Kognitio Storage as an External table General Availability: June 2013
  • 27. External Scripting – Data Transformation Converting structured data into XML format, i.e. furnishing personalised content Assembly Converting XML into structured data Disassembly Extracting complex information from URLs Pulling words from large text fields, i.e. sentiment analysis Parsing Converting row based information into columns for data mining, i.e. supporting classification or segmentation Transposition e.g. using perl Examples where SQL is typically complex and extensive
  • 29. Product Forecasting – with parallel R Forecasting Requirements Forecast Inputs
  • 30. R running in an MPP environment Persistence Layer Analytical Platform Layer
  • 31. R running in an MPP environment Persistence Layer Analytical Platform Layer Kognitio platform specification 16 servers 462GB Kognitio RAM 128 Cores This is old kit 2.9 billion rows of epos 184 day time series for 12K products
  • 32. R running in an MPP environment Persistence Layer Analytical Platform Layer
  • 33. R running in an MPP environment Persistence Layer Analytical Platform Layer 1 output table in RAM 128 parallel instances of R
  • 34. R running in an MPP environment Persistence Layer Analytical Platform Layer Application & Client Layer ExcelAll BI Tools
  • 35. R running in an MPP environment Persistence Layer Analytical Platform Layer Application & Client Layer ExcelAll BI Tools 13 views of different analytical output
  • 36. R running in an MPP environment Persistence Layer Analytical Platform Layer Application & Client Layer ExcelAll BI Tools Result set contained # rows 12K forecasts and stats calculated in # seconds 2.9B EPOS items collated into time series in # seconds
  • 38. Thank you for your participation today • More information on today’s topic can be found at:  • kognitio.com/mpp_r • kognitio.com/product‐forecasting • FREE TO USE – perpetual license – www.kognitio.com/free – Contact us for the pre‐release version 8 • Analyst White Papers – EMA Comparative Analysis  – In‐memory database platforms – www.kognitio.com/emacompinmem • Today’s slides (and more): www.slideshare.net/Kognitio