SlideShare una empresa de Scribd logo
1 de 15
Descargar para leer sin conexión
Extending the Reach of R
to the Enterprise

Lou Bajuk-Yorgan
Sr. Dir., Product Management
TIBCO Spotfire

1
© Copyright 2000-2014 TIBCO Software Inc.
Extending the Reach of R to the Enterprise
• TIBCO, S+, and embracing R in Spotfire
• Challenges of R for Enterprise applications

• TIBCO Enterprise Runtime for R (TERR)
• Benefits for organizations (and individuals) who use R

• Examples of TERR integration and performance
• Learn more and try it yourself

-2
Our Journey to TERR
•

John Chambers developed the S language at Bell Labs
– Starting in the mid 70’s

•

Insightful (Statsci) founded to commercial S as S+ in 1987
– The “plus”: statistical libraries, documentation, and support
– Later focus on commercial users, ease of use, server integration

•

R: development begun by Ross Ihaka and Robert Gentleman at University of
Auckland in mid 90’s

•

Insightful acquired by TIBCO in 2008
– Spotfire (for Data Discovery and Visualization) acquired in 2007

•

Focus shifted to applying Predictive Analytics in Spotfire
– Step 1: Embrace R

-3
Predictive Analytics with Spotfire
Easily provide targeted, relevant predictive analytics to business users to
improve decision making
•

Ensure compliance & proper usage

•

Share best practices and consistent workflows

•

Get the answer & do “What If?” analyses when needed

•

Leverage investments in R, S+, SAS, MATLAB, …

Powerful Predictive Analytics tools for Spotfire analysts
•

Integrated into Spotfire workflows

•

Easily create, evaluate, and share Predictive Models

•

Add Forecasts with a single click

Benefits of Predictive Analytics to a spectrum of users
•

Increase confidence & effectiveness in decision-making
–
–
–

Reduce uncertainty
Discover meaningful patterns, important data
Maximize ROI

•

Anticipate and react to emerging trends

•

Reduce/manage risk
–

•

Scenario planning, forecasts, fraud detection

Forecast specific behavior, preemptively act on it
–

Increase upsell, decrease churn
Embracing R
•

Spotfire Statistics Server
–

Integration of R & S+ into Spotfire
applications
•

–

Later added SAS® & MATLAB®

Leverage the interactive visualizations
of Spotfire

•

Contribute to the R community

•

Well received—but our Enterprise
customers need more
–
–

-5

R provides tremendous benefits to
statisticians
But large enterprises are often
challenged to leverage that value
Enterprise Challenges for R
•

Core R engine struggles with Big Data
–
–

•

R was not built for enterprise usage and integration
–
–

•

Built as an academic tool for research and teaching
Software vendors attempting to use R in ways it was never intended

GPL great for statisticians, but limits enterprise innovation and investment
–
–

•

Customers don’t use R, or reimplement R code in specialized libraries or other languages
Lose agility & consistency, delay time to production, lose opportunities

Viral open source licensing risks commercial IP
Large vendors avoid tight integration due to open source concerns

Free to acquire, but costly to maintain
–
–

Version incompatibilities, variable quality in packages
Lack of enterprise-level technical support

6
© Copyright 2000-2013 TIBCO Software Inc.
TIBCO Enterprise Runtime for R (TERR)
•

Unique, enterprise-quality implementation of the R language
–

Fundamentally different
•
•
•

–

TIBCO IP: Not open source/GPL
•
•
•

–

Independent implementation
Licensable for embedding and redistribution by partners
Enables implementation of transparent big data handling

Broad compatibility with R functions and 1400+ CRAN packages
•

•

New architecture, developed from the ground up
Based on our long history and expertise with S+
Faster, more robust and more memory-efficient than R

Ongoing effort to broaden our coverage of R

Extends the Reach of R to the Enterprise
–
–
–

Develop in R, deploy on TERR
Rapidly iterate prototyping to production without recoding/retesting—more rapidly respond to
changing business conditions
Easily integrate R-language analytics consistently across organization—into grids, BI
applications, event-driven analytics, etc.
© Copyright 2000-2013 TIBCO Software Inc.
Leveraging TERR
TERR in Spotfire

TERR in
Statistics Services

Embeddable
TERR Engine

Ad hoc tools and interactive applications powered by advanced analytics
• Spotfire Analytics platform: interactive visualization & data discovery,
easily build and share applications, broad data access, etc.

Distributed analytics
• Managed pools of engines
• Load balancing, queuing, failover, parallelization, etc.
• High level APIs for loose integration, data i/o (C#, Java)
• Central management of analytics, R packages
Custom (tight) integration, batch, existing grids, etc.
• Faster than R, more robust, better memory management, fully
supported
• Low level APIs for tight integration
• Integrated into TIBCO products: CEP, Cloud Compute, …
© Copyright 2000-2013 TIBCO Software Inc.
Providing Value for individuals who use R
•

Not seeking to displace R from statistician’s
desktops
–

•

Contribute to the R community
–
–

•

As we port from S+ or develop for TERR
• Supports “Develop in Open Source R, Deploy
on TERR”
• E.g., splusTimeSeries, splusTimeDate, sjdbc

TERR Developer Edition
–
–
–

-9

Sponsor useR conferences, contribute to R
Foundation
Contribute bug reports and propose fixes to R core

Contribute packages to CRAN
–

•

Enterprise platform for the deployment and
integration of your work—without having to rewrite
it!

Full version of TERR engine for testing code prior to
deployment
• Compatible with RStudio & ESS Emacs
Free for non-production use
Supported through Community site
Example 1: TERR vs. R Raw Performance
One specific example
• Non-optimal, non-vectorized, real-world R script
• For loop with row by row processing
for (i in seq(1,length=nrow(df))) {
…process each customer record…
}

Results
• TERR is ~35x faster for 50K rows, 150x faster for 500K rows
• No code modification required
We are looking for more real-world performance tests!
• On average 2-10x faster than R in microtests
Example 2: Spotfire Forecast Tool
•

Forecast Tool
– Easily add Forecasts to
Visualizations by right click menu
– Advanced users can tune settings
– Uses embedded TERR engine

•

Benefits
– Extend the power of Predictive
Analytics for ad hoc analysis to all
Spotfire users
– Easy entry point to Spotfire
Predictive Analytics
TERR integration with TIBCO StreamBase
•

Event-Driven analysis in TIBCO Spotfire
Event Analytics
–

•

Apply predictive models in real-time
decision making
–
–
–
–

•

Process monitoring, analysis, and
optimization

Best marketing offer
Customer churn
Predictive Maintenance
Yield optimization

Rapidly develop and iterate models in
production
–

Respond to changing opportunities
and threats

12
TIBCO Cloud Compute Grid
•

High performance computing on the cloud
– Available on TIBCO Cloud Marketplace
– TERR, Java and .NET computations

•

Robust DataSynapse GridServer architecture
– Used by Wall Street to manage 10K’s nodes
– Java, .NET, and REST APIs (JSON)

•

Perfect for pure computational work
– Vastly easier to use for applications like Monte Carlo
simulations than Map-Reduce
– Run complex statistical models multiple orders of
magnitude faster than open source R on a single
computer
– Unparalleled scalability without upfront capital
investment

•

Easy to get started
– Uses your Amazon EC2 account
Demos

• TERR in Spotfire
– Fraud Detection Application
– Data Functions: using the R language in Spotfire
– Forecast Tool
Learn more and Try it yourself
•

TERR Community at TIBCOmmunity.com
–
–
–
–

•

TERR Developer Edition
–
–

•

Full version of TERR engine for testing code prior to deployment
Supported through TIBCOmmunity, download via tap.tibco.com

TIBCO Cloud Compute Grid
–

•

Resources, FAQs, Forums
Details of R coverage
Product documentation & download
More info at spotfire.tibco.com/terr

https://marketplace.cloud.tibco.com

We want your feedback and input!
–
–
–

Real world performance tests
Package & R coverage prioritization
Via TERR Community, or contact me lbajuk@tibco.com or @loubajuk

Más contenido relacionado

La actualidad más candente

Stream Scaling in Pravega
Stream Scaling in PravegaStream Scaling in Pravega
Stream Scaling in Pravega
DataWorks Summit
 
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Databricks
 
Highly configurable and extensible data processing framework at PubMatic
Highly configurable and extensible data processing framework at PubMaticHighly configurable and extensible data processing framework at PubMatic
Highly configurable and extensible data processing framework at PubMatic
DataWorks Summit
 

La actualidad más candente (20)

The case of vehicle networking financial services accomplished by China Mobile
The case of vehicle networking financial services accomplished by China MobileThe case of vehicle networking financial services accomplished by China Mobile
The case of vehicle networking financial services accomplished by China Mobile
 
Stream Scaling in Pravega
Stream Scaling in PravegaStream Scaling in Pravega
Stream Scaling in Pravega
 
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
 
Using Hadoop for Cognitive Analytics
Using Hadoop for Cognitive AnalyticsUsing Hadoop for Cognitive Analytics
Using Hadoop for Cognitive Analytics
 
Democratizing data science Using spark, hive and druid
Democratizing data science Using spark, hive and druidDemocratizing data science Using spark, hive and druid
Democratizing data science Using spark, hive and druid
 
End to End Supply Chain Control Tower
End to End Supply Chain Control TowerEnd to End Supply Chain Control Tower
End to End Supply Chain Control Tower
 
O2’s Financial Data Hub: going beyond IFRS compliance to support digital tran...
O2’s Financial Data Hub: going beyond IFRS compliance to support digital tran...O2’s Financial Data Hub: going beyond IFRS compliance to support digital tran...
O2’s Financial Data Hub: going beyond IFRS compliance to support digital tran...
 
The Keys to Digital Transformation
The Keys to Digital TransformationThe Keys to Digital Transformation
The Keys to Digital Transformation
 
Highly configurable and extensible data processing framework at PubMatic
Highly configurable and extensible data processing framework at PubMaticHighly configurable and extensible data processing framework at PubMatic
Highly configurable and extensible data processing framework at PubMatic
 
Next generation Polyglot Architectures using Neo4j by Stefan Kolmar
Next generation Polyglot Architectures using Neo4j by Stefan KolmarNext generation Polyglot Architectures using Neo4j by Stefan Kolmar
Next generation Polyglot Architectures using Neo4j by Stefan Kolmar
 
Turning an idea into a Data-Driven Production System: An Energy Load Forecas...
 Turning an idea into a Data-Driven Production System: An Energy Load Forecas... Turning an idea into a Data-Driven Production System: An Energy Load Forecas...
Turning an idea into a Data-Driven Production System: An Energy Load Forecas...
 
San Antonio’s electric utility making big data analytics the business of the ...
San Antonio’s electric utility making big data analytics the business of the ...San Antonio’s electric utility making big data analytics the business of the ...
San Antonio’s electric utility making big data analytics the business of the ...
 
Munich Re: Driving a Big Data Transformation
Munich Re: Driving a Big Data TransformationMunich Re: Driving a Big Data Transformation
Munich Re: Driving a Big Data Transformation
 
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
 
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
 
Insight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital TransformationInsight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital Transformation
 
The Single Most Important Formula for Business Success
The Single Most Important Formula for Business SuccessThe Single Most Important Formula for Business Success
The Single Most Important Formula for Business Success
 

Destacado

Spotfire Integration & Dynamic Output creation
Spotfire Integration & Dynamic Output creationSpotfire Integration & Dynamic Output creation
Spotfire Integration & Dynamic Output creation
Ambareesh Kulkarni
 
Adopting Tibco Spotfire in Bio-Informatics
Adopting Tibco Spotfire in Bio-InformaticsAdopting Tibco Spotfire in Bio-Informatics
Adopting Tibco Spotfire in Bio-Informatics
Herwig Van Marck
 
Advanced Use of Properties and Scripts in TIBCO Spotfire
Advanced Use of Properties and Scripts in TIBCO SpotfireAdvanced Use of Properties and Scripts in TIBCO Spotfire
Advanced Use of Properties and Scripts in TIBCO Spotfire
Herwig Van Marck
 
Building a Perfect Strategic Partnership in 5 Stages
Building a Perfect Strategic Partnership in 5 StagesBuilding a Perfect Strategic Partnership in 5 Stages
Building a Perfect Strategic Partnership in 5 Stages
Ogilvy
 

Destacado (15)

Spotfire Integration & Dynamic Output creation
Spotfire Integration & Dynamic Output creationSpotfire Integration & Dynamic Output creation
Spotfire Integration & Dynamic Output creation
 
Using the R Language in BI and Real Time Applications (useR 2015)
Using the R Language in BI and Real Time Applications (useR 2015)Using the R Language in BI and Real Time Applications (useR 2015)
Using the R Language in BI and Real Time Applications (useR 2015)
 
Adopting Tibco Spotfire in Bio-Informatics
Adopting Tibco Spotfire in Bio-InformaticsAdopting Tibco Spotfire in Bio-Informatics
Adopting Tibco Spotfire in Bio-Informatics
 
Real time applications using the R Language
Real time applications using the R LanguageReal time applications using the R Language
Real time applications using the R Language
 
100 days of Spotfire - Tips & Tricks
100 days of Spotfire - Tips & Tricks100 days of Spotfire - Tips & Tricks
100 days of Spotfire - Tips & Tricks
 
TIBCO Spotfire: Data Science in the Enterprise
TIBCO Spotfire: Data Science in the EnterpriseTIBCO Spotfire: Data Science in the Enterprise
TIBCO Spotfire: Data Science in the Enterprise
 
Deploying R in BI and Real time Applications
Deploying R in BI and Real time ApplicationsDeploying R in BI and Real time Applications
Deploying R in BI and Real time Applications
 
Advantages of Spotfire
Advantages of SpotfireAdvantages of Spotfire
Advantages of Spotfire
 
Webinar: SAP BW Dinosaur to Agile Analytics Powerhouse
Webinar: SAP BW Dinosaur to Agile Analytics PowerhouseWebinar: SAP BW Dinosaur to Agile Analytics Powerhouse
Webinar: SAP BW Dinosaur to Agile Analytics Powerhouse
 
TIBCO Spotfire
TIBCO SpotfireTIBCO Spotfire
TIBCO Spotfire
 
Advanced Use of Properties and Scripts in TIBCO Spotfire
Advanced Use of Properties and Scripts in TIBCO SpotfireAdvanced Use of Properties and Scripts in TIBCO Spotfire
Advanced Use of Properties and Scripts in TIBCO Spotfire
 
Building a Perfect Strategic Partnership in 5 Stages
Building a Perfect Strategic Partnership in 5 StagesBuilding a Perfect Strategic Partnership in 5 Stages
Building a Perfect Strategic Partnership in 5 Stages
 
Fundamental JavaScript [UTC, March 2014]
Fundamental JavaScript [UTC, March 2014]Fundamental JavaScript [UTC, March 2014]
Fundamental JavaScript [UTC, March 2014]
 
Analysis and visualization of microarray experiment data integrating Pipeline...
Analysis and visualization of microarray experiment data integrating Pipeline...Analysis and visualization of microarray experiment data integrating Pipeline...
Analysis and visualization of microarray experiment data integrating Pipeline...
 
JavaScript Programming
JavaScript ProgrammingJavaScript Programming
JavaScript Programming
 

Similar a Extending the Reach of R to the Enterprise with TERR and Spotfire

RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
Databricks
 
Ipsita_Informatica_9Year
Ipsita_Informatica_9YearIpsita_Informatica_9Year
Ipsita_Informatica_9Year
ipsita mohanty
 
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
Revolution Analytics
 
Akram_Resume_ETL_Informatica
Akram_Resume_ETL_InformaticaAkram_Resume_ETL_Informatica
Akram_Resume_ETL_Informatica
Akram Bhuyan
 

Similar a Extending the Reach of R to the Enterprise with TERR and Spotfire (20)

TERR in BI and Real Time applications
TERR in BI and Real Time applicationsTERR in BI and Real Time applications
TERR in BI and Real Time applications
 
Extending the R language to BI and Real-time Applications JSM 2015
Extending the R language to BI and Real-time Applications JSM 2015Extending the R language to BI and Real-time Applications JSM 2015
Extending the R language to BI and Real-time Applications JSM 2015
 
Extend the Reach of R to the Enterprise (for useR! 2013)
Extend the Reach of R to the Enterprise (for useR! 2013)Extend the Reach of R to the Enterprise (for useR! 2013)
Extend the Reach of R to the Enterprise (for useR! 2013)
 
Applying the R Language to BI and Real Time Applications
Applying the R Language to BI and Real Time ApplicationsApplying the R Language to BI and Real Time Applications
Applying the R Language to BI and Real Time Applications
 
Applying R in BI and Real Time applications EARL London 2015
Applying R in BI and Real Time applications EARL London 2015Applying R in BI and Real Time applications EARL London 2015
Applying R in BI and Real Time applications EARL London 2015
 
R in BI and Streaming Applications for useR 2016
R in BI and Streaming Applications for useR 2016R in BI and Streaming Applications for useR 2016
R in BI and Streaming Applications for useR 2016
 
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...
 
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
 
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
 
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
 
Ipsita_Informatica_9Year
Ipsita_Informatica_9YearIpsita_Informatica_9Year
Ipsita_Informatica_9Year
 
Company Profile - NPC with TIBCO Spotfire solution
Company Profile - NPC with TIBCO Spotfire solution  Company Profile - NPC with TIBCO Spotfire solution
Company Profile - NPC with TIBCO Spotfire solution
 
Validation
ValidationValidation
Validation
 
Are you ready for the transformation
Are you ready for the transformationAre you ready for the transformation
Are you ready for the transformation
 
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
 
VASU_VALLABHUNI_INFOSYS
VASU_VALLABHUNI_INFOSYSVASU_VALLABHUNI_INFOSYS
VASU_VALLABHUNI_INFOSYS
 
What Does Artificial Intelligence Have to Do with IT Operations?
What Does Artificial Intelligence Have to Do with IT Operations?What Does Artificial Intelligence Have to Do with IT Operations?
What Does Artificial Intelligence Have to Do with IT Operations?
 
Akram_Resume_ETL_Informatica
Akram_Resume_ETL_InformaticaAkram_Resume_ETL_Informatica
Akram_Resume_ETL_Informatica
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBM
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBM
 

Más de Lou Bajuk

Sannella use r2013-terr-memory-management
Sannella use r2013-terr-memory-managementSannella use r2013-terr-memory-management
Sannella use r2013-terr-memory-management
Lou Bajuk
 

Más de Lou Bajuk (13)

R Consortium update for EARL Boston Oct 2017
R Consortium update for EARL Boston Oct 2017R Consortium update for EARL Boston Oct 2017
R Consortium update for EARL Boston Oct 2017
 
Reusing and Managing R models in an Enterprise
Reusing and Managing  R models in an EnterpriseReusing and Managing  R models in an Enterprise
Reusing and Managing R models in an Enterprise
 
R consortium update EARL London Sept 2017
R consortium update EARL London Sept 2017R consortium update EARL London Sept 2017
R consortium update EARL London Sept 2017
 
Making Data Science accessible to a wider audience
Making Data Science accessible to a wider audienceMaking Data Science accessible to a wider audience
Making Data Science accessible to a wider audience
 
R Consortium Update for EARL June 2017
R Consortium Update for EARL June 2017R Consortium Update for EARL June 2017
R Consortium Update for EARL June 2017
 
Streaming analytics overview for R
Streaming analytics overview for RStreaming analytics overview for R
Streaming analytics overview for R
 
Tibco streaming analytics overview and roadmap
Tibco streaming analytics overview and roadmapTibco streaming analytics overview and roadmap
Tibco streaming analytics overview and roadmap
 
Embracing data science for smarter analytics apps
Embracing data science for smarter analytics appsEmbracing data science for smarter analytics apps
Embracing data science for smarter analytics apps
 
EARL Sept 2016 R consortium
EARL Sept 2016 R consortiumEARL Sept 2016 R consortium
EARL Sept 2016 R consortium
 
The Importance of an Analytics Platform
The Importance of an Analytics PlatformThe Importance of an Analytics Platform
The Importance of an Analytics Platform
 
Software Testing and the R language
Software Testing and the R languageSoftware Testing and the R language
Software Testing and the R language
 
The Compatibility Challenge:Examining R and Developing TERR
The Compatibility Challenge:Examining R and Developing TERRThe Compatibility Challenge:Examining R and Developing TERR
The Compatibility Challenge:Examining R and Developing TERR
 
Sannella use r2013-terr-memory-management
Sannella use r2013-terr-memory-managementSannella use r2013-terr-memory-management
Sannella use r2013-terr-memory-management
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Extending the Reach of R to the Enterprise with TERR and Spotfire

  • 1. Extending the Reach of R to the Enterprise Lou Bajuk-Yorgan Sr. Dir., Product Management TIBCO Spotfire 1 © Copyright 2000-2014 TIBCO Software Inc.
  • 2. Extending the Reach of R to the Enterprise • TIBCO, S+, and embracing R in Spotfire • Challenges of R for Enterprise applications • TIBCO Enterprise Runtime for R (TERR) • Benefits for organizations (and individuals) who use R • Examples of TERR integration and performance • Learn more and try it yourself -2
  • 3. Our Journey to TERR • John Chambers developed the S language at Bell Labs – Starting in the mid 70’s • Insightful (Statsci) founded to commercial S as S+ in 1987 – The “plus”: statistical libraries, documentation, and support – Later focus on commercial users, ease of use, server integration • R: development begun by Ross Ihaka and Robert Gentleman at University of Auckland in mid 90’s • Insightful acquired by TIBCO in 2008 – Spotfire (for Data Discovery and Visualization) acquired in 2007 • Focus shifted to applying Predictive Analytics in Spotfire – Step 1: Embrace R -3
  • 4. Predictive Analytics with Spotfire Easily provide targeted, relevant predictive analytics to business users to improve decision making • Ensure compliance & proper usage • Share best practices and consistent workflows • Get the answer & do “What If?” analyses when needed • Leverage investments in R, S+, SAS, MATLAB, … Powerful Predictive Analytics tools for Spotfire analysts • Integrated into Spotfire workflows • Easily create, evaluate, and share Predictive Models • Add Forecasts with a single click Benefits of Predictive Analytics to a spectrum of users • Increase confidence & effectiveness in decision-making – – – Reduce uncertainty Discover meaningful patterns, important data Maximize ROI • Anticipate and react to emerging trends • Reduce/manage risk – • Scenario planning, forecasts, fraud detection Forecast specific behavior, preemptively act on it – Increase upsell, decrease churn
  • 5. Embracing R • Spotfire Statistics Server – Integration of R & S+ into Spotfire applications • – Later added SAS® & MATLAB® Leverage the interactive visualizations of Spotfire • Contribute to the R community • Well received—but our Enterprise customers need more – – -5 R provides tremendous benefits to statisticians But large enterprises are often challenged to leverage that value
  • 6. Enterprise Challenges for R • Core R engine struggles with Big Data – – • R was not built for enterprise usage and integration – – • Built as an academic tool for research and teaching Software vendors attempting to use R in ways it was never intended GPL great for statisticians, but limits enterprise innovation and investment – – • Customers don’t use R, or reimplement R code in specialized libraries or other languages Lose agility & consistency, delay time to production, lose opportunities Viral open source licensing risks commercial IP Large vendors avoid tight integration due to open source concerns Free to acquire, but costly to maintain – – Version incompatibilities, variable quality in packages Lack of enterprise-level technical support 6 © Copyright 2000-2013 TIBCO Software Inc.
  • 7. TIBCO Enterprise Runtime for R (TERR) • Unique, enterprise-quality implementation of the R language – Fundamentally different • • • – TIBCO IP: Not open source/GPL • • • – Independent implementation Licensable for embedding and redistribution by partners Enables implementation of transparent big data handling Broad compatibility with R functions and 1400+ CRAN packages • • New architecture, developed from the ground up Based on our long history and expertise with S+ Faster, more robust and more memory-efficient than R Ongoing effort to broaden our coverage of R Extends the Reach of R to the Enterprise – – – Develop in R, deploy on TERR Rapidly iterate prototyping to production without recoding/retesting—more rapidly respond to changing business conditions Easily integrate R-language analytics consistently across organization—into grids, BI applications, event-driven analytics, etc. © Copyright 2000-2013 TIBCO Software Inc.
  • 8. Leveraging TERR TERR in Spotfire TERR in Statistics Services Embeddable TERR Engine Ad hoc tools and interactive applications powered by advanced analytics • Spotfire Analytics platform: interactive visualization & data discovery, easily build and share applications, broad data access, etc. Distributed analytics • Managed pools of engines • Load balancing, queuing, failover, parallelization, etc. • High level APIs for loose integration, data i/o (C#, Java) • Central management of analytics, R packages Custom (tight) integration, batch, existing grids, etc. • Faster than R, more robust, better memory management, fully supported • Low level APIs for tight integration • Integrated into TIBCO products: CEP, Cloud Compute, … © Copyright 2000-2013 TIBCO Software Inc.
  • 9. Providing Value for individuals who use R • Not seeking to displace R from statistician’s desktops – • Contribute to the R community – – • As we port from S+ or develop for TERR • Supports “Develop in Open Source R, Deploy on TERR” • E.g., splusTimeSeries, splusTimeDate, sjdbc TERR Developer Edition – – – -9 Sponsor useR conferences, contribute to R Foundation Contribute bug reports and propose fixes to R core Contribute packages to CRAN – • Enterprise platform for the deployment and integration of your work—without having to rewrite it! Full version of TERR engine for testing code prior to deployment • Compatible with RStudio & ESS Emacs Free for non-production use Supported through Community site
  • 10. Example 1: TERR vs. R Raw Performance One specific example • Non-optimal, non-vectorized, real-world R script • For loop with row by row processing for (i in seq(1,length=nrow(df))) { …process each customer record… } Results • TERR is ~35x faster for 50K rows, 150x faster for 500K rows • No code modification required We are looking for more real-world performance tests! • On average 2-10x faster than R in microtests
  • 11. Example 2: Spotfire Forecast Tool • Forecast Tool – Easily add Forecasts to Visualizations by right click menu – Advanced users can tune settings – Uses embedded TERR engine • Benefits – Extend the power of Predictive Analytics for ad hoc analysis to all Spotfire users – Easy entry point to Spotfire Predictive Analytics
  • 12. TERR integration with TIBCO StreamBase • Event-Driven analysis in TIBCO Spotfire Event Analytics – • Apply predictive models in real-time decision making – – – – • Process monitoring, analysis, and optimization Best marketing offer Customer churn Predictive Maintenance Yield optimization Rapidly develop and iterate models in production – Respond to changing opportunities and threats 12
  • 13. TIBCO Cloud Compute Grid • High performance computing on the cloud – Available on TIBCO Cloud Marketplace – TERR, Java and .NET computations • Robust DataSynapse GridServer architecture – Used by Wall Street to manage 10K’s nodes – Java, .NET, and REST APIs (JSON) • Perfect for pure computational work – Vastly easier to use for applications like Monte Carlo simulations than Map-Reduce – Run complex statistical models multiple orders of magnitude faster than open source R on a single computer – Unparalleled scalability without upfront capital investment • Easy to get started – Uses your Amazon EC2 account
  • 14. Demos • TERR in Spotfire – Fraud Detection Application – Data Functions: using the R language in Spotfire – Forecast Tool
  • 15. Learn more and Try it yourself • TERR Community at TIBCOmmunity.com – – – – • TERR Developer Edition – – • Full version of TERR engine for testing code prior to deployment Supported through TIBCOmmunity, download via tap.tibco.com TIBCO Cloud Compute Grid – • Resources, FAQs, Forums Details of R coverage Product documentation & download More info at spotfire.tibco.com/terr https://marketplace.cloud.tibco.com We want your feedback and input! – – – Real world performance tests Package & R coverage prioritization Via TERR Community, or contact me lbajuk@tibco.com or @loubajuk