Prototypes are typically re-implemented in another language due to compatibility issues with R in the enterprise, but TIBCO Enterprise Runtime for R (TERR) allows the language to be run on several platforms. Enterprise-level scalability has been brought to the R language, enabling rapid iteration without the need to recode, re-implement and test. This presentation will delve further into these topics, highlighting specific use cases and the true value that can be gained from utilizing R. The session will be followed by a lively, open Q&A discussion.
2. Big Data Analytic Challenges for Enterprises
• More and more data…and the expectation to do
something valuable with it
• How do you get Value from Big Data?
• Deeper insights into Big Data—Visualization
and Advanced Analytics/Machine Learning
• Smarter Decisions—Provide insights to wider
community beyond Data Scientists
• Faster Decisions—both human and
automated
• Agile response to evolving opportunities and
threats
• Answers (and the questions to ask) change
rapidly
3. Multiple paths from Data to Decision and Action
Advanced
Analytics/Machine
Learning help
deliver smarter,
faster decisions
4. •
• Build predictive models on data in Spark and Hadoop (~10
models)
• Mllib
• Build predictive models in Spark (~20 models)
• In-database algorithms
• Vertica, SQL Server, Teradata Aster, Oracle, etc. (varies, ~10-20
models)
• Python
• Larger set of analytic methods…not designed for Big Data
(A few of) Many Options for Machine Learning on Big Data
6. • Agile
• Easy prototyping of new
models and analysis
• Deeper insights
• Huge array of analytic
methods available
• The “best” method to solve a
given problem is likely
available
• Performance
• Not designed for real time or Big Data
applications
• Broader usage
• Hard for non-Data Scientist to use
directly
• Challenging to integrate into enterprise
applications
• Performance, commercial support and
Intellectual Property concerns
• Compromises which impact Agility
• Recode in a new, less agile, more
analytically limited environment
• Rewrite, use specialized R packages to
solve one problem better
R can help…R can help address Big Data Analytic Challenges…
…but it has it’s own challenges
7. What would the ideal solution look like?
• A single environment that would allow you to prototype in R, and
deploy to production in R
• Without recoding, without delay, without compromises
• Enable agile response to changing opportunities and threats
Requires
• Analytic flexibility, power and breadth of R
• High performance, scalable, robust platform
• Easy to embed in Business Intelligence, Real time and custom applications
• Fully supported for mission critical applications
• Allows R users to continue to work in their preferred development environments (e.g.,
RStudio)
8. TIBCO Enterprise Runtime for R (TERR)
• Unique, enterprise-grade engine for the R language,
built from the ground up by TIBCO
• Based on TIBCO’s long history and expertise with S+
• Better performance and memory management than open source R
• Designed for R language compatibility
• Wide range of built-in analytic methods
• Compatible with thousands of CRAN packages (dplyr, data.table,
etc.)
• Designed for commercial embeddability
• TIBCO licensed & supported product
• Not GPL, not a repackaging of the Open source R engine
• TERR extends the reach of R in the enterprise
• Develop code in open source R
• Deploy on a commercially-supported and robust platform
• Without the delay and cost of rewriting your code
• Embed in Data Discovery, BI and real time applications
• Leverage existing R ecosystem in Hadoop, Spark, in-db analytics.
10. Embed R in Business Intelligence and Visualization tools
• Spotfire: Data Discovery and Visualization platform for Business Users and Analysts
• Separate analytics platform, independent of TERR/R
• Easily enhance Spotfire analyses and applications with R language scripts
• Extend the impact of the Data Scientist/R by making their analytic insights available to a wider audience
Write R code directly in Spotfire;
TERR executes locally or on server
Manage TERR analytics locally or
in Server to reuse across
community
Deploy TERR-powered
applications to the web
12. Advanced Analytic Applications in Spotfire
Customer Churn:
• Retain your most profitable customers
• Increase upsell, decrease churn
Fraud Detection:
• Reduce losses due to fraudulent transactions
Supply Chain Optimization:
• Anticipate peaks and lulls
• Optimize distribution centers
HR Planning:
• Predict employee attrition and optimize retention
13. Spotfire, TERR & Hadoop
• Putting it all together
• Drive complex, advanced analytic workflows in
Hadoop from intuitive Spotfire applications
• Initiate, parameterize and visualize Map Reduce
operations from within Spotfire templates via TERR
• Spotfire Customer application received
Cloudera Strata Data Impact Award for
Advanced Analytics
• Multinational manufacturing client
• Use data visualization and modeling tools co-
developed with TIBCO Spotfire, analysts for the
client are building mathematical models that can
predict when a manufacturing stoppage might occur.
14. • Real-time advanced analytics
• Apply predictive model in response to some triggering event
Sensors on industrial equipment trending negative; customer walks into your store or purchases online,
etc.
• Trigger the right decision in response
Extend a mobile offer to a customer; stop a fraudulent transaction in process; alert the equipment
operator or shut down the equipment
Embed R in real-time Data Streams
16. • Port Congestion Detection
• Real time system triggers TERR
• Analyzes port congestion
• Recommends reduction of speed if
no berths available
• Maritime Abnormality Detection
• Based on Automatic Identification
System info, TERR calculates
likelihood of deviation from normal
sailing routes
• Alerts carrier & operator
Transportation and Logistics Optimization
17. Use TERR in your familiar tools
RStudio IDE
• Free, open source IDE widely used by the
R Community
• Fully compatible with TERR Developer
Edition
KNIME
• Free, open source workflow tool for data
management and analysis
• TERR fully compatible with KNIME
Interactive R Statistics Integration nodes
18. TERR is R for the Enterprise
• Develop code in open source R, deploy on commercially-supported,
and robust platforms
• Without recoding, without compromises
• Save time & money, quickly respond to new threats and opportunities
• Tightly & efficiently embed R language functionality
• Extend the power of R to a wider audience, more applications
19. • TERR Community at community.tibco.com
• Resources, Documentation, R compatibility, FAQs, Forums
• Predictive Analytics overview and resources
• Free TERR Developer Edition
• Full version of TERR engine for testing code prior to deployment
• Supported through TIBCO Community, download via tap.tibco.com
• Spotfire Free Trial: http://spotfire.tibco.com/trial
• Spotfire and Big Data http://spotfire.tibco.com/solutions/technology/big-data
• R Consortium Founding Member www.r-consortium.org
Learn more and Try it yourself
Notas del editor
In any sort of analytics problem, people are attempting to get from the Data to make some decision and take some Action based on that decision.
Depending on the tools used, the depth and complexity of the analysis, and the skill of the user, the Decision Maker will need to apply some judgment to make the decision. The deeper the analysis, the more predictive or prescriptive it is, the less of the interpretation and mental heavy lifting the end user will need to do
The R language can be a great asset here, of course, but there are obstacles to applying it
Graphic from KDNuggets poll http://www.kdnuggets.com/2015/05/r-vs-python-data-science.html
Chart from http://www.r-bloggers.com/a-segmented-model-of-cran-package-growth/
… What is TERR?
Since the demo wraps up with the idea of deploying the model to real time systems, it is a good segue
Supply Chain Optimization: simulate production and shipping scenarios to anticipate peaks and lulls
HR Retention: Predict employee attrition and optimize retention
Advanced Analytics -- A TIBCO Spotfire advanced analytics solution purpose-built for a client
TIBCO built extensions of its Spotfire product for one client to support production line event analysis, improving product quality and supply chain efficiency. The client’s global manufacturing sites are peppered with sensors collecting programmable logical controller (PLC) data in sub second intervals. This data is fed into Hadoop, and with data visualization and modeling tools co-developed with TIBCO Spotfire, analysts are building mathematical models that predict when a manufacturing line stoppage might occur. In evaluating the predictive accuracy of the models used in this environment, the client saw the potential to reduce defects by 50% through improved control settings and changes made to the engineering process, and its analysts can focus on problem solving rather than coding and software “gymnastics”. Providing this structured analysis method will ultimately enable data-driven change of the manufacturing process yielding millions of dollars in cost savings benefit.
Example use case: real-time correlations for action
Automated manufacturing yield analysis.
Analyze manufacturing data in Spotfire
Deploy model
Compare live data to models of good behavior
When actual manufacturing usage breaks the model, Spotfire used to understand why