SlideShare una empresa de Scribd logo
1 de 20
Descargar para leer sin conexión
An introduction to Microsoft R Services
Microsoft R Open and Microsoft R Server
498 – Show and Tell Gregg Barrett
Introduction
This presentation will briefly cover the following:
- Why consider MRO and R Server
- R Server
- MRO
- Microsoft R Services/R Server Platform
- DistributedR
- RevoScaleR/ScaleR
- ConnectR
- DevelopR
- DeployR
- Resources
- References
Why consider MRO and R Server
- You get the optionality of working with R and the added benefits of
Microsoft R Open (MRO) and R Server
- Performance
- MRO is FREE
- R Server is FREE – well for students at least through DreamSpark
Why consider MRO and R Server
(Gartner, 2015)
Definition: Originally released in 1993, R is a mature, domain-specific and open-sourced language for statistical
analysis workloads.
Trend Analysis: Gartner client inquiry levels for R remain light and range from exploratory to best-practice
adopter themes; however, like MATLAB, the number of inquiries has increased substantially in recent years.
External data sources reflect a growth in R usage across the industry as well. We expect inquiry levels to
increase consistently through 2017.
Time to Next Market Phase: 2 to 5 years
Business Impact: The significant impact of "big data" analytics and real-time data analysis is driving demand for
languages such as R and MATLAB beyond previous entrenched market niches and into increasingly mainstream
programming workloads. In particular, adopters are turning to R as a free alternative to platforms such as SAS
and SPSS.
User Advice: Consider R as a free and open-source solution for workloads that require advanced statistical
computing or data mining capabilities with minimal coding and optimal maintenance costs over more general-
purpose languages.
Sample Vendors: Microsoft, Oracle, TIBCO Software, IBM, Wolfram Research (Gartner, 2015)
Why consider MRO and R Server
Why consider MRO and R Server
(Microsoft, 2016)
R Server
- Revolution R Enterprise (RRE) was developed by Revolution Analytics
- RRE is intended to offer a fast, cost effective enterprise-class big data
analytics platform
- Revolution Analytics was acquired by Microsoft
- RRE is now Microsoft R Server
- R Server is free for students and can be obtained through DreamSpark
- Logon or create a profile at DreamSpark using your university credentials:
https://www.dreamspark.com/Product/Product.aspx?productid=105
- RRE uses an R engine called Revolution R Open
- The Revolution R Open engine is now called Microsoft R Open (MRO)
- MRO is intended to be an enhanced distribution of open source R from Microsoft
Corporation. Specifically Microsoft R Open leverages high-performance, multi-
threaded math libraries to deliver performance boosts. This means that functions
in R that use, for example, matrix multiplication, will run faster out of the box.
- Just like R, Microsoft R Open is open source and free
- You can download MRO here:
https://mran.revolutionanalytics.com/download/
- MRO is intended to support a variety of big data statistics, predictive modelling,
and machine learning capabilities
- At the time of this writing the latest version of MRO is version 3.2.5
MRO
- It is important to note that R Server uses a different version of MRO
- At the time of this writing the latest version of MRO for R Server is version 3.2.2
- MRO for R Server can be found here:
https://mran.revolutionanalytics.com/download/mro-for-mrs/
- MRO for R Server is a prerequisite for R Server
- After downloading and installing MRO whether it be the version for R Server or
not, download and install MKL
- MKL is the Intel Math Kernel Library
- Important: Install Microsoft R Open first before MKL
MRO
Microsoft R Services/R Server Platform
Note: There are name changes due to the Microsoft acquisition with the “Revo” designation/reference falling away – making things a
little more challenging.
Microsoft R Services is positioned as R for the Enterprise.
The feature set provided by the Microsoft R Services software can be categorized as follows:
- Microsoft R Open: High performance math libraries installed on top of a stable version of
Open Source R
- DistributedR: Parallel and distributed computing framework for Big Data Analytics
- RevoScaleR/ScaleR: High performance, scalable, parallelized and distributable for Big Data
Analytics in R
- ConnectR: Data connections for the Big Data Analytics
- DevelopR: An integrated development environment (IDE) for R on Windows
- DeployR: A web services software development kit for integrating R with third party products
(including business intelligence, data visualization, rules engines, etc.)
Microsoft R Services/R Server Platform
DistributedR
DistributedR allows you to run the same R script on multiple platforms; you can create
a model in one environment such as a workstation and then deploy it on a different
environment such as an on-site Microsoft SQL Server, a Teradata platform, or a
Hadoop cluster in the cloud. You just need to specify the information about where
these computations should be performed and what data should be analyzed.
For information on supported computing environments, look for the “compute
contexts” in the RevoScaleR package.
RevoScaleR
RevoScaleR/ScaleR package provides efficient, scalable computational power and
allows for the development of ready-to-deploy suites of data processing and analytics
with R.
To learn more, look for the RevoScaleR “rx” analysis and data manipulation functions
and “rxExec” for HPC functionality. If you are computing decision trees, also check out
the included RevoTreeView package that allows you to interactively visualize your
decision trees.
Or run the following script: ?RevoScaleR
The RevoScaleR package provides a way for you to connect with the data you may have stored in
a variety of formats: SAS, SPSS, Teradata, ODBC, delimited and fixed format text, and Hadoop
Distributed File System (HDFS) text files. You have a choice of:
1. keeping the data as is and analyzing it directly with RevoScaleR analysis functions,
2. extracting the data you want to analyze and storing it in the efficient and higher
performance .xdf file format provided with the RevoScaleR package, or
3. bringing some or all of your data into memory as an R data frame to use with any R analysis
function.
To learn more, look for data sources in the RevoScaleR package.
Note: The RevoScaleR package is included with every distribution of RRE/R Server, and is
automatically loaded into memory when you start the program. So all of the “rx” functions
mentioned are at your fingertips.
You can get information on them by using the ? at the command line, for example: ?rxLinMod
ConnectR
DevelopR
Microsoft R Services provides a tool for the R developer to efficiently create sets of R
scripts—the R Productivity Environment (RPE).
Working on a Windows workstation with the RPE, the R developer has a full-featured
Visual Studio-like integrated development environment for R, including an
indispensable visual debugger for R. The RPE has a customizable workspace, including
an enhanced Script Editor, an Object Browser, a Solution Explorer, and an R Command
Console.
DeployR
The optional DeployR package provides the tools for doing just that; it is a full-featured
web services software development kit for R which allows programmers to use Java,
JavaScript or .Net to integrate the R analysis output with a third party package.
There are now Accelerators for DeployR which are starter kits for integrating with tools
including:
- Microsoft Excel
- Tableau
- Jaspersoft
- QlikView
R Server User Interface
Resources
R Services 2016 Getting Started Guide:
https://packages.revolutionanalytics.com/doc/8.0.0/win/MicrosoftRServices_Getting_
Started.pdf
Webinar “Using Microsoft R Server to Address Scalability Issues in R”:
https://channel9.msdn.com/blogs/Cloud-and-Enterprise-Premium/Using-Microsoft-R-
Server-to-Address-Scalability-Issues-in-R
Task Views are guides on CRAN that group sets of R packages and functions by type of
analysis, fields, or methodologies. You can browse and find packages organized by task
view:
https://mran.microsoft.com/taskview/
Resources
Software available to NU students:
http://www.it.northwestern.edu/software/
https://northwestern.onthehub.com/WebStore/Welcome.aspx
https://www.dreamspark.com/Student/Software-Catalog.aspx
Gartner. (2015). IT Market Clock for Programming Languages, 2015. [Diagram]. Retrieved from Gartner. (2015).
IT Market Clock for Programming Languages, 2015. [pdf]. https://www.gartner.com/doc/3145117/it-market-clock-programming-
languages
Gartner. (2015). IT Market Clock for Programming Languages, 2015. [pdf]. Retrieved from
https://www.gartner.com/doc/3145117/it-market-clock-programming-languages
Microsoft, (2016). The Benefits of Multithreaded Performance with Microsoft R Open. [webpage]. Retrieved from
https://mran.microsoft.com/documents/rro/multithread/
Microsoft, (2016). R Services 2016 Getting Started Guide. [pdf]. Retrieved from
https://packages.revolutionanalytics.com/doc/8.0.0/win/MicrosoftRServices_Getting_Started.pdf
References

Más contenido relacionado

La actualidad más candente

Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedIs Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Revolution Analytics
 
What's New in Revolution R Enterprise 6.2
What's New in Revolution R Enterprise 6.2What's New in Revolution R Enterprise 6.2
What's New in Revolution R Enterprise 6.2
Revolution Analytics
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
Revolution Analytics
 

La actualidad más candente (20)

R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics?
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per Second
 
R and Data Science
R and Data ScienceR and Data Science
R and Data Science
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedIs Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
 
Model Building with RevoScaleR: Using R and Hadoop for Statistical Computation
Model Building with RevoScaleR: Using R and Hadoop for Statistical ComputationModel Building with RevoScaleR: Using R and Hadoop for Statistical Computation
Model Building with RevoScaleR: Using R and Hadoop for Statistical Computation
 
What's New in Revolution R Enterprise 6.2
What's New in Revolution R Enterprise 6.2What's New in Revolution R Enterprise 6.2
What's New in Revolution R Enterprise 6.2
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
Basics of Digital Design and Verilog
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and Verilog
 
Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
New Advances in High Performance Analytics with R: 'Big Data' Decision Trees ...
New Advances in High Performance Analytics with R: 'Big Data' Decision Trees ...New Advances in High Performance Analytics with R: 'Big Data' Decision Trees ...
New Advances in High Performance Analytics with R: 'Big Data' Decision Trees ...
 
Reproducible Data Science with R
Reproducible Data Science with RReproducible Data Science with R
Reproducible Data Science with R
 
In-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and RevolutionIn-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and Revolution
 
Open source analytics
Open source analyticsOpen source analytics
Open source analytics
 

Similar a Introduction to Microsoft R Services

Michal Marušan: Scalable R
Michal Marušan: Scalable RMichal Marušan: Scalable R
Michal Marušan: Scalable R
GapData Institute
 
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Revolution Analytics
 
Bluegranite AA Webinar FINAL 28JUN16
Bluegranite AA Webinar FINAL 28JUN16Bluegranite AA Webinar FINAL 28JUN16
Bluegranite AA Webinar FINAL 28JUN16
Andy Lathrop
 
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
Revolution Analytics
 

Similar a Introduction to Microsoft R Services (20)

Michal Marušan: Scalable R
Michal Marušan: Scalable RMichal Marušan: Scalable R
Michal Marušan: Scalable R
 
R as supporting tool for analytics and simulation
R as supporting tool for analytics and simulationR as supporting tool for analytics and simulation
R as supporting tool for analytics and simulation
 
microsoft r server for distributed computing
microsoft r server for distributed computingmicrosoft r server for distributed computing
microsoft r server for distributed computing
 
Advanced analytics with R and SQL
Advanced analytics with R and SQLAdvanced analytics with R and SQL
Advanced analytics with R and SQL
 
Analytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using RAnalytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using R
 
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with R
 
Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017
 
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
 
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
 
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
 
Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629
 
Software Archaeology with RDz and RAA
Software Archaeology with RDz and RAASoftware Archaeology with RDz and RAA
Software Archaeology with RDz and RAA
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R Studio
 
Using R services with Machine Learning
Using R services with Machine LearningUsing R services with Machine Learning
Using R services with Machine Learning
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
 
Bluegranite AA Webinar FINAL 28JUN16
Bluegranite AA Webinar FINAL 28JUN16Bluegranite AA Webinar FINAL 28JUN16
Bluegranite AA Webinar FINAL 28JUN16
 
Red Hat Summit 2017 - Intro to SQL Server on RHEL and Open Shift
Red Hat Summit 2017 - Intro to SQL Server on RHEL and Open ShiftRed Hat Summit 2017 - Intro to SQL Server on RHEL and Open Shift
Red Hat Summit 2017 - Intro to SQL Server on RHEL and Open Shift
 
Revolution R: 100% R and more
Revolution R: 100% R and moreRevolution R: 100% R and more
Revolution R: 100% R and more
 
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
 

Más de Gregg Barrett

Más de Gregg Barrett (20)

Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018
 
Cirrus: Africa's AI initiative
Cirrus: Africa's AI initiativeCirrus: Africa's AI initiative
Cirrus: Africa's AI initiative
 
Applied machine learning: Insurance
Applied machine learning: InsuranceApplied machine learning: Insurance
Applied machine learning: Insurance
 
Road and Track Vehicle - Project Document
Road and Track Vehicle - Project DocumentRoad and Track Vehicle - Project Document
Road and Track Vehicle - Project Document
 
Modelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boostingModelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boosting
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?
 
Revenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla MotorsRevenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla Motors
 
Data science unit introduction
Data science unit introductionData science unit introduction
Data science unit introduction
 
Social networking brings power
Social networking brings powerSocial networking brings power
Social networking brings power
 
Procurement can be exciting
Procurement can be excitingProcurement can be exciting
Procurement can be exciting
 
Machine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing BeerMachine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing Beer
 
A note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managersA note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managers
 
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
 
Efficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in REfficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in R
 
Hadoop Overview
Hadoop OverviewHadoop Overview
Hadoop Overview
 
Variable selection for classification and regression using R
Variable selection for classification and regression using RVariable selection for classification and regression using R
Variable selection for classification and regression using R
 
Diabetes data - model assessment using R
Diabetes data - model assessment using RDiabetes data - model assessment using R
Diabetes data - model assessment using R
 
Insurance metrics overview
Insurance metrics overviewInsurance metrics overview
Insurance metrics overview
 
Review of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at IntermountainReview of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at Intermountain
 
Example: movielens data with mahout
Example: movielens data with mahoutExample: movielens data with mahout
Example: movielens data with mahout
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Introduction to Microsoft R Services

  • 1. An introduction to Microsoft R Services Microsoft R Open and Microsoft R Server 498 – Show and Tell Gregg Barrett
  • 2. Introduction This presentation will briefly cover the following: - Why consider MRO and R Server - R Server - MRO - Microsoft R Services/R Server Platform - DistributedR - RevoScaleR/ScaleR - ConnectR - DevelopR - DeployR - Resources - References
  • 3. Why consider MRO and R Server - You get the optionality of working with R and the added benefits of Microsoft R Open (MRO) and R Server - Performance - MRO is FREE - R Server is FREE – well for students at least through DreamSpark
  • 4. Why consider MRO and R Server (Gartner, 2015)
  • 5. Definition: Originally released in 1993, R is a mature, domain-specific and open-sourced language for statistical analysis workloads. Trend Analysis: Gartner client inquiry levels for R remain light and range from exploratory to best-practice adopter themes; however, like MATLAB, the number of inquiries has increased substantially in recent years. External data sources reflect a growth in R usage across the industry as well. We expect inquiry levels to increase consistently through 2017. Time to Next Market Phase: 2 to 5 years Business Impact: The significant impact of "big data" analytics and real-time data analysis is driving demand for languages such as R and MATLAB beyond previous entrenched market niches and into increasingly mainstream programming workloads. In particular, adopters are turning to R as a free alternative to platforms such as SAS and SPSS. User Advice: Consider R as a free and open-source solution for workloads that require advanced statistical computing or data mining capabilities with minimal coding and optimal maintenance costs over more general- purpose languages. Sample Vendors: Microsoft, Oracle, TIBCO Software, IBM, Wolfram Research (Gartner, 2015) Why consider MRO and R Server
  • 6. Why consider MRO and R Server (Microsoft, 2016)
  • 7. R Server - Revolution R Enterprise (RRE) was developed by Revolution Analytics - RRE is intended to offer a fast, cost effective enterprise-class big data analytics platform - Revolution Analytics was acquired by Microsoft - RRE is now Microsoft R Server - R Server is free for students and can be obtained through DreamSpark - Logon or create a profile at DreamSpark using your university credentials: https://www.dreamspark.com/Product/Product.aspx?productid=105
  • 8. - RRE uses an R engine called Revolution R Open - The Revolution R Open engine is now called Microsoft R Open (MRO) - MRO is intended to be an enhanced distribution of open source R from Microsoft Corporation. Specifically Microsoft R Open leverages high-performance, multi- threaded math libraries to deliver performance boosts. This means that functions in R that use, for example, matrix multiplication, will run faster out of the box. - Just like R, Microsoft R Open is open source and free - You can download MRO here: https://mran.revolutionanalytics.com/download/ - MRO is intended to support a variety of big data statistics, predictive modelling, and machine learning capabilities - At the time of this writing the latest version of MRO is version 3.2.5 MRO
  • 9. - It is important to note that R Server uses a different version of MRO - At the time of this writing the latest version of MRO for R Server is version 3.2.2 - MRO for R Server can be found here: https://mran.revolutionanalytics.com/download/mro-for-mrs/ - MRO for R Server is a prerequisite for R Server - After downloading and installing MRO whether it be the version for R Server or not, download and install MKL - MKL is the Intel Math Kernel Library - Important: Install Microsoft R Open first before MKL MRO
  • 10. Microsoft R Services/R Server Platform Note: There are name changes due to the Microsoft acquisition with the “Revo” designation/reference falling away – making things a little more challenging.
  • 11. Microsoft R Services is positioned as R for the Enterprise. The feature set provided by the Microsoft R Services software can be categorized as follows: - Microsoft R Open: High performance math libraries installed on top of a stable version of Open Source R - DistributedR: Parallel and distributed computing framework for Big Data Analytics - RevoScaleR/ScaleR: High performance, scalable, parallelized and distributable for Big Data Analytics in R - ConnectR: Data connections for the Big Data Analytics - DevelopR: An integrated development environment (IDE) for R on Windows - DeployR: A web services software development kit for integrating R with third party products (including business intelligence, data visualization, rules engines, etc.) Microsoft R Services/R Server Platform
  • 12. DistributedR DistributedR allows you to run the same R script on multiple platforms; you can create a model in one environment such as a workstation and then deploy it on a different environment such as an on-site Microsoft SQL Server, a Teradata platform, or a Hadoop cluster in the cloud. You just need to specify the information about where these computations should be performed and what data should be analyzed. For information on supported computing environments, look for the “compute contexts” in the RevoScaleR package.
  • 13. RevoScaleR RevoScaleR/ScaleR package provides efficient, scalable computational power and allows for the development of ready-to-deploy suites of data processing and analytics with R. To learn more, look for the RevoScaleR “rx” analysis and data manipulation functions and “rxExec” for HPC functionality. If you are computing decision trees, also check out the included RevoTreeView package that allows you to interactively visualize your decision trees. Or run the following script: ?RevoScaleR
  • 14. The RevoScaleR package provides a way for you to connect with the data you may have stored in a variety of formats: SAS, SPSS, Teradata, ODBC, delimited and fixed format text, and Hadoop Distributed File System (HDFS) text files. You have a choice of: 1. keeping the data as is and analyzing it directly with RevoScaleR analysis functions, 2. extracting the data you want to analyze and storing it in the efficient and higher performance .xdf file format provided with the RevoScaleR package, or 3. bringing some or all of your data into memory as an R data frame to use with any R analysis function. To learn more, look for data sources in the RevoScaleR package. Note: The RevoScaleR package is included with every distribution of RRE/R Server, and is automatically loaded into memory when you start the program. So all of the “rx” functions mentioned are at your fingertips. You can get information on them by using the ? at the command line, for example: ?rxLinMod ConnectR
  • 15. DevelopR Microsoft R Services provides a tool for the R developer to efficiently create sets of R scripts—the R Productivity Environment (RPE). Working on a Windows workstation with the RPE, the R developer has a full-featured Visual Studio-like integrated development environment for R, including an indispensable visual debugger for R. The RPE has a customizable workspace, including an enhanced Script Editor, an Object Browser, a Solution Explorer, and an R Command Console.
  • 16. DeployR The optional DeployR package provides the tools for doing just that; it is a full-featured web services software development kit for R which allows programmers to use Java, JavaScript or .Net to integrate the R analysis output with a third party package. There are now Accelerators for DeployR which are starter kits for integrating with tools including: - Microsoft Excel - Tableau - Jaspersoft - QlikView
  • 17. R Server User Interface
  • 18. Resources R Services 2016 Getting Started Guide: https://packages.revolutionanalytics.com/doc/8.0.0/win/MicrosoftRServices_Getting_ Started.pdf Webinar “Using Microsoft R Server to Address Scalability Issues in R”: https://channel9.msdn.com/blogs/Cloud-and-Enterprise-Premium/Using-Microsoft-R- Server-to-Address-Scalability-Issues-in-R Task Views are guides on CRAN that group sets of R packages and functions by type of analysis, fields, or methodologies. You can browse and find packages organized by task view: https://mran.microsoft.com/taskview/
  • 19. Resources Software available to NU students: http://www.it.northwestern.edu/software/ https://northwestern.onthehub.com/WebStore/Welcome.aspx https://www.dreamspark.com/Student/Software-Catalog.aspx
  • 20. Gartner. (2015). IT Market Clock for Programming Languages, 2015. [Diagram]. Retrieved from Gartner. (2015). IT Market Clock for Programming Languages, 2015. [pdf]. https://www.gartner.com/doc/3145117/it-market-clock-programming- languages Gartner. (2015). IT Market Clock for Programming Languages, 2015. [pdf]. Retrieved from https://www.gartner.com/doc/3145117/it-market-clock-programming-languages Microsoft, (2016). The Benefits of Multithreaded Performance with Microsoft R Open. [webpage]. Retrieved from https://mran.microsoft.com/documents/rro/multithread/ Microsoft, (2016). R Services 2016 Getting Started Guide. [pdf]. Retrieved from https://packages.revolutionanalytics.com/doc/8.0.0/win/MicrosoftRServices_Getting_Started.pdf References