An introduction to Microsoft R Services,
Microsoft R Open and Microsoft R Server.
This presentation will briefly cover the following:
-Why consider MRO and R Server
-R Server
-MRO
-Microsoft R Services/R Server Platform
-DistributedR
-RevoScaleR/ScaleR
-ConnectR
-DevelopR
-DeployR
-Resources
-References
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Introduction to Microsoft R Services
1. An introduction to Microsoft R Services
Microsoft R Open and Microsoft R Server
498 – Show and Tell Gregg Barrett
2. Introduction
This presentation will briefly cover the following:
- Why consider MRO and R Server
- R Server
- MRO
- Microsoft R Services/R Server Platform
- DistributedR
- RevoScaleR/ScaleR
- ConnectR
- DevelopR
- DeployR
- Resources
- References
3. Why consider MRO and R Server
- You get the optionality of working with R and the added benefits of
Microsoft R Open (MRO) and R Server
- Performance
- MRO is FREE
- R Server is FREE – well for students at least through DreamSpark
5. Definition: Originally released in 1993, R is a mature, domain-specific and open-sourced language for statistical
analysis workloads.
Trend Analysis: Gartner client inquiry levels for R remain light and range from exploratory to best-practice
adopter themes; however, like MATLAB, the number of inquiries has increased substantially in recent years.
External data sources reflect a growth in R usage across the industry as well. We expect inquiry levels to
increase consistently through 2017.
Time to Next Market Phase: 2 to 5 years
Business Impact: The significant impact of "big data" analytics and real-time data analysis is driving demand for
languages such as R and MATLAB beyond previous entrenched market niches and into increasingly mainstream
programming workloads. In particular, adopters are turning to R as a free alternative to platforms such as SAS
and SPSS.
User Advice: Consider R as a free and open-source solution for workloads that require advanced statistical
computing or data mining capabilities with minimal coding and optimal maintenance costs over more general-
purpose languages.
Sample Vendors: Microsoft, Oracle, TIBCO Software, IBM, Wolfram Research (Gartner, 2015)
Why consider MRO and R Server
7. R Server
- Revolution R Enterprise (RRE) was developed by Revolution Analytics
- RRE is intended to offer a fast, cost effective enterprise-class big data
analytics platform
- Revolution Analytics was acquired by Microsoft
- RRE is now Microsoft R Server
- R Server is free for students and can be obtained through DreamSpark
- Logon or create a profile at DreamSpark using your university credentials:
https://www.dreamspark.com/Product/Product.aspx?productid=105
8. - RRE uses an R engine called Revolution R Open
- The Revolution R Open engine is now called Microsoft R Open (MRO)
- MRO is intended to be an enhanced distribution of open source R from Microsoft
Corporation. Specifically Microsoft R Open leverages high-performance, multi-
threaded math libraries to deliver performance boosts. This means that functions
in R that use, for example, matrix multiplication, will run faster out of the box.
- Just like R, Microsoft R Open is open source and free
- You can download MRO here:
https://mran.revolutionanalytics.com/download/
- MRO is intended to support a variety of big data statistics, predictive modelling,
and machine learning capabilities
- At the time of this writing the latest version of MRO is version 3.2.5
MRO
9. - It is important to note that R Server uses a different version of MRO
- At the time of this writing the latest version of MRO for R Server is version 3.2.2
- MRO for R Server can be found here:
https://mran.revolutionanalytics.com/download/mro-for-mrs/
- MRO for R Server is a prerequisite for R Server
- After downloading and installing MRO whether it be the version for R Server or
not, download and install MKL
- MKL is the Intel Math Kernel Library
- Important: Install Microsoft R Open first before MKL
MRO
10. Microsoft R Services/R Server Platform
Note: There are name changes due to the Microsoft acquisition with the “Revo” designation/reference falling away – making things a
little more challenging.
11. Microsoft R Services is positioned as R for the Enterprise.
The feature set provided by the Microsoft R Services software can be categorized as follows:
- Microsoft R Open: High performance math libraries installed on top of a stable version of
Open Source R
- DistributedR: Parallel and distributed computing framework for Big Data Analytics
- RevoScaleR/ScaleR: High performance, scalable, parallelized and distributable for Big Data
Analytics in R
- ConnectR: Data connections for the Big Data Analytics
- DevelopR: An integrated development environment (IDE) for R on Windows
- DeployR: A web services software development kit for integrating R with third party products
(including business intelligence, data visualization, rules engines, etc.)
Microsoft R Services/R Server Platform
12. DistributedR
DistributedR allows you to run the same R script on multiple platforms; you can create
a model in one environment such as a workstation and then deploy it on a different
environment such as an on-site Microsoft SQL Server, a Teradata platform, or a
Hadoop cluster in the cloud. You just need to specify the information about where
these computations should be performed and what data should be analyzed.
For information on supported computing environments, look for the “compute
contexts” in the RevoScaleR package.
13. RevoScaleR
RevoScaleR/ScaleR package provides efficient, scalable computational power and
allows for the development of ready-to-deploy suites of data processing and analytics
with R.
To learn more, look for the RevoScaleR “rx” analysis and data manipulation functions
and “rxExec” for HPC functionality. If you are computing decision trees, also check out
the included RevoTreeView package that allows you to interactively visualize your
decision trees.
Or run the following script: ?RevoScaleR
14. The RevoScaleR package provides a way for you to connect with the data you may have stored in
a variety of formats: SAS, SPSS, Teradata, ODBC, delimited and fixed format text, and Hadoop
Distributed File System (HDFS) text files. You have a choice of:
1. keeping the data as is and analyzing it directly with RevoScaleR analysis functions,
2. extracting the data you want to analyze and storing it in the efficient and higher
performance .xdf file format provided with the RevoScaleR package, or
3. bringing some or all of your data into memory as an R data frame to use with any R analysis
function.
To learn more, look for data sources in the RevoScaleR package.
Note: The RevoScaleR package is included with every distribution of RRE/R Server, and is
automatically loaded into memory when you start the program. So all of the “rx” functions
mentioned are at your fingertips.
You can get information on them by using the ? at the command line, for example: ?rxLinMod
ConnectR
15. DevelopR
Microsoft R Services provides a tool for the R developer to efficiently create sets of R
scripts—the R Productivity Environment (RPE).
Working on a Windows workstation with the RPE, the R developer has a full-featured
Visual Studio-like integrated development environment for R, including an
indispensable visual debugger for R. The RPE has a customizable workspace, including
an enhanced Script Editor, an Object Browser, a Solution Explorer, and an R Command
Console.
16. DeployR
The optional DeployR package provides the tools for doing just that; it is a full-featured
web services software development kit for R which allows programmers to use Java,
JavaScript or .Net to integrate the R analysis output with a third party package.
There are now Accelerators for DeployR which are starter kits for integrating with tools
including:
- Microsoft Excel
- Tableau
- Jaspersoft
- QlikView
18. Resources
R Services 2016 Getting Started Guide:
https://packages.revolutionanalytics.com/doc/8.0.0/win/MicrosoftRServices_Getting_
Started.pdf
Webinar “Using Microsoft R Server to Address Scalability Issues in R”:
https://channel9.msdn.com/blogs/Cloud-and-Enterprise-Premium/Using-Microsoft-R-
Server-to-Address-Scalability-Issues-in-R
Task Views are guides on CRAN that group sets of R packages and functions by type of
analysis, fields, or methodologies. You can browse and find packages organized by task
view:
https://mran.microsoft.com/taskview/
19. Resources
Software available to NU students:
http://www.it.northwestern.edu/software/
https://northwestern.onthehub.com/WebStore/Welcome.aspx
https://www.dreamspark.com/Student/Software-Catalog.aspx
20. Gartner. (2015). IT Market Clock for Programming Languages, 2015. [Diagram]. Retrieved from Gartner. (2015).
IT Market Clock for Programming Languages, 2015. [pdf]. https://www.gartner.com/doc/3145117/it-market-clock-programming-
languages
Gartner. (2015). IT Market Clock for Programming Languages, 2015. [pdf]. Retrieved from
https://www.gartner.com/doc/3145117/it-market-clock-programming-languages
Microsoft, (2016). The Benefits of Multithreaded Performance with Microsoft R Open. [webpage]. Retrieved from
https://mran.microsoft.com/documents/rro/multithread/
Microsoft, (2016). R Services 2016 Getting Started Guide. [pdf]. Retrieved from
https://packages.revolutionanalytics.com/doc/8.0.0/win/MicrosoftRServices_Getting_Started.pdf
References