3. F# R TYPE PROVIDER
F# vs. R
What are Type Providers?
The R Provider
Challenges
Type Provider Growing Pains
Was it worth it?
4. F# VS R
F# R
Functional + OO Functional-ish + Crazy-OO
Compiled Interpreted
Statically typed Dynamically typed
OK for Exploratory Analysis Strong for Exploratory Analysis
Well-suited for building systems Unsuitable for building systems
Weak math/stats libraries Strong stats libraries
Basic visualization tools Rich visualization tools
Good for data acquisition Poor for data acquisition
Good for data processing Decent for data processing
Scalable Not particularly scalable
5. MIXING THEM
WHY?
Functionality from .NET libraries
Data acquisition & transformation in F#
Stats/graphics functionality from R libraries
Build robust systems
HOW?
RDotNet provides .NET OO wrapper around R.DLL (in-process)
RCOM provides cross-process access to R session via DCOM
Rserve provides client-server socket-server access to R session
6. F# TYPE PROVIDERS
A mechanism for “dynamically” providing types to the IDE +
compiler
Provided at compile/edit time, based on:
Static parameters (in code)
Access to external resources (database, WSDL, odata)
Downstream code is then statically typed
Good Intellisense experience
Code fragments are generated at compile-time and injected
“Schema” baked into client code
Addresses a significant issue that drives people to dynamic
languages
7. TYPE PROVIDERS VS CODE GEN
Generally equivalent, except…
Some problems don‟t scale with codegen (e.g. Freebase provider)
Simpler process-wise (no additional tool to know/run)
Uniform mechanism for access
Somewhat simpler/less error-prone to write type provider
8. THE R PROVIDER
Type Providers can be used for inter-language interop / meta-
programming
The “external resource”/schema in this case is the R environment
Make R packages available as .NET namespaces
Make R functions & values available as .NET members
Uses RDotNet
Lightweight, in-process
Results kept in R environment unless explicitly marshaled back
Objects can be explicitly saved and loaded into a real R session if desired.
Available at
http://github.com/BlueMountainCapital/FSharpRProvider
9. CHALLENGES
How do we bridge dynamic <-> static typing?
Dynamic typing basically just has one static type – Any/Obj/…
.NET-base statically-typed languages still have a dynamic typing
system
Dynamic languages have lightweight syntax for dynamic method
dispatch
But R eshews dotted notation for method dispatch
Do the obvious thing – use the “one static type”
All arguments are of type object
Results are of type RDotNet.SymbolicExpression – keeps result inside R
engine
Arguments can be native .NET types or SymbolicExpression
10. ARGUMENT PASSING
CONVENTIONS
R has named and positional passing styles
R has … argument (params/varargs)
R allows arguments to have default values
Function will be invoked even if no value supplied and no default
Make all arguments optional
These map pretty well onto F# named/optional arguments
Need to expose functions as static members.
Always exposed as RProvider.packagename.R.functionname
Exceptions:
In R, … argument can come before named arguments.
In R, … arguments can be passed using an identifying name.
11. ARGUMENT CONVERSION
Obvious basic type conversions are built in:
Seq<double> -> numeric vector
Double -> numeric vector
Etc.
Lists can be constructed using R.list()
How do we support implicit conversion of bespoke classes?
E.g. we have our own .NET DataFrame type, should convert to R data.frame
Avoid forking the Open Source project
Support plug-ins via Managed Extensibility Framework
Plug-ins can use the type provider to call R functions, or talk to REngine
directly
12. RESULT CONVERSION
Results always come back as SymbolicExpression
RDotNet wrapper around the R C datatype SEXPREC
RProvider adds a Value property as extension
Returns the default .NET representation of the SymbolicExpression
Obvious default conversions are built-in – can add/override using MEF plug-in
We also add GetValue<„ResType> : unit -> „ResType
Allows caller to specify the type they want
Supports things like NumericVector->double when vector is length 1
Can also augment/override using MEF plug-in
13. TYPE PROVIDER GROWING PAINS
Type providers are an awesome idea
Current implementation has some kinks:
Cannot compile Type Provider while binary is in use by VS (VS keeps it
locked)
If Type Providers are dependent on other assemblies they may not get
resolved
Accessing slow external resources can slow down your machine
Buggy type providers can crash the IDE or compiler
Builds may fail because of machine configuration:
E.g. you don‟t have R
You don‟t have the same packages installed in R
Best to put external resources or schema files in source control
14. WAS IT WORTH IT?
Having integrated, slightly-type-safe access to R from F#
interactive is extremely powerful.
This problem can be solved using code generation
Using the type provider is much more “fluid” – no process
Issues from previous slide detract from that somewhat
Type providers have lots of interesting applications outside data
access:
COM interop
WinRT interop
Intra-language meta-programming (if static parameters were more flexible)
Notas del editor
Demo: R-studio; 1+2x = 3xs = c(1,2,3) (both are vectors)Xs + xXs > 1F = function(x) x + 1Look at fLook at cDf = data.frame(A=c(1,2,3), B=(c,4,5,6))Class(df)Unclass(df)Print(df)PrintPrint.data.frame
Show RDotNet sample program.
Examples: SQL provider, XML provider, Regex provider, CSV provider, Odata provider, file system provider
Demo of F# type provider.#load script that provides getStockPrices as CSV (save files on disk)Call getStockPrices on MSFT and show results in visualizerCall R.log |> R.diff on result – show resultlet data =[for t in tickers -> t,getStockPrices t 255 |>R.log|>R.diff]Call R.data_frame on the resultShow pairs plotPick a pair – plot against each otherBuild lm.