30. dplyr
•
Package by Hadley Whickham
•
Plyr specialised for data frames: faster & with
remote data stores
•
Great design and syntax
•
Great performance thanks to C++
31. arrange
ex: Arrange by year within each player
arrange(Batting,
playerID, yearID)
Unit: milliseconds
expr
min
lq
df 186.64016 188.48495
dt 349.25496 352.12806
cpp 12.20485 13.85538
base 181.68259 182.58014
dt_raw 166.94213 170.15704
median
190.8989
357.4358
14.0081
184.6904
170.6418
uq
192.42140
403.45465
16.72979
186.33794
220.89911
max neval
195.36592
10
405.30055
10
23.95173
10
189.70377
10
223.42155
10
32. filter
Find the year for which each player played the most games
filter(Batting, G == max(G))
Unit: milliseconds
expr
min
lq
median
uq
max neval
df 371.96066 375.98652 380.92300 389.78870 430.2898
10
dt 47.37897 49.39681 51.23722 52.79181 95.8757
10
cpp 34.63382 35.27462 36.48151 38.30672 106.2422
10
base 141.81983 144.87670 147.36940 148.67299 173.8763
10
33. summarise
Compute the average number of at bats for each player
summarise(x, ab = mean(AB))
Unit: microseconds
expr
min
lq
median
uq
max neval
df 470726.569 475168.481 495500.076 498223.152 502601.494
10
dt 23002.422 23923.691 25888.191 28517.318 28683.864
10
cpp
756.265
820.921
838.529
864.624
950.079
10
base 253189.624 259167.496 263124.650 273097.845 326663.243
10
dt_raw 22462.560 23469.528 24438.422 25718.549 28385.158
10
34. Vector Visitor
Traversing an R vector of any type with the same interface
class VectorVisitor {
public:
virtual ~VectorVisitor(){}
/** hash the element of the visited vector at index i */
virtual size_t hash(int i) const = 0 ;
/** are the elements at indices i and j equal */
virtual bool equal(int i, int j) const = 0 ;
!
/** creates a new vector, of the same type as the visited vector, by
* copying elements at the given indices
*/
virtual SEXP subset( const Rcpp::IntegerVector& index ) const = 0 ;
!
}
35. Vector Visitor
inline VectorVisitor* visitor( SEXP vec ){
switch( TYPEOF(vec) ){
case INTSXP:
if( Rf_inherits(vec, "factor" ))
return new FactorVisitor( vec ) ;
return new VectorVisitorImpl<INTSXP>( vec ) ;
case REALSXP:
if( Rf_inherits( vec, "Date" ) )
return new DateVisitor( vec ) ;
if( Rf_inherits( vec, "POSIXct" ) )
return new POSIXctVisitor( vec ) ;
return new VectorVisitorImpl<REALSXP>( vec ) ;
case LGLSXP: return new VectorVisitorImpl<LGLSXP>( vec ) ;
case STRSXP: return new VectorVisitorImpl<STRSXP>( vec ) ;
default: break ;
}
// should not happen
return 0 ;
}
36. Chunked evaluation
ir <- group_by( iris, Species)
summarise(ir,
Sepal.Length = mean(Sepal.Length)
)
•
R expression to evaluate: mean(Sepal.Length)
•
Sepal.Length
•
dplyr knows mean.
•
fast and memory efficient algorithm
∊
iris
37. Hybrid evaluation
myfun <- function(x) x+x
ir <- group_by( iris, Species)
summarise(ir,
xxx = mean(Sepal.Length) + min(Sepal.Width) - myfun(Sepal.Length)
)
#1: fast evaluation of mean(Sepal.Length).
5.006 + min(Sepal.Width) - myfun(Sepal.Length)
#2: fast evaluation of min(Sepal.Width).
5.006 + 3.428 - myfun(Sepal.Length)
#3: fast evaluation of 5.006 + 3.428.
8.434 - myfun(Sepal.Length)
#4: R evaluation of 8.434 - myfun(Sepal.Length).
38. Hybrid Evaluation
!
•
mean, min, max, sum, sd, var, n, +, -, /, *, <, >,
<=, >=, &&, ||
•
packages can register their own hybrid
evaluation handler.
•
See hybrid-evaluation vignette
41. C++11 :
Lambda: function defined where used. Similar to apply
functions in R.
// [[Rcpp::export]]
NumericVector foo( NumericVector v){
NumericVector res = sapply( v,
[](double x){ return x*x; }
) ;
return res ;
}
42. C++11 : for each loop
C++98, C++03
std::vector<double> v ;
for( int i=0; i<v.size(); v++){
double d = v[i] ;
// do something with d
}
C++11
for( double d: v){
// do stuff with d
}
43. C++11 : init list
C++98, C++03
NumericVector x = NumericVector::create( 1, 2 ) ;
C++11
NumericVector x = {1, 2} ;
44. Other changes
•
Move semantics : used under the hood in
Rcpp11. Using less memory.
•
Less code bloat. Variadic templates
45. Rcpp11 article
•
I’m writing an article about C++11
•
Explain the merits of C++11
•
What’s next: C++14, C++17
•
Goal is to make C++11 welcome on CRAN
•
https://github.com/romainfrancois/cpp11_article