Apache Mahout is changing radically. Here is a report on what is coming, notably including an R like domain specific language that can use multiple computational engines such as Spark.
Ted: Is “Revolution” a better word? Want to imply exciting change but not discension
Talk track:
Apache Mahout is an open-source project with international contributors and a vibrant community of users and developers. A new version – 0.8 – was recently released.
Mahout is a library of scalable algorithms used for clustering, classification and recommendation. Mahout also includes a math library that is low level, flexible, scalable and makes certain functions very easy to carry out.
Talk track: First let’s make a quick comparison of the three main areas of Mahout machine learning…
Ted: I included this as intro slide to set up the content, but I think save details for each following slide
TED: NO Idea???
Ted: Is “Revolution” a better word? Want to imply exciting change but not discension
Ted: Is “Revolution” a better word? Want to imply exciting change but not discension
Ted: Is “Revolution” a better word? Want to imply exciting change but not discension
The first four columns represent the ingredients (our features) and the last column (the rating) is the target variable for our regression. Linear regression assumes that the target variable y is generated by the linear combination of the feature matrix X with the parameter vector β plus the noise ε, summarized in the formula y = Xβ + ε. Our goal is to find an estimate of the parameter vector β that explains the data very well.
Ted: Is “Revolution” a better word? Want to imply exciting change but not discension
Ted: Is “Revolution” a better word? Want to imply exciting change but not discension
Ted: Is “Revolution” a better word? Want to imply exciting change but not discension
TED: consider using the word “interesting” instead of “anomalous”… people may think you are talking about anomaly detection…
TED: Likely this can be skipped
Notes to trainer: A lot of work to do a grid. Represent by math
A is history matrix
Ah finds users who do the same things as in h
H is vector of items for one (new current) user
A transpose times Ah gives you the things
That computes what these users do
Shape of matrix multiplications and many of the same properties. Sometimes have weights etc. Had they been exactly the same, we could just move the parentheses.
Our recommender does the item-centric version
General relationships in data don’t change fast (what is related to what; nothing happens to change mozart related to Hayden overnight. )
What does change fast is what the user did in the last five minutes.
//in first case, we have to compute Ah first. Inputs to that compution (h) only available now, in RT so nothing can be computed ahead of time
Second case (Atranspose A) only involves things that change slowly. So pre-compute. Makes it possible to do this offline. Significant because we move a lot of computation for all users into an overnight process. So each RT recommendation involves only a small part, only 1 big matrix multiply in RT. Result: you get a fast response for the recommendations
Second form runs on one machine for one user (the RT part)
Talk track: Here are documents for two different artists with indicator IDs that are part of the recommendation model.
When recommendations are needed, the web-site uses recent visitor behavior to query against the indicators in these documents.