Skytree focuses on production grade machine learning using algorithms that reduce computational complexity from quadratic and cubic to linear or logarithmic. This allows machine learning to be applied to large datasets. Skytree's products include Skytree Adviser for desktop machine learning and Skytree Server for enterprise machine learning applications such as prediction, detection, finding trends and patterns, and identifying outliers. The company was founded by experts in machine learning, algorithms, and distributed systems.
2. SKYTREE’S FOCUS
"
PRODUCTION GRADE"
MACHINE LEARNING
Machine learning: the modern science of finding patterns and making predictions from data.!
aka: multivariate statistics, data mining, pattern recognition, or advanced/predictive analytics.!
3. Machine Learning Use Cases!
Predict categories and classes!
Predict values and numbers!
Grouping and segmentation!
Detection and characterization!
Visualization and reduction!
Find similar items !
Classification !
Regression!
Clustering!
Density Estimation !
Dimension Reduction!
Multidimensional Querying!
Example Skytree Algorithms: Random Decision Forests, Gradient Boosting Machines, Nearest
Neighbor, Kernel Density Estimation, K-means, Linear Regression, Support Vector Machine,
2-point Correlation, Decision Tree, Singular Value Decomposition, Range Search, Logistic Regression
Recommendations Predictions
Outlier
Detection
4. What are the current options for ML for Big Data!
1. Just use a subset of the data!!
– e.g. just take the first 1,000 rows. Result to expect: Capture only
the broadest patterns. à Lower accuracy."
2. Just use a simple ML method!!
– e.g. use logistic regression instead of nonlinear SVM. Result to
expect: Entire types of patterns cannot be found. à Lower
accuracy."
3. Just use simple parallelism/MapReduce!!
– i.e. replace all the for-loops with parallel ones. Result to expect:
Only the simplest of ML methods (not O(N2)/O(N3)) can be
significantly sped up this way. à See #2."
4. Just throw it in the cloud!!
– i.e. somehow use the large compute power of the cloud. Result
to expect: The cost of sending it to the cloud is even greater than
the compute cost. à See #1. See also #3."
5. Skytree’s Unique Differentiation:
Fundamental Technology Breakthrough!
Complexity of State-of-the-Art Machine Learning methods:!
1. Querying: all-nearest-neighbors O(N2)!
2. Density estimation: kernel density estimation O(N2), kernel conditional density est.
O(N3) !
3. Classification: logistic regression, decision tree, neural nets, nearest-neighbor
classifier O(N2), kernel discriminant O(N2), support vector machine O(N3), !
4. Regression: linear regression, LASSO, kernel regression O(N2), regression tree,
Gaussian process regression O(N3)!
5. Dimension reduction: PCA, non-negative matrix factorization, kernel PCA O(N3),
maximum variance unfolding O(N3); Gaussian graphical models, discrete graphical
models!
6. Clustering: k-means, mean-shift O(N2), hierarchical clustering O(N3)!
7. Testing and matching: MST O(N3), bipartite cross-matching O(N3), n-point correlation
2-sample testing O(Nn), n=2, 3, 4, …!
► Unfortunately O(N2), O(N3) are computationally prohibitive for big data!
Skytree has invented a way to reduce the complexity of above
methods from O(N2) and O(N3) to O(N) or O(N log N).
5
7. How Does Skytree Do This?!
7
Deep knowledge of algorithms
Drawing from the latest from academia
Smart programming
Efficient ways to compute order N(2) and N(3)
Distributed systems
Take advantage of parallel computing speed
8. Team!
8
Martin Hack, CEO & Co-Founder
Sun, GreenBorder (Google)!
Alexander Gray, PhD, CTO & Co-Founder
Leading Light for Large-Scale, Fast Algorithms!
Paul Salazar, VP Sales
RedHat, Greenplum!
Leland Wilkinson, PhD, VP Data Visualization
Creator of SYSTAT (SPSS/IBM).!
Tim Marsland, PhD, VP Engineering
Sun Fellow, CTO Software, Apple, Oracle!
!
!
!
EXECUTIVE
TEAM!
BOARD OF
DIRECTORS!
Rick Lewis, USVP
Noah Doyle, Javelin Venture Partners!
David Toth, Founder and CEO NetRatings (Nielsen)!
Prof. Michael Jordan, UC Berkeley: machine learning ‘godfather’!
Prof. David Patterson, UC Berkeley: systems (inventor RISC, RAID)!
Prof. Pat Hanrahan, Stanford: data visualization (Tableau, Pixar)!
Prof. James Demmel, UC Berkeley: high-performance computing!
INVESTORS!
TECH!
ADVISORY!
BOARD!
USVP, Javelin Venture Partners, Scott McNealy, UPS
9. Product Overview!
9
Skytree Adviser
for Desktop
Data Science for Everyone
Skytree Server
for Enterprises
Enterprise Machine Learning
• Predict Categories/Classes
• Detect Anomalies
• Find Trends
• Predict Values/Numbers
• Identify Patterns
• Find Outliers
Advanced Analytics:
10. Thank you for learning about Skytree
Read more at www.skytree.net
!
• We’re hiring: check out our careers page.!
• Download Skytree Adviser for Free.!
• Pick up a T-Shirt.!