
Organización/Lugar de trabajo
TCSUbicación
United States United StatesOcupación
Data ScientistSector
Technology / Software / Internet
Acerca de
I lead the development and deployment of scaleable models, with expertise in both real-time and big data architecture. = Apache: Spark, Hadoop, Pig, Hive, and Oozie. = Python: scikit-learn, pandas, NumPy, and Luigi. = R: PivotalR, madlib, Time Series Analysis with X12-ARIMA. = Modeling: MLLib, H2O, yhat, Sense = Machine Learning: Random Forests, Clustering, Association Rules, and Logistic Regression. = Software Development: Streaming, Distributed Systems, REST APIs. = Visualization: Matplotlib, ggplot2, Seaborn, and D3. = Database: Hive, Postgres, SQL I build data science pipelines and frameworks (see my presentations below).