Personal Information
Organización/Lugar de trabajo
San Francisco Bay Area United States
Ocupación
Data analytics developer
Sector
Technology / Software / Internet
Sitio web
http://xinhstechblog.blogspot.com
Acerca de
Developer and team lead, with 9+ years experience in analytics, big data, and data science. At Samsung SDS, I worked on data science projects. As a developer, I used Spark and Scala for data munging, exploration, machine learning, and data pipelines. As a scrum master, I facilitated the Agile process in a team of developers and data scientists.
From 2010-2012, at LLNL, a research and development lab, I worked on a text processing data pipeline for a document search application, as well as analytics with Hadoop, Pig, HBase, and Solr.
In 2005-2009, I worked on Web search at Yahoo!, implementing distributed applications to analyze Web data, consisting of many billions of web pages, in the ...
Etiquetas
dataframe
big data
spark
ops
data pipeline
dcos
production
scala
data science
data munging
analytics
spark sql
Ver más
Presentaciones
(3)Recomendaciones
(5)The Future of Real-Time in Spark
Reynold Xin
•
Hace 8 años
Dato Keynote
Turi, Inc.
•
Hace 8 años
Introducing DataFrames in Spark for Large Scale Data Science
Databricks
•
Hace 9 años
Scalding: Twitter's Scala DSL for Hadoop/Cascading
johnynek
•
Hace 11 años
Data Workflows for Machine Learning - Seattle DAML
Paco Nathan
•
Hace 10 años
Personal Information
Organización/Lugar de trabajo
San Francisco Bay Area United States
Ocupación
Data analytics developer
Sector
Technology / Software / Internet
Sitio web
http://xinhstechblog.blogspot.com
Acerca de
Developer and team lead, with 9+ years experience in analytics, big data, and data science. At Samsung SDS, I worked on data science projects. As a developer, I used Spark and Scala for data munging, exploration, machine learning, and data pipelines. As a scrum master, I facilitated the Agile process in a team of developers and data scientists.
From 2010-2012, at LLNL, a research and development lab, I worked on a text processing data pipeline for a document search application, as well as analytics with Hadoop, Pig, HBase, and Solr.
In 2005-2009, I worked on Web search at Yahoo!, implementing distributed applications to analyze Web data, consisting of many billions of web pages, in the ...
Etiquetas
dataframe
big data
spark
ops
data pipeline
dcos
production
scala
data science
data munging
analytics
spark sql
Ver más