(Susan) Xinh Huynh

2 Seguidores

Developer and team lead, with 9+ years experience in analytics, big data, and data science. At Samsung SDS, I worked on data science projects. As a developer, I used Spark and Scala for data munging, exploration, machine learning, and data pipelines. As a scrum master, I facilitated the Agile process in a team of developers and data scientists. From 2010-2012, at LLNL, a research and development lab, I worked on a text processing data pipeline for a document search application, as well as analytics with Hadoop, Pig, HBase, and Solr. In 2005-2009, I worked on Web search at Yahoo!, implementing distributed applications to analyze Web data, consisting of many billions of web pages, in the ...

dataframe big data spark ops data pipeline dcos production scala data science data munging analytics spark sql

Actividad
Acerca de

(Susan) Xinh Huynh

Presentaciones

Introduction to Spark SQL training workshop

Spark DataFrames for Data Munging

Big Data on DC/OS

Recomendaciones

The Future of Real-Time in Spark

Dato Keynote

Introducing DataFrames in Spark for Large Scale Data Science

Scalding: Twitter's Scala DSL for Hadoop/Cascading

Data Workflows for Machine Learning - Seattle DAML