Genislab builds better products and faster go-to-market with Lean project man...
Data science
1. The Colorful World of
Data Science
Sreejith C
Data Scientist
Calpine Labs
UVJ Technologies
Kochi
2. Overview
- Presentaion:
Introduction to Data Science
- Demonstration :
Loan Prediction Problem
- Exploratory data analysis in Python
- Data Munging in Python
- Building a Predictive Model in Python
Logistic Regression
Decision Tree
Random Forest
4. The Science of
- Discovering what we don’t know from data
- Obtaining predictive, actionable insight from data
- Creating Data Products that have business impact
now
- Communicating relevant business stories from data
- Building confidence in decisions that drive business
value
5. “ Data science is clearly a blend of the hackers’ arts,
statistics and machine learning...
and the expertise in mathematics and the domain of
the data for the analysis to be interpretable...
It requires creative decisions and open-mindedness in
a scientific context “
Hilary Mason and Chris Wiggins
Hilary Mason is an American data scientist and the founder of technology startup Fast Forward Labs as well as Data Scientist in Residence at Accel Partners. She
was the Chief Scientist at bitly.
Christopher H. Wiggins is an associate professor of applied mathematics at Columbia University, the first Chief Data Scientist at The New York Times, and co-
founder and co-organizer of hackNY hackNY.org
8. “ We realized that as our organizations grew, we both had to figure
out what to call the people on our teams.
Business analyst and Data analyst seemed too limiting.
The focus of our teams was to work on data applications that would
have an immediate and massive impact on the business.
The term that seemed to fit best was data scientist:
those who use both data and science to create something new “
DJ Patil
Chief Data Scientist of the United States Office of Science and Technology Policy, Patil is credited for coining the term "data science"
11. “... on any given day, a team member could author a multistage
processing pipeline in Python,
design a hypothesis test, perform a regression analysis over data
samples with R,
design and implement an algorithm for some data-intensive product
or service in Hadoop,
communicate the results of our analyses to other members of the
organization “
Jeff Hammerbacher
Data scientist as well as chief scientist and cofounder at Cloudera.Along with Along with Jeff Hammerbacher, Patil is credited with coining the term "data science", Jeff
Hammerbacher is credited with coining the term "data science"
19. Putting the pieces together .....
SIMPLE (Students' Innovations in Morphology Phonology and
Language Engineering) groups
CLEAR (Computational Linguistics in Engineering And
Research) magazine
- Blog / Write about your experience
- Build sample projects
- Share ideas
20. Puzzle
A huntsman can hit a target with a probability of 0.8
He sees a flock of birds (150 birds) atop a banyan tree.
He takes aim and fires 5 continuos shots.
Question : How many birds remain on the tree ?
22. Loan Prediction Problem
challenge is to predict approval status of loan
(Approved/ Reject)
Link :
https://github.com/sreejithc321/ML_Regression/tree/master/loan
_prediction
Demonstration