SlideShare una empresa de Scribd logo
1 de 15
Consumer Expenditure
Setu Chokshi
14th July 2017
Objective
• Propose one way of using the data employing one of the following
methods: regression, classification or clustering. Execute your
proposal and discuss your methodology, justify your algorithm/
feature selection and share insights from the model.
• Dataset: Consumer Expenditure Survey for 1996-‐2000 (12k rows, 220
columns)
A typical American family
This infographic summarizes the consumer demographics in the expenditure data. It provides for a very good macro overview of the
dataset and what can be expected out of it.
About Chart
2.3 vehicles per family
77% own a home
2.8 members per family
1.5 earning members per family
How much do they earn?
Description
For every dollar earned by the family members, about 78 cents are
used to pay various expenses to support and maintain the family. 20
cents are used to pay various taxes including social security.
Maintenance
About 60% of the expenses are
towards the non discretionary items
like rent, food etc.
Expenses
$40,679
Income
$53,147
Entertainment
The balance 40% is what is used for
discretionary items like Alcohol,
entertainment and travel.
Taxes
$9,962
Where does the money go?
0
50
100
$765 $1489 $1956 $2806
RentAlcohol
Tobacco
Entertainment Clothes Utilities Transport Food
$3921 $5821 $10687
Analytics
Potential questions data can answer?
Who are these people?
Who are these people? What are their demographics? Should
we customize the product for the diversity?
Targeting specific groups
Why should be target certain demographics? Why would they
buy the product from you?
Potential reach
Where should they grow the business next?
Is this necessary for them to get your product? If so how
frequently?
What motivates them to buy?
How much elasticity do they have in purchasing the product?
Would they be ok with price increases or would this product be
a battle over prices.
1. Other macro economic indicators can also be calculated as well using this data.
But since our focus is on CE goods company, we will exclude them.
Steps for the analysis
Step 04
Step 03
Step 02
Step 01
Initial Analysis
After eliminating lag variables, a pair-wise correlation
analysis was performed to id key variables.
Calculations
Calculated savings using residual & net worth methods
to identify elasticity of each demographic.
Understanding the data
K-Means to identify clusters within the groups. Decision
trees & ridge regression to understand the expenses.
Validation
Tried to understand the clusters and the data
patterns to get additional insights.
Presentation
Preparation of the results in the simplistic manner to be
presented to the Consumer goods executive team.
Demographics (using clustering)
Rich / Super Rich
3.7%
Single earner
25.7%
Singles
25.0%
Working spouse
33.1%
Widows
12.5%
59 years old
Mostly female
1 member
High school
46 years old
Mostly female
1 to 2 members
Some degree
45 years old
Mostly male
3 to 6 members
College, no degree
55 years old
Mostly male
2 to 4 members
College educated
47 years old
Fe(Male)
3 to 5 members
Bachelors degree
These were arrived using the K Means clustering algorithm. The features names were arrived on the basis of what the key separation features were for each cluster. I included the
calculated parameters of residual savings and net worth savings to be included in the clustering as well. The outliers were kept in the separate cluster and is being named as super rich
or the 0.01 percenter. Additional cluster level information can be found in the slide notes for this page.
Elasticity (expense / income)4
Widows Singles Working Spouse Single Earner
Income5
Clothes2
Alcohol / Tobacco
Entertainment
Residual Savings1
Net worth savings
19$
42$
70$
42$
38%
10%
2%
$18
$0
22%
7%
1%
$35
$5
17%
7%
2%
$49
$42
25%
9%
1%
$35
$13
1. The residual savings are a bit inflated due to some outlier data points, that fall on the cluster boundary. Did not get time to clean up.
2. For food I should have included the food away from home and working expenses. A potential link to elasticity could have helped further.
3. The (super) rich spend about 7 to 11% on clothes; 2 to 4% on alchol/tobacco and 1% on entertainment.
4. I would also carry out the elasticity analysis over the lag variables to determine the sensitivity towards price (data not used)
5. All income values in 10,000’s
Appendix
Pairwise Correlation Analysis (sklearn)
Unsorted Sorted
t-SNE for cluster analysis (sklearn)
Clothing spend (decision trees)
Gradient Boosted
Tried this approach to see if
building multiple decision trees
changes the variable importance on
the clothing spend
Simple decision tree
A quick look at the variable
importance in a build up of a
decision tree. These line up with
the variables found via correlation
analysis
17
%
14%
5%
5%
4%
Income
Residual savings
Education
Vehicles
Hours worked
68%
9%
6%
5%
4%
Income
Renter
Residual Savings
West US
Education
1. Explained variance is 0.35 for decision trees vs 0.48 for gradient booted trees
2. RMSE 5323 for decision trees vs 4765 for the gradient boosted trees
Food_Away Analysis using Ridge
Regression
See reference excel sheet.

Más contenido relacionado

Similar a Analysis on the US Consumer Expenditure

Customer Personality Analysis — Part 1.pdf
Customer Personality Analysis — Part 1.pdfCustomer Personality Analysis — Part 1.pdf
Customer Personality Analysis — Part 1.pdfssuser33ba021
 
INFORMED WOMEN KNOW MORE!
INFORMED WOMEN KNOW MORE!INFORMED WOMEN KNOW MORE!
INFORMED WOMEN KNOW MORE!WOGA Colorado
 
INFORMED WOMEN KNOW MORE!
INFORMED WOMEN KNOW MORE!INFORMED WOMEN KNOW MORE!
INFORMED WOMEN KNOW MORE!WOGA Colorado
 
DC_nonprofit_2015_5DataPointsThatReallyMatterDRAFTv2
DC_nonprofit_2015_5DataPointsThatReallyMatterDRAFTv2DC_nonprofit_2015_5DataPointsThatReallyMatterDRAFTv2
DC_nonprofit_2015_5DataPointsThatReallyMatterDRAFTv2LMSmith361
 
Discussion Board RubricProficientNoviceIntroduction an.docx
Discussion Board RubricProficientNoviceIntroduction an.docxDiscussion Board RubricProficientNoviceIntroduction an.docx
Discussion Board RubricProficientNoviceIntroduction an.docxfelipaser7p
 
Avalon's DM 101 - Analytics and Reporting
Avalon's DM 101 - Analytics and ReportingAvalon's DM 101 - Analytics and Reporting
Avalon's DM 101 - Analytics and ReportingAvalon Consulting
 
Melbourne Business School - mba talk october 14 - croll - 40m - lean analytics
Melbourne Business School - mba talk october 14 - croll - 40m - lean analyticsMelbourne Business School - mba talk october 14 - croll - 40m - lean analytics
Melbourne Business School - mba talk october 14 - croll - 40m - lean analyticsLean Analytics
 
Slides from New Media Manitoba Lean Analytics workshop, June 2015
Slides from New Media Manitoba Lean Analytics workshop, June 2015Slides from New Media Manitoba Lean Analytics workshop, June 2015
Slides from New Media Manitoba Lean Analytics workshop, June 2015Lean Analytics
 
North Carolina 2013 LTC Costs
North Carolina 2013 LTC CostsNorth Carolina 2013 LTC Costs
North Carolina 2013 LTC CostsBrian Johnson
 
Changes in consumer spending habits due to covid 19
Changes in consumer spending habits due to covid 19Changes in consumer spending habits due to covid 19
Changes in consumer spending habits due to covid 19Paras Lakhotra
 
CPG Trend Analysis and Growth Opportunities across Retail Channels
CPG Trend Analysis and Growth Opportunities across Retail ChannelsCPG Trend Analysis and Growth Opportunities across Retail Channels
CPG Trend Analysis and Growth Opportunities across Retail ChannelsInformation Resources Inc.
 
Dundee wealth slides
Dundee wealth slidesDundee wealth slides
Dundee wealth slidestaneilanthony
 
McKinsey Survey: Qatari consumer sentiment during the coronavirus crisis
McKinsey Survey: Qatari consumer sentiment during the coronavirus crisisMcKinsey Survey: Qatari consumer sentiment during the coronavirus crisis
McKinsey Survey: Qatari consumer sentiment during the coronavirus crisisMcKinsey on Marketing & Sales
 
Introducing the Motivational Map
Introducing the Motivational MapIntroducing the Motivational Map
Introducing the Motivational MapDavid Rose
 
Week 9 - eHealth in Ontario
Week 9 - eHealth in OntarioWeek 9 - eHealth in Ontario
Week 9 - eHealth in OntarioAlexandre Mayer
 
Exam 1 (covers Chapters 1-7)Math 140Show all work! Na.docx
Exam 1 (covers Chapters 1-7)Math 140Show all work!     Na.docxExam 1 (covers Chapters 1-7)Math 140Show all work!     Na.docx
Exam 1 (covers Chapters 1-7)Math 140Show all work! Na.docxSANSKAR20
 
Bryant Loy MKT 530 Final Exam - Final Copy
Bryant Loy MKT 530 Final Exam - Final CopyBryant Loy MKT 530 Final Exam - Final Copy
Bryant Loy MKT 530 Final Exam - Final CopyBryant Loy
 
Traci's vs. new team elites ppt aug 2011
Traci's vs. new team elites ppt aug 2011 Traci's vs. new team elites ppt aug 2011
Traci's vs. new team elites ppt aug 2011 John Wright
 
Trends in the Advisor Market
Trends in the Advisor Market Trends in the Advisor Market
Trends in the Advisor Market NICSA
 
Abc workshop ppt__1.5_hr__2014_v9
Abc workshop ppt__1.5_hr__2014_v9Abc workshop ppt__1.5_hr__2014_v9
Abc workshop ppt__1.5_hr__2014_v9AmericanRetire
 

Similar a Analysis on the US Consumer Expenditure (20)

Customer Personality Analysis — Part 1.pdf
Customer Personality Analysis — Part 1.pdfCustomer Personality Analysis — Part 1.pdf
Customer Personality Analysis — Part 1.pdf
 
INFORMED WOMEN KNOW MORE!
INFORMED WOMEN KNOW MORE!INFORMED WOMEN KNOW MORE!
INFORMED WOMEN KNOW MORE!
 
INFORMED WOMEN KNOW MORE!
INFORMED WOMEN KNOW MORE!INFORMED WOMEN KNOW MORE!
INFORMED WOMEN KNOW MORE!
 
DC_nonprofit_2015_5DataPointsThatReallyMatterDRAFTv2
DC_nonprofit_2015_5DataPointsThatReallyMatterDRAFTv2DC_nonprofit_2015_5DataPointsThatReallyMatterDRAFTv2
DC_nonprofit_2015_5DataPointsThatReallyMatterDRAFTv2
 
Discussion Board RubricProficientNoviceIntroduction an.docx
Discussion Board RubricProficientNoviceIntroduction an.docxDiscussion Board RubricProficientNoviceIntroduction an.docx
Discussion Board RubricProficientNoviceIntroduction an.docx
 
Avalon's DM 101 - Analytics and Reporting
Avalon's DM 101 - Analytics and ReportingAvalon's DM 101 - Analytics and Reporting
Avalon's DM 101 - Analytics and Reporting
 
Melbourne Business School - mba talk october 14 - croll - 40m - lean analytics
Melbourne Business School - mba talk october 14 - croll - 40m - lean analyticsMelbourne Business School - mba talk october 14 - croll - 40m - lean analytics
Melbourne Business School - mba talk october 14 - croll - 40m - lean analytics
 
Slides from New Media Manitoba Lean Analytics workshop, June 2015
Slides from New Media Manitoba Lean Analytics workshop, June 2015Slides from New Media Manitoba Lean Analytics workshop, June 2015
Slides from New Media Manitoba Lean Analytics workshop, June 2015
 
North Carolina 2013 LTC Costs
North Carolina 2013 LTC CostsNorth Carolina 2013 LTC Costs
North Carolina 2013 LTC Costs
 
Changes in consumer spending habits due to covid 19
Changes in consumer spending habits due to covid 19Changes in consumer spending habits due to covid 19
Changes in consumer spending habits due to covid 19
 
CPG Trend Analysis and Growth Opportunities across Retail Channels
CPG Trend Analysis and Growth Opportunities across Retail ChannelsCPG Trend Analysis and Growth Opportunities across Retail Channels
CPG Trend Analysis and Growth Opportunities across Retail Channels
 
Dundee wealth slides
Dundee wealth slidesDundee wealth slides
Dundee wealth slides
 
McKinsey Survey: Qatari consumer sentiment during the coronavirus crisis
McKinsey Survey: Qatari consumer sentiment during the coronavirus crisisMcKinsey Survey: Qatari consumer sentiment during the coronavirus crisis
McKinsey Survey: Qatari consumer sentiment during the coronavirus crisis
 
Introducing the Motivational Map
Introducing the Motivational MapIntroducing the Motivational Map
Introducing the Motivational Map
 
Week 9 - eHealth in Ontario
Week 9 - eHealth in OntarioWeek 9 - eHealth in Ontario
Week 9 - eHealth in Ontario
 
Exam 1 (covers Chapters 1-7)Math 140Show all work! Na.docx
Exam 1 (covers Chapters 1-7)Math 140Show all work!     Na.docxExam 1 (covers Chapters 1-7)Math 140Show all work!     Na.docx
Exam 1 (covers Chapters 1-7)Math 140Show all work! Na.docx
 
Bryant Loy MKT 530 Final Exam - Final Copy
Bryant Loy MKT 530 Final Exam - Final CopyBryant Loy MKT 530 Final Exam - Final Copy
Bryant Loy MKT 530 Final Exam - Final Copy
 
Traci's vs. new team elites ppt aug 2011
Traci's vs. new team elites ppt aug 2011 Traci's vs. new team elites ppt aug 2011
Traci's vs. new team elites ppt aug 2011
 
Trends in the Advisor Market
Trends in the Advisor Market Trends in the Advisor Market
Trends in the Advisor Market
 
Abc workshop ppt__1.5_hr__2014_v9
Abc workshop ppt__1.5_hr__2014_v9Abc workshop ppt__1.5_hr__2014_v9
Abc workshop ppt__1.5_hr__2014_v9
 

Más de Setu Chokshi

Build vs Buy: Ensuring maximum ROI from AI
Build vs Buy: Ensuring maximum ROI from AIBuild vs Buy: Ensuring maximum ROI from AI
Build vs Buy: Ensuring maximum ROI from AISetu Chokshi
 
AI for AI: Building state of the art models
AI for AI: Building state of the art modelsAI for AI: Building state of the art models
AI for AI: Building state of the art modelsSetu Chokshi
 
Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningMicrosoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningSetu Chokshi
 
2018 Global Azure Bootcamp Azure Machine Learning for neural networks
2018 Global Azure Bootcamp Azure Machine Learning for neural networks2018 Global Azure Bootcamp Azure Machine Learning for neural networks
2018 Global Azure Bootcamp Azure Machine Learning for neural networksSetu Chokshi
 
Azure machine learning 101 Parts 1 & 2 - Classification Algorithms
Azure machine learning 101  Parts 1 & 2  -  Classification Algorithms Azure machine learning 101  Parts 1 & 2  -  Classification Algorithms
Azure machine learning 101 Parts 1 & 2 - Classification Algorithms Setu Chokshi
 
Azure machine learning 101 - Part 1
Azure machine learning 101 - Part 1Azure machine learning 101 - Part 1
Azure machine learning 101 - Part 1Setu Chokshi
 
Azure Boot Camp 2017 getting started with azure machine learning
Azure Boot Camp 2017 getting started with azure machine learningAzure Boot Camp 2017 getting started with azure machine learning
Azure Boot Camp 2017 getting started with azure machine learningSetu Chokshi
 
Machine Learning 101
Machine Learning 101Machine Learning 101
Machine Learning 101Setu Chokshi
 
Time series predictions using LSTMs
Time series predictions using LSTMsTime series predictions using LSTMs
Time series predictions using LSTMsSetu Chokshi
 

Más de Setu Chokshi (9)

Build vs Buy: Ensuring maximum ROI from AI
Build vs Buy: Ensuring maximum ROI from AIBuild vs Buy: Ensuring maximum ROI from AI
Build vs Buy: Ensuring maximum ROI from AI
 
AI for AI: Building state of the art models
AI for AI: Building state of the art modelsAI for AI: Building state of the art models
AI for AI: Building state of the art models
 
Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningMicrosoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine Learning
 
2018 Global Azure Bootcamp Azure Machine Learning for neural networks
2018 Global Azure Bootcamp Azure Machine Learning for neural networks2018 Global Azure Bootcamp Azure Machine Learning for neural networks
2018 Global Azure Bootcamp Azure Machine Learning for neural networks
 
Azure machine learning 101 Parts 1 & 2 - Classification Algorithms
Azure machine learning 101  Parts 1 & 2  -  Classification Algorithms Azure machine learning 101  Parts 1 & 2  -  Classification Algorithms
Azure machine learning 101 Parts 1 & 2 - Classification Algorithms
 
Azure machine learning 101 - Part 1
Azure machine learning 101 - Part 1Azure machine learning 101 - Part 1
Azure machine learning 101 - Part 1
 
Azure Boot Camp 2017 getting started with azure machine learning
Azure Boot Camp 2017 getting started with azure machine learningAzure Boot Camp 2017 getting started with azure machine learning
Azure Boot Camp 2017 getting started with azure machine learning
 
Machine Learning 101
Machine Learning 101Machine Learning 101
Machine Learning 101
 
Time series predictions using LSTMs
Time series predictions using LSTMsTime series predictions using LSTMs
Time series predictions using LSTMs
 

Último

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 

Último (20)

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 

Analysis on the US Consumer Expenditure

  • 2. Objective • Propose one way of using the data employing one of the following methods: regression, classification or clustering. Execute your proposal and discuss your methodology, justify your algorithm/ feature selection and share insights from the model. • Dataset: Consumer Expenditure Survey for 1996-‐2000 (12k rows, 220 columns)
  • 3. A typical American family This infographic summarizes the consumer demographics in the expenditure data. It provides for a very good macro overview of the dataset and what can be expected out of it. About Chart 2.3 vehicles per family 77% own a home 2.8 members per family 1.5 earning members per family
  • 4. How much do they earn? Description For every dollar earned by the family members, about 78 cents are used to pay various expenses to support and maintain the family. 20 cents are used to pay various taxes including social security. Maintenance About 60% of the expenses are towards the non discretionary items like rent, food etc. Expenses $40,679 Income $53,147 Entertainment The balance 40% is what is used for discretionary items like Alcohol, entertainment and travel. Taxes $9,962
  • 5. Where does the money go? 0 50 100 $765 $1489 $1956 $2806 RentAlcohol Tobacco Entertainment Clothes Utilities Transport Food $3921 $5821 $10687
  • 7. Potential questions data can answer? Who are these people? Who are these people? What are their demographics? Should we customize the product for the diversity? Targeting specific groups Why should be target certain demographics? Why would they buy the product from you? Potential reach Where should they grow the business next? Is this necessary for them to get your product? If so how frequently? What motivates them to buy? How much elasticity do they have in purchasing the product? Would they be ok with price increases or would this product be a battle over prices. 1. Other macro economic indicators can also be calculated as well using this data. But since our focus is on CE goods company, we will exclude them.
  • 8. Steps for the analysis Step 04 Step 03 Step 02 Step 01 Initial Analysis After eliminating lag variables, a pair-wise correlation analysis was performed to id key variables. Calculations Calculated savings using residual & net worth methods to identify elasticity of each demographic. Understanding the data K-Means to identify clusters within the groups. Decision trees & ridge regression to understand the expenses. Validation Tried to understand the clusters and the data patterns to get additional insights. Presentation Preparation of the results in the simplistic manner to be presented to the Consumer goods executive team.
  • 9. Demographics (using clustering) Rich / Super Rich 3.7% Single earner 25.7% Singles 25.0% Working spouse 33.1% Widows 12.5% 59 years old Mostly female 1 member High school 46 years old Mostly female 1 to 2 members Some degree 45 years old Mostly male 3 to 6 members College, no degree 55 years old Mostly male 2 to 4 members College educated 47 years old Fe(Male) 3 to 5 members Bachelors degree These were arrived using the K Means clustering algorithm. The features names were arrived on the basis of what the key separation features were for each cluster. I included the calculated parameters of residual savings and net worth savings to be included in the clustering as well. The outliers were kept in the separate cluster and is being named as super rich or the 0.01 percenter. Additional cluster level information can be found in the slide notes for this page.
  • 10. Elasticity (expense / income)4 Widows Singles Working Spouse Single Earner Income5 Clothes2 Alcohol / Tobacco Entertainment Residual Savings1 Net worth savings 19$ 42$ 70$ 42$ 38% 10% 2% $18 $0 22% 7% 1% $35 $5 17% 7% 2% $49 $42 25% 9% 1% $35 $13 1. The residual savings are a bit inflated due to some outlier data points, that fall on the cluster boundary. Did not get time to clean up. 2. For food I should have included the food away from home and working expenses. A potential link to elasticity could have helped further. 3. The (super) rich spend about 7 to 11% on clothes; 2 to 4% on alchol/tobacco and 1% on entertainment. 4. I would also carry out the elasticity analysis over the lag variables to determine the sensitivity towards price (data not used) 5. All income values in 10,000’s
  • 12. Pairwise Correlation Analysis (sklearn) Unsorted Sorted
  • 13. t-SNE for cluster analysis (sklearn)
  • 14. Clothing spend (decision trees) Gradient Boosted Tried this approach to see if building multiple decision trees changes the variable importance on the clothing spend Simple decision tree A quick look at the variable importance in a build up of a decision tree. These line up with the variables found via correlation analysis 17 % 14% 5% 5% 4% Income Residual savings Education Vehicles Hours worked 68% 9% 6% 5% 4% Income Renter Residual Savings West US Education 1. Explained variance is 0.35 for decision trees vs 0.48 for gradient booted trees 2. RMSE 5323 for decision trees vs 4765 for the gradient boosted trees
  • 15. Food_Away Analysis using Ridge Regression See reference excel sheet.

Notas del editor

  1. These were arrived using the K Means clustering algorithm. The features names were arrived on the basis of what the key separation features were for each cluster. I included the calculated parameters of residual savings and net worth savings to be included in the clustering as well. The outliers were kept in the separate cluster and is being named as super rich or the 0.01 percenter. Non- Working Widows: Observations: 40.88% of the cluster has 2 for marital (against 7.66 % globally) 83.82% of the cluster has \N for emptype (against 24.41 % globally) 83.82% of the cluster has \N for empstat (against 24.50 % globally) Rich: Observations wages_calc is in average 245% greater : mean of 190k against 55002 globally expenses is in average 246% greater : mean of 180k against 53148 globally residual_savings is in average 172% greater : mean of 110k against 40679 globally Singles: Observations 33.53% of the cluster has 5 for marital (against 10.52 % globally) 46.29% of the cluster has 3 for marital (against 15.34 % globally) 97.24% of the cluster has 0 for married (against 36.23 % globally) Working Spouses: Observations 36.28% of the cluster has 1 for working_part_spouse (against 15.40 % globally) 45.41% of the cluster has 40 for hrswkd_spouse (against 19.98 % globally) 98.90% of the cluster has 1 for working_spouse (against 44.15 % globally) Single Earner Observations 67.08% of the cluster has 0 for wkswkd_spouse (against 17.98 % globally) 67.46% of the cluster has 0 for hrswkd_spouse (against 18.09 % globally) 52.79% of the cluster has \N for empstat (against 24.50 % globally) Super Rich: Observations net_worth_savings is in average 1531% greater : mean of 400k against 24731 globally expenses is in average 564% greater : mean of 350k against 53148 globally wages_calc is in average 559% greater : mean of 360k against 55002 globally