SlideShare una empresa de Scribd logo
1 de 38
TEXT
MINING
Team 4
Syed Aqib Ali
Syeda Ramsha Habib Gilani
Lateefah Omoyosola Yusuf
Rochelle Star Velasquez
TABLE OF CONTENT
1. What is Text Mining?
2. Introduction
3. Main Models Used
4. Key Contributions
5. Marketing and Non-marketing Applications
6. Limitations
7. Avenues for future research
8. Key Takeaways
WHAT IS TEXT MINING?
WHAT IS TEXT MINING?
Text mining is a process of deriving/extracting high
quality meaningful information and patterns.
Text analysis involves information retrieval, analysis
to study word frequency distributions, pattern
recognition, information extraction, data mining
techniques including link and association analysis,
visualization, and predictive analytics.
INTRODUCTION
INTRODUCTION
● A research study applying Text Mining and
Machine Learning tools.
● The authors find that loan applicants' choice
of words reveals insights into their intentions,
circumstances, and personality.
● This information is powerful in predicting
loan repayment, going beyond typical
financial and demographic factors.
Setting and Data
1. Potential borrowers submit their request for a loan for a specific
amount with a specific maximum interest rate (they are willing to pay).
2. The loan amount they wish to borrow must in (between $1,000 and
$25,000 in the data).
3. Prosper verifies all financial information, including the potential
borrower’s credit score.
Textual, Financial, and Demographic Variables
1. Textual variables:
a. The number of characters in the title and the text box.
b. The percentage of words with six or more letters.
c. SMOG: This measures writing quality by mapping it to number of years of formal
education needed to easily understand the text in first reading.
d. Count of spelling mistakes.
e. Bigrams : Two-word combinations (help to understand the context and the pattern).
2. Financial variable:
a. Loan amount, borrower’s credit grade, Debt to income ratio.
3. Demographic variables:
a. Gender, age, location, race.
PROCESS OF
TEXT MINING
The authors used something called "Term
frequency-inverse document frequency" or tf-
idf to compare how often a word is used in a
loan request to how often it's used in all the
loan requests and how long the request is.
Process 04
Process 01
tm package in r was used to select
distinct words in each loan application.
Process 02
- Porter’s stemming algorithm to collapse
variations of words into one e.g., “borrower,”
“borrowed,” “borrowing,” and “borrowers”
become “borrow” (3.5M words → 30,920 unique
words and 1052 bigrams.
PyEnchant 1.6.6 package in Python was
used to count spelling mistakes in the
loan applications. This allows them to
identify words that are misspelled and
potentially serve as a proxy for
characteristics correlated with lower
income.
Process 03
4
MAIN MODELS USED
MODEL 1 - Predictive model
Aim:
To evaluate whether the text used by borrowers in their loan application predicts
their loan default.
Machine Learning Methods:
Ensemble stacking approach
1. Train each model on the calibration data (2 logistics regression and 3 tree-
based methods).
2. Build a weighting model to combine the models calibrated in the first model.
Result
Source: Netzer, O., Lemaire, A., & Herzenstein, M. (2019). When words sweat: Identifying signals for loan default in the text of loan applications. Journal of Marketing Research,
56(6), 960-980.
Result
Source: Netzer, O., Lemaire, A., & Herzenstein, M. (2019). When words sweat: Identifying signals for loan default in the text of loan applications. Journal of
Marketing Research, 56(6), 960-980.
MODEL 2 - Words and writing styles of default loan request
Aim:
Learn which words, writing styles, and general ideas conveyed by the text are more
likely to be associated with default loan request.
Machine Learning Methods:
1)Machine learning tools
Naive Bayes
L1 regularization binary logistic model
Word Count Dictionary (LIWC)
2) Standard Econometrics tools
Topic’s Logistic regression extracted from
a latent Dirichlet allocation (LDA) analysis
and the sub-dictionaries of the Linguistic
Inquiry.
Result
Source: Netzer, O., Lemaire, A., & Herzenstein, M. (2019). When words sweat: Identifying signals for loan default in the text of loan applications. Journal of
Marketing Research, 56(6), 960-980.
MODEL 3 - Potential Borrower’s Personality
Aim:
Further exploration of potential traits and states of borrowers.
Machine Learning Methods:
Applying LIWC library.
Results:
Defaulting loan requests are written in a manner consistent
with the writing styles of extroverts and liars.
KEY CONTRIBUTIONS
Analyzing applications
Borrower 1: “I am a hard working person, married for 25 years, and have
two wonderful boys. Please let me explain why I need help. I would use
the $2,000 loan to fix our roof. Thank you, God bless you, and I promise to
pay you back.”
Borrower 2: “While the past year in our new place has been more than
great, the roof is now leaking and I need to borrow $2,000 to cover the
cost of the repair. I pay all bills (e.g., car loans, cable, utilities) on time.”
Which borrower is more likely to default?
KEY CONTRIBUTIONS
Textual information
on the loan
significantly helps
predict loan default.
Source: Netzer, O., Lemaire, A., & Herzenstein, M. (2019). When words sweat: Identifying signals for loan default in the text of loan applications. Journal of
Marketing Research, 56(6), 960-980.
KEY CONTRIBUTIONS
Words indicative of
loan repayment.
Source: Netzer, O., Lemaire, A., & Herzenstein, M. (2019). When words sweat: Identifying signals for loan default in the text of loan applications. Journal of
Marketing Research, 56(6), 960-980.
KEY CONTRIBUTIONS
Loan default requests mimic the
writing styles of extroverts and liars.
KEY CONTRIBUTIONS
Evidence of people with different
educational backgrounds and
economic situations use words
differently.
KEY CONTRIBUTIONS
Evidence of supplementing
traditional measures and replacing
some aspects of it.
KEY CONTRIBUTIONS
Help lenders avoid defaulting borrowers
and help borrowers better express
themselves when requesting a loan.
MARKETING AND
NON-MARKETING
APPLICATIONS
MARKETING APPLICATIONS
• Sentiment analysis
• Brand monitoring
• Customer feedback analysis
• Churn prediction
• Predictive analysis
• Market research
• Personalized marketing
• Social media analytics
NON-MARKETING APPLICATIONS
• Psychological profiling
• Fraud detection
• Credit risk assessment
• Customer service
LIMITATIONS
LIMITATIONS
1. Text data may not be available for all loan
applications, as some borrowers may not
provide any text or may provide incomplete
or inaccurate information.
2. Text data may be subject to
interpretation and bias, as different lenders
may interpret the same text differently
based on their own biases and assumptions.
3. The use of text data to predict loan
default raises ethical and legal concerns
FURTHER RESEARCH
FURTHER RESEARCH
● The predictive ability of text analysis
regarding future behavior extended
to other behaviors and industries.
● Extension of results to other types of
communication, e.g., phone calls
and online chats.
● How word usage can change
overtime.
FURTHER RESEARCH
● Exploring the role of emotions and
mental states in financial behaviors.
● Investigate the impact of different
writing styles on loan default.
● Application of the findings to other
loan types and platforms.
● Develop more accurate and
efficient text-mining and machine
learning tools for analyzing loan
applications.
KEY TAKEAWAYS
KEY TAKEAWAYS
● Text mining and machine learning tools can be
employed to predict psychographics, including
the likelihood of future loan defaults.
KEY TAKEAWAYS
● The LIWC dictionaries associated with
extroversion and deception are significantly
correlated with default.
KEY TAKEAWAYS
● There may be variables that are affected by
both the observable text and unobservable
personality traits.
Thank you
for your
attention!

Más contenido relacionado

Similar a Text Mining - Advanced Customer Analytics

NEIL MANOJ C (2247224) (PPT).pptx
NEIL MANOJ C (2247224) (PPT).pptxNEIL MANOJ C (2247224) (PPT).pptx
NEIL MANOJ C (2247224) (PPT).pptxNEILMANOJC2247224
 
Effect of Customer Relationship Management in Public and Private Banks
Effect of Customer Relationship Management in Public and Private BanksEffect of Customer Relationship Management in Public and Private Banks
Effect of Customer Relationship Management in Public and Private Banksijtsrd
 
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.Souma Maiti
 
Applying Convolutional-GRU for Term Deposit Likelihood Prediction
Applying Convolutional-GRU for Term Deposit Likelihood PredictionApplying Convolutional-GRU for Term Deposit Likelihood Prediction
Applying Convolutional-GRU for Term Deposit Likelihood PredictionVandanaSharma356
 
A STUDY ON ISLAMIC CREDIT CARDS HOLDERS.
A STUDY ON ISLAMIC CREDIT CARDS HOLDERS.A STUDY ON ISLAMIC CREDIT CARDS HOLDERS.
A STUDY ON ISLAMIC CREDIT CARDS HOLDERS.Claire Webber
 
Financial Text Analysis
Financial Text AnalysisFinancial Text Analysis
Financial Text AnalysisBytesview
 
Data Science - Experiments
Data Science - ExperimentsData Science - Experiments
Data Science - ExperimentsGaurav Marwaha
 
Running Head CONSUMER BEHAVIOR ANALYSISCONSUMER BEHAVIOR ANAL
Running Head CONSUMER BEHAVIOR ANALYSISCONSUMER BEHAVIOR ANALRunning Head CONSUMER BEHAVIOR ANALYSISCONSUMER BEHAVIOR ANAL
Running Head CONSUMER BEHAVIOR ANALYSISCONSUMER BEHAVIOR ANALMalikPinckney86
 
Propose a Human Resource Management strategy and specific organiza.docx
Propose a Human Resource Management strategy and specific organiza.docxPropose a Human Resource Management strategy and specific organiza.docx
Propose a Human Resource Management strategy and specific organiza.docxbriancrawford30935
 
Estimating Supply and Demand for Microcredit
Estimating Supply and Demand for MicrocreditEstimating Supply and Demand for Microcredit
Estimating Supply and Demand for MicrocreditFriedman Associates
 
MODULE 1 COURSE PROJECT1MODULE 1 COURSE PROJECT2.docx
MODULE 1 COURSE PROJECT1MODULE 1 COURSE PROJECT2.docxMODULE 1 COURSE PROJECT1MODULE 1 COURSE PROJECT2.docx
MODULE 1 COURSE PROJECT1MODULE 1 COURSE PROJECT2.docxraju957290
 
A Study on Consumer Preference towards Four Wheeler Loans with Reference to C...
A Study on Consumer Preference towards Four Wheeler Loans with Reference to C...A Study on Consumer Preference towards Four Wheeler Loans with Reference to C...
A Study on Consumer Preference towards Four Wheeler Loans with Reference to C...ijtsrd
 
B510519.pdf
B510519.pdfB510519.pdf
B510519.pdfaijbm
 
Consumers Buying Behaviors’ Loans and Credits: A Situationer
Consumers Buying Behaviors’ Loans and Credits: A SituationerConsumers Buying Behaviors’ Loans and Credits: A Situationer
Consumers Buying Behaviors’ Loans and Credits: A SituationerIJAEMSJORNAL
 

Similar a Text Mining - Advanced Customer Analytics (20)

Adithya Resume
Adithya ResumeAdithya Resume
Adithya Resume
 
NEIL MANOJ C (2247224) (PPT).pptx
NEIL MANOJ C (2247224) (PPT).pptxNEIL MANOJ C (2247224) (PPT).pptx
NEIL MANOJ C (2247224) (PPT).pptx
 
3-Project_FIN_955PROJECT_LAST VERSION (1)
3-Project_FIN_955PROJECT_LAST VERSION (1)3-Project_FIN_955PROJECT_LAST VERSION (1)
3-Project_FIN_955PROJECT_LAST VERSION (1)
 
03_AJMS_298_21.pdf
03_AJMS_298_21.pdf03_AJMS_298_21.pdf
03_AJMS_298_21.pdf
 
MTBiz August-September 2016
MTBiz August-September 2016MTBiz August-September 2016
MTBiz August-September 2016
 
Effect of Customer Relationship Management in Public and Private Banks
Effect of Customer Relationship Management in Public and Private BanksEffect of Customer Relationship Management in Public and Private Banks
Effect of Customer Relationship Management in Public and Private Banks
 
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
 
Applying Convolutional-GRU for Term Deposit Likelihood Prediction
Applying Convolutional-GRU for Term Deposit Likelihood PredictionApplying Convolutional-GRU for Term Deposit Likelihood Prediction
Applying Convolutional-GRU for Term Deposit Likelihood Prediction
 
A STUDY ON ISLAMIC CREDIT CARDS HOLDERS.
A STUDY ON ISLAMIC CREDIT CARDS HOLDERS.A STUDY ON ISLAMIC CREDIT CARDS HOLDERS.
A STUDY ON ISLAMIC CREDIT CARDS HOLDERS.
 
Financial Text Analysis
Financial Text AnalysisFinancial Text Analysis
Financial Text Analysis
 
Data Science - Experiments
Data Science - ExperimentsData Science - Experiments
Data Science - Experiments
 
Running Head CONSUMER BEHAVIOR ANALYSISCONSUMER BEHAVIOR ANAL
Running Head CONSUMER BEHAVIOR ANALYSISCONSUMER BEHAVIOR ANALRunning Head CONSUMER BEHAVIOR ANALYSISCONSUMER BEHAVIOR ANAL
Running Head CONSUMER BEHAVIOR ANALYSISCONSUMER BEHAVIOR ANAL
 
DB_Assgn 3
DB_Assgn 3DB_Assgn 3
DB_Assgn 3
 
Propose a Human Resource Management strategy and specific organiza.docx
Propose a Human Resource Management strategy and specific organiza.docxPropose a Human Resource Management strategy and specific organiza.docx
Propose a Human Resource Management strategy and specific organiza.docx
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
 
Estimating Supply and Demand for Microcredit
Estimating Supply and Demand for MicrocreditEstimating Supply and Demand for Microcredit
Estimating Supply and Demand for Microcredit
 
MODULE 1 COURSE PROJECT1MODULE 1 COURSE PROJECT2.docx
MODULE 1 COURSE PROJECT1MODULE 1 COURSE PROJECT2.docxMODULE 1 COURSE PROJECT1MODULE 1 COURSE PROJECT2.docx
MODULE 1 COURSE PROJECT1MODULE 1 COURSE PROJECT2.docx
 
A Study on Consumer Preference towards Four Wheeler Loans with Reference to C...
A Study on Consumer Preference towards Four Wheeler Loans with Reference to C...A Study on Consumer Preference towards Four Wheeler Loans with Reference to C...
A Study on Consumer Preference towards Four Wheeler Loans with Reference to C...
 
B510519.pdf
B510519.pdfB510519.pdf
B510519.pdf
 
Consumers Buying Behaviors’ Loans and Credits: A Situationer
Consumers Buying Behaviors’ Loans and Credits: A SituationerConsumers Buying Behaviors’ Loans and Credits: A Situationer
Consumers Buying Behaviors’ Loans and Credits: A Situationer
 

Más de Aqib Syed

KNOWLEDGE BASED ENTREPRENEURSHIP - ALT Business Plan59cc9dee8.pdf
KNOWLEDGE BASED ENTREPRENEURSHIP - ALT Business Plan59cc9dee8.pdfKNOWLEDGE BASED ENTREPRENEURSHIP - ALT Business Plan59cc9dee8.pdf
KNOWLEDGE BASED ENTREPRENEURSHIP - ALT Business Plan59cc9dee8.pdfAqib Syed
 
Challenged-Based Learning Project on IVAR IKS (Digitalisation and sustainabil...
Challenged-Based Learning Project on IVAR IKS (Digitalisation and sustainabil...Challenged-Based Learning Project on IVAR IKS (Digitalisation and sustainabil...
Challenged-Based Learning Project on IVAR IKS (Digitalisation and sustainabil...Aqib Syed
 
E Scooters in Scandinavia and Sustainability
E Scooters in Scandinavia and SustainabilityE Scooters in Scandinavia and Sustainability
E Scooters in Scandinavia and SustainabilityAqib Syed
 
The Great Leader Muhammad Ali Jinnah
The Great Leader Muhammad Ali JinnahThe Great Leader Muhammad Ali Jinnah
The Great Leader Muhammad Ali JinnahAqib Syed
 
Sir Syed Ahmed Khan Bahadur -History of Pakistan
Sir Syed Ahmed Khan Bahadur -History of PakistanSir Syed Ahmed Khan Bahadur -History of Pakistan
Sir Syed Ahmed Khan Bahadur -History of PakistanAqib Syed
 
Pakistan Resolution 1940 -History of Pakistan
Pakistan Resolution 1940 -History of PakistanPakistan Resolution 1940 -History of Pakistan
Pakistan Resolution 1940 -History of PakistanAqib Syed
 
Rise of Mughal Empire (1625-1707)- History of SubContinent
Rise of Mughal Empire (1625-1707)-  History of SubContinentRise of Mughal Empire (1625-1707)-  History of SubContinent
Rise of Mughal Empire (1625-1707)- History of SubContinentAqib Syed
 
Decline of Mughals (1707-1857) -History of SubContinent
Decline of Mughals (1707-1857) -History of SubContinentDecline of Mughals (1707-1857) -History of SubContinent
Decline of Mughals (1707-1857) -History of SubContinentAqib Syed
 
Allama Muhammad Iqbal as a Dreamer of Pakistan- History of SubContinent
Allama Muhammad Iqbal as a Dreamer of Pakistan- History of SubContinentAllama Muhammad Iqbal as a Dreamer of Pakistan- History of SubContinent
Allama Muhammad Iqbal as a Dreamer of Pakistan- History of SubContinentAqib Syed
 
East Pakistan Separation- History of SubContinent
East Pakistan  Separation- History of SubContinentEast Pakistan  Separation- History of SubContinent
East Pakistan Separation- History of SubContinentAqib Syed
 
General Muhammad Zia Ul Haq - Dictatorship in Pakistan
General Muhammad Zia Ul Haq - Dictatorship in PakistanGeneral Muhammad Zia Ul Haq - Dictatorship in Pakistan
General Muhammad Zia Ul Haq - Dictatorship in PakistanAqib Syed
 
Zulfiqar Ali Bhutto- A Politician
Zulfiqar Ali Bhutto- A Politician Zulfiqar Ali Bhutto- A Politician
Zulfiqar Ali Bhutto- A Politician Aqib Syed
 
Ashoka- The Great _History of Subcontinent
Ashoka- The Great _History of SubcontinentAshoka- The Great _History of Subcontinent
Ashoka- The Great _History of SubcontinentAqib Syed
 
Perception and Marketing- Consumer Behavior
Perception and Marketing- Consumer BehaviorPerception and Marketing- Consumer Behavior
Perception and Marketing- Consumer BehaviorAqib Syed
 
Learning, Memory and Retrieval
Learning, Memory and RetrievalLearning, Memory and Retrieval
Learning, Memory and RetrievalAqib Syed
 
Exposure, Attention and Interpretation -Consumer Behavior
Exposure, Attention and Interpretation -Consumer BehaviorExposure, Attention and Interpretation -Consumer Behavior
Exposure, Attention and Interpretation -Consumer BehaviorAqib Syed
 
Emotions and Marketing Strategy- Cosnumer Behavior
Emotions and Marketing Strategy- Cosnumer BehaviorEmotions and Marketing Strategy- Cosnumer Behavior
Emotions and Marketing Strategy- Cosnumer BehaviorAqib Syed
 
Attitude - Consumer Behavior
Attitude - Consumer BehaviorAttitude - Consumer Behavior
Attitude - Consumer BehaviorAqib Syed
 
Measuring Sources of Brand Equity -Brand Management
Measuring Sources of Brand Equity -Brand ManagementMeasuring Sources of Brand Equity -Brand Management
Measuring Sources of Brand Equity -Brand ManagementAqib Syed
 
Social Media Marketing - Brand Management
Social Media Marketing - Brand ManagementSocial Media Marketing - Brand Management
Social Media Marketing - Brand ManagementAqib Syed
 

Más de Aqib Syed (20)

KNOWLEDGE BASED ENTREPRENEURSHIP - ALT Business Plan59cc9dee8.pdf
KNOWLEDGE BASED ENTREPRENEURSHIP - ALT Business Plan59cc9dee8.pdfKNOWLEDGE BASED ENTREPRENEURSHIP - ALT Business Plan59cc9dee8.pdf
KNOWLEDGE BASED ENTREPRENEURSHIP - ALT Business Plan59cc9dee8.pdf
 
Challenged-Based Learning Project on IVAR IKS (Digitalisation and sustainabil...
Challenged-Based Learning Project on IVAR IKS (Digitalisation and sustainabil...Challenged-Based Learning Project on IVAR IKS (Digitalisation and sustainabil...
Challenged-Based Learning Project on IVAR IKS (Digitalisation and sustainabil...
 
E Scooters in Scandinavia and Sustainability
E Scooters in Scandinavia and SustainabilityE Scooters in Scandinavia and Sustainability
E Scooters in Scandinavia and Sustainability
 
The Great Leader Muhammad Ali Jinnah
The Great Leader Muhammad Ali JinnahThe Great Leader Muhammad Ali Jinnah
The Great Leader Muhammad Ali Jinnah
 
Sir Syed Ahmed Khan Bahadur -History of Pakistan
Sir Syed Ahmed Khan Bahadur -History of PakistanSir Syed Ahmed Khan Bahadur -History of Pakistan
Sir Syed Ahmed Khan Bahadur -History of Pakistan
 
Pakistan Resolution 1940 -History of Pakistan
Pakistan Resolution 1940 -History of PakistanPakistan Resolution 1940 -History of Pakistan
Pakistan Resolution 1940 -History of Pakistan
 
Rise of Mughal Empire (1625-1707)- History of SubContinent
Rise of Mughal Empire (1625-1707)-  History of SubContinentRise of Mughal Empire (1625-1707)-  History of SubContinent
Rise of Mughal Empire (1625-1707)- History of SubContinent
 
Decline of Mughals (1707-1857) -History of SubContinent
Decline of Mughals (1707-1857) -History of SubContinentDecline of Mughals (1707-1857) -History of SubContinent
Decline of Mughals (1707-1857) -History of SubContinent
 
Allama Muhammad Iqbal as a Dreamer of Pakistan- History of SubContinent
Allama Muhammad Iqbal as a Dreamer of Pakistan- History of SubContinentAllama Muhammad Iqbal as a Dreamer of Pakistan- History of SubContinent
Allama Muhammad Iqbal as a Dreamer of Pakistan- History of SubContinent
 
East Pakistan Separation- History of SubContinent
East Pakistan  Separation- History of SubContinentEast Pakistan  Separation- History of SubContinent
East Pakistan Separation- History of SubContinent
 
General Muhammad Zia Ul Haq - Dictatorship in Pakistan
General Muhammad Zia Ul Haq - Dictatorship in PakistanGeneral Muhammad Zia Ul Haq - Dictatorship in Pakistan
General Muhammad Zia Ul Haq - Dictatorship in Pakistan
 
Zulfiqar Ali Bhutto- A Politician
Zulfiqar Ali Bhutto- A Politician Zulfiqar Ali Bhutto- A Politician
Zulfiqar Ali Bhutto- A Politician
 
Ashoka- The Great _History of Subcontinent
Ashoka- The Great _History of SubcontinentAshoka- The Great _History of Subcontinent
Ashoka- The Great _History of Subcontinent
 
Perception and Marketing- Consumer Behavior
Perception and Marketing- Consumer BehaviorPerception and Marketing- Consumer Behavior
Perception and Marketing- Consumer Behavior
 
Learning, Memory and Retrieval
Learning, Memory and RetrievalLearning, Memory and Retrieval
Learning, Memory and Retrieval
 
Exposure, Attention and Interpretation -Consumer Behavior
Exposure, Attention and Interpretation -Consumer BehaviorExposure, Attention and Interpretation -Consumer Behavior
Exposure, Attention and Interpretation -Consumer Behavior
 
Emotions and Marketing Strategy- Cosnumer Behavior
Emotions and Marketing Strategy- Cosnumer BehaviorEmotions and Marketing Strategy- Cosnumer Behavior
Emotions and Marketing Strategy- Cosnumer Behavior
 
Attitude - Consumer Behavior
Attitude - Consumer BehaviorAttitude - Consumer Behavior
Attitude - Consumer Behavior
 
Measuring Sources of Brand Equity -Brand Management
Measuring Sources of Brand Equity -Brand ManagementMeasuring Sources of Brand Equity -Brand Management
Measuring Sources of Brand Equity -Brand Management
 
Social Media Marketing - Brand Management
Social Media Marketing - Brand ManagementSocial Media Marketing - Brand Management
Social Media Marketing - Brand Management
 

Último

Stages of Startup Funding - An Explainer
Stages of Startup Funding - An ExplainerStages of Startup Funding - An Explainer
Stages of Startup Funding - An ExplainerAlejandro Cremades
 
zidauu _business communication.pptx /pdf
zidauu _business  communication.pptx /pdfzidauu _business  communication.pptx /pdf
zidauu _business communication.pptx /pdfzukhrafshabbir
 
Aptar Closures segment - Corporate Overview-India.pdf
Aptar Closures segment - Corporate Overview-India.pdfAptar Closures segment - Corporate Overview-India.pdf
Aptar Closures segment - Corporate Overview-India.pdfprchbhandari
 
Potato Flakes Manufacturing Plant Project Report.pdf
Potato Flakes Manufacturing Plant Project Report.pdfPotato Flakes Manufacturing Plant Project Report.pdf
Potato Flakes Manufacturing Plant Project Report.pdfhostl9518
 
Special Purpose Vehicle (Purpose, Formation & examples)
Special Purpose Vehicle (Purpose, Formation & examples)Special Purpose Vehicle (Purpose, Formation & examples)
Special Purpose Vehicle (Purpose, Formation & examples)linciy03
 
Daftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdf
Daftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdfDaftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdf
Daftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdfAgusHalim9
 
NewBase 17 May 2024 Energy News issue - 1725 by Khaled Al Awadi_compresse...
NewBase   17 May  2024  Energy News issue - 1725 by Khaled Al Awadi_compresse...NewBase   17 May  2024  Energy News issue - 1725 by Khaled Al Awadi_compresse...
NewBase 17 May 2024 Energy News issue - 1725 by Khaled Al Awadi_compresse...Khaled Al Awadi
 
What is paper chromatography, principal, procedure,types, diagram, advantages...
What is paper chromatography, principal, procedure,types, diagram, advantages...What is paper chromatography, principal, procedure,types, diagram, advantages...
What is paper chromatography, principal, procedure,types, diagram, advantages...srcw2322l101
 
PitchBook’s Guide to VC Funding for Startups
PitchBook’s Guide to VC Funding for StartupsPitchBook’s Guide to VC Funding for Startups
PitchBook’s Guide to VC Funding for StartupsAlejandro Cremades
 
The Truth About Dinesh Bafna's Situation.pdf
The Truth About Dinesh Bafna's Situation.pdfThe Truth About Dinesh Bafna's Situation.pdf
The Truth About Dinesh Bafna's Situation.pdfMont Surfaces
 
NFS- Operations Presentation - Recurrent
NFS- Operations Presentation - RecurrentNFS- Operations Presentation - Recurrent
NFS- Operations Presentation - Recurrenttoniquemcintosh1
 
Blinkit: Revolutionizing the On-Demand Grocery Delivery Service.pptx
Blinkit: Revolutionizing the On-Demand Grocery Delivery Service.pptxBlinkit: Revolutionizing the On-Demand Grocery Delivery Service.pptx
Blinkit: Revolutionizing the On-Demand Grocery Delivery Service.pptxSaksham Gupta
 
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot ReportFuture of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot ReportDubai Multi Commodity Centre
 
LinkedIn Masterclass Techweek 2024 v4.1.pptx
LinkedIn Masterclass Techweek 2024 v4.1.pptxLinkedIn Masterclass Techweek 2024 v4.1.pptx
LinkedIn Masterclass Techweek 2024 v4.1.pptxSymbio Agency Ltd
 
Your Work Matters to God RestorationChurch.pptx
Your Work Matters to God RestorationChurch.pptxYour Work Matters to God RestorationChurch.pptx
Your Work Matters to God RestorationChurch.pptxOs Hillman
 
Powers and Functions of CPCB - The Water Act 1974.pdf
Powers and Functions of CPCB - The Water Act 1974.pdfPowers and Functions of CPCB - The Water Act 1974.pdf
Powers and Functions of CPCB - The Water Act 1974.pdflinciy03
 
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdf
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdfInnomantra Viewpoint - Building Moonshots : May-Jun 2024.pdf
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdfInnomantra
 
wagamamaLab presentation @MIT 20240509 IRODORI
wagamamaLab presentation @MIT 20240509 IRODORIwagamamaLab presentation @MIT 20240509 IRODORI
wagamamaLab presentation @MIT 20240509 IRODORIIRODORI inc.
 
How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?Alejandro Cremades
 

Último (20)

Stages of Startup Funding - An Explainer
Stages of Startup Funding - An ExplainerStages of Startup Funding - An Explainer
Stages of Startup Funding - An Explainer
 
zidauu _business communication.pptx /pdf
zidauu _business  communication.pptx /pdfzidauu _business  communication.pptx /pdf
zidauu _business communication.pptx /pdf
 
Aptar Closures segment - Corporate Overview-India.pdf
Aptar Closures segment - Corporate Overview-India.pdfAptar Closures segment - Corporate Overview-India.pdf
Aptar Closures segment - Corporate Overview-India.pdf
 
Potato Flakes Manufacturing Plant Project Report.pdf
Potato Flakes Manufacturing Plant Project Report.pdfPotato Flakes Manufacturing Plant Project Report.pdf
Potato Flakes Manufacturing Plant Project Report.pdf
 
Special Purpose Vehicle (Purpose, Formation & examples)
Special Purpose Vehicle (Purpose, Formation & examples)Special Purpose Vehicle (Purpose, Formation & examples)
Special Purpose Vehicle (Purpose, Formation & examples)
 
Daftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdf
Daftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdfDaftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdf
Daftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdf
 
NewBase 17 May 2024 Energy News issue - 1725 by Khaled Al Awadi_compresse...
NewBase   17 May  2024  Energy News issue - 1725 by Khaled Al Awadi_compresse...NewBase   17 May  2024  Energy News issue - 1725 by Khaled Al Awadi_compresse...
NewBase 17 May 2024 Energy News issue - 1725 by Khaled Al Awadi_compresse...
 
What is paper chromatography, principal, procedure,types, diagram, advantages...
What is paper chromatography, principal, procedure,types, diagram, advantages...What is paper chromatography, principal, procedure,types, diagram, advantages...
What is paper chromatography, principal, procedure,types, diagram, advantages...
 
PitchBook’s Guide to VC Funding for Startups
PitchBook’s Guide to VC Funding for StartupsPitchBook’s Guide to VC Funding for Startups
PitchBook’s Guide to VC Funding for Startups
 
The Truth About Dinesh Bafna's Situation.pdf
The Truth About Dinesh Bafna's Situation.pdfThe Truth About Dinesh Bafna's Situation.pdf
The Truth About Dinesh Bafna's Situation.pdf
 
NFS- Operations Presentation - Recurrent
NFS- Operations Presentation - RecurrentNFS- Operations Presentation - Recurrent
NFS- Operations Presentation - Recurrent
 
Blinkit: Revolutionizing the On-Demand Grocery Delivery Service.pptx
Blinkit: Revolutionizing the On-Demand Grocery Delivery Service.pptxBlinkit: Revolutionizing the On-Demand Grocery Delivery Service.pptx
Blinkit: Revolutionizing the On-Demand Grocery Delivery Service.pptx
 
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot ReportFuture of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
 
LinkedIn Masterclass Techweek 2024 v4.1.pptx
LinkedIn Masterclass Techweek 2024 v4.1.pptxLinkedIn Masterclass Techweek 2024 v4.1.pptx
LinkedIn Masterclass Techweek 2024 v4.1.pptx
 
Your Work Matters to God RestorationChurch.pptx
Your Work Matters to God RestorationChurch.pptxYour Work Matters to God RestorationChurch.pptx
Your Work Matters to God RestorationChurch.pptx
 
WAM Corporate Presentation May 2024_w.pdf
WAM Corporate Presentation May 2024_w.pdfWAM Corporate Presentation May 2024_w.pdf
WAM Corporate Presentation May 2024_w.pdf
 
Powers and Functions of CPCB - The Water Act 1974.pdf
Powers and Functions of CPCB - The Water Act 1974.pdfPowers and Functions of CPCB - The Water Act 1974.pdf
Powers and Functions of CPCB - The Water Act 1974.pdf
 
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdf
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdfInnomantra Viewpoint - Building Moonshots : May-Jun 2024.pdf
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdf
 
wagamamaLab presentation @MIT 20240509 IRODORI
wagamamaLab presentation @MIT 20240509 IRODORIwagamamaLab presentation @MIT 20240509 IRODORI
wagamamaLab presentation @MIT 20240509 IRODORI
 
How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?
 

Text Mining - Advanced Customer Analytics

  • 1. TEXT MINING Team 4 Syed Aqib Ali Syeda Ramsha Habib Gilani Lateefah Omoyosola Yusuf Rochelle Star Velasquez
  • 2. TABLE OF CONTENT 1. What is Text Mining? 2. Introduction 3. Main Models Used 4. Key Contributions 5. Marketing and Non-marketing Applications 6. Limitations 7. Avenues for future research 8. Key Takeaways
  • 3. WHAT IS TEXT MINING?
  • 4.
  • 5. WHAT IS TEXT MINING? Text mining is a process of deriving/extracting high quality meaningful information and patterns. Text analysis involves information retrieval, analysis to study word frequency distributions, pattern recognition, information extraction, data mining techniques including link and association analysis, visualization, and predictive analytics.
  • 7. INTRODUCTION ● A research study applying Text Mining and Machine Learning tools. ● The authors find that loan applicants' choice of words reveals insights into their intentions, circumstances, and personality. ● This information is powerful in predicting loan repayment, going beyond typical financial and demographic factors.
  • 8. Setting and Data 1. Potential borrowers submit their request for a loan for a specific amount with a specific maximum interest rate (they are willing to pay). 2. The loan amount they wish to borrow must in (between $1,000 and $25,000 in the data). 3. Prosper verifies all financial information, including the potential borrower’s credit score.
  • 9. Textual, Financial, and Demographic Variables 1. Textual variables: a. The number of characters in the title and the text box. b. The percentage of words with six or more letters. c. SMOG: This measures writing quality by mapping it to number of years of formal education needed to easily understand the text in first reading. d. Count of spelling mistakes. e. Bigrams : Two-word combinations (help to understand the context and the pattern). 2. Financial variable: a. Loan amount, borrower’s credit grade, Debt to income ratio. 3. Demographic variables: a. Gender, age, location, race.
  • 10. PROCESS OF TEXT MINING The authors used something called "Term frequency-inverse document frequency" or tf- idf to compare how often a word is used in a loan request to how often it's used in all the loan requests and how long the request is. Process 04 Process 01 tm package in r was used to select distinct words in each loan application. Process 02 - Porter’s stemming algorithm to collapse variations of words into one e.g., “borrower,” “borrowed,” “borrowing,” and “borrowers” become “borrow” (3.5M words → 30,920 unique words and 1052 bigrams. PyEnchant 1.6.6 package in Python was used to count spelling mistakes in the loan applications. This allows them to identify words that are misspelled and potentially serve as a proxy for characteristics correlated with lower income. Process 03 4
  • 12. MODEL 1 - Predictive model Aim: To evaluate whether the text used by borrowers in their loan application predicts their loan default. Machine Learning Methods: Ensemble stacking approach 1. Train each model on the calibration data (2 logistics regression and 3 tree- based methods). 2. Build a weighting model to combine the models calibrated in the first model.
  • 13. Result Source: Netzer, O., Lemaire, A., & Herzenstein, M. (2019). When words sweat: Identifying signals for loan default in the text of loan applications. Journal of Marketing Research, 56(6), 960-980.
  • 14. Result Source: Netzer, O., Lemaire, A., & Herzenstein, M. (2019). When words sweat: Identifying signals for loan default in the text of loan applications. Journal of Marketing Research, 56(6), 960-980.
  • 15. MODEL 2 - Words and writing styles of default loan request Aim: Learn which words, writing styles, and general ideas conveyed by the text are more likely to be associated with default loan request. Machine Learning Methods: 1)Machine learning tools Naive Bayes L1 regularization binary logistic model Word Count Dictionary (LIWC) 2) Standard Econometrics tools Topic’s Logistic regression extracted from a latent Dirichlet allocation (LDA) analysis and the sub-dictionaries of the Linguistic Inquiry.
  • 16. Result Source: Netzer, O., Lemaire, A., & Herzenstein, M. (2019). When words sweat: Identifying signals for loan default in the text of loan applications. Journal of Marketing Research, 56(6), 960-980.
  • 17. MODEL 3 - Potential Borrower’s Personality Aim: Further exploration of potential traits and states of borrowers. Machine Learning Methods: Applying LIWC library. Results: Defaulting loan requests are written in a manner consistent with the writing styles of extroverts and liars.
  • 19. Analyzing applications Borrower 1: “I am a hard working person, married for 25 years, and have two wonderful boys. Please let me explain why I need help. I would use the $2,000 loan to fix our roof. Thank you, God bless you, and I promise to pay you back.” Borrower 2: “While the past year in our new place has been more than great, the roof is now leaking and I need to borrow $2,000 to cover the cost of the repair. I pay all bills (e.g., car loans, cable, utilities) on time.” Which borrower is more likely to default?
  • 20. KEY CONTRIBUTIONS Textual information on the loan significantly helps predict loan default. Source: Netzer, O., Lemaire, A., & Herzenstein, M. (2019). When words sweat: Identifying signals for loan default in the text of loan applications. Journal of Marketing Research, 56(6), 960-980.
  • 21. KEY CONTRIBUTIONS Words indicative of loan repayment. Source: Netzer, O., Lemaire, A., & Herzenstein, M. (2019). When words sweat: Identifying signals for loan default in the text of loan applications. Journal of Marketing Research, 56(6), 960-980.
  • 22. KEY CONTRIBUTIONS Loan default requests mimic the writing styles of extroverts and liars.
  • 23. KEY CONTRIBUTIONS Evidence of people with different educational backgrounds and economic situations use words differently.
  • 24. KEY CONTRIBUTIONS Evidence of supplementing traditional measures and replacing some aspects of it.
  • 25. KEY CONTRIBUTIONS Help lenders avoid defaulting borrowers and help borrowers better express themselves when requesting a loan.
  • 27. MARKETING APPLICATIONS • Sentiment analysis • Brand monitoring • Customer feedback analysis • Churn prediction • Predictive analysis • Market research • Personalized marketing • Social media analytics
  • 28. NON-MARKETING APPLICATIONS • Psychological profiling • Fraud detection • Credit risk assessment • Customer service
  • 30. LIMITATIONS 1. Text data may not be available for all loan applications, as some borrowers may not provide any text or may provide incomplete or inaccurate information. 2. Text data may be subject to interpretation and bias, as different lenders may interpret the same text differently based on their own biases and assumptions. 3. The use of text data to predict loan default raises ethical and legal concerns
  • 32. FURTHER RESEARCH ● The predictive ability of text analysis regarding future behavior extended to other behaviors and industries. ● Extension of results to other types of communication, e.g., phone calls and online chats. ● How word usage can change overtime.
  • 33. FURTHER RESEARCH ● Exploring the role of emotions and mental states in financial behaviors. ● Investigate the impact of different writing styles on loan default. ● Application of the findings to other loan types and platforms. ● Develop more accurate and efficient text-mining and machine learning tools for analyzing loan applications.
  • 35. KEY TAKEAWAYS ● Text mining and machine learning tools can be employed to predict psychographics, including the likelihood of future loan defaults.
  • 36. KEY TAKEAWAYS ● The LIWC dictionaries associated with extroversion and deception are significantly correlated with default.
  • 37. KEY TAKEAWAYS ● There may be variables that are affected by both the observable text and unobservable personality traits.