SlideShare a Scribd company logo
1 of 20
Agenda
2
Problem Summary
Confusion about
precise definition of
analytics
Benefit of ‘practical’
definitions
Issues with the
conventional ‘practical’
model of analytics
Model Details
Data source: ‘analytics’
job adverts
Topic modeling &
Latent Dirichlet
Allocation
Model build & data
pre-processing
Implications
Model analysis
An alternative
definition of analytics
Implications for OR/MS
Analytics is …
3
…. delivering the right
decision support to the right
people at the right time.
Laursen & Thorlund, 2010, p XII
… the scientific process of
transforming data into insight
for making better decisions
INFORMS
… [the] technologies, systems,
practices, & applications to analyze
critical business data so as to gain
new insights
Lim et al, 2012
… the extensive use of data, statistical
& quantitative analysis, explanatory &
predictive models, & fact-based
management to drive
decisions & actions.
Davenport & Harris , 2007, p 7
… an outgrowth of what is known as
business intelligence *…+ Today’s
expansive, global enterprises generate a
deluge of data that is impossible for a
human to make sense of.
Varshney & Mojsilovic, 2011
Analytics with a capital "A" is an
umbrella term that represents
our industry at a macro level,
and analytics with a small "a"
refers to technology used to
analyze data.
Eckerson, 2011
… information-intensive concepts
and methods to improve business
decision making.
Chiang et al, 2012
… is the process of obtaining
an optimal and realistic
decision based on existing data
Hamel, 2011
… data analysis that changes the
behavior of the organization
Hackathom, 2010
the science of analysis
… the science of analysis
Wikipedia
… the method of logical
analysis
Meriam Webster
… the brains to cloud
computing’s brawn
Croll, 2011
… the process of transforming data,
from a variety of sources and of a
variety of types, into insights that
support, improve and/or automate
business decisions, using
technological, quantitative and
presentation techniques
Mortenson et al, 2013
… a group of approaches, organizational
procedures and tools used in combination
with one another to gain information,
analyze that information, and predict
outcomes of problem solutions
Trkman et al, 2010
… the use of data, information
technology, statistical analysis, quantitative
methods, and mathematical or computer-based
models to help managers gain improved insight
about their business operations and make
better, fact-based decisions
Evans, 2012
• Many contrasting and often contradictory definitions
• Particularly difficult to distinguish analytics from
business intelligence or similar fields
• Does it matter?
 Potential confusion
 As analytics is multi-disciplinary it is important
that a common language can be established
 Important so that the growing job market can be
met with the appropriate training
What is Analytics?
Analytics: Practical Definition
4
Source: Blackett, 2012
Advantages
• Focuses on application &
generation of value
• Demonstrates the
disciplines informing
analytics
Issues
• Some methods suggest
different purposes
• Suggesting progression to
prescriptive as advanced
may not always hold
Job Adverts
5
• Analyse “analytics” job adverts – following the tradition of
‘ASP’ studies (e.g. Liberatore and Luo, 2012)
• Instead of studying a smaller pool of jobs, we access
through the LinkedIn API
 Over 250k jobs online
 77% of all jobs are posted on LinkedIn (Dougherty, 2012)
• Scripted using Python & stored in MongoDB
 OAuth, SimpleJSON, & PyMongo
• Need to reduce and generalise results from >6,800 adverts
with >50,000 unique words.
Topic Models
6
• Topic models assume documents to be a collection of
latent topics. The topics determine which words are used
• Probabilistic models that determine the topics by analysis
of the co-occurrence of the words used
• The most common are Probabilistic Latent Semantic
Indexing (pLSI) and Latent Dirichlet Allocation (LDA)
Latent Dirichlet Allocation (LDA)
7
• Basic conception is that a collection of documents has
three layers and contains:
Documents
Words
Words
W
Topics
Z
Topic
Distribution
Ө
Alpha
Parameter
α
Beta
Parameter
β
Adapted from Blei et al, 2003N M
Latent Dirichlet Allocation - Process
8
• Model is built by:
1. Estimating topics as product of observed words
2. Use to estimate document topic proportions
3. Evaluate corpus based on the distributions suggested in
(1) & (2)
4. Use (3) to improve topic estimations (1)
5. Reiterate until best fit found
Latent Dirichlet Allocation - Assumptions
9
• Bag-of-words / exchangeability
• The number of topics is known and pre-determined (K )
 Cross-validation to identify K with the lowest perplexity
• Topic independence
 As α is a parameter of a Dirichlet prior, each topic is assumed to
be independent and not correlated
 In this research correlation between topics has to be assumed.
 Alternative is the correlated topic model (Blei & Lafferty, 2007),
which uses a logistic normal rather than a Dirichlet distribution
Data Pre-Processing & Model Build
10
• Strip HTML / XML
• Remove stop words, numbers and punctuation
• Remove words < 3 characters
• Remove most and least frequent words
 Python: HTMLParser, GenSim and String
 R: TM and TopicModels
• To stem or not to stem?
 "the job involves managing analytics projects"
 "the job involves the management of analytical projects“
 "has experience running projects using management science and analytics"
 "managing a team of scientists analysing the experience of runners"
Topic Results
• 30 topics identified
• All topics are created equally but some are more topical
than others
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
Most Likely Topic per Document as % of Corpus
11
Most Likely Terms in Topics
• Analysis of the 3rd, 4th & 5th most likely topics
Digital & Web (8%)
Topic 3 (4th
)
other media
across working
understanding analysis
social projects
responsible required
ensure within
design key
performance digital
company manager
products their
lead tools
role services
Topic 13 (3rd
)
working market
develop project
software process
media reporting
key through
requirements solutions
manager excellent
your strategy
multiple more
service opportunity
manage well
opportunities clients
Consultancy (17%)
12
Topic 9 (5th)
risk systems
design solutions
services other
tools technical
teams related
provide required
position degree
such operations
global skills
project opportunity
clients service
excellent products
Technical (7%)
Most Likely Terms in Topics (cont.)
• Analysis of the top two most likely topics
Topic 20 (1st
)
reporting analysis
media required
strategy related
strategic manager
company degree
risk online
products across
drive must
manage responsible
well financial
planning industry
lead software
Topic 21 (2nd
)
services solutions
technology clients
digital consulting
your more
implementation management
oracle technical
capabilities design
provide advisory
strategy integration
technologies sap
career enterprise
solution architecture
Strategic (41%)Computing (20%)
13
Model Analysis
• Main five topics:
 Technical
 Digital/Web
 Consultancy
 Computing
 Strategic
• ‘Digital/Web’ is a specialism within analytics (also ‘Financial’)
• ‘Technical’ & ‘Consultancy’ are specific job types or environments
 However, some technical (‘hard’) skills & some consulting-type (‘soft’) skills
are likely to be required in all analytics jobs
• ‘Computing’ & ‘Strategic’?
14
The Analytics of Computing?
15
Basic Analytics Capability
SoftHard
Data
Warehouses
Big Data
Architecture
Stock Market
Analysis
Algorithmic
Trading
Fraud
Investigation
Automatic
Fraud
Detection
Customer
Segmentation
Propensity
Modeling
Clickstream
Analysis
Behavioural
Targeting
Qualitative
Text Analysis
Natural
Language
Processing
Reports &
Dashboards
Advanced
Visualisation
Advanced Analytics Capability
Discovery
Analytics
The Analytics of Strategy?
16
Basic Analytics Capability
SoftHard
Trial & Error
Experimentation
Optimisation Simulation
Basic
Forecasting
ARIMA Time
Series
Performance
Metrics
Data
Envelopment
Analysis
A/B Testing
Multivariate
Testing
Business
Analysis
Business
Process
Optimisation
Requirements
Gathering
Problem
Structuring
Advanced Analytics Capability
Decision
Analytics
An Alternative Definition of Analytics
17
Descriptive Analytics
Predictive Analytics Prescriptive Analytics
Statistical and data modeling techniques designed to describe past
events and answer “what happened”?
Data mining and machine learning
techniques used to predict future
events and answer “what will
happen next”?
OR/MS , advanced statistical and
mathematical models used to
prescribe future actions and answer
“what should we do next”?
An Alternative Definition of Analytics
Technological Strategic
Lower Risk Decisions Higher Risk Decisions
18
Discovery Analytics Decision Analytics
Advanced Discovery
Analytics
Reporting & alerts
Market research
Information systems
Basic historical analysis
Performance metrics
Stakeholder consultation
Advanced visualisation
Real time insights
Automated decisions
Advanced Decision
Analytics
Advanced modelling
Problem structuring
Decision analysis
Advanced
Summary & Implications for OR/MS
• Implemented a correlated topic model on 6,873 job adverts
• An alternative practical definition of analytics has been
suggested: discovery and decision analytics
 Maintains the focus on business value, application & the
disciplines that inform analytics
 However, removes the contradictions in the previous model
• OR/MS has an obvious role in advanced decision analytics,
both in hard and soft applications
• Further exploration (and/or promotion) of the role of
OR/MS in advanced discovery analytics
19
Contact Details and Questions
Email: m.j.mortenson@lboro.ac.uk
Website: www.whatisanalytics.co.uk
Mobile: 07833 XXXXXX
LinkedIn: http://www.linkedin.com/profile/view?id=114000243&trk=tab_pro
(or search Michael Mortenson)
20

More Related Content

What's hot

Data analytics presentation- Management career institute
Data analytics presentation- Management career institute Data analytics presentation- Management career institute
Data analytics presentation- Management career institute PoojaPatidar11
 
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and ProvidersText/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and ProvidersSeth Grimes
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analyticsSSaudia
 
12 Things the Semantic Web Should Know about Content Analytics
12 Things the Semantic Web Should Know about Content Analytics12 Things the Semantic Web Should Know about Content Analytics
12 Things the Semantic Web Should Know about Content AnalyticsSeth Grimes
 
Where are the data professionals
Where are the data professionalsWhere are the data professionals
Where are the data professionalsSteven Miller
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data ScienceSanghamitra Deb
 
Text Analytics 2014: User Perspectives on Solutions and Providers
Text Analytics 2014: User Perspectives on Solutions and ProvidersText Analytics 2014: User Perspectives on Solutions and Providers
Text Analytics 2014: User Perspectives on Solutions and ProvidersSeth Grimes
 
Gartner Business Intelligence & Analytics Summit Brochure
Gartner Business Intelligence & Analytics Summit BrochureGartner Business Intelligence & Analytics Summit Brochure
Gartner Business Intelligence & Analytics Summit BrochureNadia Smith
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analyticsUmasree Raghunath
 
KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...
KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...
KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...IADSS
 
How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?Steven Mugerwa
 
940 sponsor gazdak_using our laptop
940 sponsor gazdak_using our laptop940 sponsor gazdak_using our laptop
940 sponsor gazdak_using our laptopRising Media, Inc.
 
data scientists and their role
data scientists and their roledata scientists and their role
data scientists and their rolebhavesh lande
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist prateek kumar
 
Competitive Advantage with Optimization MII
Competitive Advantage with Optimization MIICompetitive Advantage with Optimization MII
Competitive Advantage with Optimization MIIAnwar Ali Mohamed
 
Introduction to Business Data Analytics
Introduction to Business Data AnalyticsIntroduction to Business Data Analytics
Introduction to Business Data AnalyticsVadivelM9
 

What's hot (20)

Data analytics presentation- Management career institute
Data analytics presentation- Management career institute Data analytics presentation- Management career institute
Data analytics presentation- Management career institute
 
Data analytics
Data analyticsData analytics
Data analytics
 
Road Map for Careers in Big Data
Road Map for Careers in Big DataRoad Map for Careers in Big Data
Road Map for Careers in Big Data
 
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and ProvidersText/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
 
Data analytics
Data analyticsData analytics
Data analytics
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analytics
 
12 Things the Semantic Web Should Know about Content Analytics
12 Things the Semantic Web Should Know about Content Analytics12 Things the Semantic Web Should Know about Content Analytics
12 Things the Semantic Web Should Know about Content Analytics
 
Where are the data professionals
Where are the data professionalsWhere are the data professionals
Where are the data professionals
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data Science
 
Text Analytics 2014: User Perspectives on Solutions and Providers
Text Analytics 2014: User Perspectives on Solutions and ProvidersText Analytics 2014: User Perspectives on Solutions and Providers
Text Analytics 2014: User Perspectives on Solutions and Providers
 
Gartner Business Intelligence & Analytics Summit Brochure
Gartner Business Intelligence & Analytics Summit BrochureGartner Business Intelligence & Analytics Summit Brochure
Gartner Business Intelligence & Analytics Summit Brochure
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analytics
 
KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...
KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...
KDD 2019 IADSS Workshop - Skills to Master Machine Learning and Data Science ...
 
How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?
 
940 sponsor gazdak_using our laptop
940 sponsor gazdak_using our laptop940 sponsor gazdak_using our laptop
940 sponsor gazdak_using our laptop
 
data scientists and their role
data scientists and their roledata scientists and their role
data scientists and their role
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Competitive Advantage with Optimization MII
Competitive Advantage with Optimization MIICompetitive Advantage with Optimization MII
Competitive Advantage with Optimization MII
 
Introduction to Business Data Analytics
Introduction to Business Data AnalyticsIntroduction to Business Data Analytics
Introduction to Business Data Analytics
 

Viewers also liked

Orientation english 2
Orientation english 2Orientation english 2
Orientation english 2Marlo Ortinez
 
¿Qué es el liderazgo? - José Luis López
¿Qué es el liderazgo? - José Luis López¿Qué es el liderazgo? - José Luis López
¿Qué es el liderazgo? - José Luis LópezJosé Luis López
 
AgendaJoven 15-25oct 2010
AgendaJoven 15-25oct 2010AgendaJoven 15-25oct 2010
AgendaJoven 15-25oct 2010Omij Huesca
 
Constancias egresados (tes)
Constancias  egresados (tes)Constancias  egresados (tes)
Constancias egresados (tes)veganet
 
Agentes y sistemas económicos
Agentes y sistemas económicosAgentes y sistemas económicos
Agentes y sistemas económicosluzferro
 
Healthstory - Dictation to Clinical Data: Automating the Production of Struc...
Healthstory - Dictation to Clinical Data: Automating the Production of Struc...Healthstory - Dictation to Clinical Data: Automating the Production of Struc...
Healthstory - Dictation to Clinical Data: Automating the Production of Struc...Nick van Terheyden
 
Panorama del mercado móvil
Panorama del mercado móvilPanorama del mercado móvil
Panorama del mercado móvilPablo Capurro
 
Technical data about quicklime digestor
Technical data about quicklime digestorTechnical data about quicklime digestor
Technical data about quicklime digestorEvita Lee
 
Catalogo Agencia de viajes Costa Rica te enamora 2013
Catalogo Agencia de viajes Costa Rica te enamora 2013 Catalogo Agencia de viajes Costa Rica te enamora 2013
Catalogo Agencia de viajes Costa Rica te enamora 2013 Costa Rica Te Enamora
 
TARGI MOBILNE, DZIEN II, SALA A, Taksówki w dobie inteligentnych urządzeń – z...
TARGI MOBILNE, DZIEN II, SALA A, Taksówki w dobie inteligentnych urządzeń – z...TARGI MOBILNE, DZIEN II, SALA A, Taksówki w dobie inteligentnych urządzeń – z...
TARGI MOBILNE, DZIEN II, SALA A, Taksówki w dobie inteligentnych urządzeń – z...ecommerce poland expo
 
October 2015 Newsletter v2
October 2015 Newsletter v2October 2015 Newsletter v2
October 2015 Newsletter v2Mariah Arnold
 
Historia y evolucion del pasaporte Dominicano
Historia y evolucion  del pasaporte DominicanoHistoria y evolucion  del pasaporte Dominicano
Historia y evolucion del pasaporte DominicanoNoelAlfredo
 

Viewers also liked (20)

GASPARIN Dejan1980
GASPARIN Dejan1980GASPARIN Dejan1980
GASPARIN Dejan1980
 
Präsi creativtag
Präsi creativtagPräsi creativtag
Präsi creativtag
 
Mep 09 2012_he3_web
Mep 09 2012_he3_webMep 09 2012_he3_web
Mep 09 2012_he3_web
 
Diccionarioindices
DiccionarioindicesDiccionarioindices
Diccionarioindices
 
Orientation english 2
Orientation english 2Orientation english 2
Orientation english 2
 
¿Qué es el liderazgo? - José Luis López
¿Qué es el liderazgo? - José Luis López¿Qué es el liderazgo? - José Luis López
¿Qué es el liderazgo? - José Luis López
 
AgendaJoven 15-25oct 2010
AgendaJoven 15-25oct 2010AgendaJoven 15-25oct 2010
AgendaJoven 15-25oct 2010
 
Constancias egresados (tes)
Constancias  egresados (tes)Constancias  egresados (tes)
Constancias egresados (tes)
 
Agentes y sistemas económicos
Agentes y sistemas económicosAgentes y sistemas económicos
Agentes y sistemas económicos
 
Healthstory - Dictation to Clinical Data: Automating the Production of Struc...
Healthstory - Dictation to Clinical Data: Automating the Production of Struc...Healthstory - Dictation to Clinical Data: Automating the Production of Struc...
Healthstory - Dictation to Clinical Data: Automating the Production of Struc...
 
Panorama del mercado móvil
Panorama del mercado móvilPanorama del mercado móvil
Panorama del mercado móvil
 
La fibromialgia no es un cajón de sastre
La fibromialgia no es un cajón de sastreLa fibromialgia no es un cajón de sastre
La fibromialgia no es un cajón de sastre
 
noemi alvarez
noemi alvareznoemi alvarez
noemi alvarez
 
Technical data about quicklime digestor
Technical data about quicklime digestorTechnical data about quicklime digestor
Technical data about quicklime digestor
 
Catalogo Agencia de viajes Costa Rica te enamora 2013
Catalogo Agencia de viajes Costa Rica te enamora 2013 Catalogo Agencia de viajes Costa Rica te enamora 2013
Catalogo Agencia de viajes Costa Rica te enamora 2013
 
TARGI MOBILNE, DZIEN II, SALA A, Taksówki w dobie inteligentnych urządzeń – z...
TARGI MOBILNE, DZIEN II, SALA A, Taksówki w dobie inteligentnych urządzeń – z...TARGI MOBILNE, DZIEN II, SALA A, Taksówki w dobie inteligentnych urządzeń – z...
TARGI MOBILNE, DZIEN II, SALA A, Taksówki w dobie inteligentnych urządzeń – z...
 
October 2015 Newsletter v2
October 2015 Newsletter v2October 2015 Newsletter v2
October 2015 Newsletter v2
 
Guia biblio
Guia biblioGuia biblio
Guia biblio
 
EL RUC POR ALEXANDRA TASHIGUANO
EL RUC POR ALEXANDRA TASHIGUANOEL RUC POR ALEXANDRA TASHIGUANO
EL RUC POR ALEXANDRA TASHIGUANO
 
Historia y evolucion del pasaporte Dominicano
Historia y evolucion  del pasaporte DominicanoHistoria y evolucion  del pasaporte Dominicano
Historia y evolucion del pasaporte Dominicano
 

Similar to A Topic Model of Analytics Job Adverts (Operational Research Society Annual Conference, Sept 2013)

Presentation to Analytics Network of the OR Society Nov 2020
Presentation to Analytics Network of the OR Society Nov 2020Presentation to Analytics Network of the OR Society Nov 2020
Presentation to Analytics Network of the OR Society Nov 2020Paul Laughlin
 
Tips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistTips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistLisa Cohen
 
Self Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxSelf Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxShanmugasundaram M
 
Tips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseTips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseLisa Cohen
 
Data science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyData science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyPeter Kua
 
Veda Semantics - introduction document
Veda Semantics - introduction documentVeda Semantics - introduction document
Veda Semantics - introduction documentrajatkr
 
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...Sri Ambati
 
PPT1-Buss Intel Analytics.pptx
PPT1-Buss Intel  Analytics.pptxPPT1-Buss Intel  Analytics.pptx
PPT1-Buss Intel Analytics.pptxssuser28b150
 
Practical Machine Learning
Practical Machine LearningPractical Machine Learning
Practical Machine LearningLynn Langit
 
The Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewThe Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewDr. Ananth Krishnamoorthy
 
Data Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfData Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfSujata Gupta
 
Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical Universitybutest
 
The Softer Skills that analysts need (beyond Data Visualisation)
The Softer Skills that analysts need (beyond Data Visualisation)The Softer Skills that analysts need (beyond Data Visualisation)
The Softer Skills that analysts need (beyond Data Visualisation)Paul Laughlin
 
The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...Juan Mateos-Garcia
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseSoftServe
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdxThinkful
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2Roger Barga
 

Similar to A Topic Model of Analytics Job Adverts (Operational Research Society Annual Conference, Sept 2013) (20)

Presentation to Analytics Network of the OR Society Nov 2020
Presentation to Analytics Network of the OR Society Nov 2020Presentation to Analytics Network of the OR Society Nov 2020
Presentation to Analytics Network of the OR Society Nov 2020
 
Tips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistTips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data Scientist
 
Text Analytics for Legal work
Text Analytics for Legal workText Analytics for Legal work
Text Analytics for Legal work
 
Self Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxSelf Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docx
 
Tips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseTips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the Enterprise
 
Data science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyData science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi Periasamy
 
Veda Semantics - introduction document
Veda Semantics - introduction documentVeda Semantics - introduction document
Veda Semantics - introduction document
 
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
 
PPT1-Buss Intel Analytics.pptx
PPT1-Buss Intel  Analytics.pptxPPT1-Buss Intel  Analytics.pptx
PPT1-Buss Intel Analytics.pptx
 
Practical Machine Learning
Practical Machine LearningPractical Machine Learning
Practical Machine Learning
 
The Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewThe Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape Overview
 
Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017 Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017
 
Data Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfData Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdf
 
Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical University
 
The Softer Skills that analysts need (beyond Data Visualisation)
The Softer Skills that analysts need (beyond Data Visualisation)The Softer Skills that analysts need (beyond Data Visualisation)
The Softer Skills that analysts need (beyond Data Visualisation)
 
The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science Expertise
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdx
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
What is data science ?
What is data science ?What is data science ?
What is data science ?
 

Recently uploaded

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 

A Topic Model of Analytics Job Adverts (Operational Research Society Annual Conference, Sept 2013)

  • 1.
  • 2. Agenda 2 Problem Summary Confusion about precise definition of analytics Benefit of ‘practical’ definitions Issues with the conventional ‘practical’ model of analytics Model Details Data source: ‘analytics’ job adverts Topic modeling & Latent Dirichlet Allocation Model build & data pre-processing Implications Model analysis An alternative definition of analytics Implications for OR/MS
  • 3. Analytics is … 3 …. delivering the right decision support to the right people at the right time. Laursen & Thorlund, 2010, p XII … the scientific process of transforming data into insight for making better decisions INFORMS … [the] technologies, systems, practices, & applications to analyze critical business data so as to gain new insights Lim et al, 2012 … the extensive use of data, statistical & quantitative analysis, explanatory & predictive models, & fact-based management to drive decisions & actions. Davenport & Harris , 2007, p 7 … an outgrowth of what is known as business intelligence *…+ Today’s expansive, global enterprises generate a deluge of data that is impossible for a human to make sense of. Varshney & Mojsilovic, 2011 Analytics with a capital "A" is an umbrella term that represents our industry at a macro level, and analytics with a small "a" refers to technology used to analyze data. Eckerson, 2011 … information-intensive concepts and methods to improve business decision making. Chiang et al, 2012 … is the process of obtaining an optimal and realistic decision based on existing data Hamel, 2011 … data analysis that changes the behavior of the organization Hackathom, 2010 the science of analysis … the science of analysis Wikipedia … the method of logical analysis Meriam Webster … the brains to cloud computing’s brawn Croll, 2011 … the process of transforming data, from a variety of sources and of a variety of types, into insights that support, improve and/or automate business decisions, using technological, quantitative and presentation techniques Mortenson et al, 2013 … a group of approaches, organizational procedures and tools used in combination with one another to gain information, analyze that information, and predict outcomes of problem solutions Trkman et al, 2010 … the use of data, information technology, statistical analysis, quantitative methods, and mathematical or computer-based models to help managers gain improved insight about their business operations and make better, fact-based decisions Evans, 2012 • Many contrasting and often contradictory definitions • Particularly difficult to distinguish analytics from business intelligence or similar fields • Does it matter?  Potential confusion  As analytics is multi-disciplinary it is important that a common language can be established  Important so that the growing job market can be met with the appropriate training What is Analytics?
  • 4. Analytics: Practical Definition 4 Source: Blackett, 2012 Advantages • Focuses on application & generation of value • Demonstrates the disciplines informing analytics Issues • Some methods suggest different purposes • Suggesting progression to prescriptive as advanced may not always hold
  • 5. Job Adverts 5 • Analyse “analytics” job adverts – following the tradition of ‘ASP’ studies (e.g. Liberatore and Luo, 2012) • Instead of studying a smaller pool of jobs, we access through the LinkedIn API  Over 250k jobs online  77% of all jobs are posted on LinkedIn (Dougherty, 2012) • Scripted using Python & stored in MongoDB  OAuth, SimpleJSON, & PyMongo • Need to reduce and generalise results from >6,800 adverts with >50,000 unique words.
  • 6. Topic Models 6 • Topic models assume documents to be a collection of latent topics. The topics determine which words are used • Probabilistic models that determine the topics by analysis of the co-occurrence of the words used • The most common are Probabilistic Latent Semantic Indexing (pLSI) and Latent Dirichlet Allocation (LDA)
  • 7. Latent Dirichlet Allocation (LDA) 7 • Basic conception is that a collection of documents has three layers and contains: Documents Words Words W Topics Z Topic Distribution Ө Alpha Parameter α Beta Parameter β Adapted from Blei et al, 2003N M
  • 8. Latent Dirichlet Allocation - Process 8 • Model is built by: 1. Estimating topics as product of observed words 2. Use to estimate document topic proportions 3. Evaluate corpus based on the distributions suggested in (1) & (2) 4. Use (3) to improve topic estimations (1) 5. Reiterate until best fit found
  • 9. Latent Dirichlet Allocation - Assumptions 9 • Bag-of-words / exchangeability • The number of topics is known and pre-determined (K )  Cross-validation to identify K with the lowest perplexity • Topic independence  As α is a parameter of a Dirichlet prior, each topic is assumed to be independent and not correlated  In this research correlation between topics has to be assumed.  Alternative is the correlated topic model (Blei & Lafferty, 2007), which uses a logistic normal rather than a Dirichlet distribution
  • 10. Data Pre-Processing & Model Build 10 • Strip HTML / XML • Remove stop words, numbers and punctuation • Remove words < 3 characters • Remove most and least frequent words  Python: HTMLParser, GenSim and String  R: TM and TopicModels • To stem or not to stem?  "the job involves managing analytics projects"  "the job involves the management of analytical projects“  "has experience running projects using management science and analytics"  "managing a team of scientists analysing the experience of runners"
  • 11. Topic Results • 30 topics identified • All topics are created equally but some are more topical than others 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% Most Likely Topic per Document as % of Corpus 11
  • 12. Most Likely Terms in Topics • Analysis of the 3rd, 4th & 5th most likely topics Digital & Web (8%) Topic 3 (4th ) other media across working understanding analysis social projects responsible required ensure within design key performance digital company manager products their lead tools role services Topic 13 (3rd ) working market develop project software process media reporting key through requirements solutions manager excellent your strategy multiple more service opportunity manage well opportunities clients Consultancy (17%) 12 Topic 9 (5th) risk systems design solutions services other tools technical teams related provide required position degree such operations global skills project opportunity clients service excellent products Technical (7%)
  • 13. Most Likely Terms in Topics (cont.) • Analysis of the top two most likely topics Topic 20 (1st ) reporting analysis media required strategy related strategic manager company degree risk online products across drive must manage responsible well financial planning industry lead software Topic 21 (2nd ) services solutions technology clients digital consulting your more implementation management oracle technical capabilities design provide advisory strategy integration technologies sap career enterprise solution architecture Strategic (41%)Computing (20%) 13
  • 14. Model Analysis • Main five topics:  Technical  Digital/Web  Consultancy  Computing  Strategic • ‘Digital/Web’ is a specialism within analytics (also ‘Financial’) • ‘Technical’ & ‘Consultancy’ are specific job types or environments  However, some technical (‘hard’) skills & some consulting-type (‘soft’) skills are likely to be required in all analytics jobs • ‘Computing’ & ‘Strategic’? 14
  • 15. The Analytics of Computing? 15 Basic Analytics Capability SoftHard Data Warehouses Big Data Architecture Stock Market Analysis Algorithmic Trading Fraud Investigation Automatic Fraud Detection Customer Segmentation Propensity Modeling Clickstream Analysis Behavioural Targeting Qualitative Text Analysis Natural Language Processing Reports & Dashboards Advanced Visualisation Advanced Analytics Capability Discovery Analytics
  • 16. The Analytics of Strategy? 16 Basic Analytics Capability SoftHard Trial & Error Experimentation Optimisation Simulation Basic Forecasting ARIMA Time Series Performance Metrics Data Envelopment Analysis A/B Testing Multivariate Testing Business Analysis Business Process Optimisation Requirements Gathering Problem Structuring Advanced Analytics Capability Decision Analytics
  • 17. An Alternative Definition of Analytics 17 Descriptive Analytics Predictive Analytics Prescriptive Analytics Statistical and data modeling techniques designed to describe past events and answer “what happened”? Data mining and machine learning techniques used to predict future events and answer “what will happen next”? OR/MS , advanced statistical and mathematical models used to prescribe future actions and answer “what should we do next”?
  • 18. An Alternative Definition of Analytics Technological Strategic Lower Risk Decisions Higher Risk Decisions 18 Discovery Analytics Decision Analytics Advanced Discovery Analytics Reporting & alerts Market research Information systems Basic historical analysis Performance metrics Stakeholder consultation Advanced visualisation Real time insights Automated decisions Advanced Decision Analytics Advanced modelling Problem structuring Decision analysis Advanced
  • 19. Summary & Implications for OR/MS • Implemented a correlated topic model on 6,873 job adverts • An alternative practical definition of analytics has been suggested: discovery and decision analytics  Maintains the focus on business value, application & the disciplines that inform analytics  However, removes the contradictions in the previous model • OR/MS has an obvious role in advanced decision analytics, both in hard and soft applications • Further exploration (and/or promotion) of the role of OR/MS in advanced discovery analytics 19
  • 20. Contact Details and Questions Email: m.j.mortenson@lboro.ac.uk Website: www.whatisanalytics.co.uk Mobile: 07833 XXXXXX LinkedIn: http://www.linkedin.com/profile/view?id=114000243&trk=tab_pro (or search Michael Mortenson) 20