SlideShare a Scribd company logo
1 of 32
Social Media Analysis to Monitor
Cannabis Trends
Presenter: Raminta Daniulaityte, Ph.D.
CITAR & Kno.e.sis,
Wright State University
Boonshoft School of Medicine
T32 Substance Abuse Seminar
(Public Health Seminar at Columbia University)
February 23, 2017
© Wright State University
Center for Interventions,
Treatment, and Addiction
Research (CITAR)
Ohio Center of Excellence in
Knowledge-enabled
Computing (Kno.e.sis)
Research Team
NIH/NIDA R01 DA03945
Trending: Social media analysis to monitor cannabis and synthetic cannabinoid use
Principle Investigators:
Raminta Daniulaityte, Ph.D. Amit Sheth, Ph.D.
Center for Interventions, Treatment, and
Addiction Research (CITAR),
Wright State University
Boonshoft School of Medicine
Ohio Center of Excellence in Knowledge-
Enabled Computing (Kno.e.sis),
Wright State University
Co-Investigators:
Robert Carlson, Ph.D. (CITAR) Silvia Martins, M.D., Ph.D. (Columbia U)
Ramzi Nahhas, Ph.D. (Comm. Health, WSU) Edward Boyer, M.D., Ph.D. (U Mass)
Krishnaprasad Thirunarayan, Ph.D. (Kno.e.sis)
Research Staff:
Francois R. Lamy, PhD (CITAR, Postdoc);
G. Alan Smith (Kno.e.sis, Software Engineer);
Sanjaya Wijeratne (Kno.e.sis, Ph.D. student)
Farahnaz Golroo (Kno.e.sis, Ph.D. student)
No Conflicts of Interest to declare
Project Aims
• Aim 1: Develop a comprehensive software platform, eDrugTrends, for
semi-automated processing and visualization of spatio-temporal, and social
network dimensions of social media data (Twitter and Web forums) on
cannabis and synthetic cannabinoid use.
• Aim 2: Deploy eDrugTrends to identify and compare trends in knowledge,
attitudes, and behaviors related to cannabis and synthetic cannabinoid use
across U.S. regions with different cannabis legalization policies using Twitter
and Web forum data.
• Types of data sources:
o Twitter (brief content, but over 500 million tweets/day, geo-info)
o Web forums such as Bluelight, drugs-forum, Reddit (detailed discussions of drug
use practices)
o Web survey on Bluelight
Presentation Objectives
• Overview of the technical capabilities of eDrugTrends platform
to process Twitter data
• How data is collected
• Geo-location identification
• Keyword selection and monitoring
• Tweet content processing
• Exploration of recently collected and processed data on
marijuana concentrates
• Integrating geographic and content analysis features to
explore cannabis-related tweeting activity
:
Twitter Data Collection
• Tweets are collected using Twitter’s streaming Application Programming
Interface (API) that provides free access to 1% of all tweets.
• Publically available tweets only.
• The system automatically filters out non-English language tweets.
• Current system started data collection March 2015; Close to 90 million
tweets have been collected
eDrugTrends Dashboard Showing in-coming Tweets and trending Topics
What does “up to 1%” mean?
• Free access to 1% of all tweets
o It can be thought of as a ”bucket” that can fit up to 1% of all tweets.
o Assuming 400 million daily tweets are generated per day, 1% would constitute
about 4 million daily tweets.
o Still, it is possible to miss some of the tweets due to sudden volume spikes.
• With a reasonably limited number of keywords, all or
most relevant tweets can be collected.
• Our system collects an average of about 150,000 tweets
per day, which is below the allowable limit.
Extraction of Geo-Location Information
• Tweets may contain GPS coordinates (via a mobile phone
that supports the feature).
• Users may indicate their geo information in their user profiles:
WHERE THE WEED AT
DAYTON, OH
SAN DIEGO
Pittsburgh, PA
wonderland
Earth
• eDrugTrends geo-locates close to 30% of tweets for state-
level and county-level information .
• Some earlier studies reported 1-3%of tweets with geo-location identification.
Adjusted Measures of Tweeting Activity
• To compare regional trends, we can’t work with raw numbers.
• eDrugTrends started running a parallel data collection system
to obtain general sample of tweets (denominator data).
• General sample data are collected using another API stream;
no keywords are used; data are processed to identify
geographic information.
• “General sample” is then used to calculate state-tweet-
volume-adjusted state proportion of tweets
o (or county-tweet-volume-adjusted county proportion of tweets)
RAW Numbers and ADJUSTED State Proportions of
Cannabis-Related Tweets
(March-September, 2016)
Raw numbers
Adjusted proportions
Twitter Data Collection: Keywords
• Keywords/slang terms are used to collect relevant tweets:
o Cannabis—weed, marijuana, spliff, ganja, kush, sativa, indica, chronic, blunt,
hydro, skunk, reefer, joint, etc.
o Marijuana concentrates—dabs, shatter, budder, wax BHO, butane honey oil,
hash oil, etc.
o Edibles—weed cookies, space cake, pot cookie, pot brownie, mj brownie,
medibles, etc.
o Synthetic cannabinoids—spice, K2, CHMINACA, AB-FUBINACA, synthetic weed,
smoking blend, noid, black mamba, etc.
• Inclusion of slang terms improves sensitivity (recall) in data
collection
Keyword challenges
• Issues with “precision”– risk of getting “noisy” or “irrelevant” data.
• Ways to improve precision of collected data:
o Ambiguous terms are combined with additional keywords indicating usage (e.g., smoke
blunt, smoke budder)
o “Black list” words are used to exclude irrelevant tweets (e.g., pumpkin spice latte, Emily
Blunt).
o Machine learning and other advanced information processing techniques are needed
• On-going monitoring is needed:
o New types of products or slang terms emerge. For example, “rosin”—new type of marijuana
concentrate produced using solvent-less method.
o New uses/meanings of words may affect the accuracy of collected data. (e.g., “dabs”)
Data Processing: Automated Tweets Classification
• Using manually annotated training
data sets, machine learning
classifiers were developed to
automatically classify tweets
• Classification by the the source/type
of communication (personal, media,
retail)
o Machine learning classifier (SVM)
achieved F score = 0.81.
• Classification by sentiment
(positive, negative, neutral),
o Sentiment classification is applied
to personal communications only
o Machine learning classifier (SVM)
achieved F score = 0.71.
Kickin back wit my spliff
Late night dabs
Medical marvel: the uses of cannabis
continue to grow
http://t.co/djtKPunxW9
$10 #Cannabis #Edibles 12 Varieties 1
Package 10MG #THC total
http://t.co/9w3xrFUnAe
Positive:
Marijuana works wonders on the soul
Strongest shatter I've ever smoked
Negative:
I’m not much of a fan when it comes to
edibles
hate when people think i smoke weed
Exploring Twitter Data on
Marijuana Concentrates
Initial report about marijuana concentrate
related tweeting: “Time for dabs”
2014 data
• Data collected over 2 month period, end of 2014.
• 27,018 tweets with identifiable state-level geo-location
• Although over 10 keywords were used (shatter, concentrates, butane
hash oil, etc.), keyword “dabs” produced over 99% of the total sample.
Dabs on Dabs on Dabs
Time for dabs
I just need a cute girl to take
dabs with me and get stoned
together
Time for dabs": Analyzing Twitter data on marijuana concentrates across the U.S.
Daniulaityte R., Nahhas R.W., Wijeratne S., Carlson R.G., Lamy F.R., Martins S.S., Boyer E.W., (...), Sheth A.
(2015) Drug and Alcohol Dependence, 155 , pp. 307-311.
2015: Increases in Marijuana Concentrate-Related
Tweeting Activity? Oops! Not So Fast…
0
2000
4000
6000
8000
10000
12000
14000
Jun
8th
Jun
15th
Jun
22nd
Jun
29th
Jul
6th
Jul
13th
Jul
20th
Jul
27th
Aug
3rd
Aug
10th
Aug
17th
Aug
24th
Aug
31st
Sep
7th
Sep
14th
Sep
21st
Sep
28th
Oct
5th
Oct
12th
Oct
19th
Oct
26th
Nov
2nd
Nov
9th
Nov
16th
Nov
23rd
Marijuana Concentrates US Tweeting Activity Jun-Nov, 2015
Tweets Unique users
Issues with Collected Data
Drug vs. Dance
Cam Newton cheers on Kevin Hart in a bench press challenge…then Dabs
Tell me why my mom DABS so well? https://t.co/7LZjdqBkQr
Cam celebrates, Cam dabs, Cam does Cam thing
Development of Machine Learning Classifier to
Extract Relevant Tweets
• Machine learning (ML) classifier was developed using 1,000 manually
labeled tweets
• Excellent results:
• ML classifier (NB) achieved F Measure=0.9; Kappa Statistic=0.8
• Dabs ML classifier was plugged into the system;
End of 2014
Start of 2017
• Similar geographic
patterns remained
• 96% were personal
communication tweets
(2017 data)
• Decrease in variability
across states:
Marijuana Concentrate Related Tweeting Over Time
Emerging Product: Rosin Tech
• Rosin technique is a solventless method to produce marijuana
concentrates
• Involves use of pressure and heat (e.g., hair straightener or rosin
tech press) to produce concentrates
• Occurrences of ‘rosin’ mentions in eDrugTrends steam (03 2015-09
2016), before “rosin” keyword was added
Rosin dabs: Preliminary data
• Keyword “Rosin” (exclude violin, brass, bow); Time period: December 6
2016- February 22 2017; 3,471 tweets collected (with identifiable state-level
geo-location)
YOOOO JUST PRESSED FOR THE FIRST TIME AND IT WAS LIFE
CHANGING 🙏🙏🙏🙏🔥😩 flower rosin is the new fav
The future is bright for #Rosin. #Marijuana #Cannabis
Nice chunk of rosin to start this morning off
2017 goal....buy a house & rosin press.
Marijuana rosin, and increasingly common extract:
https://t.co/tXZNErOPta
Rosin Tech Hash Is perfect for the people in non medical marijuana
states where it's hard to come across quality BHO to dab.
Adjusted Proportions of Rosin-Related Tweets
(Preliminary data, Dec. 6, 2016-Feb. 22, 2017)
84% - personal communication tweets
8% - media related
8% - retail related
Great Variability:
Mean: 1.96; Variance: 2.5
Exploring Cannabis-Related Tweeting
Activity: Combining Content and
Geographic Analysis Features
Cannabis Data, March–May, 2016
• Between March and May of 2016, the eDrugTrends
platform collected 13,233,837 cannabis-related tweets.
• About 30% (N=3,948,402) of those tweets had
identifiable state-level geo-location information.
• These U.S.-based tweets were posted by 965, 610
unique users.
Content Classification and Analysis
• Tweet content was automatically classified by:
A. source (personal communication, media, retail)
B. sentiment (positive, negative, neutral).
• States were grouped by cannabis legalization polices into
“recreational,” “medical, less restrictive,” “medical, more
restrictive,” and “illegal.”
• Permutation tests were performed to analyze differences
among four groups in:
A. Adjusted state proportions of all tweets,
B. personal communications only,
C. positive to negative sentiment ratios.
Classification of States by Legal Status
Adjusted state proportions of cannabis
related tweets
Adjusted tweet rate
per state
>3.0%
2.5%-3.0%
2.0%-2.49%
1.5%-1.9%
1.0%-1.49%
Medical Marijuana Legal
Recreational Marijuana Legal
Tweet Content Classification Results
Source/Type of communication
• 76.2% were personal communications,
• 21.1% media
• 2.7% retail-related
Sentiment
• About 71% of personal communication tweets expressed
positive sentiment towards cannabis,
• 16% negative sentiment,
• 13% were neutral.
Results of Permutation Test
Mapping Positive to Negative Sentiment Tweet Ratios
Conclusion
• Social media data present exciting new opportunities
for timely, sensitive and flexible approaches to
epidemiological surveillance of drug use practices
and trends.
• Continued research is needed to establish
methodological standards and practices to reduce
the “noise” and increase reliability and validity of
social media data.
• Social media monitoring can be of particular value
for tracking cannabis-related trends in the context of
rapid policy changes.
Keep up with our research/publications:
@ project page:
http://wiki.knoesis.org/index.php/EDrugTrends
or Google: eDrugTrends
or Twitter: @eDrugTrends
Thank you!
Center for Interventions,
Treatment, and Addiction
Research (CITAR)
https://medicine.wright.edu/citar
Ohio Center of Excellence in
Knowledge-enabled
Computing (Kno.e.sis)
http://knoesis.org
Sponsored by:
Grant No. 5R01DA039454-02
Trending: Social media analysis to monitor cannabis and synthetic cannabinoid use.
Any opinions, findings, conclusions or recommendations expressed in this material are those of the investigator(s)
and do not necessarily reflect the views of the National Institutes of Health.
system architecture
eDrugTrends is an extension of TwitrisTM system developed at Kno.e.sis: http://twitris.knoesis.org
© Wright State University

More Related Content

Similar to Social media analysis to monitor trends in cannabis and synthetic cannabinoid use

Trending: Social media analysis to monitor cannabis and synthetic cannabino...
Trending: Social media analysis to monitor cannabis and synthetic cannabino...Trending: Social media analysis to monitor cannabis and synthetic cannabino...
Trending: Social media analysis to monitor cannabis and synthetic cannabino...Artificial Intelligence Institute at UofSC
 
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Artificial Intelligence Institute at UofSC
 
What do you do with 280 million tweets from the 2016 U.S. election?
What do you do with 280 million tweets from the 2016 U.S. election?What do you do with 280 million tweets from the 2016 U.S. election?
What do you do with 280 million tweets from the 2016 U.S. election?Justin Littman
 
Big Data @ CBS for Fontys students in Eindhoven
Big Data @ CBS for Fontys students in EindhovenBig Data @ CBS for Fontys students in Eindhoven
Big Data @ CBS for Fontys students in EindhovenPiet J.H. Daas
 
NPIN twitter chat NCHCMM 2012
NPIN twitter chat NCHCMM 2012NPIN twitter chat NCHCMM 2012
NPIN twitter chat NCHCMM 2012CDC NPIN
 
2015 Latino Summit: Communicating Your Message Effectively
2015 Latino Summit: Communicating Your Message Effectively2015 Latino Summit: Communicating Your Message Effectively
2015 Latino Summit: Communicating Your Message EffectivelySenate Hispanic Caucus
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...Spotle.ai
 
Npin twitter townhall_apha_110310 w logo
Npin twitter townhall_apha_110310 w logoNpin twitter townhall_apha_110310 w logo
Npin twitter townhall_apha_110310 w logoCDC NPIN
 
AI in the Social Sciences Presentation
AI in the Social Sciences Presentation AI in the Social Sciences Presentation
AI in the Social Sciences Presentation April Heyward
 
U.S. Religious Landscape on Twitter
U.S. Religious Landscape on TwitterU.S. Religious Landscape on Twitter
U.S. Religious Landscape on TwitterLu Chen
 
Planning to Evaluate Earned, Social/Digital Media Campaigns
Planning to Evaluate Earned, Social/Digital Media CampaignsPlanning to Evaluate Earned, Social/Digital Media Campaigns
Planning to Evaluate Earned, Social/Digital Media CampaignsEman Aly
 
Extracting information from ' messy' social media data
Extracting information from ' messy' social media dataExtracting information from ' messy' social media data
Extracting information from ' messy' social media dataPiet J.H. Daas
 
Advancing Team-Based Care:Data Driven Dashboards to Support Team Based Care
Advancing Team-Based Care:Data Driven Dashboards to Support Team Based Care Advancing Team-Based Care:Data Driven Dashboards to Support Team Based Care
Advancing Team-Based Care:Data Driven Dashboards to Support Team Based Care CHC Connecticut
 
No Money, No Problem - A Scalable Approach to Social Media Monitoring
No Money, No Problem - A Scalable Approach to Social Media MonitoringNo Money, No Problem - A Scalable Approach to Social Media Monitoring
No Money, No Problem - A Scalable Approach to Social Media MonitoringTamer Hadi
 
Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...Diana Maynard
 
In search of a Digital Health Compass Patient Empowerment
In search of a Digital Health CompassPatient Empowerment In search of a Digital Health CompassPatient Empowerment
In search of a Digital Health Compass Patient Empowerment chronaki
 

Similar to Social media analysis to monitor trends in cannabis and synthetic cannabinoid use (20)

Trending: Social media analysis to monitor cannabis and synthetic cannabino...
Trending: Social media analysis to monitor cannabis and synthetic cannabino...Trending: Social media analysis to monitor cannabis and synthetic cannabino...
Trending: Social media analysis to monitor cannabis and synthetic cannabino...
 
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
 
Surveillance of social media: Big data analytics
Surveillance of social media: Big data analyticsSurveillance of social media: Big data analytics
Surveillance of social media: Big data analytics
 
What do you do with 280 million tweets from the 2016 U.S. election?
What do you do with 280 million tweets from the 2016 U.S. election?What do you do with 280 million tweets from the 2016 U.S. election?
What do you do with 280 million tweets from the 2016 U.S. election?
 
Big Data @ CBS for Fontys students in Eindhoven
Big Data @ CBS for Fontys students in EindhovenBig Data @ CBS for Fontys students in Eindhoven
Big Data @ CBS for Fontys students in Eindhoven
 
NPIN twitter chat NCHCMM 2012
NPIN twitter chat NCHCMM 2012NPIN twitter chat NCHCMM 2012
NPIN twitter chat NCHCMM 2012
 
2015 Latino Summit: Communicating Your Message Effectively
2015 Latino Summit: Communicating Your Message Effectively2015 Latino Summit: Communicating Your Message Effectively
2015 Latino Summit: Communicating Your Message Effectively
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
 
Npin twitter townhall_apha_110310 w logo
Npin twitter townhall_apha_110310 w logoNpin twitter townhall_apha_110310 w logo
Npin twitter townhall_apha_110310 w logo
 
AI in the Social Sciences Presentation
AI in the Social Sciences Presentation AI in the Social Sciences Presentation
AI in the Social Sciences Presentation
 
U.S. Religious Landscape on Twitter
U.S. Religious Landscape on TwitterU.S. Religious Landscape on Twitter
U.S. Religious Landscape on Twitter
 
Planning to Evaluate Earned, Social/Digital Media Campaigns
Planning to Evaluate Earned, Social/Digital Media CampaignsPlanning to Evaluate Earned, Social/Digital Media Campaigns
Planning to Evaluate Earned, Social/Digital Media Campaigns
 
NCCMT Spotlight Webinar: Clear Communication Index
NCCMT Spotlight Webinar: Clear Communication Index NCCMT Spotlight Webinar: Clear Communication Index
NCCMT Spotlight Webinar: Clear Communication Index
 
Nanotweets
NanotweetsNanotweets
Nanotweets
 
Extracting information from ' messy' social media data
Extracting information from ' messy' social media dataExtracting information from ' messy' social media data
Extracting information from ' messy' social media data
 
Advancing Team-Based Care:Data Driven Dashboards to Support Team Based Care
Advancing Team-Based Care:Data Driven Dashboards to Support Team Based Care Advancing Team-Based Care:Data Driven Dashboards to Support Team Based Care
Advancing Team-Based Care:Data Driven Dashboards to Support Team Based Care
 
No Money, No Problem - A Scalable Approach to Social Media Monitoring
No Money, No Problem - A Scalable Approach to Social Media MonitoringNo Money, No Problem - A Scalable Approach to Social Media Monitoring
No Money, No Problem - A Scalable Approach to Social Media Monitoring
 
PRC Training: Data Sharing to Capture the Bigger Picture
PRC Training: Data Sharing to Capture the Bigger PicturePRC Training: Data Sharing to Capture the Bigger Picture
PRC Training: Data Sharing to Capture the Bigger Picture
 
Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...
 
In search of a Digital Health Compass Patient Empowerment
In search of a Digital Health CompassPatient Empowerment In search of a Digital Health CompassPatient Empowerment
In search of a Digital Health Compass Patient Empowerment
 

Recently uploaded

Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️saminamagar
 
Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...
Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...
Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...Ahmedabad Escorts
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersnarwatsonia7
 
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...narwatsonia7
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...narwatsonia7
 
Glomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxGlomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxDr.Nusrat Tariq
 
call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️saminamagar
 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknownarwatsonia7
 
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service MumbaiLow Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbaisonalikaur4
 
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near MeHigh Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Menarwatsonia7
 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...saminamagar
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...Nehru place Escorts
 
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...narwatsonia7
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...narwatsonia7
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptxDr.Nusrat Tariq
 
Pharmaceutical Marketting: Unit-5, Pricing
Pharmaceutical Marketting: Unit-5, PricingPharmaceutical Marketting: Unit-5, Pricing
Pharmaceutical Marketting: Unit-5, PricingArunagarwal328757
 
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...rajnisinghkjn
 

Recently uploaded (20)

Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
 
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
 
Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...
Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...
Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
 
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
 
Glomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxGlomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptx
 
call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
 
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service MumbaiLow Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
 
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near MeHigh Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
 
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
 
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptx
 
Pharmaceutical Marketting: Unit-5, Pricing
Pharmaceutical Marketting: Unit-5, PricingPharmaceutical Marketting: Unit-5, Pricing
Pharmaceutical Marketting: Unit-5, Pricing
 
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
 

Social media analysis to monitor trends in cannabis and synthetic cannabinoid use

  • 1. Social Media Analysis to Monitor Cannabis Trends Presenter: Raminta Daniulaityte, Ph.D. CITAR & Kno.e.sis, Wright State University Boonshoft School of Medicine T32 Substance Abuse Seminar (Public Health Seminar at Columbia University) February 23, 2017 © Wright State University Center for Interventions, Treatment, and Addiction Research (CITAR) Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis)
  • 2. Research Team NIH/NIDA R01 DA03945 Trending: Social media analysis to monitor cannabis and synthetic cannabinoid use Principle Investigators: Raminta Daniulaityte, Ph.D. Amit Sheth, Ph.D. Center for Interventions, Treatment, and Addiction Research (CITAR), Wright State University Boonshoft School of Medicine Ohio Center of Excellence in Knowledge- Enabled Computing (Kno.e.sis), Wright State University Co-Investigators: Robert Carlson, Ph.D. (CITAR) Silvia Martins, M.D., Ph.D. (Columbia U) Ramzi Nahhas, Ph.D. (Comm. Health, WSU) Edward Boyer, M.D., Ph.D. (U Mass) Krishnaprasad Thirunarayan, Ph.D. (Kno.e.sis) Research Staff: Francois R. Lamy, PhD (CITAR, Postdoc); G. Alan Smith (Kno.e.sis, Software Engineer); Sanjaya Wijeratne (Kno.e.sis, Ph.D. student) Farahnaz Golroo (Kno.e.sis, Ph.D. student) No Conflicts of Interest to declare
  • 3. Project Aims • Aim 1: Develop a comprehensive software platform, eDrugTrends, for semi-automated processing and visualization of spatio-temporal, and social network dimensions of social media data (Twitter and Web forums) on cannabis and synthetic cannabinoid use. • Aim 2: Deploy eDrugTrends to identify and compare trends in knowledge, attitudes, and behaviors related to cannabis and synthetic cannabinoid use across U.S. regions with different cannabis legalization policies using Twitter and Web forum data. • Types of data sources: o Twitter (brief content, but over 500 million tweets/day, geo-info) o Web forums such as Bluelight, drugs-forum, Reddit (detailed discussions of drug use practices) o Web survey on Bluelight
  • 4. Presentation Objectives • Overview of the technical capabilities of eDrugTrends platform to process Twitter data • How data is collected • Geo-location identification • Keyword selection and monitoring • Tweet content processing • Exploration of recently collected and processed data on marijuana concentrates • Integrating geographic and content analysis features to explore cannabis-related tweeting activity
  • 5. : Twitter Data Collection • Tweets are collected using Twitter’s streaming Application Programming Interface (API) that provides free access to 1% of all tweets. • Publically available tweets only. • The system automatically filters out non-English language tweets. • Current system started data collection March 2015; Close to 90 million tweets have been collected eDrugTrends Dashboard Showing in-coming Tweets and trending Topics
  • 6. What does “up to 1%” mean? • Free access to 1% of all tweets o It can be thought of as a ”bucket” that can fit up to 1% of all tweets. o Assuming 400 million daily tweets are generated per day, 1% would constitute about 4 million daily tweets. o Still, it is possible to miss some of the tweets due to sudden volume spikes. • With a reasonably limited number of keywords, all or most relevant tweets can be collected. • Our system collects an average of about 150,000 tweets per day, which is below the allowable limit.
  • 7. Extraction of Geo-Location Information • Tweets may contain GPS coordinates (via a mobile phone that supports the feature). • Users may indicate their geo information in their user profiles: WHERE THE WEED AT DAYTON, OH SAN DIEGO Pittsburgh, PA wonderland Earth • eDrugTrends geo-locates close to 30% of tweets for state- level and county-level information . • Some earlier studies reported 1-3%of tweets with geo-location identification.
  • 8. Adjusted Measures of Tweeting Activity • To compare regional trends, we can’t work with raw numbers. • eDrugTrends started running a parallel data collection system to obtain general sample of tweets (denominator data). • General sample data are collected using another API stream; no keywords are used; data are processed to identify geographic information. • “General sample” is then used to calculate state-tweet- volume-adjusted state proportion of tweets o (or county-tweet-volume-adjusted county proportion of tweets)
  • 9. RAW Numbers and ADJUSTED State Proportions of Cannabis-Related Tweets (March-September, 2016) Raw numbers Adjusted proportions
  • 10. Twitter Data Collection: Keywords • Keywords/slang terms are used to collect relevant tweets: o Cannabis—weed, marijuana, spliff, ganja, kush, sativa, indica, chronic, blunt, hydro, skunk, reefer, joint, etc. o Marijuana concentrates—dabs, shatter, budder, wax BHO, butane honey oil, hash oil, etc. o Edibles—weed cookies, space cake, pot cookie, pot brownie, mj brownie, medibles, etc. o Synthetic cannabinoids—spice, K2, CHMINACA, AB-FUBINACA, synthetic weed, smoking blend, noid, black mamba, etc. • Inclusion of slang terms improves sensitivity (recall) in data collection
  • 11. Keyword challenges • Issues with “precision”– risk of getting “noisy” or “irrelevant” data. • Ways to improve precision of collected data: o Ambiguous terms are combined with additional keywords indicating usage (e.g., smoke blunt, smoke budder) o “Black list” words are used to exclude irrelevant tweets (e.g., pumpkin spice latte, Emily Blunt). o Machine learning and other advanced information processing techniques are needed • On-going monitoring is needed: o New types of products or slang terms emerge. For example, “rosin”—new type of marijuana concentrate produced using solvent-less method. o New uses/meanings of words may affect the accuracy of collected data. (e.g., “dabs”)
  • 12. Data Processing: Automated Tweets Classification • Using manually annotated training data sets, machine learning classifiers were developed to automatically classify tweets • Classification by the the source/type of communication (personal, media, retail) o Machine learning classifier (SVM) achieved F score = 0.81. • Classification by sentiment (positive, negative, neutral), o Sentiment classification is applied to personal communications only o Machine learning classifier (SVM) achieved F score = 0.71. Kickin back wit my spliff Late night dabs Medical marvel: the uses of cannabis continue to grow http://t.co/djtKPunxW9 $10 #Cannabis #Edibles 12 Varieties 1 Package 10MG #THC total http://t.co/9w3xrFUnAe Positive: Marijuana works wonders on the soul Strongest shatter I've ever smoked Negative: I’m not much of a fan when it comes to edibles hate when people think i smoke weed
  • 13. Exploring Twitter Data on Marijuana Concentrates
  • 14. Initial report about marijuana concentrate related tweeting: “Time for dabs” 2014 data • Data collected over 2 month period, end of 2014. • 27,018 tweets with identifiable state-level geo-location • Although over 10 keywords were used (shatter, concentrates, butane hash oil, etc.), keyword “dabs” produced over 99% of the total sample. Dabs on Dabs on Dabs Time for dabs I just need a cute girl to take dabs with me and get stoned together Time for dabs": Analyzing Twitter data on marijuana concentrates across the U.S. Daniulaityte R., Nahhas R.W., Wijeratne S., Carlson R.G., Lamy F.R., Martins S.S., Boyer E.W., (...), Sheth A. (2015) Drug and Alcohol Dependence, 155 , pp. 307-311.
  • 15. 2015: Increases in Marijuana Concentrate-Related Tweeting Activity? Oops! Not So Fast… 0 2000 4000 6000 8000 10000 12000 14000 Jun 8th Jun 15th Jun 22nd Jun 29th Jul 6th Jul 13th Jul 20th Jul 27th Aug 3rd Aug 10th Aug 17th Aug 24th Aug 31st Sep 7th Sep 14th Sep 21st Sep 28th Oct 5th Oct 12th Oct 19th Oct 26th Nov 2nd Nov 9th Nov 16th Nov 23rd Marijuana Concentrates US Tweeting Activity Jun-Nov, 2015 Tweets Unique users
  • 16. Issues with Collected Data Drug vs. Dance Cam Newton cheers on Kevin Hart in a bench press challenge…then Dabs Tell me why my mom DABS so well? https://t.co/7LZjdqBkQr Cam celebrates, Cam dabs, Cam does Cam thing
  • 17. Development of Machine Learning Classifier to Extract Relevant Tweets • Machine learning (ML) classifier was developed using 1,000 manually labeled tweets • Excellent results: • ML classifier (NB) achieved F Measure=0.9; Kappa Statistic=0.8 • Dabs ML classifier was plugged into the system;
  • 18. End of 2014 Start of 2017 • Similar geographic patterns remained • 96% were personal communication tweets (2017 data) • Decrease in variability across states: Marijuana Concentrate Related Tweeting Over Time
  • 19. Emerging Product: Rosin Tech • Rosin technique is a solventless method to produce marijuana concentrates • Involves use of pressure and heat (e.g., hair straightener or rosin tech press) to produce concentrates • Occurrences of ‘rosin’ mentions in eDrugTrends steam (03 2015-09 2016), before “rosin” keyword was added
  • 20. Rosin dabs: Preliminary data • Keyword “Rosin” (exclude violin, brass, bow); Time period: December 6 2016- February 22 2017; 3,471 tweets collected (with identifiable state-level geo-location) YOOOO JUST PRESSED FOR THE FIRST TIME AND IT WAS LIFE CHANGING 🙏🙏🙏🙏🔥😩 flower rosin is the new fav The future is bright for #Rosin. #Marijuana #Cannabis Nice chunk of rosin to start this morning off 2017 goal....buy a house & rosin press. Marijuana rosin, and increasingly common extract: https://t.co/tXZNErOPta Rosin Tech Hash Is perfect for the people in non medical marijuana states where it's hard to come across quality BHO to dab.
  • 21. Adjusted Proportions of Rosin-Related Tweets (Preliminary data, Dec. 6, 2016-Feb. 22, 2017) 84% - personal communication tweets 8% - media related 8% - retail related Great Variability: Mean: 1.96; Variance: 2.5
  • 22. Exploring Cannabis-Related Tweeting Activity: Combining Content and Geographic Analysis Features
  • 23. Cannabis Data, March–May, 2016 • Between March and May of 2016, the eDrugTrends platform collected 13,233,837 cannabis-related tweets. • About 30% (N=3,948,402) of those tweets had identifiable state-level geo-location information. • These U.S.-based tweets were posted by 965, 610 unique users.
  • 24. Content Classification and Analysis • Tweet content was automatically classified by: A. source (personal communication, media, retail) B. sentiment (positive, negative, neutral). • States were grouped by cannabis legalization polices into “recreational,” “medical, less restrictive,” “medical, more restrictive,” and “illegal.” • Permutation tests were performed to analyze differences among four groups in: A. Adjusted state proportions of all tweets, B. personal communications only, C. positive to negative sentiment ratios.
  • 25. Classification of States by Legal Status
  • 26. Adjusted state proportions of cannabis related tweets Adjusted tweet rate per state >3.0% 2.5%-3.0% 2.0%-2.49% 1.5%-1.9% 1.0%-1.49% Medical Marijuana Legal Recreational Marijuana Legal
  • 27. Tweet Content Classification Results Source/Type of communication • 76.2% were personal communications, • 21.1% media • 2.7% retail-related Sentiment • About 71% of personal communication tweets expressed positive sentiment towards cannabis, • 16% negative sentiment, • 13% were neutral.
  • 29. Mapping Positive to Negative Sentiment Tweet Ratios
  • 30. Conclusion • Social media data present exciting new opportunities for timely, sensitive and flexible approaches to epidemiological surveillance of drug use practices and trends. • Continued research is needed to establish methodological standards and practices to reduce the “noise” and increase reliability and validity of social media data. • Social media monitoring can be of particular value for tracking cannabis-related trends in the context of rapid policy changes.
  • 31. Keep up with our research/publications: @ project page: http://wiki.knoesis.org/index.php/EDrugTrends or Google: eDrugTrends or Twitter: @eDrugTrends Thank you! Center for Interventions, Treatment, and Addiction Research (CITAR) https://medicine.wright.edu/citar Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) http://knoesis.org Sponsored by: Grant No. 5R01DA039454-02 Trending: Social media analysis to monitor cannabis and synthetic cannabinoid use. Any opinions, findings, conclusions or recommendations expressed in this material are those of the investigator(s) and do not necessarily reflect the views of the National Institutes of Health.
  • 32. system architecture eDrugTrends is an extension of TwitrisTM system developed at Kno.e.sis: http://twitris.knoesis.org © Wright State University