SlideShare una empresa de Scribd logo
1 de 16
Descargar para leer sin conexión
Random forests with random projections of the 
output space for high dimensional multi-label 
classification 
Arnaud Joly, Pierre Geurts, Louis Wehenkel
Multi-label classification tasks 
Many supervised learning applications in text, biology or image 
processing where samples are associated to sets of labels. 
Input X 800  600 pixel Output Y labels 
driver, mountain, road, 
car, tree, rock, line, 
human, . . . 
If each label corresponds 
to a wikipedia article, then 
we have around 4 million 
labels. 
2 / 15
Random forest 
Randomized trees are built on a bootstrap copy of the input-output 
pairs ((xi; yi) 2 (X  Y))ni 
=1 by recursively maximizing the 
reduction of impurity, here the variance Var. At each node, the 
best split is selected among k randomly selected features. 
S 
SL 
Xk  tk 
Xk  tk 
SR 
Var(S) = 0:24 
Var(SL) = 0:014 
Var(SR) = 0:1875 
Var(S) = Var(S) 
12 
20 
Var(SL) 
8 
20 
Var(tR) 
 0:16 
3 / 15
When Y is very high dimensional, this constitutes the main 
bottleneck of the random tree ensemble. 
The multi-output single tree algorithm requires the computation of 
the sum of the variance over the label space at each tree node and 
for each candidate split. 
4 / 15
Multi-output regression trees in randomly projected output 
space 
We propose to approximate the computation of the variance by 
using random projection of the output space. 
5 / 15
Multi-output regression trees in randomly projected output 
space 
We propose to approximate the computation of the variance by 
using random projection of the output space. 
Theorem 
Given   0, a sample (yi)ni 
=1 of n points y 2 Rd, and a projection 
matrix  2 Rmd such that for all pairs of points the 
Jonhson-Lindenstrauss lemma holds, we have also 
(1  )Var 
 
(yi)ni 
=1 
 
 Var 
 
(yi)ni 
=1 
 
 (1 + )Var 
 
(yi)ni 
=1 
 
: 
5 / 15
Multi-output regression trees in randomly projected output 
space 
1. Randomly project the output space 
 
yi 
= 
2. Grow the tree on the projected output space 
(xi;yi)ni 
=1 
3. Label leaves using (yi)ni 
=1 
6 / 15
Ensemble of randomized trees 
Shared subspace Individual subspace 
(xi;yi)ni 
=1 (xi;1yi)ni 
=1 (xi;2yi)ni 
=1 
7 / 15
Bias-variance analysis 
Averaging over the learning set LS, algorithm randomization  and 
output subspace randomization , the square error Err of t multi 
output tree models can be decomposed into: 
Single shared subspace (Algo 1) 
ELS;;tfErr(f1(x; LS;; t))g 
= 2R 
(x) + B2(x) + VLS(x) + 
VAlgo(x) 
t 
+ VProj(x): 
Individual subspace (Algo 2) 
ELS;t;tfErr(f2(x; LS;t; t))g 
= 2R 
(x) | {z } 
residual error 
+B2(x) | {z } 
bias 
+VLS(x) + 
VAlgo(x) + VProj(x) 
| {z t } 
variance 
: 
Individual subspace should always be preferred to single shared 
subspace. 
8 / 15
Label ranking average precision to assess performance 
LRAP( ^ f) = 
1 
jTSj 
X 
i2TS 
1 
jyij 
X 
j2fk:yik 
=1g 
jLi 
j(yi)j 
jLi 
j(1d)j 
; 
Li 
j(q) = 
n 
k : qk = 1 and ^ f(xi)k  ^ f(xi)j 
o 
where ^ f(xi)j is the probability (or the score) associated to the label 
j by the learnt model ^ f applied to xi, 1d is a d-dimensional row 
vector of ones. 
Higher score if true labels have a higher probability (score) than the 
false labels. 
9 / 15
Decision tree performance converges with m = 200 
Gaussian random output projections 
0 200 400 600 800 1000 
m 
0.205 
0.200 
0.195 
0.190 
0.185 
0.180 
LRAP 
Decision tree 
Decision tree on Gaussian subspace (Algo 2) 
Delicious dataset (983 labels) 
10 / 15
Faster convergence with ensemble of randomized trees 
0 200 400 600 800 1000 
m 
0.378 
0.375 
0.372 
0.369 
0.366 
LRAP 
Random forest 
Random forest on Gaussian subspace (Algo 2) 
Random forest on fix-Gaussian subspace (Algo 1) 
Delicious dataset (983 labels, k = 
p 
p, t = 100, nmin = 1) 
Randomly projecting the output space reduces computing time 
from 3458 seconds (no projection) to 311 seconds (m = 25, 
individual subspace) without accuracy degradation. 
11 / 15
Systematic analysis on 24 datasets 
Increasing m leads to convergence in LRAP 
EUR-Lex (subj.) m=1 
bibtex 
tmc2017 
reuters 
emotions 
Expression-GO 
WIPO 
yeast 
medical 
Yeast-GO 
mediamill 
enron 
SCOP-GO 
EUR-Lex (dir.) 
genbase 
scene 
drug-interaction 
EUR-Lex (desc.) 
protein-interaction 
diatoms 
corel5k 
CAL500 
delicious 
bookmarks 
m=log(d) 
m=d 
0.8 1.0 1.2 
LRAP(random forest on Gaussian output subspace) / LRAP(random forest) 
(k = 
p 
p, t = 100, nmin = 1, averaged over 10 repetitions) 
12 / 15
Output randomization could be more effective than input 
randomization 
0 150 300 450 600 
k 
0.400 
0.392 
0.384 
0.376 
0.368 
0.360 
LRAP 
Random forest 
Random forest on Gaussian subspace (Algo 2) with m= 1 
Random forest on Gaussian subspace (Algo 2) with m= 7 
Random forest on Gaussian subspace (Algo 2) with m=14 
Drug-interaction dataset(1554 labels, t = 100, nmin = 1) 
13 / 15
Alternative random output subspace 
0 30 60 90 120 150 
m 
0.39 
0.38 
0.37 
0.36 
0.35 
0.34 
0.33 
LRAP 
Random forest 
Random forest on random output subspace (Algo 2) 
Random forest on sparse Rademacher (s=pd) subspace (Algo 2) 
Random forest on sparse Rademacher (s=3) subspace (Algo 2) 
Delicious dataset(981 labels, k = 
p 
p, t = 100, nmin = 1) 
14 / 15
Random forests with random projections of the output space 
for high dimensional multi-label classification 
Conclusions 
I Lower computing time, without affecting accuracy. 
I Optimizing input and output randomization could improve 
prediction performance. 
Future work 
Efficient technique to adjust random output space parameters so as 
to reach the best accuracy and computing time trade-off. 
Source code is available @ 
github.com/arjoly/random-output-trees. 
15 / 15

Más contenido relacionado

Último

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 

Último (20)

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 

Destacado

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Destacado (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

Random forests with random projections of the output space for high dimensional multi-label classification

  • 1. Random forests with random projections of the output space for high dimensional multi-label classification Arnaud Joly, Pierre Geurts, Louis Wehenkel
  • 2. Multi-label classification tasks Many supervised learning applications in text, biology or image processing where samples are associated to sets of labels. Input X 800 600 pixel Output Y labels driver, mountain, road, car, tree, rock, line, human, . . . If each label corresponds to a wikipedia article, then we have around 4 million labels. 2 / 15
  • 3. Random forest Randomized trees are built on a bootstrap copy of the input-output pairs ((xi; yi) 2 (X Y))ni =1 by recursively maximizing the reduction of impurity, here the variance Var. At each node, the best split is selected among k randomly selected features. S SL Xk tk Xk tk SR Var(S) = 0:24 Var(SL) = 0:014 Var(SR) = 0:1875 Var(S) = Var(S) 12 20 Var(SL) 8 20 Var(tR) 0:16 3 / 15
  • 4. When Y is very high dimensional, this constitutes the main bottleneck of the random tree ensemble. The multi-output single tree algorithm requires the computation of the sum of the variance over the label space at each tree node and for each candidate split. 4 / 15
  • 5. Multi-output regression trees in randomly projected output space We propose to approximate the computation of the variance by using random projection of the output space. 5 / 15
  • 6. Multi-output regression trees in randomly projected output space We propose to approximate the computation of the variance by using random projection of the output space. Theorem Given 0, a sample (yi)ni =1 of n points y 2 Rd, and a projection matrix 2 Rmd such that for all pairs of points the Jonhson-Lindenstrauss lemma holds, we have also (1 )Var (yi)ni =1 Var (yi)ni =1 (1 + )Var (yi)ni =1 : 5 / 15
  • 7. Multi-output regression trees in randomly projected output space 1. Randomly project the output space yi = 2. Grow the tree on the projected output space (xi;yi)ni =1 3. Label leaves using (yi)ni =1 6 / 15
  • 8. Ensemble of randomized trees Shared subspace Individual subspace (xi;yi)ni =1 (xi;1yi)ni =1 (xi;2yi)ni =1 7 / 15
  • 9. Bias-variance analysis Averaging over the learning set LS, algorithm randomization and output subspace randomization , the square error Err of t multi output tree models can be decomposed into: Single shared subspace (Algo 1) ELS;;tfErr(f1(x; LS;; t))g = 2R (x) + B2(x) + VLS(x) + VAlgo(x) t + VProj(x): Individual subspace (Algo 2) ELS;t;tfErr(f2(x; LS;t; t))g = 2R (x) | {z } residual error +B2(x) | {z } bias +VLS(x) + VAlgo(x) + VProj(x) | {z t } variance : Individual subspace should always be preferred to single shared subspace. 8 / 15
  • 10. Label ranking average precision to assess performance LRAP( ^ f) = 1 jTSj X i2TS 1 jyij X j2fk:yik =1g jLi j(yi)j jLi j(1d)j ; Li j(q) = n k : qk = 1 and ^ f(xi)k ^ f(xi)j o where ^ f(xi)j is the probability (or the score) associated to the label j by the learnt model ^ f applied to xi, 1d is a d-dimensional row vector of ones. Higher score if true labels have a higher probability (score) than the false labels. 9 / 15
  • 11. Decision tree performance converges with m = 200 Gaussian random output projections 0 200 400 600 800 1000 m 0.205 0.200 0.195 0.190 0.185 0.180 LRAP Decision tree Decision tree on Gaussian subspace (Algo 2) Delicious dataset (983 labels) 10 / 15
  • 12. Faster convergence with ensemble of randomized trees 0 200 400 600 800 1000 m 0.378 0.375 0.372 0.369 0.366 LRAP Random forest Random forest on Gaussian subspace (Algo 2) Random forest on fix-Gaussian subspace (Algo 1) Delicious dataset (983 labels, k = p p, t = 100, nmin = 1) Randomly projecting the output space reduces computing time from 3458 seconds (no projection) to 311 seconds (m = 25, individual subspace) without accuracy degradation. 11 / 15
  • 13. Systematic analysis on 24 datasets Increasing m leads to convergence in LRAP EUR-Lex (subj.) m=1 bibtex tmc2017 reuters emotions Expression-GO WIPO yeast medical Yeast-GO mediamill enron SCOP-GO EUR-Lex (dir.) genbase scene drug-interaction EUR-Lex (desc.) protein-interaction diatoms corel5k CAL500 delicious bookmarks m=log(d) m=d 0.8 1.0 1.2 LRAP(random forest on Gaussian output subspace) / LRAP(random forest) (k = p p, t = 100, nmin = 1, averaged over 10 repetitions) 12 / 15
  • 14. Output randomization could be more effective than input randomization 0 150 300 450 600 k 0.400 0.392 0.384 0.376 0.368 0.360 LRAP Random forest Random forest on Gaussian subspace (Algo 2) with m= 1 Random forest on Gaussian subspace (Algo 2) with m= 7 Random forest on Gaussian subspace (Algo 2) with m=14 Drug-interaction dataset(1554 labels, t = 100, nmin = 1) 13 / 15
  • 15. Alternative random output subspace 0 30 60 90 120 150 m 0.39 0.38 0.37 0.36 0.35 0.34 0.33 LRAP Random forest Random forest on random output subspace (Algo 2) Random forest on sparse Rademacher (s=pd) subspace (Algo 2) Random forest on sparse Rademacher (s=3) subspace (Algo 2) Delicious dataset(981 labels, k = p p, t = 100, nmin = 1) 14 / 15
  • 16. Random forests with random projections of the output space for high dimensional multi-label classification Conclusions I Lower computing time, without affecting accuracy. I Optimizing input and output randomization could improve prediction performance. Future work Efficient technique to adjust random output space parameters so as to reach the best accuracy and computing time trade-off. Source code is available @ github.com/arjoly/random-output-trees. 15 / 15