SlideShare una empresa de Scribd logo
1 de 11
VC-dimension,
very fast tutorial
O. Teytaud
f = a function in F
For example:
● F = set of linear functions on Rd
● F = set of neural networks with N neurons
in two layers on Rd,
● F = set of polynomial functions with degree d on Rd
We want to pick up a good f in F. E.g.
for some distribution on x,y:
● f minimizing the expectation of (f(x)-y)2
● f minimizing the -log likelihood of data -log P(f(x)=y)
We have some set F of functions, we want to find f in F such that Lf = E something is minimum
Noisy optimization boils down to fitting data:
● y=1 with probability A+B||x-x*||^2, 0 otherwise
● Or maybe E (y|x) = A+B||x-x*||^2
● We want to find f(x)=A+B|| x – f* ||^2 minimizing E(f(x)-y)^2
Loss function:
We want E Lf small.
All we know is:
General case
= empirical error
with V the VC-dimension
With probability >= 1-eta
Remarks:
● Assumes that the data are
independently drawn
● Distribution-free bounds
● Scale as 1/n if empirical error
very small, 1/sqrt(n) otherwise
Ok, but what is the VC-dimension ?
First define shattering coefficient S(n)=
The VC-dimension is n maximum
(possibly infinite) such that S(n) = 2^n.
Explanation ? It is >= 7 if:
● For at least one set S of 7 points
● All 2^7 binary subsets of S are >=c for some f
and <c for the complement...
In many sufficiently smooth cases, the VC-dimension I
the number of parameters – but not always...
Polynomial of degree k over Rd:
Linear combinations of V functions
have VC-dimension at most V
Remarks:
● If distribution on X is known, there are better bounds
● VC bounds are nearly optimal - within huge constants
● For distribution-dep. rates, there exist faster results
● For distribution-dep rates, there are more general results
(even with infinite VC-dimension); see
● Donsker classes
● Glivenko-Cantelli classes (convergence, no rate)
● Covering-numbers
● Fat-shattering dimension
Structural risk Minimization:
we choose f minimizing this bound:
In particular, we can have several families F, and minimize
this bound over several families.
Principle = penalization of families with high VC-dimension
We want a small
empirical error
We want a small
VC-dimension
(complexity)
Overfitting = choosing a function which is
empirically good, but generalizes poorly.
VC-dimension is about avoiding overfitting.
Structural risk minimization =
minimizing the VC bound
However, in everyday life, people use
cross-validation: choose the family of
functions such that
● learning on half examples
●testing on the other half
performs well.
But: VC-dimension convenient for proving
useful theorems.
Vapnik's books: centered on Vapnik's work, but good book
Devroye-Gyorfi-Lugosi = very good book,
mainly on binary case
Vidyasagar = using covering number, good book
Feedback:
- I try to promote in the team, the idea of talks for a wide audience
- do you find that interesting ?

Más contenido relacionado

Destacado

Combining UCT and Constraint Satisfaction Problems for Minesweeper
Combining UCT and Constraint Satisfaction Problems for MinesweeperCombining UCT and Constraint Satisfaction Problems for Minesweeper
Combining UCT and Constraint Satisfaction Problems for MinesweeperOlivier Teytaud
 
Introduction to the TAO Uct Sig, a team working on computational intelligence...
Introduction to the TAO Uct Sig, a team working on computational intelligence...Introduction to the TAO Uct Sig, a team working on computational intelligence...
Introduction to the TAO Uct Sig, a team working on computational intelligence...Olivier Teytaud
 
Multimodal or Expensive Optimization
Multimodal or Expensive OptimizationMultimodal or Expensive Optimization
Multimodal or Expensive OptimizationOlivier Teytaud
 
Stochastic modelling and quasi-random numbers
Stochastic modelling and quasi-random numbersStochastic modelling and quasi-random numbers
Stochastic modelling and quasi-random numbersOlivier Teytaud
 
Noisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) SurveyNoisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) SurveyOlivier Teytaud
 
Tools for artificial intelligence
Tools for artificial intelligenceTools for artificial intelligence
Tools for artificial intelligenceOlivier Teytaud
 
Theories of continuous optimization
Theories of continuous optimizationTheories of continuous optimization
Theories of continuous optimizationOlivier Teytaud
 
Complexity of planning and games with partial information
Complexity of planning and games with partial informationComplexity of planning and games with partial information
Complexity of planning and games with partial informationOlivier Teytaud
 

Destacado (9)

Combining UCT and Constraint Satisfaction Problems for Minesweeper
Combining UCT and Constraint Satisfaction Problems for MinesweeperCombining UCT and Constraint Satisfaction Problems for Minesweeper
Combining UCT and Constraint Satisfaction Problems for Minesweeper
 
Statistics 101
Statistics 101Statistics 101
Statistics 101
 
Introduction to the TAO Uct Sig, a team working on computational intelligence...
Introduction to the TAO Uct Sig, a team working on computational intelligence...Introduction to the TAO Uct Sig, a team working on computational intelligence...
Introduction to the TAO Uct Sig, a team working on computational intelligence...
 
Multimodal or Expensive Optimization
Multimodal or Expensive OptimizationMultimodal or Expensive Optimization
Multimodal or Expensive Optimization
 
Stochastic modelling and quasi-random numbers
Stochastic modelling and quasi-random numbersStochastic modelling and quasi-random numbers
Stochastic modelling and quasi-random numbers
 
Noisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) SurveyNoisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) Survey
 
Tools for artificial intelligence
Tools for artificial intelligenceTools for artificial intelligence
Tools for artificial intelligence
 
Theories of continuous optimization
Theories of continuous optimizationTheories of continuous optimization
Theories of continuous optimization
 
Complexity of planning and games with partial information
Complexity of planning and games with partial informationComplexity of planning and games with partial information
Complexity of planning and games with partial information
 

Último

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 

Último (20)

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 

VC-dimension: a superfast survey

  • 2. f = a function in F For example: ● F = set of linear functions on Rd ● F = set of neural networks with N neurons in two layers on Rd, ● F = set of polynomial functions with degree d on Rd We want to pick up a good f in F. E.g. for some distribution on x,y: ● f minimizing the expectation of (f(x)-y)2 ● f minimizing the -log likelihood of data -log P(f(x)=y) We have some set F of functions, we want to find f in F such that Lf = E something is minimum Noisy optimization boils down to fitting data: ● y=1 with probability A+B||x-x*||^2, 0 otherwise ● Or maybe E (y|x) = A+B||x-x*||^2 ● We want to find f(x)=A+B|| x – f* ||^2 minimizing E(f(x)-y)^2
  • 3. Loss function: We want E Lf small. All we know is: General case = empirical error
  • 4. with V the VC-dimension With probability >= 1-eta
  • 5. Remarks: ● Assumes that the data are independently drawn ● Distribution-free bounds ● Scale as 1/n if empirical error very small, 1/sqrt(n) otherwise
  • 6. Ok, but what is the VC-dimension ? First define shattering coefficient S(n)= The VC-dimension is n maximum (possibly infinite) such that S(n) = 2^n. Explanation ? It is >= 7 if: ● For at least one set S of 7 points ● All 2^7 binary subsets of S are >=c for some f and <c for the complement...
  • 7. In many sufficiently smooth cases, the VC-dimension I the number of parameters – but not always... Polynomial of degree k over Rd: Linear combinations of V functions have VC-dimension at most V
  • 8. Remarks: ● If distribution on X is known, there are better bounds ● VC bounds are nearly optimal - within huge constants ● For distribution-dep. rates, there exist faster results ● For distribution-dep rates, there are more general results (even with infinite VC-dimension); see ● Donsker classes ● Glivenko-Cantelli classes (convergence, no rate) ● Covering-numbers ● Fat-shattering dimension
  • 9. Structural risk Minimization: we choose f minimizing this bound: In particular, we can have several families F, and minimize this bound over several families. Principle = penalization of families with high VC-dimension We want a small empirical error We want a small VC-dimension (complexity)
  • 10. Overfitting = choosing a function which is empirically good, but generalizes poorly. VC-dimension is about avoiding overfitting. Structural risk minimization = minimizing the VC bound However, in everyday life, people use cross-validation: choose the family of functions such that ● learning on half examples ●testing on the other half performs well. But: VC-dimension convenient for proving useful theorems.
  • 11. Vapnik's books: centered on Vapnik's work, but good book Devroye-Gyorfi-Lugosi = very good book, mainly on binary case Vidyasagar = using covering number, good book Feedback: - I try to promote in the team, the idea of talks for a wide audience - do you find that interesting ?