Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Toward Accurate and Robust Cross-
Ratio based Gaze Trackers Through
Learning from Simulation
Jia-Bin Huang1, Qin Cai2, Zic...
Why?
• Multimodal natural interaction
• Gaze + touch, gesture, speech
If I were an iron man…
Why?
• Understanding user attention and intention
Why?
• Understanding interaction among people
Before sunrise
1995
Sclera
Limbus
Pupil
Iris
Glint
Cornea (like a spherical mirror)
Mike @ Monster University
Geometric Model of an Eye
Gaze Estimation using
Pupil Center and Corneal Reflections
Interpolation-
based
Cross-Ratio
based
Model-based
Model-based Gaze Estimation
• Detailed geometric modeling between light sources, corneal, and
camera [Guestrin and Eizenma...
Interpolation-based Gaze Estimation
• Learn polynomial regression from subject-dependent calibration
• Directly map from n...
Cross-Ratio based Gaze Estimation
• Gaze estimation by exploiting invariance of a plane
projectivity [Yoo et al. 2002]
• P...
The Basic Form of Cross-Ratio Method
Image
Corneal
Display
Two Sources of Errors [Kang et al. 2008]
• Angular deviation of visual axis and optical axis
• Virtual image of pupil cent...
Improve Accuracy for Stationary Head
CR [Yoo-2002]
CR-Multi [Yoo-2005]
CR-HOM [Kang-2007]
CR-HOMN [Hansen-2010]
CR-DV [Cou...
Improve Robustness for Head
Movements
No adaptation Adapt to eye
depth variations
Adapt to eye movements
Assumptions
1) we...
Accuracy of Gaze Prediction for
Stationary Head
Robustness to Head
Movement
No adaptation
CR [Yoo-2002]
CR-Multi [Yoo-2005...
How? The Main Idea
• Build upon the homography normalization method [Hansen et al
2010]
• Improving accuracy and robustnes...
Adaptive Homograph Mapping
• Two types of predictor variables
• : capture the head movements relative to the calibration p...
Training Adaptive Homography Mapping
• Exploit large amount of simulated data
• the set of sampled head position in 3D
• t...
Minimizing the Objective Function
• Minimize an algebraic error at each sampled head position
• Use the solution from alge...
Visualize the Training Process
• Eye gaze prediction results using the bias-correcting homography
computed at the calibrat...
RMSE Error Comparisons Using
Different Training Models
• Differences are small in
linear regression
• Linear model is not
...
Linear Regression
Linear Regression
Adding the normalized pupil center
corrected spatially-varying errors
Quadratic Regression
Quadratic Regression
Experimental Results – Synthetic data
• Setup
• Screen size 400mm x 300mm
• Four IR lights
• Camera 13mm focal length, pla...
Stationary Head
Varying corneal radius
Stationary Head
Varying pupil-corneal distance
Stationary Head
Varying (horizontal) angle between optical/visual axis
Stationary Head
Varying (vertical) angle between optical/visual axis
Head Movements Parallel to the Screen
Head Movement along Depth Variation
Tested at Another Head Position
Noise Sensitivity Analysis
Effect of Sensor Resolution (at
calibration)
Focal Length = 13 mm Focal Length = 35 mm
Effect of Sensor Resolution (at new
position)
Focal Length = 13 mm Focal Length = 35 mm
Real Data Evaluation –
Programmable Hardware Setup
Off-axis IR light sources
Stereo camera
(We use one only in this work)
...
Real Data Evaluation – Feature Detection
• Detecting glints and pupil center
Averaged Gaze Estimation Error
at calibration position
Averaged Gaze Estimation Error
Calibrated at 600mm from screenCalibrated at 500mm from screen
Conclusions
• A learning-based approach for simultaneously compensating (1)
spatially varying errors and (2) errors induce...
Future Work
• Consider subject-dependent parameters in the learning and inference
the adaptive homography adaptation
• Int...
Comments or questions?
Jia-Bin Huang
jbhuang1@Illinois.edu
Narendra Ahuja
n-ahuja@Illinois.edu
Zhengyou Zhang
zhang@micros...
Próxima SlideShare
Cargando en…5
×

Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning From Simulation (ETRA 2014)

Jia-Bin Huang, Qin Cai, Zicheng Liu, Narendra Ahuja, and Zhengyou Zhang

Towards Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning From Simulation

Proceedings of ACM Symposium on Eye Tracking Research & Applications (ETRA), 2014

ETRA 2014 Best Paper Award

  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning From Simulation (ETRA 2014)

  1. 1. Toward Accurate and Robust Cross- Ratio based Gaze Trackers Through Learning from Simulation Jia-Bin Huang1, Qin Cai2, Zicheng Liu2, Narendra Ahuja1, and Zhengyou Zhang2 21
  2. 2. Why? • Multimodal natural interaction • Gaze + touch, gesture, speech If I were an iron man…
  3. 3. Why? • Understanding user attention and intention
  4. 4. Why? • Understanding interaction among people Before sunrise 1995
  5. 5. Sclera Limbus Pupil Iris Glint Cornea (like a spherical mirror) Mike @ Monster University
  6. 6. Geometric Model of an Eye
  7. 7. Gaze Estimation using Pupil Center and Corneal Reflections Interpolation- based Cross-Ratio based Model-based
  8. 8. Model-based Gaze Estimation • Detailed geometric modeling between light sources, corneal, and camera [Guestrin and Eizenman, 2006] • Pros • Accurate (reported performance < 1o) • 3D gaze direction • Head pose invariant • Cons • Need careful hardware calibration Figure from [Guestrin and Eizenman, 2006]
  9. 9. Interpolation-based Gaze Estimation • Learn polynomial regression from subject-dependent calibration • Directly map from normalized to Point of Regard (2D PoR) [Cerrolaza et al., 2008] • Pros • Simple to implement • No need for hardware calibration • Cons • Head pose sensitive
  10. 10. Cross-Ratio based Gaze Estimation • Gaze estimation by exploiting invariance of a plane projectivity [Yoo et al. 2002] • Pros • Simple to implement • No need for hardware calibration • Head pose invariant • Cons • Large subject dependent bias occur because simplifying assumptions Figure from [Coutinho and Morimoto 2012]
  11. 11. The Basic Form of Cross-Ratio Method Image Corneal Display
  12. 12. Two Sources of Errors [Kang et al. 2008] • Angular deviation of visual axis and optical axis • Virtual image of pupil center is not coplanar with corneal reflections
  13. 13. Improve Accuracy for Stationary Head CR [Yoo-2002] CR-Multi [Yoo-2005] CR-HOM [Kang-2007] CR-HOMN [Hansen-2010] CR-DV [Coutinho-2006] No correction Scale correction Scale and translation correction Homography correction Homography correction + Residual interpolation
  14. 14. Improve Robustness for Head Movements No adaptation Adapt to eye depth variations Adapt to eye movements Assumptions 1) weak perspective 2) fixed eye parameters. CR [Yoo-2002] CR-DD [Coutinho and Morimoto 2010] PL-CR [Coutinho and Morimoto 2012]
  15. 15. Accuracy of Gaze Prediction for Stationary Head Robustness to Head Movement No adaptation CR [Yoo-2002] CR-Multi [Yoo-2005] CR-DV [Coutinho-2006] CR-HOM [Kang-2007] CR-HOMN [Hansen-2010] No correction Scale correction Scale and translation correction Homography correction Homography correction + Residual interpolation CR-DD [Coutinho-2010] Adapt to eye depth variations only PL-CR [Coutinho-2012] Adapt to eye movements Assumptions 1) weak perspective 2) fixed eye parameters. Adapt to eye movements No assumptions on 1) weak perspective 2) fixed eye parameters This paper
  16. 16. How? The Main Idea • Build upon the homography normalization method [Hansen et al 2010] • Improving accuracy and robustness simultaneously by introducing the Adaptive Homography Mapping
  17. 17. Adaptive Homograph Mapping • Two types of predictor variables • : capture the head movements relative to the calibration position • Affine transformation between the glints quadrilateral • : capture gaze direction for spatially-varying mapping • Pupil center position in the normalized space • : polynomial regression of degree two with parameter
  18. 18. Training Adaptive Homography Mapping • Exploit large amount of simulated data • the set of sampled head position in 3D • the set of calibration target index in the screen space • Objective function
  19. 19. Minimizing the Objective Function • Minimize an algebraic error at each sampled head position • Use the solution from algebraic error minimization as initialization Minimize the re-projection errors using the Levenberg-Marquardt algorithm
  20. 20. Visualize the Training Process • Eye gaze prediction results using the bias-correcting homography computed at the calibration position
  21. 21. RMSE Error Comparisons Using Different Training Models • Differences are small in linear regression • Linear model is not sufficiently complex • Compensation using both predictor variables achieve the lowest errors
  22. 22. Linear Regression
  23. 23. Linear Regression Adding the normalized pupil center corrected spatially-varying errors
  24. 24. Quadratic Regression
  25. 25. Quadratic Regression
  26. 26. Experimental Results – Synthetic data • Setup • Screen size 400mm x 300mm • Four IR lights • Camera 13mm focal length, placed slighted below the screen border (FoV~31 degree) • Calibration position and eye parameters • Eye parameters from [Guestrin and Eizenman, 2006]
  27. 27. Stationary Head Varying corneal radius
  28. 28. Stationary Head Varying pupil-corneal distance
  29. 29. Stationary Head Varying (horizontal) angle between optical/visual axis
  30. 30. Stationary Head Varying (vertical) angle between optical/visual axis
  31. 31. Head Movements Parallel to the Screen
  32. 32. Head Movement along Depth Variation
  33. 33. Tested at Another Head Position
  34. 34. Noise Sensitivity Analysis
  35. 35. Effect of Sensor Resolution (at calibration) Focal Length = 13 mm Focal Length = 35 mm
  36. 36. Effect of Sensor Resolution (at new position) Focal Length = 13 mm Focal Length = 35 mm
  37. 37. Real Data Evaluation – Programmable Hardware Setup Off-axis IR light sources Stereo camera (We use one only in this work) On-axis ring light
  38. 38. Real Data Evaluation – Feature Detection • Detecting glints and pupil center
  39. 39. Averaged Gaze Estimation Error at calibration position
  40. 40. Averaged Gaze Estimation Error Calibrated at 600mm from screenCalibrated at 500mm from screen
  41. 41. Conclusions • A learning-based approach for simultaneously compensating (1) spatially varying errors and (2) errors induced from head movements • Generalize previous work on compensating head movements using glint geometric transformation [Cerroaza et al. 2012] [Coutinho and Morimoto 2012] • Leveraging simulated data avoid the tedious data collection
  42. 42. Future Work • Consider subject-dependent parameters in the learning and inference the adaptive homography adaptation • Integrate binocular information, please see poster Zhengyou Zhang, Qin Cai, Improving Cross-Ratio-Based Eye Tracking Techniques by Leveraging the Binocular Fixation Constraint • Extensive user study using a physical setup
  43. 43. Comments or questions? Jia-Bin Huang jbhuang1@Illinois.edu Narendra Ahuja n-ahuja@Illinois.edu Zhengyou Zhang zhang@microsoft.com Qin Cai qincai@microsoft.com Zicheng Liu zliu@microsoft.com

×