Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection, and Safety

223 visualizaciones

Publicado el

Data-driven Challenges in AI: Scale, Information Selection, and Safety

Publicado en: Tecnología
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection, and Safety

  1. 1. Introduction Data systems @ scale Information selection Safety Conclusions Data-driven challenges in AI: scale, information selection, and safety Anna Choromanska New York University ECE Department, Tandon Schoold of Engineering Talk dedicated to my son, Marcin Tadeusz.
  2. 2. Introduction Data systems @ scale Information selection Safety Conclusions Characteristics of modern data Data size/multi-modality/safety The amount of available digital data is doubling every two years; by 2020 the amount of data we create and copy annually will reach 44 zettabytes. EMC Digital Universe study
  3. 3. Introduction Data systems @ scale Information selection Safety Conclusions Characteristics of modern data Data size/multi-modality/safety The amount of available digital data is doubling every two years; by 2020 the amount of data we create and copy annually will reach 44 zettabytes. EMC Digital Universe study The data comes from multiple modalities such as LiDARs (point cloud), cameras (images), natural language (text, speech), . . .
  4. 4. Introduction Data systems @ scale Information selection Safety Conclusions Characteristics of modern data Data size/multi-modality/safety The amount of available digital data is doubling every two years; by 2020 the amount of data we create and copy annually will reach 44 zettabytes. EMC Digital Universe study The data comes from multiple modalities such as LiDARs (point cloud), cameras (images), natural language (text, speech), . . . Data can be safe or else anomalous/corrupted/adversarial.
  5. 5. Introduction Data systems @ scale Information selection Safety Conclusions Challenges driven by modern data Data-driven challenges in AI scale: how to build AI systems @ scale?
  6. 6. Introduction Data systems @ scale Information selection Safety Conclusions Challenges driven by modern data Data-driven challenges in AI scale: how to build AI systems @ scale? information selection: how to effectively process data = choose relevant data modalities/portions = avoid wasteful computations?
  7. 7. Introduction Data systems @ scale Information selection Safety Conclusions Challenges driven by modern data Data-driven challenges in AI scale: how to build AI systems @ scale? information selection: how to effectively process data = choose relevant data modalities/portions = avoid wasteful computations? safety: how to verify and trust the data?
  8. 8. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification eXtreme classification problem Problem setting: multi-class classification: each data point is assigned one label
  9. 9. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification eXtreme classification problem Problem setting: multi-class classification: each data point is assigned one label multi-label classification: each data point is assigned a subset of labels
  10. 10. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification eXtreme classification problem Problem setting: multi-class classification: each data point is assigned one label multi-label classification: each data point is assigned a subset of labels Applications: search engines targeted advertising aggregation of online news stories and their categorization . . .
  11. 11. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification eXtreme classification problem Problem setting: multi-class classification: each data point is assigned one label multi-label classification: each data point is assigned a subset of labels Goal: good predictor with logarithmic training and testing time
  12. 12. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification eXtreme classification problem Problem setting: multi-class classification: each data point is assigned one label multi-label classification: each data point is assigned a subset of labels Goal: good predictor with logarithmic training and testing time Most multi-class algo- rithms run in O(k) time, where k is the number of classes. The lower-bound is O(log k) .
  13. 13. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Tree-based classifier h - hypothesis inducing the split, x - data point
  14. 14. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Tree-based classifier h - hypothesis inducing the split, x - data point
  15. 15. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Pure and balanced split
  16. 16. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Pure and balanced split Design per-node objective function that favors: balanced splits ⇒ efficient tree
  17. 17. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Pure and balanced split Design per-node objective function that favors: balanced splits ⇒ efficient tree pure splits ⇒ small classification error
  18. 18. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Objective function J := M j=1 M l=j+1 |Pj −Pl | balancing term −λ1 K y=1 M j=1 M l=j+1 πi Py j −Py l class integrity term + λ2   M j=1 Pj  −1 multi-way penalty purity term ∈[−λ1, λ2]
  19. 19. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Objective function J := M j=1 M l=j+1 |Pj −Pl | balancing term −λ1 K y=1 M j=1 M l=j+1 πi Py j −Py l class integrity term + λ2   M j=1 Pj  −1 multi-way penalty purity term ∈[−λ1, λ2] J ⇒ Splitting criterion (objective function) Given a set of n examples each with one (multi-class)/subset (multi-label) of k labels, find a partitioner h that minimizes J.
  20. 20. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Objective function J := M j=1 M l=j+1 |Pj −Pl | balancing term −λ1 K y=1 M j=1 M l=j+1 πi Py j −Py l class integrity term + λ2   M j=1 Pj  −1 multi-way penalty purity term ∈[−λ1, λ2] J ⇒ Splitting criterion (objective function) Given a set of n examples each with one (multi-class)/subset (multi-label) of k labels, find a partitioner h that minimizes J. Decreasing J leads to more pure and more balanced splits
  21. 21. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Objective function J := M j=1 M l=j+1 |Pj −Pl | balancing term −λ1 K y=1 M j=1 M l=j+1 πi Py j −Py l class integrity term + λ2   M j=1 Pj  −1 multi-way penalty purity term ∈[−λ1, λ2] J ⇒ Splitting criterion (objective function) Given a set of n examples each with one (multi-class)/subset (multi-label) of k labels, find a partitioner h that minimizes J. Decreasing J leads to more pure and more balanced splits ⇒ efficient trees with logarithmic depth
  22. 22. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Objective function J := M j=1 M l=j+1 |Pj −Pl | balancing term −λ1 K y=1 M j=1 M l=j+1 πi Py j −Py l class integrity term + λ2   M j=1 Pj  −1 multi-way penalty purity term ∈[−λ1, λ2] J ⇒ Splitting criterion (objective function) Given a set of n examples each with one (multi-class)/subset (multi-label) of k labels, find a partitioner h that minimizes J. Decreasing J leads to more pure and more balanced splits ⇒ efficient trees with logarithmic depth Decreasing J leads to the reduction of the tree error
  23. 23. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Objective function J := M j=1 M l=j+1 |Pj −Pl | balancing term −λ1 K y=1 M j=1 M l=j+1 πi Py j −Py l class integrity term + λ2   M j=1 Pj  −1 multi-way penalty purity term ∈[−λ1, λ2] J ⇒ Splitting criterion (objective function) Given a set of n examples each with one (multi-class)/subset (multi-label) of k labels, find a partitioner h that minimizes J. Decreasing J leads to more pure and more balanced splits ⇒ efficient trees with logarithmic depth Decreasing J leads to the reduction of the tree error ⇒ small-error trees
  24. 24. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Objective properties J extends to trees of arbitrary arity
  25. 25. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Objective properties J extends to trees of arbitrary arity J can be easily optimized with SGD
  26. 26. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Objective properties J extends to trees of arbitrary arity J can be easily optimized with SGD J leads to the algorithm for tree construction and training that runs online
  27. 27. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Objective properties J extends to trees of arbitrary arity J can be easily optimized with SGD J leads to the algorithm for tree construction and training that runs online The approach accommodates classification as well as density estimation problems.
  28. 28. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Objective properties J extends to trees of arbitrary arity J can be easily optimized with SGD J leads to the algorithm for tree construction and training that runs online The approach accommodates classification as well as density estimation problems. J can be used to learn both the label partitioning and the data representation simultaneously!
  29. 29. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Deep eXtreme classification Deep representation learning: Computation in the last layer can blow up...
  30. 30. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Deep eXtreme classification Deep representation learning: Computation in the last layer can blow up...
  31. 31. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Deep eXtreme classification Deep representation learning: Computation in the last layer can blow up...
  32. 32. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Experiments: classification Table: Precisions: P@1, P@3, and P@5 (%) and nDCG scores: N@1, N@3, and N@5 (%) obtained by OAA, LPSR, FastXML, PFastreXML, and LdSM (d,M) with the depth of the tree d and arity M. Delicious-200k N = 197k, D = 783k, K = 205k Algorithm P@1 P@3 P@5 N@1 N@3 N@5 LPSR 18.59 15.43 14.07 18.59 16.17 15.13 FastXML 43.07 38.66 36.19 43.07 39.70 37.83 PFastreXML 41.72 37.83 35.58 41.72 38.76 37.08 LdSM (35,2) 43.40 39.80 37.75 43.40 40.66 39.11
  33. 33. Introduction Data systems @ scale Information selection Safety Conclusions eXtreme classification Experiments: classification Table: Precisions: P@1, P@3, and P@5 (%) and nDCG scores: N@1, N@3, and N@5 (%) obtained by OAA, LPSR, FastXML, PFastreXML, and LdSM (d,M) with the depth of the tree d and arity M. Delicious-200k N = 197k, D = 783k, K = 205k Algorithm P@1 P@3 P@5 N@1 N@3 N@5 LPSR 18.59 15.43 14.07 18.59 16.17 15.13 FastXML 43.07 38.66 36.19 43.07 39.70 37.83 PFastreXML 41.72 37.83 35.58 41.72 38.76 37.08 LdSM (35,2) 43.40 39.80 37.75 43.40 40.66 39.11 Table: Prediction time [ms] per example for FastXML, PFastreXML, and LdSM on AmazonCat, Wiki10, and Delicious-200k data sets. FastXML PFastreXML LdSM AmazonCat 1.21 1.34 0.49 Wiki10 3.00 NA 1.21 Delicious-200k 1.28 7.40 1.30
  34. 34. Introduction Data systems @ scale Information selection Safety Conclusions Sensor selection for autonomous driving Sensor selection problem for autonomous driving Problem setting: autonomous car equipped with multiple sensors
  35. 35. Introduction Data systems @ scale Information selection Safety Conclusions Sensor selection for autonomous driving Sensor selection problem for autonomous driving Problem setting: autonomous car equipped with multiple sensors end-to-end training framework
  36. 36. Introduction Data systems @ scale Information selection Safety Conclusions Sensor selection for autonomous driving Sensor selection problem for autonomous driving Problem setting: autonomous car equipped with multiple sensors end-to-end training framework steering command: the only available supervision
  37. 37. Introduction Data systems @ scale Information selection Safety Conclusions Sensor selection for autonomous driving Sensor selection problem for autonomous driving Problem setting: autonomous car equipped with multiple sensors end-to-end training framework steering command: the only available supervision Goal: avoid fast increase of computational complexity with the number of sensing devices
  38. 38. Introduction Data systems @ scale Information selection Safety Conclusions Sensor selection for autonomous driving Sensor selection problem for autonomous driving Problem setting: autonomous car equipped with multiple sensors end-to-end training framework steering command: the only available supervision Goal: avoid fast increase of computational complexity with the number of sensing devices activate feature extractors for relevant inputs only
  39. 39. Introduction Data systems @ scale Information selection Safety Conclusions Sensor selection for autonomous driving Sensor selection problem for autonomous driving Problem setting: autonomous car equipped with multiple sensors end-to-end training framework steering command: the only available supervision Goal: avoid fast increase of computational complexity with the number of sensing devices activate feature extractors for relevant inputs only avoid overfitting to the simplest and most informative input
  40. 40. Introduction Data systems @ scale Information selection Safety Conclusions Sensor selection for autonomous driving Sensor selection problem for autonomous driving Problem setting: autonomous car equipped with multiple sensors end-to-end training framework steering command: the only available supervision Goal: avoid fast increase of computational complexity with the number of sensing devices activate feature extractors for relevant inputs only avoid overfitting to the simplest and most informative input guarantee real-time operation
  41. 41. Introduction Data systems @ scale Information selection Safety Conclusions Sensor selection for autonomous driving Sensor selection problem for autonomous driving Problem setting: autonomous car equipped with multiple sensors end-to-end training framework steering command: the only available supervision Goal: avoid fast increase of computational complexity with the number of sensing devices activate feature extractors for relevant inputs only avoid overfitting to the simplest and most informative input guarantee real-time operation allow both discrete and continuous data selection
  42. 42. Introduction Data systems @ scale Information selection Safety Conclusions Sensor selection for autonomous driving Hardware Figure: The block diagram of the autonomous platform. Traxxas X-Maxx remote control truck (RC car, scale 1/6) DrivePX2 for computations three SEKONIX AR0231 GMSL cameras that are facing the front of the platform and cover non-overlapping views. Each camera has 60 degrees horizontal field of view Velodyne VLP-16 LiDAR with 16 lasers covering 30 degree vertical FOV and 360 degree horizontal FOV
  43. 43. Introduction Data systems @ scale Information selection Safety Conclusions Sensor selection for autonomous driving Approach: multi-modality and mixed policy Figure: The architecture of the reconfigurable network.
  44. 44. Introduction Data systems @ scale Information selection Safety Conclusions Sensor selection for autonomous driving Approach: multi-modality and mixed policy Figure: Different stages of training.
  45. 45. Introduction Data systems @ scale Information selection Safety Conclusions Sensor selection for autonomous driving Experiments: multi-modality and mixed policy Table: Computational complexity comparison of different networks. Network Name FLOPs LiDAR only 26.17M LiDAR with gating 14.11M Single Camera 25.38M Three Cameras 76.01M Three cameras and LiDAR 102.49M Three cameras and LiDAR with gating 90.08M Multi-modal Experts Network 17.28M chosen sensor: LiDAR Multi-modal Experts Network 29.61M chosen sensor: camera
  46. 46. Introduction Data systems @ scale Information selection Safety Conclusions Sensor selection for autonomous driving Experiments: multi-modality and mixed policy
  47. 47. Introduction Data systems @ scale Information selection Safety Conclusions Safety in autonomous driving Problem of safety in autonomous driving Problem setting: autonomous car instrumented with cameras and LIDAR and controlled by an end-to-end learning system
  48. 48. Introduction Data systems @ scale Information selection Safety Conclusions Safety in autonomous driving Problem of safety in autonomous driving Problem setting: autonomous car instrumented with cameras and LIDAR and controlled by an end-to-end learning system Goal: develop on-line monitoring framework for continuous real-time safety in learning-based control systems
  49. 49. Introduction Data systems @ scale Information selection Safety Conclusions Safety in autonomous driving Problem of safety in autonomous driving Problem setting: autonomous car instrumented with cameras and LIDAR and controlled by an end-to-end learning system Goal: develop on-line monitoring framework for continuous real-time safety in learning-based control systems monitor the validity of mappings from sensor inputs to actuator commands
  50. 50. Introduction Data systems @ scale Information selection Safety Conclusions Safety in autonomous driving CEBGAN for safety in autonomous driving Figure: Conditional energy based generative adversarial network (CEBGAN) framework for the controller-focused anomaly detection (CFAM).
  51. 51. Introduction Data systems @ scale Information selection Safety Conclusions Safety in autonomous driving Experiments Figure: Safe operation of the autonomous platform. Figure: Anomalous operation of the autonomous platform.
  52. 52. Introduction Data systems @ scale Information selection Safety Conclusions Safety in autonomous driving Experiments
  53. 53. Introduction Data systems @ scale Information selection Safety Conclusions Summary Summary and Future Directions Discussed approaches: scale: using decisions trees to scale AI systems to large data sizes
  54. 54. Introduction Data systems @ scale Information selection Safety Conclusions Summary Summary and Future Directions Discussed approaches: scale: using decisions trees to scale AI systems to large data sizes information selection: using reconfigurable networks to select relevant data
  55. 55. Introduction Data systems @ scale Information selection Safety Conclusions Summary Summary and Future Directions Discussed approaches: scale: using decisions trees to scale AI systems to large data sizes information selection: using reconfigurable networks to select relevant data safety: using GANs to monitor system’s safety
  56. 56. Introduction Data systems @ scale Information selection Safety Conclusions Summary Summary and Future Directions Discussed approaches: scale: using decisions trees to scale AI systems to large data sizes information selection: using reconfigurable networks to select relevant data safety: using GANs to monitor system’s safety Future directions: logarithmic space framework, modern recommendation systems, other applications scaling information selection algorithms to a large number of inputs ambiguous scenarios, increasing system’s robustness practical sample complexity bounds
  57. 57. Introduction Data systems @ scale Information selection Safety Conclusions Summary Research Group
  58. 58. Introduction Data systems @ scale Information selection Safety Conclusions Summary Research Group Many thanks to NVIDIA Autonomous Driving Team in New Jersey!!!
  59. 59. Introduction Data systems @ scale Information selection Safety Conclusions Summary NYU Tandon ECE Seminar Series on Modern AI DOORS ARE OPEN TO EVERYBODY!!! Past Speakers: Yann LeCun, Yoshua Bengio, Stefano Soatto, Vladimir Vapnik, David Blei, Richard J. Roberts, Anima Anadkumar, Martial Hebert, Tony Jebara Future confirmed speakers: Manuela Veloso, Eric Kandel, Francis Bach, Raia Hadsell, Leon Bottou, Michael Kearns, Nicol`o Cesa-Bianchi

×