Se ha denunciado esta presentación.

Data Driven Development of Autonomous Driving at BMW

18

Compartir

Próximo SlideShare
Renault: A Data Lake Journey
Renault: A Data Lake Journey
Cargando en…3
×
22 de 48
22 de 48

Data Driven Development of Autonomous Driving at BMW

18

Compartir

Descargar para leer sin conexión

"The development of autonomous driving cars requires the handling of huge amounts of data produced by test vehicles and solving a number of critical challenges specific to the automotive industry.

In this talk we will describe these challenges and how we, at BMW, are overcoming them by adapting and reinventing existing big data solutions for our end-to-end data journey for autonomous driving. Our journey involves ingesting data produced by a variety of sensors into a dedicated Hadoop cluster, decoding the data, conducting quality control, processing and storing the data on the clusters, making it searchable, analyzing it and exposing it to the engineers working on the algorithms development.

In the first part of the talk we will present a general overview of the challenges we faced and the lessons we learned from them. In the second part we will deep dive into the most interesting technical issues. These include: dealing with automotive formats and standards that are not designed for distributed processing; defragmentation of sensory data; assuring the quality of the data coming from complex car hardware and software components; efficient data search across petabytes of data; and reprocessing the computing components running in the car inside the data center, which typically requires high performance computing."

Speakers:
Felix Reuthlinger, Data Engineer for Autonomous Driving, BMW Group
Dogukan Sonmez, Senior Software Engineer, BMW Group

"The development of autonomous driving cars requires the handling of huge amounts of data produced by test vehicles and solving a number of critical challenges specific to the automotive industry.

In this talk we will describe these challenges and how we, at BMW, are overcoming them by adapting and reinventing existing big data solutions for our end-to-end data journey for autonomous driving. Our journey involves ingesting data produced by a variety of sensors into a dedicated Hadoop cluster, decoding the data, conducting quality control, processing and storing the data on the clusters, making it searchable, analyzing it and exposing it to the engineers working on the algorithms development.

In the first part of the talk we will present a general overview of the challenges we faced and the lessons we learned from them. In the second part we will deep dive into the most interesting technical issues. These include: dealing with automotive formats and standards that are not designed for distributed processing; defragmentation of sensory data; assuring the quality of the data coming from complex car hardware and software components; efficient data search across petabytes of data; and reprocessing the computing components running in the car inside the data center, which typically requires high performance computing."

Speakers:
Felix Reuthlinger, Data Engineer for Autonomous Driving, BMW Group
Dogukan Sonmez, Senior Software Engineer, BMW Group

Más Contenido Relacionado

Más de DataWorks Summit

Libros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

Audiolibros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

Data Driven Development of Autonomous Driving at BMW

  1. 1. BMW at DataWorks Summit 2018 Berlin 18.04.2018 DATA DRIVEN DEVELOPMENT OF AUTONOMOUS DRIVING AT BMW
  2. 2. ABOUTTHE SPEAKERS Felix Reuthlinger § Data Engineer for AD § Joined BMW in 2015 § Before joining AD, I was Big Data Architect at BMW central IT § Focus: Data center and data flow architecture for AD § Strong in: Spark, Scala § Co-founding and member of http://munich-datageeks.de/ Dogukan Sonmez § Software Engineer for AD § Joined BMW in 2017 § Prior to BMW worked at various big data and machine learning projects at SAP, Siemens and Sony § Focus: Data and Simulation for AD § Strong in: Distributed systems and software craftsmanship § Hobbies: Building wooden furniture, painting, IoT Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 2
  3. 3. AGENDA Why Autonomous Driving requires data How we get data process data serve data ensure data quality Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 3
  4. 4. WHYAUTONOMOUS DRIVING REQUIRES DATA Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 4
  5. 5. AUTONOMOUS DRIVING LEVELS NO SUPPORT HANDS ON ASSISTENCE PARTLYAUTOMATED AUTONOMOUSHIGHLYAUTOMATED FULLYAUTOMATED Vehicle controls forward and sideward motion Vehicle controls forward motion Driver has full control Driver controls steering and checks forward motion Driver checks forward and sideward motion Driver is ready to take control at any time Driver only required for certain parts of the track AUFGABE DES FAHRERS AUFGABE DES FAHRZEUGS 0 1 2 3 4 5 G11 / G30 iNEXT iNEXT Pilotserie tbd. HANDS ON HANDS TEMP. OFF EYES TEMP. OFF HANDSOFF EYESOFF HANDS OFF MINDOFF PASSENGER TRANSITION OF REPONSIBILITYHUMAN MACHINE TECHNO- LOGICAL ‘MOONSHOT’ TECHNO- LOGICAL QUANTUM LEAP Vehicle requests driver to take over control based on situations Vehicle does not request driver to take over control No driver required *Source: SAE (Society of Automotive Engineers) International Level of Automation Page 5Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018
  6. 6. Full Range Radar. Page 6 NIGHTVISION.Side View Camera. Side Range Radar. Surround View Camera. Ultra-sonic. Stereo Front Camera. Rear View Camera. Side Range Radar. Ultra-sonic. STEERING AND LANE CONTROL ASSISTANT INCL. LANE CHANGE ASSISTANT. SURROUND VIEW. ACTIVE CRUISE CONTROL. SPEED LIMIT ASSIST. EMERGENCY STEERING ASSIST. WRONG WAY ASSIST. CROSSROAD ASSIST. ADAS* SYSTEM SETUP (* AUTONOMOUS DRIVING ASSISTANCE SYSTEMS) 23 SENSORS BMW SERIES 5
  7. 7. DATA DRIVEN DEVELOPMENT FOR AD @ BMW Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 7 SeveralTB/h Upto 500 PB/a ML Experiments/Training Test drives Data Ingest to Data Center Organize Structure KPI report Deployment of trained algorithms ML data sets Phase out / Balance datasets Combinatorial boost of scenarios Synthetic data Focus of thistalk
  8. 8. Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 8 We got hundreds of PBs of datato crunch … Have a lot squirrels do it? Probably not …
  9. 9. DATA JOURNEY OVERVIEW Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 9 Logger File Copy / Ingest Instance Hadoop File(s) Meta store InputFormat, Defragmentation, Decoding Speed Weather 25 km/h Sunny 30 km/h Sunny Analytics, Functions, Learning, … I want to work on data from a sunny drive in June, …
  10. 10. HOW WE GET DATA Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 10
  11. 11. FILE FORMAT STANDARD Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 11 MDF4 (Measurement Data Format,version 4) à https://www.asam.net/standards/detail/mdf/ Standard in automotive industry (by ASAM organization https://www.asam.net/ ) Organized in binary blocks MDF4 has multiple usagetypes sorted / unsorted content for recording (hardware loggers) or calculated data for data exchange and long-term storage BMW AG is one of the standard authors
  12. 12. FILE FORMAT – HOW WE USE IT Logger centric: Main use case à hardware logger inthe car Very high data bandwidth à write down data quickly (FIFO) Our MDF4 files: Unsorted content Multiple small blocks for metadata One continuous big block for storing record payload data Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 12 * Example generated with our custom implementation of Mdf4Writer * Example hardware logger inthe car
  13. 13. FILE FORMAT Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 13 Header : car 1, drive 1 file set 1, file #1, … 0E 12 1A Header (ID block) ->this is MDF4 of version X MDF block Block header à Block description, size link[0] link[1] --- Link[n] Data section à Fields MDF block Block header à Block description, size link[0] link[1] --- Link[n] Data Section à Fields Data block Block header à Block description, size Data Section à Records / payloads à Dynamic record size à No indexing è This causes the file to be not split-able Substructures, like structs, contain metadata downtothe Data Block We use only 1 data block here It covers 99,99% of thetotalvolume ….
  14. 14. DATA COLLECTION FLEET 40 VEHICLES IN 2017 BMW 7 SERIES
  15. 15. DATA LOGGING IN THE CAR Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 15 Logger SSD File set 1 File set 2 Logger Logger config : car 1, drive 1, … FIFO Roll over to next file at 2 GB (ca. 5s data) 0E 1A 87 … 12 1B AA … 00 01 2A … Header : car 1, drive 1 file set 1, file #1, … 0E 12 1A Header : car 1, drive 1 file set 1, file #2, … 87 1B AA Header : car 1, drive 1 file set 2, file #1, … 00 01 2A Header : car 1, drive 1 file set 2, file #2, … 04 23 0A
  16. 16. HOW WE PROCESS DATA Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 16
  17. 17. DATA PROCESSING Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 17 Hadoop / HDFS Hadoop / Spark InputFormat RDD / DF … Hadoop / HDFS Speed Weather 25 km/h Sunny 30 km/h SunnyDrive meta data Merged header information Hadoop / HBase Meta store Note: we parallelize by scaling out over multiple driving sessions Header : car 1, drive 1 file set 1, file #1, … 0E 12 1A Header : car 1, drive 1 file set 1, file #2, … 87 1B AA Header : car 1, drive 1 file set 2, file #1, … 00 01 2A RDD / DFRDD / DF RDD / DF … RDD / DF … read defragment decode store
  18. 18. DEEP DIVE ABOUT REDUCING I/O Continuous data collection requires continuous processing. Challenges: Potentially thousands of files per driving session MDF4 using dynamic record length, no clear split Seeks inside file Defragmentation = groupingtransformation Goal: reduce network I/O. Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 18
  19. 19. MDF4 INPUT FORMAT Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 19
  20. 20. CUSTOM INPUT FORMAT IMPLEMENTATION Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 20 Header : car 1, drive 1 file set 1, file #1, … 0E 12 1A RDD / DF … Header : car 1, drive 1 file set 2, file #1, … 00 01 2A 2 GB file size dfs.blocksize=2G è 1 file = 1 input split Mdf4Record= Metadata Payload (binary) Mdf4Reader InputSplit … Mdf4Reader Executor / Partition Mdf4InputFormat Executor / Partition
  21. 21. DEFRAGMENTATION Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 21
  22. 22. DATA REPRESENTATION IN THE CAR BUS Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 22 Image from camera Ethernet IPv4 SomeIP UDP UDP Datagram Ethernet IPv4 fragment Ethernet IPv4 fragment Ethernet IPv4 fragment Ethernet IPv4 SomeIP UDP UDP Datagram …
  23. 23. DATA STRUCTURE FRAGMENTATION Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 23 Header : car 1, drive 1 file set 1, file #1, … 0E 12 1A Header : car 1, drive 1 file set 1, file #1, … 0E 12 1A In ~2% of the cases, data overlaps over multiple files …… 12 1A90 … ~98% of the data structures are within one MDF4 fileImage from camera
  24. 24. Key Value A A A WHY NOT USE WHAT IS ALREADY AVAILABLE Reduce-by-key / group-by-key will shuffle most / all fragments. Applied function on grouping has still huge result volume (partial image). Defragmentation requires completeness, incomplete partial-defragmented results might again require shuffle. Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 24 Key Value A 12 A 23 B 54 A 47 B 24 Key Sum A 82 B 78 Key Sum A This will not get us a result Works for aggregation What if something is missing?
  25. 25. DEFRAGMENTATION PROCESS Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 25 RDD with fragments create local- complete RDD Reduce local- complete RDD Create local- incomplete RDD from remaining fragments reduce local- incomplete RDD Executor Partition #1 Message #1 Executor Partition #1 Message #1 ExecutorExecutor Partition #1 #2 #3 #4 Partition #2 Message #1 (example: completeness = 4 fragments) RDD #1 RDD #2 RDD #3 Executor #4 Partition #2 Executor Partition #2 #4 ExecutorExecutor Partition #1 #2 #3 Partition #2 RDD #4 Executor Partition #1 #2 RDD #5 Executor Partition #2 #3 Union #3 and #5, Discard remaining uncomplete fragments
  26. 26. SHUFFLE RESULTS: EXAMPLE Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 26 This example result shows limited shuffling
  27. 27. HOW WE SERVE DATA Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 27
  28. 28. THE DATA Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 28 Speed Weather Environment 85 km/h Rainy Highway 30 km/h Sunny Urban V1
  29. 29. LIDAR Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 29 LIDAR: Light Detection and Ranging Good for generating a precise 3D map Not reliable during bad weather conditions
  30. 30. RADAR Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 30 Long and short range inthe car Good for detecting moving objects Reliable during bad weather conditions
  31. 31. IMAGE Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 31 RCCC Image RCCC format, compressed or uncompressed Good for object recognitions (traffic lights, street signs, lane lines)
  32. 32. WHO ARE THE DATA USERS Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 32 Machine Learning Engineer Software Engineer Algorithm Developer Robotics Engineer Applied Scientist
  33. 33. WHICH DATA USERS INTERESTED IN Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 33 WANTED Parquet or ORC Drive in highway at rainy day ★★★★★★ ★ ★ WANTED jpeg Camera images WANTED Rosbag Sensory data IMU, GPS ★ ★ WANTED DF or DS Lidar and radar data ★ ★ WANTED HDF5 Urban drive with traffic lights ★★★★★★ ★★★★★★ ★★★★★★ ★★★★★★
  34. 34. WHAT OUR USERS DO WITH THAT Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 34 Building driving strategy Signal processing, sensor fusion Sensor validation Simulation
  35. 35. OUR PHILOSOPHY FOR DATA PROVISIONING Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 35 Evangelize data driven development Big datatrainings On boarding new usersto use our cluster Abstract away data cluster complexity but also allow user to developtop of it
  36. 36. DATA ACCESS CHALLENGES Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 36 Scalable way of accessing big data Continuously changing data structure makes it harder to work with data Variety and complexity of data andtheir formats Data centers acrossthe world and data shipping (in case privacy is not affected)
  37. 37. DATA ACCESS Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 37 Hadoop / HDFS Speed Weather 25 km/h Sunny 60 km/h Sunny Meta store Hadoop / Spark / … Data search API RDD / DF speed Weather front_camera_image 60 km/h Sunny 55 km/h Sunny select (speed, front_camera_image) where (whether=sunny and speed > 50)
  38. 38. HOW WE ENSURE DATA QUALITY Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 38
  39. 39. WHY DATA QUALITY IS IMPORTANT Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 39 We don’t want to wastetime and resources by having unnecessary test drives We don’t want to store datathat users cannot use We don’t want to provide bad data
  40. 40. IT’S ALL ABOUT … Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 40 The GOOD The BAD The UGLY
  41. 41. WHAT COULD POSSIBLY GO WRONG Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 41 Logger Image Frame drops Calibration Errors Configuration Errors Corrupted sensory data
  42. 42. WHICH DATA IS INTERESTING TO USERS Highway / urban drives Drive at the night Rainy day drive Drive which in cross roads Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 42
  43. 43. ENSURING DATA QUALITY Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 43 Centralized data quality framework Built top of the spark Kafka for inter-application communication
  44. 44. CUSTOM INPUT DISCRETIZED STREAM IMPLEMENTATION Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 44 CustomInputDStream InputDStream Creates a new RDD once new data available Uses streaming scheduler to run continuously Triggered once a new message is sent
  45. 45. DATA QUALITY FRAMEWORK Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 45 HDFS Hadoop / HDFS Header : car 1, drive 1 file set 1, file #1, … 0E 12 1A Header : car 1, drive 1 file set 2, file #1, … 00 01 2A
  46. 46. DATA DRIVEN DEVELOPMENT FOR AD @ BMW Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 46 SeveralTB/h Upto 500 PB/a ML Experiments/Training Test drives Data Ingest to Data Center Organize Structure KPI report Deployment of trained algorithms ML data sets Phase out / Balance datasets Combinatorial boost of scenarios Synthetic data Focus of thistalk
  47. 47. WE ARE HIRING The BMW AD organization is growing! Visit our booth :) We are also at Strata London in May Data Driven Development of Autonomous Driving at BMW | DataWorks Summit Berlin | April 2018 Page 47 Autonomous Driving Campus We got PBdata!
  48. 48. EXCITING TIMES AHEAD – THANKYOU FORYOUR INTEREST.

×