Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

The Importance of Open Innovation in AI era

148 visualizaciones

Publicado el

Keynote Speech Open Innovation Network, I-CON 2019
AI 시대 오픈 이노베이션의 중요성
Invited by Department of Small Business in Korea

Publicado en: Datos y análisis
  • Inicia sesión para ver los comentarios

  • Sé el primero en recomendar esto

The Importance of Open Innovation in AI era

  1. 1. Jongwook Woo HiPIC CalStateLA 2019 i-CON (개방형 혁신 네트워크) 컨퍼런스 Dec 3 2019 Jongwook Woo, PhD, jwoo5@calstatela.edu Big Data AI Center (BigDAI) California State University Los Angeles AI시대 오픈이노베이션의 중요성
  2. 2. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Contents Myself 개방형 혁신 오픈 네트워크 소개 New Technology: Big Data and Deep Learning 개방형 혁신 오픈 네트워크
  3. 3. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Myself Experience: Since 2002, Professor at California State University Los Angeles – PhD in 2001: Computer Science and Engineering at USC
  4. 4. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Myself: S/W Development Lead http://www.mobygames.com/game/windows/matrix-online/credits
  5. 5. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Collaboration with HDP, CDH, Oracle, Amazon using Hadoop Big Data https://www.cloudera.com/more/customers/csula.html
  6. 6. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Contents Myself 개방형 혁신 오픈 네트워크 소개 New Technology: Big Data and Deep Learning 개방형 혁신 오픈 네트워크
  7. 7. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA 개방형 혁신 네트워크 * i-CON (innovation - Communication Open Network) : 산학연 등 다양한 혁신 주체 간 자유로운 소통 –그를 통해 혁신을 이루는 열린 네트워크를 의미 –대‧중소기업, 연구자, 금융 등의 연결과 협력을 통해 • 기술개발과제 발굴‧기획, 사업화, 투자 등 혁신활동의 허브 역할을 수행하는「개방형 혁신 네트워크 i-CON*」출범
  8. 8. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Contents Myself 개방형 혁신 오픈 네트워크 소개 New Technology: Big Data and Deep Learning 개방형 혁신 오픈 네트워크
  9. 9. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA New Technology: Big Data What is Big Data? Data or Systems? Large Scale Data? – Many people only see the data point of view – 3 Vs, 5Vs Systems? – YES Big Data [1] New Computer Systems to store and compute Large-Scale data – How to Store and Process large scale data • Not only for computing power • Parallel Distributed Computing Systems => Data Intensive Super Computer – Does not need to be Tera-/Peta-Bytes of data set • Linearly Scalable
  10. 10. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Data Handling: Traditional Way
  11. 11. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Data Handling: Traditional Way Becomes too Expensive
  12. 12. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Data Handling: Another Way Not Expensive From 2017 Korean Blockbuster Movie, “The Fortress” (남한산성)
  13. 13. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Data Handling: Another Way Not Expensive http://blog.naver.com/PostView.nhn?blogId=dosims&logNo=221127053677 1409년(태종 9) 최해산(崔海山), 아버지 최무선(崔茂宣) [출처] 조선의 비밀 병기 : 총통기 화차(銃筒機火車)|작성자 도심
  14. 14. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Super Computer vs Big Data vs Cloud Traditional Super Computer (Parallel File Systems: Lustre, PVFS, GPFS) Cluster for Store Big Data (Hadoop, Spark, Distributed Deep Learning) Cluster for Compute and Store (Distributed File Systems: HDFS, GFS) However, Cloud Computing adopts this separated architecture: with High Speed N/W and Object Storage Cluster for Compute
  15. 15. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Big Data: Linearly Scalable  Some people questions that the system to handle 1 ~ 3GB of data set is not Big Data Well…. add more servers as more data in the future in Big Data platform – it is linearly scalable once built – n time more computing power ideally Data Size: < 3 GB Data Size: 200 TB > Add n servers
  16. 16. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Big Data is great for Small Business Your Business data is the value and Big Data  Customer data  Operational data You have your specific data Big Company does not have a specific data as you have Potentials  Your customer data – Smart marketing and Sales – Advertisement  Your operational data – Efficient operation, For Example, Smart*: • Smart Factory, Smart City
  17. 17. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Big Data Data Analysis & Visualization Sentiment Map of Alphago Positive Negative
  18. 18. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Businesses popular in 5 miles of CalStateLA, USC , UCLA
  19. 19. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Big Data Analysis and Prediction Flow Data Collection Batch API: Yelp, Google Streaming: Twitter, Apache NiFi, Kafka, StereamSets, Storm Open Data: Government Data Storage HDFS, S3, Object Storage, NoSQL DB (Couchbase)… Data Filtering Hive, Pig Data Analysis and Science Hive, Pig, Spark, Deep Learning, BI Tools (Qlik, Tableau, …) Data Visualization Qlik, Excel PowerMap, Tableau, Looker, … - Big Data Engineering - Big Data Analysis - Big Data Science Deep Learning - Data Visualization
  20. 20. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Azure ML Studio and Spark ML Result Comparison Big Data Science: Ad Click Fraud Prediction, 7GB data TWO-CLASS DECISION JUNGLE (AzureML) TWO-CLASS DECISION FOREST (AzureML) DECISION TREE CLASSIFIER (Databricks) RANDOM FOREST CLASSIFIER (Databricks) DECISION TREE CLASSIFIER (Balanced Sample Data, Oracle) RANDOM FOREST CLASSIFIER (Balanced Sample Data, Oracle) AUC 0.905 0.997 0.815 0.746 0.896 0.893 PRECISION 1.0 0.992 0.822 0.878 0.935 0.934 RECALL 0.001 0.902 0.633 0.495 0.807 0.800 TP 35 47,199 86,683 67,726 111,187 110,220 FP 0 377 18,727 9,408 7,712 7,791 TN 52,306 406,228 7,112,961 7,122,280 545,302 545,223 FN 406,605 5,142 50,074 69,031 26,604 27,571 Run Time 2 hrs 2-3 hrs 22 mins 50 mins 24 sec 2 mins
  21. 21. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Big Data Science: Transaction Data Fraud Detection Model Area under ROC Precision Recall DecisionTreeClassifier RandomForestClassifier 0.909573 LogisticRegression  Size: 470 MB (=> 718MB)  6,362,620 records  Not that large scale data comparing to data set > GB  https://www.kaggle.com/ntnu-testimon/paysim1 3 models in Spark Cluster with different combinations of the parameters  Times taken: 1 hour with 3 Spark clsters  In theory of Linear Scalability: 2 minutes with 30 Spark clsters
  22. 22. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Experimental Results in AWS Execution times 3 nodes: –40min – 70mins 11 nodes –10min – 20mins
  23. 23. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Data Scale Driving: Deep Learning Process Deep Learning and Massive Data [3] “Machine Learning Yearning” Andrew Ng 2016
  24. 24. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Deep learning experts The Chasm Big Data Engineers, Scientists, Analysts, etc. Another Gap between Deep Learning and Big Data Communities [6]
  25. 25. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Leveraging Big Data Cluster  Existing Big Data cluster with massive data set without using Big Data Too slow in data migration and single server fails Single GPU server for Deep Learning? Single server for Python and R Traditional Machine Learning? Big Data Cluster
  26. 26. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Leveraging Big Data Cluster  Existing Big Data cluster Big Data Engineering Big Data Analysis Big Data Science Distributed Deep Learning – Integrate Deep Learning to the cluster Not needs data migration and can leverage the parallel computing and existing large scale data Big Data Cluster
  27. 27. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Big Data Prediction with DDL DDL: Distributed Deep Learning Tensor Flow Distributed Training and Inference in Spark cluster DDL
  28. 28. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Spark ML and DDL [2-5] Deep Learning in Spark cluster Distributed Deep Learning DDL DDL lib DDL lib Deep Learning in Spark
  29. 29. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA AWS Review Dataset Predictive Analysis Prediction of rating – important measures for purchase and selling Spark ML: ALS (Alternating Least Squares) algorithm DDL (Distributed Deep Learning): Neural Collaborative Filtering(NCF) Dataset : - https://s3.amazonaws.com/amazon-reviews- pds/tsv/index.txt Products reviewed between 2005 and 2015 are analyzed Total product reviews : 9.57 million File Size : 5.26 GB
  30. 30. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Summary: Performance
  31. 31. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Summary: Mean Absolute Error
  32. 32. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Contents Myself 개방형 혁신 오픈 네트워크 소개 New Technology: Big Data and Deep Learning 개방형 혁신 오픈 네트워크
  33. 33. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Small Business How to adopt the new technology Training for new technology Collaboration Government
  34. 34. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Training Emerging Technology every moment IT companies lead the industry not university How to catch up with? – Training Company with new technology Always deliver training – Big Data • Cloudera, Hortonworks – AI Deep Learning • Traditional Concept – Stanford, UC Berkeley, edx, IBM, H2O
  35. 35. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Training (Cont’d) Training by Company  3 - 4days/Week –$2,500 - $3,000 –Practical • with theory + hands-on exercise • Instructor paid well • Employer send their engineers to learn the new technology in a few weeks
  36. 36. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Trained but No Experience with bad management in Korea Sang-Ryung Battle: From 2017 Korean Blockbuster Movie, “The Fortress” (남한산성)
  37. 37. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Trained Well With Experience and Good management in Japan Battle of Nagashino, 1575, Japan
  38. 38. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Trained but No Experience with bad management in Korea (Cont’d) Sang-Ryung Battle: From 2017 Korean Blockbuster Movie, “The Fortress” (남한산성)
  39. 39. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Collaboration  Multiple entities  IT company – Expert in Big Data, Deep Learning  Business – Domain Expert  For example, Big Data in Smart Factory  IT – What the manufacturer wants to analyze and predict? – What data and How the manufacturer generates it? – What Tags are? How to put sensors to motors? What is SCADA?  OT – What is Big Data and Big Data Science – How to implement Machine Learning/Deep Learning?
  40. 40. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Collaboration For example, Los Angeles Weekly Coffee time or non-formal meetings with diverse backgrounds – Professors, Computer Scientists, Investor, Journalists, Government officials, Digital Artists from Universities, Companies in any area – Just chat • and can find out the opportunities to work together • And find out the solutions
  41. 41. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Government Open Data in Los Angeles Open Data: https://data.lacity.org/ Location based open data: http://geohub.lacity.org/ Data Science Federation City of Los Angeles https://dsf.lacity.org/ – is a partnership between The City of Los Angeles and Los Angeles area colleges and universities to tackle tough city problems. – work on tough city problems that will make a difference in many areas and expand on early work in data science and data-driven decision making for City Government. Jongwook Woo, BigDAI Center – Dalyapraz Dauletbak, Jongwook Woo, "Traffic Data Analysis and Prediction using Big Data", APIC-IST 2019, pp127-133, ISSN 2093-0542 • Outstanding Paper Award received – LA Zoo Data Analysis and Prediction
  42. 42. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA City of Los Angeles: DSF
  43. 43. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Jams and other traffic incidents reported by users in Dec 2017 – Jan 2018: (Dalyapraz Dauletbak, Jongwook woo at BigDAI Center, CalStateLA)
  44. 44. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Government (Cont’d)  대한민국  중소기업 지원 및 프로젝트 공고  사무실 저렴 임대  미국 중소기업
  45. 45. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA https://www.visualcapitalist.com/ranked-the-20-easiest-countries-for- doing-business
  46. 46. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Government (Cont’d)  중소기업 지적 권리:  대기업과의 관계에 있어서 기술개발, 수익모델 – 무료 POC – 무료 연구개발 솔루션 계획서 • 기술 개발 도용 장기 소송전  중소기업  작은 내수시장  민간주도 투자부족 – 국가 지원 외의 민간투자 최하위? – 기술 평가 전문가 이용 부족 – 인건비 (생각보다 한국이 인건비 높음)  사업가들의 신용불량 전락 위험성  기업에 지우는 과도한 준 조세부담 (4대 보험 기업부담 강제)  고용 유연성 부족
  47. 47. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Government (Cont’d)  미국 공정위 (Fair Trade Commission)  공정위 조사 들어가는 즉시 대기업 해체 – Microsoft: MS 오피스 윈도우 통합 독점 해체 – AT&T:  대한민국 중기부 와 공정거래위원회?  중소기업 지적 보호막 필요 – 대기업의 관여나 포식 방어막 필요 • 대기업과의 공정 경쟁과 상생을 통한 중소기업의 발전 지원  중소기업 애로 해결 필요 – 시장 진입 퇴출 자유자재 환경 조성
  48. 48. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Summary Open Innovation 정의 Big Data AI 같은 기술의 트레이닝 필요성 Open Innovation 협업 나아갈 길 중소기업 생존에 필요한 큰 그림의 정부 역할 필요
  49. 49. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Questions?
  50. 50. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA References 1. “Rating Prediction using Deep Learning and Spark”, Monika Mishra, Mingoo Kang, Jongwook Woo, The 11th International Conference on Internet (ICONI 2019), Dec 15-18 2019, Hanoi, Vietnam 2. “BigDL: Bringing Ease of Use of Deep Learning for Apache Spark”, Jason Dai, Radhika Rangarajan, Databricks, Spark Summit 2017 3. “Building Deep Learning Applications for Big Data: An Introduction to Analytics Zoo : Distributed TensorFlow, Keras and BigDL on Apache Spark”, Jennie Wang, Guoqiong Song, CVPR 2018, Salt Lake City, Utah, June 18-22 2018 4. “Building Deep Learning Applications for Big Data: An Introduction to Analytics Zoo : Distributed TensorFlow, Keras and BigDL on Apache Spark”, Jason Dai, AAAI 2019 Tutorial Forum, Thirty-Third Conference on Artificial Intelligence, January 27 – 28, 2019, Honolulu, Hawaii, USA 5. “User-based real-time product recommendations leveraging deep learning using Analytics Zoo on Apache Spark and BigDL”, Luyang Wang, Guoqiong Song, Jing (Nicole) Kong, Maneesha Bhalla, Strata Data Conference 2019, March 25-28, 2019, San Francisco, CA 6. “Leveraging NLP and Deep Learning for Document Recommendation in the Cloud”, Guoqiong Song, Spark + AI Summit 2019

×