Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Amazon EMR과 SageMaker를 이용하여 데이터를 준비하고 머신러닝 모델 개발 하기

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Cargando en…3
×

Eche un vistazo a continuación

1 de 29 Anuncio

Amazon EMR과 SageMaker를 이용하여 데이터를 준비하고 머신러닝 모델 개발 하기

Descargar para leer sin conexión

Amazon SageMaker는 머신러닝 프로젝트를 위한 통합 플랫폼입니다. SageMaker의 기능 중 Amazon SageMaker Studio는 머신러닝 통합 개발환경을 제공하여, 데이터를 준비에서부터 모델을 빌드, 교육 및 배포하는 데 필요한 모든 단계를 수행할 수 있습니다. Amazon EMR은 Apache Spark, Apache Hive 및 Presto와 같은 오픈 소스 분석 프레임워크를 사용하여 대규모 분산 데이터 처리 작업, 대화형 SQL 쿼리 및 ML 애플리케이션을 실행하기 위한 빅 데이터 플랫폼입니다. 이 세션에서는 데이터 과학자와 ML 엔지니어가 ML 워크플로우에서 분산 빅 데이터 프레임워크를 쉽게 사용할 수 있도록 상호 서비스 간의 통합에 대하여 데모를 통해 알아봅니다.

Amazon SageMaker는 머신러닝 프로젝트를 위한 통합 플랫폼입니다. SageMaker의 기능 중 Amazon SageMaker Studio는 머신러닝 통합 개발환경을 제공하여, 데이터를 준비에서부터 모델을 빌드, 교육 및 배포하는 데 필요한 모든 단계를 수행할 수 있습니다. Amazon EMR은 Apache Spark, Apache Hive 및 Presto와 같은 오픈 소스 분석 프레임워크를 사용하여 대규모 분산 데이터 처리 작업, 대화형 SQL 쿼리 및 ML 애플리케이션을 실행하기 위한 빅 데이터 플랫폼입니다. 이 세션에서는 데이터 과학자와 ML 엔지니어가 ML 워크플로우에서 분산 빅 데이터 프레임워크를 쉽게 사용할 수 있도록 상호 서비스 간의 통합에 대하여 데모를 통해 알아봅니다.

Anuncio
Anuncio

Más Contenido Relacionado

Similares a Amazon EMR과 SageMaker를 이용하여 데이터를 준비하고 머신러닝 모델 개발 하기 (20)

Más de Amazon Web Services Korea (20)

Anuncio

Más reciente (20)

Amazon EMR과 SageMaker를 이용하여 데이터를 준비하고 머신러닝 모델 개발 하기

  1. 1. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. © 2022, Amazon Web Services, Inc. or its affiliates. Amazon EMR과 SageMaker를 이용하여 데이터를 준비하고 머신러닝 모델 개발 하기 A W S F O R D A T A W E B I N A R 강성문 Sr. AIML Special Solutions Architect AWS
  2. 2. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. Agenda 2 SageMaker vs EMR EMR과 SageMaker를 이용한 대용량 데이터 준비와 머신러닝 모델 개발 ▪ 데모1. 환경 구성 ▪ 데모2. 머신러닝 모델 개발 정리
  3. 3. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. © 2022, Amazon Web Services, Inc. or its affiliates. Amazon SageMaker 와 EMR은 어떻게 다른가요?
  4. 4. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. Amazon EMR (Elastic Map Reduced)
  5. 5. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. Amazon SageMaker PREPARE SageMaker Ground Truth Label training data for machine learning SageMaker Data Wrangler Aggregate and prepare data for machine learning SageMaker Processing Built-in Python, BYO R/Spark SageMaker Feature Store Store, update, retrieve, and share features SageMaker Clarify Detect bias and understand model predictions BUILD SageMaker Studio notebooks Jupyter notebooks with elastic compute and sharing Built-in and bring-your-own algorithms Dozens of optimized algorithms or bring your own Local mode Test and prototype on your local machine SageMaker Autopilot Automatically create machine learning models with full visibility SageMaker JumpStart Pre-built solutions for common use cases TRAIN & TUNE One-click training Distributed infrastructure management SageMaker Experiments Capture, organize, and compare every step Automatic model tuning Hyperparameter optimization Distributed training libraries Training for large datasets and models SageMaker Debugger Debug and profile training runs Managed spot training Reduce training cost by 90% DEPLOY & MANAGE Fully managed deployment Fully managed, ultra-low latency, high throughput Kubernetes & Kubeflow integration Simplify Kubernetes-based machine learning Multi-model endpoints Reduce cost by hosting multiple models per instance SageMaker Model Monitor Maintain accuracy of deployed models SageMaker Edge Manager Manage and monitor models on edge devices SageMaker Pipelines Workflow orchestration and automation Amazon SageMaker SageMaker Studio Integrated development environment (IDE) for ML Not a comprehensive list. Visit aws.amazon.com/sagemaker for the latest information 데이터 준비 빌드 학습 & 튜닝 모델 배포 & 관리
  6. 6. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. Machine learning cycle Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions YES NO
  7. 7. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. Build and train models using SageMaker Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions YES NO
  8. 8. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. Manage data on AWS Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions YES NO
  9. 9. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. Example Scenario 대용량 데이터 전처리 요청 전처리 결과 활용한 모델 개발
  10. 10. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. © 2022, Amazon Web Services, Inc. or its affiliates. EMR과 SageMaker를 이용한 대용량 데이터 준비와 머신러닝 모델 개발
  11. 11. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. 목표 시스템 구성도 1 2
  12. 12. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. 구성요소 1 – SageMaker Studio notebooks
  13. 13. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. 구성요소 1 – SageMaker Studio notebooks
  14. 14. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. 구성요소 2 – AWS Service Catalog User’s custom product list VMs, containers, services ✓ 사내 정책 준수 ✓ 원클릭 배포 ✓ 자동화된 리소스 태깅 ✓ 예산관리 AWS Service Catalog User admin Bitnami Certified App: WordPress
  15. 15. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. 구성요소 2 – AWS Service Catalog Constraint 보안, 거버넌스, 배포 제어 Product IT 서비스, 리소스 Products list 허용된 Product 목록 조회 Portfolio Product의 집합 Provisioned products 서비스/리소스 생성 및 실행 AWS Service Catalog Administrator AWS Service Catalog End User JSON, YML, or Terraform
  16. 16. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. © 2022, Amazon Web Services, Inc. or its affiliates. 데모1 [플랫폼 엔지니어 대상] SageMaker Studio에서 EMR 생성하고 접속할 수 있는 환경 구성
  17. 17. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates.
  18. 18. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. 목표 시스템 구성도 2 3 1
  19. 19. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. 구성요소 3 – Apache Livy and SparkMagic https://livy.apache.org/
  20. 20. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. 구성요소 3 – Apache Livy and SparkMagic https://github.com/jupyter-incubator/sparkmagic
  21. 21. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. 구성요소 3 – Apache Livy and SparkMagic
  22. 22. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. © 2022, Amazon Web Services, Inc. or its affiliates. 데모2 [데이터 사이언티스트 대상] SageMaker Studio에서 EMR 접속하고 데이터 준비 및 머신러닝 모델 개발하기
  23. 23. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates.
  24. 24. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. © 2022, Amazon Web Services, Inc. or its affiliates. 정리
  25. 25. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. Build and train models using SageMaker Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions YES NO
  26. 26. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. 목표 시스템 구성도 1 2 3
  27. 27. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. SageMaker 에서 Spark를 사용하는 다른 방법 SageMaker Processing SageMaker Spark Library Data Data 전처리 Script SageMaker Spark Framework • SageMakerEstimator • KMeansSageMakerEstimator • PCASageMakerEstimator • XGBoostSageMakerEstimator • SageMakerModel • … EMR with SageMaker Pipeline
  28. 28. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. References 37 • SageMaker Studio EMR Integration example code - https://github.com/aws-samples/sagemaker-studio-emr • SageMaker Studio integration with EMR Workshop - https://catalog.workshops.aws/sagemaker-studio-emr/en-US • Train an ML Model using Apache Spark in EMR and deploy in SageMaker - https://github.com/aws/amazon-sagemaker- examples/blob/main/sagemaker-python-sdk/sparkml_serving_emr_mleap_abalone/sparkml_serving_emr_mleap_abalone.ipynb • Create and manage Amazon EMR clusters from SageMaker Studio to run interactive Spark and ML workloads - https://aws.amazon.com/ko/blogs/machine-learning/part-1-create-and-manage-amazon-emr-clusters-from-sagemaker-studio-to-run- interactive-spark-and-ml-workloads/ • Prepare data at scale with SageMaker Studio notebooks - https://docs.aws.amazon.com/sagemaker/latest/dg/studio-notebooks-emr- cluster.html • Connect SageMaker Studio Notebooks in a VPC to External Resources - https://docs.aws.amazon.com/sagemaker/latest/dg/studio-notebooks- and-internet-access.html • Apache Livy - https://livy.apache.org/ • Spark Magic - https://github.com/jupyter-incubator/sparkmagic • Use Apache Spark with Amazon SageMaker - https://docs.aws.amazon.com/sagemaker/latest/dg/apache-spark.html • Amazon SageMaker Processing (with Spark) - https://sagemaker.readthedocs.io/en/stable/amazon_sagemaker_processing.html#amazon- sagemaker-processing • Train an ML Model using Apache Spark in EMR and deploy in SageMaker - https://sagemaker-examples.readthedocs.io/en/latest/sagemaker- python-sdk/sparkml_serving_emr_mleap_abalone/sparkml_serving_emr_mleap_abalone.html • SageMaker Pipeline Step (with EMR) - https://docs.aws.amazon.com/sagemaker/latest/dg/build-and-manage-steps.html
  29. 29. AWS FOR DATA WEBINAR – SAGEMAKER WITH EMR © 2022, Amazon Web Services, Inc. or its affiliates. Thank you! © 2022, Amazon Web Services, Inc. or its affiliates. 강성문 kseongmo@amazon.com

×