Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Where ml ai_heavy

whereml

  • Sé el primero en comentar

Where ml ai_heavy

  1. 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Randall Hunt – Some Guy From Los Angeles @WhereML a Serverless AI Powered Location Guessing Twitter Bot Built with Amazon SageMaker and AWS Lambda Based on LocationNet work by Jaeyoung Choi and Kevin Li
  2. 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt About Me • Technical Evangelist at AWS • I build some demos: https://github.com/ranman • I write some blogs: https://aws.amazon.com/blogs/aws/author/randhunt/ • Formerly of SpaceX, NASA, MongoDB • I like Python • I dislike javascript • I look ridiculous in my badge photo
  3. 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Example Try it! Tweet to @WhereML with a picture. Hold your cell phone camera to the screen
  4. 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Architecture AWS Lambda FunctionAmazon API Gateway Amazon SageMaker Model Artifacts Inference Endpoint Inference code Amazon ECR Inference code
  5. 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Solving Some Of The Hardest Problems In Computer Science Learning Language Perception Problem Solving Reasoning
  6. 6. The (60 year) rise of Artificial Intelligence
  7. 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Put machine learning in the hands of every developer and data scientist ML @ AWS: Our mission
  8. 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Customer Running ML on AWS Today
  9. 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Reviewing The ML Process
  10. 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
  11. 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
  12. 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
  13. 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Visualization & Analysis Business Problem – ML problem framing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering Model Training & Parameter Tuning Model Evaluation Are Business Goals met? Model Deployment Monitoring & Debugging – Predictions YesNo DataAugmentation Feature Augmentation The Machine Learning Process Re-training
  14. 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Visualization & Analysis Business Problem – ML problem framing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering Model Training & Parameter Tuning Model Evaluation Are Business Goals met? Model Deployment Monitoring & Debugging – Predictions YesNo DataAugmentation Feature Augmentation Discovery: The Analysts Re-training • Help formulate the right questions • Domain Knowledge
  15. 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Visualization & Analysis Business Problem – ML problem framing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering Model Training & Parameter Tuning Model Evaluation Are Business Goals met? Model Deployment Monitoring & Debugging – Predictions YesNo DataAugmentation Feature Augmentation Integration: The Data Architecture Retraining • Build the data platform: • Amazon S3 • AWS Glue • Amazon Athena • Amazon EMR • Amazon Redshift Spectrum
  16. 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Visualization & Analysis Feature Engineering Model Training & Parameter Tuning Model Evaluation • Setup and manage Notebook Environments • Setup and manage Training Clusters • Write Data Connectors • Scale ML algorithms to large datasets • Distribute ML training algorithm to multiple machines • Secure Model artifacts Why We built Amazon SageMaker: The Model Training Undifferentiated Heavy Lifting
  17. 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Business Problem – Model Deployment Monitoring & Debugging – Predictions • Setup and manage Model Inference Clusters • Manage and Scale Model Inference APIs • Monitor and Debug Model Predictions • Models versioning and performance tracking • Automate New Model version promotion to production (A/B testing) Why We built Amazon SageMaker: The Model Deployment Undifferentiated Heavy Lifting
  18. 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A fully managed service that enables data scientists and developers to quickly and easily build machine-learning based models into production smart applications. Amazon SageMaker
  19. 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker 1 2 3 4 I I I I Notebook Instances Algorithms ML Training Service ML Hosting Service
  20. 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 1 I Notebook Instances Zero Setup For Exploratory Data Analysis Authoring & Notebooks ETL Access to AWS Database services Access to S3 Data Lake • Recommendations/Personalization • Fraud Detection • Forecasting • Image Classification • Churn Prediction • Marketing Email/Campaign Targeting • Log processing and anomaly detection • Speech to Text • More… “Just add data”
  21. 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Streaming datasets, for cheaper training Train faster, in a single pass Greater reliability on extremely large datasets Choice of several ML algorithms Amazon SageMaker: 10x better algorithms 2 I Algorithms
  22. 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Cost vs. Time $$$$ $$$ $$ $ Minutes Hours Days Weeks Months Single Machine Distributed, with Strong Machines
  23. 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Infinitely Scalable ML Algorithms
  24. 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 2 I Algorithms Training code • Matrix Factorization • Regression • Principal Component Analysis • K-Means Clustering • Gradient Boosted Trees • And More! Amazon provided Algorithms Bring Your Own Script (IM builds the Container) IM Estimators in Apache Spark Bring Your Own Algorithm (You build the Container) Amazon SageMaker: 10x better algorithms
  25. 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Managed Distributed Training with Flexibility Training code • Matrix Factorization • Regression • Principal Component Analysis • K-Means Clustering • Gradient Boosted Trees • And More! Amazon provided Algorithms Bring Your Own Script (IM builds the Container) Bring Your Own Algorithm (You build the Container) 3 I ML Training Service Fetch Training data Save Model Artifacts Fully managed – Secured– Amazon ECR Save Inference Image IM Estimators in Apache Spark CPU GPU HPO
  26. 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 4 I ML Hosting Service Amazon ECR Amazon SageMaker Easy Model Deployment to Amazon SageMaker Versions of the same inference code saved in inference containers. Prod is the primary one, 50% of the traffic must be served there!
  27. 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 4 I ML Hosting Service Amazon ECR Model Artifacts Inference Image Versions of the same inference code saved in inference containers. Prod is the primary one, 50% of the traffic must be served there! Create a Model ModelName: prod Amazon SageMaker Easy Model Deployment to Amazon SageMaker
  28. 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 4 I ML Hosting Service Amazon ECR Model Artifacts Inference Image Model versions Versions of the same inference code saved in inference containers. Prod is the primary one, 50% of the traffic must be served there! Create versions of a Model Amazon SageMaker Easy Model Deployment to Amazon SageMaker
  29. 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 4 I ML Hosting Service Amazon ECR 30 50 10 10 InstanceType: c3.4xlarge InitialInstanceCount: 3 ModelName: prod VariantName: primary InitialVariantWeight: 50 ProductionVariant Model Artifacts Inference Image Model versions Versions of the same inference code saved in inference containers. Prod is the primary one, 50% of the traffic must be served there! Create weighted ProductionVariants Amazon SageMaker Easy Model Deployment to Amazon SageMaker
  30. 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 4 I ML Hosting Service Amazon ECR 30 50 10 10 ProductionVariant Model Artifacts Inference Image Model versions Versions of the same inference code saved in inference containers. Prod is the primary one, 50% of the traffic must be served there! Create an EndpointConfiguration from one or many ProductionVariant(s)EndpointConfiguration Amazon SageMaker Easy Model Deployment to Amazon SageMaker InstanceType: c3.4xlarge InitialInstanceCount: 3 ModelName: prod VariantName: primary InitialVariantWeight: 50
  31. 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 4 I ML Hosting Service Amazon ECR 30 50 10 10 ProductionVariant Model Artifacts Inference Image Model versions Versions of the same inference code saved in inference containers. Prod is the primary one, 50% of the traffic must be served there! Create an Endpoint from one EndpointConfiguration EndpointConfiguration Inference Endpoint Amazon SageMaker Easy Model Deployment to Amazon SageMaker InstanceType: c3.4xlarge InitialInstanceCount: 3 ModelName: prod VariantName: primary InitialVariantWeight: 50
  32. 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 4 I ML Hosting Service Amazon ECR 30 50 10 10 ProductionVariant Model Artifacts Inference Image Model versions Versions of the same inference code saved in inference containers. Prod is the primary one, 50% of the traffic must be served there! One-Click! EndpointConfiguration Inference Endpoint Amazon Provided Algorithms Amazon SageMaker Easy Model Deployment to Amazon SageMaker InstanceType: c3.4xlarge InitialInstanceCount: 3 ModelName: prod VariantName: primary InitialVariantWeight: 50
  33. 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 4 I ML Hosting Service  Auto-Scaling Inference APIs  A/B Testing (more to come)  Low Latency & High Throughput  Bring Your Own Model  Python SDK Amazon SageMaker Easy Model Deployment to Amazon SageMaker
  34. 34. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Building the Model
  35. 35. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Credits • This model, LocationNet, was built by Jaeyoung Choi of the International Computer Science Institute and Kevin Li of the University of California, Berkley • Supported by the AWS Cloud Credits for Research Program • Based on work on PlaNet by Weyland et. all Ref: arxiv.org/abs/1602.05314 : PlaNet—Photo Geolocation with Convolutional Neural Networks
  36. 36. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt LocationNet Model Approach • Model trained and built with Apache MXNet • Trained with 33.9 million geo-tagged images from the AWS Multimedia Commons Dataset for 12 epochs over 9 days using a single p2.16xlarge. • Uses Google’s S2 Spherical Geometry library to subdivide the earth into 15,527 of multi-scale geographic cells which serve as classes for the data. • Built on ResNet-101 architecture
  37. 37. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Model Architecture
  38. 38. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Example S2 Multi-scale partitioning Ref: arxiv.org/abs/1602.05314 : PlaNet—Photo Geolocation with Convolutional Neural Networks
  39. 39. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
  40. 40. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Pros / Cons of this approach • Surprisingly precise in cities with larger numbers of partitions • Fast inference (<100ms on t2.large) • Excellent performance for unique objects / landmarks • Small model <300 MB can be deployed anywhere • Surprisingly inaccurate in locales with fewer partitions. • Poor performance for common objects / terrain
  41. 41. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Infrastructure for WhereML
  42. 42. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Architecture 1. Twitter Webhook calls to API Gateway endpoint 2. API Gateway invokes Lambda function with payload from twitter 3. Lambda function calls out to SageMaker Inference Endpoint with URL of image 4. Inference endpoint downloads image and classifies it with LocationNet 5. Lambda posts results back to Twitter AWS Lambda FunctionAmazon API Gateway SageMaker Inference Endpoint
  43. 43. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Amazon API Gateway
  44. 44. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt AWS Lambda Function • Proxy-Invocation from API Gateway sends entire request to the Lambda • AWS Lambda Function: 1. Parses the incoming request 2. Verifies it is from Twitter and verifies message integrity 3. Parses the tweet 4. Sends the media URL in tweet to SageMaker endpoint 5. Uses twitter API to respond to original tweet • Billed per GB/Second. 400,000 GB/S perpetual free-tier. • Any scale of requests, but SageMaker endpoint limited to 10,000 TPS • Python 🐍
  45. 45. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Lambda Function – Verify Request 1 def lambda_handler(events, context): 2 # deal with bad requests 3 if event.get('path') != WEBHOOK_PATH: 4 return {'statusCode': 404, 'body': ''} 5 # deal with subscription calls 6 if event.get('httpMethod') == 'GET': 7 crc = event.get('queryStringParameters', {}).get('crc_token') 8 if not crc: return {'statusCode': 401, 'body': 'bad crc'} 9 return {'statusCode': 200, 'body': sign_crc(crc)} 10 # deal with bad crc 11 if not verify_request(event, context): 12 return {'statusCode': 400, 'body': 'bad crc'}
  46. 46. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Lambda Function – Verify Request Utilities 1 def sign_crc(crc): 2 h = hmac.new( 3 bytes(CONSUMER_SECRET, 'ascii'), bytes(crc, 'ascii'), 4 digestmod=sha256) 5 return json.dumps({ 6 "response_token": "sha256="+b64encode(h.digest()).decode() 7 }) 8 9 def verify_request(event, context): 10 crc = event['headers']['X-Twitter-Webhooks-Signature'] 11 h = hmac.new( 12 bytes(CONSUMER_SECRET, 'ascii'), 13 bytes(event['body'], 'utf-8'), 14 digestmod=sha256) 15 crc = b64decode(crc[7:]) # strip out the first 7 characters ("sha256=") 16 return hmac.compare_digest(h.digest(), crc)
  47. 47. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Lambda Function – SageMaker and Twitter 1 def lambda_handler(events, context): 2 # we're good! load that event up 3 twitter_events = json.loads(event['body']) 4 for event in twitter_events.get('tweet_create_events', []): 5 if validate_record(event): 6 body = json.dumps({'url': media, 'max_predictions': MAX_PREDICTIONS}) 7 results = json.loads( 8 sagemaker.invoke_endpoint(EndpointName=ENDPOINT_NAME, Body=body) 9 )['Body'].read() 10 status = build_tweet(results) 11 twitter_api.PostUpdate( 12 "📍 ?n" + status[0], 13 media=status[1], 14 in_reply_to_status_id=event['id_str'], 15 auto_populate_reply_metadata=True 16 )
  48. 48. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt SageMaker Architecture • Docker container stored in ECR • Autoscaling Endpoint spins up containers as needed and automatically fetches model artifacts from S3 and puts them in /opt/models • Flask app responds to /ping and /inference Amazon SageMaker Model Artifacts Inference Endpoint Inference code Amazon ECR Inference code
  49. 49. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt SageMaker Inference Code – Load Model 1 import mxnet as mx 2 import numpy as np 3 sym, arg_params, aux_params = mx.model.load_checkpoint(MODEL_NAME, 12) 4 mod = mx.mod.Module(symbol=sym, context=mx.cpu()) 5 mod.bind([('data', (1, 3, 224, 224))], for_training=False) 6 mod.set_params(arg_params, aux_params, allow_missing=True) 7 Batch = namedtuple('Batch', ['data']) 8 grids = load_grids()
  50. 50. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt SageMaker Inference Code – Predict 1 def predict(img, max_predictions): 2 mod.forward(Batch(img), is_train=False) 3 prob = mod.get_outputs()[0].asnumpy()[0] 4 pred = np.argsort(prob)[::-1] 5 result = [] 6 for i in range(max_predictions): 7 pred_loc = grids[int(pred[i])] 8 result.append((pred_loc, prob[])) 9 return result 10 11 def download_and_predict(url, max_predictions=3): 12 img = preprocess_image(download_image(url)) 13 return predict(img, max_predictions)
  51. 51. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt SageMaker Inference Code – Flask App 1 from flask import request, jsonify, Flask 2 import predict 3 app = Flask("WhereML") 4 5 @app.route("/ping") 6 def ping(): 7 return "", 200 8 9 @app.route("/invocations", methods=["POST"]) 10 def invoke(): 11 data = request.get_json(force=True) 12 return jsonify(predict.download_and_predict(data['url']))
  52. 52. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt SageMaker Inference Code – Docker File 1 FROM mxnet/python:latest 2 WORKDIR /app 3 RUN pip install -U flask scikit-image numpy reverse_geocoder boto3 4 COPY *.py /app/ 5 COPY grids.txt /app/ 6 ENTRYPOINT ["python", "app.py"] 7 EXPOSE 8080
  53. 53. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Working with Twitter
  54. 54. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Twitter API • Two ways to respond to mentions: User Streams API (deprecated) and Account Activity API (beta) • Account Activity Webhooks allow fully “serverless” approach • UserStreams API requires running container/instance to poll for updates • UserStreams API is going away in June 2018… even though the replacement for it is not GA…
  55. 55. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Registering Webhook 1 twitter = OAuth1Session(**keys) 2 base_url = "https://api.twitter.com/1.1/all/env-beta/" 3 params={'url': "https://mywebsite.com/twitter/whereml"} 4 webhook_id = twitter.post(base_url+"webhooks.json", params).json()['id'] 5 # pass webhook ID in prod, not needed in beta 6 twitter.post(base_url+"subscriptions.json")
  56. 56. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt SageMaker Notebooks Training Algorithm SageMaker Training Amazon ECR Code Commit Code Pipeline SageMaker Hosting dataset AWS Lambda API Gateway SageMaker Example End-to-End Architecture Build Train Deploy static website hosted on S3 Inference requests Amazon S3 Amazon Cloudfront Web assets on Cloudfront
  57. 57. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Easy to get started! Tons of tools Machine Learning is FUN
  58. 58. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Built live on twitch.tv/aws
  59. 59. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt Thank you! randhunt@amazon.com

    Sé el primero en comentar

    Inicia sesión para ver los comentarios

  • ILIYASMANSUREE

    Jul. 19, 2018
  • BillChen26

    Jul. 19, 2018

whereml

Vistas

Total de vistas

347

En Slideshare

0

De embebidos

0

Número de embebidos

5

Acciones

Descargas

23

Compartidos

0

Comentarios

0

Me gusta

2

×