Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

The Azure Cognitive Services on Spark: Clusters with Embedded Intelligent Services

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio

Eche un vistazo a continuación

1 de 27 Anuncio

The Azure Cognitive Services on Spark: Clusters with Embedded Intelligent Services

Descargar para leer sin conexión

We present the Azure Cognitive Services on Spark, a simple and easy to use extension of the SparkML Library to all Azure Cognitive Services. This integration allows Spark Users to embed cloud intelligence directly into their spark computations, enabling a new generation of intelligent applications on Spark. Furthermore, we show that with our new Containerized Cognitive Services, one can embed cloud intelligence directly into the Spark cluster for ultra-low latency, on-prem, and offline applications. We show how using our Integration, one can compose these cognitive services with other services, SQL computations, and Deep Networks to create sophisticated and intelligent heterogenous applications. Moreover, we show how to redeploy these compositions as Restful Services with Spark Serving. We will also explore the architecture of these contributions which leverage HTTP on Spark, a novel integration between Spark with the widely used Hypertext Transfer Protocol (HTTP). This library can integrate any framework into the Spark ecosystem that is capable of communicating through HTTP. Finally, we demonstrate how to use these services to create a large class of intelligent applications such as custom search engines, realtime facial recognition systems, and unsupervised object detectors.

We present the Azure Cognitive Services on Spark, a simple and easy to use extension of the SparkML Library to all Azure Cognitive Services. This integration allows Spark Users to embed cloud intelligence directly into their spark computations, enabling a new generation of intelligent applications on Spark. Furthermore, we show that with our new Containerized Cognitive Services, one can embed cloud intelligence directly into the Spark cluster for ultra-low latency, on-prem, and offline applications. We show how using our Integration, one can compose these cognitive services with other services, SQL computations, and Deep Networks to create sophisticated and intelligent heterogenous applications. Moreover, we show how to redeploy these compositions as Restful Services with Spark Serving. We will also explore the architecture of these contributions which leverage HTTP on Spark, a novel integration between Spark with the widely used Hypertext Transfer Protocol (HTTP). This library can integrate any framework into the Spark ecosystem that is capable of communicating through HTTP. Finally, we demonstrate how to use these services to create a large class of intelligent applications such as custom search engines, realtime facial recognition systems, and unsupervised object detectors.

Anuncio
Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a The Azure Cognitive Services on Spark: Clusters with Embedded Intelligent Services (20)

Anuncio

Más de Databricks (20)

Más reciente (20)

Anuncio

The Azure Cognitive Services on Spark: Clusters with Embedded Intelligent Services

  1. 1. Mark Hamilton, Microsoft, marhamil@microsoft.com Anand Raman, Microsoft, aram@microsoft.com The Azure Cognitive Services on Spark: Clusters with Embedded Intelligent Services #UnifiedAnalytics #SparkAISummit
  2. 2. Overview • The Cognitive Services on Spark – Basic Usage – Fluent Design • HTTP on Spark – Architecture and Principles • Clusters with Embedded Services – Kubernetes, Databricks • Examples – GANs + the Metropolitan Museum of Art 2#UnifiedAnalytics #SparkAISummit
  3. 3. Motivation • Azure Cognitive Services provide high quality pre- built intelligent services • No need for time intensive model training or deployment • Can quickly create intelligent applications • Leverage Microsoft Research and Azure ML 3#UnifiedAnalytics #SparkAISummit • http://www.seeingai.com
  4. 4. Object, scene, and activity detection Face recognition and identification Celebrity and landmark recognition Emotion recognition Text and handwriting recognition (OCR) Customizable image recognition Video metadata, audio, and keyframe extraction and analysis Explicit or offensive content moderation Speech transcription (speech-to-text) Custom speech models for unique vocabularies or complex environment Text-to-speech Custom Voice Real-time speech translation Customizable speech transcription and translation Speaker identification and verification Language detection Named entity recognition Key phrase extraction Text sentiment analysis Multilingual and contextual spell checking Explicit or offensive text content moderation PII detection for text moderation Text translation Customizable text translation Contextual language understanding Q&A extraction from unstructured text Knowledge base creation from collections of Q&As Semantic matching for knowledge bases Customizable content personalization learning Ad-free web, news, image, and video search results Trends for video, news Image identification, classification and knowledge extraction Identification of similar images and products Named entity recognition and classification Knowledge acquisition for named entities Search query autosuggest Ad-free custom search engine creation Vision Speech Language Knowledge Search
  5. 5. Azure Cognitive Services on Spark • Easy to use integration between Spark and the Azure Cognitive Services • Composable and pipelinable with all other SparkML models! • Python, Scala, R (Beta) 5#UnifiedAnalytics #SparkAISummit val df = new TextSentiment() .setTextCol(“text”) .setOutputCol(“sentiment”) .transform(inputs)
  6. 6. http://www.seeingai.com
  7. 7. Fluent API for Advanced Orchestration • Any parameter can be set with a dataframe column or with a single value 7#UnifiedAnalytics #SparkAISummit new BingImageSearch() .setQueryCol(“queries”) queries Cat Dog Antelope Car Bob Ross Get results for multiple search terms:
  8. 8. new BingImageSearch() .setQuery(“cats”) .setOffsetCol(“offsets”) Fluent API for Advanced Orchestration • Any parameter can be set with a dataframe column or with a single value 8#UnifiedAnalytics #SparkAISummit offsets 0 100 200 300 400 Get the first N pages of Bing for a specific term:
  9. 9. Fluent API for Advanced Orchestration • Any parameter can be set with a dataframe column or with a single value 9#UnifiedAnalytics #SparkAISummit offsets queries keys 0 Cat 17… 100 Cat 17… 0 Tree 3e… 100 Tree 4q… 0 Car G1… Get the get fist 200 results for many terms using several different accounts: new BingImageSearch() .setQueryCol(“queries”) .setOffsetCol(“offsets”) .setKeyCol(“keys”)
  10. 10. High Performance Capabilities OOTB • Asynchronous Parallelism (P) • Automatic Batching (B) • Automatic Retries – Exponential Back-offs (EBO) – Backpressure (BP) 10#UnifiedAnalytics #SparkAISummit Features Time (s) Errors # None 30.8 18993 EBO+BP 1163.0 0 EBO+BP+B 57.1 0 EBO+BP+B+P 49.7 0 10 nodes, 20k Requests, 1k req/min limited service
  11. 11. • Full Integration between HTTP Protocol and Spark SQL • Spark as a Microservice Orchestrator • Spark + X 11#UnifiedAnalytics #SparkAISummit df = SimpleHTTPTransformer() .setInputParser(JSONInputParser()) .setOutputParser(JSONOutputParser() .setDataType(schema)) .setOutputCol("results") .setUrl(…) on
  12. 12. 12#UnifiedAnalytics #SparkAISummit on Spark Worker Partition Partition Partition Client Client Client Web Service Spark Worker Partition Partition Partition Client Client Client Local Service Local Service Local Service HTTP Requests and Responses
  13. 13. Cognitive Service Containers 13#UnifiedAnalytics #SparkAISummit Now In Public Preview • No app changes & Compatible with full Cognitive Services feature-set • Support for 6 key AI capabilities: • Key Phrase Extraction • Language Detection • Sentiment Analysis • Face & Emotion Detection • OCR / Text Recognition • Language Understanding • Run & manage locally, Try for free • Connect to Billing service for report back, unified billing with on-cloud and off-cloud transactions • Additional Capabilities coming soon (e.g. Speech)
  14. 14. #UnifiedAnalytics #SparkAISummit 14 Clusters with Embedded Services • Deploy cognitive services directly onto cluster worker nodes • Bring the compute to the data • Use low latency in- machine networking Spark Worker Spark Scala Process PySpark Local Cognitive Service Pyspark Protocol HTTP
  15. 15. Azure Kubernetes Service + Helm • Works on any k8s cluster • Helm: Package Manager for Kubernetes 15#UnifiedAnalytics #SparkAISummit Kubernetes (AKS, ACS, GKE, On-Prem etc) K8s workerK8s worker Spark Worker Spark Worker K8s worker Cognitive Service Container HTTP on Spark Spark Worker Cognitive Service Container HTTP on Spark Spark Worker Cognitive Service Container HTTP on Spark Spark Serving Load Balancer Jupyter, Zepplin, LIVY, or Spark Submit LB Zepplin Jupyter Storage or other Databases Cloud Cognitive Services Spark Serving Hotpath HTTP on Spark Spark Readers REST Requests to Deployed Models Submit Jobs, Run Notebooks, Manage Cluster, etc Users / Apps helm repo add mmlspark https://dbanda.github.io/charts helm install mmlspark/spark --set localTextApi=true Dalitso Banda, dbanda@microsoft.com Microsoft AI Development Acceleration Program
  16. 16. Creating a Visual Search Engine for the Metropolitan Museum of Art 16#UnifiedAnalytics #SparkAISummit https://gen.studio
  17. 17. Intelligent Image Annotation • The MET Released 400k Images under Open Access • Pipe images through Computer Vision API to annotate image for searching 17#UnifiedAnalytics #SparkAISummit A picture containing a person A picture containing a glass, cup A fish swimming underwater Query Image: Describe Image Output: Deep Feature Nearest Neighbors:
  18. 18. Reverse Image Search Architecture 18#UnifiedAnalytics #SparkAISummit Filters from Zeiler + Fergus 2013 Query Image ResNet Featurizer Deep Features Closest Match Fast Nearest Neighbor Lookup MMLSpark SparkML LSH or Annoy
  19. 19. Example Nearest Neighbors 19#UnifiedAnalytics #SparkAISummit QueryImages Nearest Neighbors
  20. 20. Spark x Azure Search • Azure Search Sink for Spark • Allows for pushing thousands of documents per second into Azure Search instances • Built on HTTP on Spark • Use to create search APIs on top of Spark Dataframe 20#UnifiedAnalytics #SparkAISummit
  21. 21. 21#UnifiedAnalytics #SparkAISummit Microsoft Machine Learning for Apache Spark v0.16 Microsoft’s Open Source Contributions to Apache Spark www.aka.ms/spark Azure/mmlspark Cognitive Services Spark Serving Model Interpretability LightGBM Gradient Boosting Deep Networks with CNTK HTTP on Spark
  22. 22. Conclusions • Can now embed Cognitive Services into Spark Workflows • Can harness Spark Cluster for Microservices • Get started now with interactive examples! 22#UnifiedAnalytics #SparkAISummit www.aka.ms/spark Contact: marhamil@microsoft.com mmlspark-support@microsoft.com Azure/mmlspark Help us advance Spark:
  23. 23. Thanks To • Sudarshan Raghunathan • Ilya Matiach • Microsoft NERD Garage Team + MIT Externship Program • Microsoft Development Acceleration Team: – Dalitso Banda, Casey Hong, Karthik Rajendran, Manon Knoertzer, Tayo Amuneke, Alejandro Buendia • Pablo Castro, Chris Hoder, Ryan Gaspar, Henrik Neilsen, Joseph Sirosh, Andrew Schonhoffer, Daniel Ciborowski, Markus Cosowicz • Azure CAT, AzureML, and Azure Search Teams 23#UnifiedAnalytics #SparkAISummit
  24. 24. Backup Slides 24#UnifiedAnalytics #SparkAISummit
  25. 25. Real or Generated ? Noise Vector Generator Generated Image Training Data Discriminator Real or Generated ?
  26. 26. Learned Noise Vector Generator Generated Image Target Image Pretrained ResNet 50 𝐿𝑜𝑠𝑠 𝑝𝑖𝑥𝑒𝑙 + 𝐿𝑜𝑠𝑠𝑠𝑒𝑚𝑎𝑛𝑡𝑖𝑐 × 𝜆
  27. 27. Inverted Noise Vector 1 Inverted Noise Vector 2 𝐺−1 𝐺−1 𝐺 𝐺 𝐺 𝐺 𝐺 𝐺 Code Space Interpolation

×