Quoc Le, Software Engineer, Google at MLconf SF

•Descargar como PPTX, PDF•

3 recomendaciones•3,843 vistas

Title: Deep Learning for Language Understanding Abstract: Many current language understanding algorithms rely on expert knowledge to engineer models and features. In this talk, I will discuss how to use Deep Learning to understand texts without much prior knowledge. In particular, our algorithms will learn the vector representations of words. These vector representations can be used to solve word analogy or translate unknown words between languages. Our algorithms also learn vector representations of sentences and documents. These vector representations preserve the semantics of sentences and documents and therefore can be used for machine translation, text classification, information retrieval and sentiment analysis.

Tecnología

Sequence Learning for
Language Understanding
Presenter: Quoc V. Le
Google
Thanks: Andrew Dai, Jeff Dean, Matthieu Devin, Geoff
Hinton, Thang Luong, Rajat Monga, Ilya Sutskever, Oriol
Vinyals

Sequence Learning
Typical success of Machine Learning: Mapping fixed length input to
a scalar value:
- Image recognition (Pixels -> “cat”)
- Speech recognition (Waveforms -> the utterance of “cat”)
Many language understanding problems require mapping from
sequences to sequences:
- Machine Translation (“I love music” -> “Je aime la musique”)
Quoc V. Le

How does Machine Translation work?
Use a dictionary to translate one word at a time
Use a model put reorder the words so that the sentence looks
reasonable.
Lots of rules:
- Phrases instead of words (“New York” should not be translated
as “New” + “York”)
- Meaning of words depend on contexts
Quoc V. Le

Ideas:
Sequence Learning
- Use a Recurrent Neural Net encoder to map an input sequence
to a vector
- Use a Recurrent Neural Net decoder to map the vector to
another sequence
Quoc V. Le

Sequence Learning
W X Y Z <EOS>
Quoc V. Le
Example network that maps ABC -> WXYZ
A B C <EOS> W X Y Z
At test time, feed the output back into the decoder as the input
For better output sequence, generate many candidates, feed each
candidate to the decoder to have a beam of possible sequences
Use “beam search” to find the top sequences

A machine translation experiment
WMT’2014 (small in comparison to Google’s data):
- State-of-art (a combination of many methods, took 20 years to
develop): 37
- Our method (took 3 person year): 37
Important achievement because it’s a new way to represent input
texts and output texts. Potential breakthrough in many other areas
of language understanding.
Quoc V. Le

Sequence Learning
W X Y Z <EOS>
A B C <EOS> W X Y Z
Quoc V. Le

Contact: Quoc V. Le (qvl@google.com),
Ilya Sutskever (ilyasu@google.com),
Oriol Vinyals (vinyals@google.com)
Minh-Thang Luong (lmthang@cs.stanford.edu)
Paper: Sequence to Sequence Learning with Neural Networks
Addressing the Rare Word Problem in Neural Machine
Translation
Upcoming NIPS paper
Quoc V. Le

Más contenido relacionado

Destacado

Abstract: Introducing the Metric Optimization Engine (MOE); an open source, black box, Bayesian Global Optimization engine for optimal experimental design. In this talk we will introduce MOE, the Metric Optimization Engine. MOE is an efficient way to optimize a system’s parameters, when evaluating parameters is time-consuming or expensive. It can be used to help tackle a myriad of problems including optimizing a system’s click-through or conversion rate via A/B testing, tuning parameters of a machine learning prediction method or expensive batch job, designing an engineering system or finding the optimal parameters of a real-world experiment. MOE is ideal for problems in which the optimization problem’s objective function is a black box, not necessarily convex or concave, derivatives are unavailable, and we seek a global optimum, rather than just a local one. This ability to handle black-box objective functions allows us to use MOE to optimize nearly any system, without requiring any internal knowledge or access. To use MOE, we simply need to specify some objective function, some set of parameters, and any historical data we may have from previous evaluations of the objective function. MOE then finds the set of parameters that maximize (or minimize) the objective function, while evaluating the objective function as few times as possible. This is done internally using Bayesian Global Optimization on a Gaussian Process model of the underlying system and finding the points of highest Expected Improvement to sample next. MOE provides easy to use Python, C++, CUDA and REST interfaces to accomplish these goals and is fully open source. We will present the motivation and background, discuss the implementation and give real-world examples.

Scott Clark, Software Engineer, Yelp at MLconf SF

MLconf

Abstract: How graphs became just another big data primitive Graph-shaped data is used in product recommendation systems, social network analysis, network threat detection, image de-noising, and many other important applications. And, a growing number of these applications will benefit from parallel distributed processing for graph featuring engineering, model training, and model serving. But today’s graph tools are riddled with limitations and shortcomings, such as a lack of language bindings, streaming support, and seamless integration with other popular data services. In this talk, we’ll argue that the key to doing more with graphs is doing less with specialized systems and more with systems already good at handling data of other shapes. We’ll examine some practical data science workflows to further motivate this argument and we’ll talk about some of the things that Intel is doing with the open source community and industry to make graphs just another big data primitive.

Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF

MLconf

Title: Factorization Machines Abstract: Developing accurate recommender systems for a specific problem setting seems to be a complicated and time-consuming task: models have to be defined, learning algorithms derived and implementations written. In this talk, I present the factorization machine (FM) model which is a generic factorization approach that allows to be adapted to problems by feature engineering. Efficient FM learning algorithms are discussed among them SGD, ALS/CD and MCMC inference including automatic hyperparameter selection. I will show on several tasks, including the Netflix prize and KDDCup 2012, that FMs are flexible and generate highly competitive accuracy. With FMs these results can be achieved by simple data preprocessing and without any tuning of regularization parameters or learning rates.

Steffen Rendle, Research Scientist, Google at MLconf SF

MLconf

Video recording (no audio?): http://new.livestream.com/accounts/7874891/events/3565981/videos/68114143 from 32:00 to 54:30 Deep Learning has been dominating recent machine learning competitions with better predictions. Unlike the neural networks of the past, modern Deep Learning methods have cracked the code for training stability and generalization. Deep Learning is not only the leader in image and speech recognition tasks, but is also emerging as the algorithm of choice for highest predictive performance in traditional business analytics. This talk introduces Deep Learning and implementation concepts in the open-source H2O in-memory prediction engine. Designed for the solution of business-critical problems on distributed compute clusters, it offers advanced features such as adaptive learning rate, dropout regularization, parameter tuning and a fully-featured R interface. World record performance on the classic MNIST dataset, best-in-class accuracy for a high-dimensional eBay text classification problem and other relevant datasets showcase the power of this game-changing technology. A whole new ecosystem of Intelligent Applications is emerging with Deep Learning at its core. Bio: Prior to joining 0xdata as Physicist & Hacker, Arno was a founding Senior MTS at Skytree where he designed and implemented high-performance machine learning algorithms. He has over a decade of experience in HPC with C++/MPI and had access to the world’s largest supercomputers as a Staff Scientist at SLAC National Accelerator Laboratory where he participated in US DOE scientific computing initiatives. While at SLAC, he authored the first curvilinear finite-element simulation code for space-charge dominated relativistic free electrons and scaled it to thousands of compute nodes. He also led a collaboration with CERN to model the electromagnetic performance of CLIC, a ginormous e+e- collider and potential successor of LHC. Arno has authored dozens of scientific papers and was a sought-after academic conference speaker. He holds a PhD and Masters summa cum laude in Physics from ETH Zurich. Arno was named 2014 Big Data All-Star by Fortune Magazine. - Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai - To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata

MLconf - Distributed Deep Learning for Classification and Regression Problems...

Sri Ambati

Abstract: Apache Spark’s MLlib is a terrific library for fitting large-scale machine learning models. However, translating high-level problem statements like “learn a classifier” into a working model presently requires significant manual effort (via ad hoc parameter tuning) and computational resources (to fit several models). We present our work on the MLbase optimizer – a system designed on top of Spark to quickly and automatically search through a hyperparameter space and find a good model. By leveraging performance enhancements, better search algorithms, and statistical heuristics, our system offers an order of magnitude speedup over standard methods.

Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF

MLconf

10 Lessons Learned from Building Machine Learning Systems

Xavier Amatriain

Sequence learning under incidental conditions [poster]

Fayme Yeates

Pnomics-2015-FINAL-kbs

Katherine Schuetz

llvm-py: Writing Compilers In Python

mdevan

CogSci2014-kbs-2

Katherine Schuetz

Recurrent Neural Networks are a powerful class of statistical models which allow neural networks to deal with sequential data. They have recently become a powerful component within the Deep Learning community for achieving state of the art performance on tasks such as captioning, translation, and summarization. This talk will provide a brief introduction to the terminology of recurrent neural networks and then focus on how to create and train them from Python. I will show network implementations using several popular Python deep learning libraries (Keras, Lasagne, Chainer) and discuss their performance and extensibility.

Python libraries for Deep Learning with Sequences

Alex Rubinsteyn

Cognitive Science in Virtual Worlds

bangor

These are slides presented at MLconf in San Francisco, November 14, 2014. I share the approach to real-time machine learning for recommender systems developed at if(we). We achieve rapid iterative cycles by adhering to a strict approach to structuring and accessing our data, as well as to building the online features that comprise our models. These developments support teams of data scientist and data engineers, who work together to solve complex recommendation problems. We also introduce the Antelope Realtime Events framework, an open source demonstration application which derives from our scalable proprietary software stack.

Agile Machine Learning for Real-time Recommender Systems

Johann Schleier-Smith

Introduction to the LLVM Compiler System

zionsaint

Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala

Spark Summit

Destacado (15)

Scott Clark, Software Engineer, Yelp at MLconf SF

Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF

Steffen Rendle, Research Scientist, Google at MLconf SF

MLconf - Distributed Deep Learning for Classification and Regression Problems...

Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF

10 Lessons Learned from Building Machine Learning Systems

Sequence learning under incidental conditions [poster]

Pnomics-2015-FINAL-kbs

llvm-py: Writing Compilers In Python

CogSci2014-kbs-2

Python libraries for Deep Learning with Sequences

Cognitive Science in Virtual Worlds

Agile Machine Learning for Real-time Recommender Systems

Introduction to the LLVM Compiler System

Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala

Similar a Quoc Le, Software Engineer, Google at MLconf SF

ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications

Forward Gradient

Sequence to sequence (encoder-decoder) learning

Roberto Pereira Silveira

05-transformers.pdf

ChaoYang81

A Panorama of Natural Language Processing

Ted Xiao

Sequence to Sequence Learning with Neural Networks

Nguyen Quang

In beginning there was the "rule based" machine translation, like Babelfish, that didn't work at all. Then came the Statistical Machine translation, powering the like of Google Translate, and all was good. Nowadays, it's all about Deep Learning and the Neural Machine Translation is the state of the art, with unmatched translation fluency. Let's dive into the internals of a Neural Machine Translation system, explaining the principles and the advantages over the past.

Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...

Codemotion

Word2Vec

mohammad javad hasani

Word2vec slide(lab seminar)

Jinpyo Lee

Deep Learning for NLP: An Introduction to Neural Word Embeddings

Roelof Pieters

50 Shades of Text - Leveraging Natural Language Processing (NLP) to validate, improve, and expand the functionalities of a product Nowadays, every company either stores or produces text data: from web logs and user queries, to translations and support tickets, yet not everyone knows how to extract valuable insights from it. In this session, we will present a practical case on how to move from raw text data to a valuable business application leveraging upon some of the major NLP methodologies (word embedding, word2vec, doc2vec, fastText, etc.) Bio: Alessandro is a data veteran. He holds two Master’s degrees in computer engineering, one from Politecnico di Milano and the other from University of Illinois at Chicago (UIC). He started his career in data consultancy, where he mastered Apache Spark for Machine Learning projects and subsequently joined WW Grainger, one of the largest MRO e-commerce companies in the United States. In September 2017, after more than 5 years in the USA, Alessandro returned to his native country, Italy, where he is now leading a team of data scientists. His current work focuses on achieving energy efficiency through the automation of energy management processes for commercial customers.

50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...

Data Science Milan

Ry pyconjp2015 karaoke

Renyuan Lyu

Fii Practic Frontend - BeeNear - laborator 4

BeeNear

presentation2-180202073525.pptx

KtonNguyn2

Video + Language 2019

Goergen Institute for Data Science

Video has become ubiquitous on the Internet, TV, as well as personal devices. Recognition of video content has been a fundamental challenge in computer vision for decades, where previous research predominantly focused on recognizing videos using a predefined yet limited vocabulary. Thanks to the recent development of deep learning techniques, researchers in multiple communities are now striving to bridge videos with natural language in order to move beyond classification to interpretation, which should be regarded as the ultimate goal of video understanding. We will present recent advances in exploring the synergy of video understanding and language processing techniques.

Video + Language

Goergen Institute for Data Science

Deep Learning & NLP: Graphs to the Rescue!

Roelof Pieters

Lecture 6-computer vision features descriptors matching

cairo university

Subword tokenizers

Ha Loc Do

Go from a PHP Perspective

Barry Jones

발표자: 정준선 (옥스포드대 박사, 현 NAVER) 발표일: 2018.3. The objective of this work is visual recognition of human communications. Solving this problem opens up a host of applications, such as transcribing archival silent films, or resolving multi-talker simultaneous speech, but most importantly it helps to advance the state of the art in speech recognition by enabling machines to take advantage of the multi-modal nature of human communications. Training a deep learning algorithm requires a lot of training data. We propose a method to automatically collect, process and generate a large-scale audio-visual corpus from television videos temporally aligned with the transcript. To build such dataset, it is essential to know 'who' is speaking 'when'. We develop a ConvNet model that learns joint embedding of the sound and the mouth images from unlabelled data, and apply this network to the tasks of audio-to-video synchronisation and active speaker detection. We also show that the methods developed here can be extended to the problem of generating talking faces from audio and still images, and re-dubbing videos with audio samples from different speakers. We then propose a number of deep learning models that are able to recognise visual speech at sentence level. The lip reading performance beats a professional lip reader on videos from BBC television. We demonstrate that if audio is available, then visual information helps to improve speech recognition performance. We also propose methods to enhance noisy audio and to resolve multi-talker simultaneous speech using visual cues. Finally, we explore the problem of speaker recognition. Whereas previous works for speaker identification have been limited to constrained conditions, here we build a new large-scale speaker recognition dataset collected from 'in the wild' videos using an automated pipeline. We propose a number of ConvNet architectures that outperforms traditional baselines on this dataset.

Visual recognition of human communications

NAVER Engineering

Similar a Quoc Le, Software Engineer, Google at MLconf SF (20)

ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications

Sequence to sequence (encoder-decoder) learning

05-transformers.pdf

A Panorama of Natural Language Processing

Sequence to Sequence Learning with Neural Networks

Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...

Word2Vec

Word2vec slide(lab seminar)

Deep Learning for NLP: An Introduction to Neural Word Embeddings

50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...

Ry pyconjp2015 karaoke

Fii Practic Frontend - BeeNear - laborator 4

presentation2-180202073525.pptx

Video + Language 2019

Video + Language

Deep Learning & NLP: Graphs to the Rescue!

Lecture 6-computer vision features descriptors matching

Subword tokenizers

Go from a PHP Perspective

Visual recognition of human communications

Más de MLconf

Understanding Human Impact: Social and Equity Assessments for AI Technologies Social and Equity Impact Assessments have broad applications but can be a useful tool to explore and mitigate for Machine Learning fairness issues and can be applied to product specific questions as a way to generate insights and learnings about users, as well as impacts on society broadly as a result of the deployment of new and emerging technologies. In this presentation, my goal is to advocate for and highlight the need to consult community and external stakeholder engagement to develop a new knowledge base and understanding of the human and social consequences of algorithmic decision making and to introduce principles, methods and process for these types of impact assessments.

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...

MLconf

The Brain’s Guide to Dealing with Context in Language Understanding Like the visual cortex, the regions of the brain involved in understanding language represent information hierarchically. But whereas the visual cortex organizes things into a spatial hierarchy, the language regions encode information into a hierarchy of timescale. This organization is key to our uniquely human ability to integrate semantic information across narratives. More and more, deep learning-based approaches to natural language understanding embrace models that incorporate contextual information at varying timescales. This has not only led to state-of-the art performance on many difficult natural language tasks, but also to breakthroughs in our understanding of brain activity. In this talk, we will discuss the important connection between language understanding and context at different timescales. We will explore how different deep learning architectures capture timescales in language and how closely their encodings mimic the brain. Along the way, we will uncover some surprising discoveries about what depth does and doesn’t buy you in deep recurrent neural networks. And we’ll describe a new, more flexible way to think about these architectures and ease design space exploration. Finally, we’ll discuss some of the exciting applications made possible by these breakthroughs.

Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding

MLconf

Applying Computer Vision to Reduce Contamination in the Recycling Stream With China’s recent refusal of most foreign recyclables, North American waste haulers are scrambling to figure out how to make on-shore recycling cost-effective in order to continue providing recycling services. Recyclables that were once being shipped to China for manual sorting are now primarily being redirected to landfills or incinerators. Without a solution, a nearly $5 billion annual recycling market could come to a halt. Purity in the recycling stream is key to this effort as contaminants in the stream can increase the cost of operations, damage equipment and reduce the ability to create pure commodities suitable for creating recycled goods. This market disruption as a result of China’s new regulations, however, provides us the chance to re-examine and improve our current disposal & collection habits with modern monitoring & artificial intelligence technology. Using images from our in-dumpster cameras, Compology has developed an ML-based process that helps identify, measure and alert for contaminants in recycling containers before they are picked-up, helping keep the recycling stream clean. Our convolutional neural network flags potential instances of contamination inside a dumpster, enabling garbage haulers to know which containers have the wrong type of material inside. This allows them to provide targeted, timely education, and when appropriate, assess fines, to improve recycling compliance at the businesses and residences they serve, helping keep recycling services financially viable. In this presentation, we will walk through our ML-based contamination measurement and scoring process by showing how Waste Management, a national waste hauler, has experienced 57% contamination reduction in nearly 2,000 containers over six months, This progress shows significant strides towards financially viable recycling services.

Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...

MLconf

Quantum Computing: a Treasure Hunt, not a Gold Rush Quantum computers promise a significant step up in computational power over conventional computers, but also suffer a number of counterintuitive limitations --- both in their computational model and in leading lab implementations. In this talk, we review how quantum computers compete with conventional computers and how conventional computers try to hold their ground. Then we outline what stands in the way of successful quantum ML applications.

Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush

MLconf

Data Labeling as Religious Experience One of the most common places to deploy a production machine learning systems is as a replacement for a legacy rules-based system that is having a hard time keeping up with new edge cases and requirements. I'll be walking through the process and tooling we used to help us design, train, and deploy a model to replace a set of static rules we had for handling invite spam at Slack, talk about what we learned, and discuss some problems to solve in order to make these migrations easier for everyone.

Josh Wills - Data Labeling as Religious Experience

MLconf

Project GaitNet: Ushering in the ImageNet moment for human Gait kinematics The emergence of the upright human bipedal gait can be traced back 4 to 2.8 million years ago, to the now extinct hominin Australopithecus afarensis. Fine grained analysis of gait using the modern MEMS sensors found on all smartphones not just reveals a lot about the person’s orthopedic and neuromuscular health status, but also has enough idiosyncratic clues that it can be harnessed as a passive biometric. While there were many siloed attempts made by the machine learning community to model Bipedal Gait sensor data, these were done with small datasets oft collected in restricted academic environs. In this talk, we will introduce the ImageNet moment for human gait analysis by presenting 'Project GaitNet', the largest ever planet-sized motion sensor based human bipedal gait dataset ever curated. We’ll also present the associated state-of-the-art results in classifying humans harnessing novel deep neural architectures and the related success stories we have enjoyed in transfer-learning into disparate domains of human kinematics analysis.

Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...

MLconf

Machine Learning Methods in Detecting Alzheimer’s Disease from Speech and Language Alzheimer's disease affects millions of people worldwide, and it is important to predict the disease as early and as accurate as possible. In this talk, I will discuss development of novel ML models that help classifying healthy people from those who develop Alzheimer's, using short samples of human speech. As an input to the model, features of different modalities are extracted from speech audio samples and transcriptions: (1) syntactic measures, such as e.g. production rules extracted from syntactic parse trees, (2) lexical measures, such as e.g. features of lexical richness and complexity and lexical norms, and (3) acoustic measures, such as e.g. standard Mel-frequency cepstral coefficients. I will present the ML model that detects cognitive impairment by reaching agreement among modalities. The resulting model is able to achieve state of the art performance in both supervised and semi-supervised manner, using manual transcripts of human speech. Additionally, I will discuss potential limitations of any fully-automated speech-based Alzheimer's disease detection model, focusing mostly on the analysis of the impact of a not-so-accurate automatic speech recognition (ASR) on the classification performance. To illustrate this, I will present the experiments with controlled amounts of artificially generated ASR errors and explain how the deletion errors affect Alzheimer's detection performance the most, due to their impact on the features of syntactic and lexical complexity.

Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...

MLconf

Optimized Image Classification on the Cheap In this talk, we anchor on building an image classifier trained on the Stanford Cars dataset to evaluate two approaches to transfer learning -fine tuning and feature extraction- and the impact of hyperparameter optimization on these techniques. Once we define the most performant transfer learning technique for Stanford Cars, we will double the size of the dataset through image augmentation to boost the classifier’s performance. We will use Bayesian optimization to learn the hyperparameters associated with image transformations using the downstream image classifier’s performance as the guide. In conjunction with model performance, we will also focus on the features of these augmented images and the downstream implications for our image classifier. To both maximize model performance on a budget and explore the impact of optimization on these methods, we apply a particularly efficient implementation of Bayesian optimization to each of these architectures in this comparison. Our goal is to draw on a rigorous set of experimental results that can help us answer the question: how can resource-constrained teams make trade-offs between efficiency and effectiveness using pre-trained models?

Meghana Ravikumar - Optimized Image Classification on the Cheap

MLconf

The Importance of Modeling Data Collection Data sets used in machine learning are often collected in a systematically biased way - certain data points are more likely to be collected than others. We call this "observation bias". For example, in health care, we are more likely to see lab tests when the patient is feeling unwell than otherwise. Failing to account for observation bias can, of course, result in poor predictions on new data. By contrast, properly accounting for this bias allows us to make better use of the data we do have. In this presentation, we discuss practical and theoretical approaches to dealing with observation bias. When the nature of the bias is known, there are simple adjustments we can make to nonparametric function estimation techniques, such as Gaussian Process models. We also discuss the scenario where the data collection model is unknown. In this case, there are steps we can take to estimate it from observed data. Finally, we demonstrate that having a small subset of data points that are known to be collected at random - that is, in an unbiased way - can vastly improve our ability to account for observation bias in the rest of the data set. My hope is that attendees of this presentation will be aware of the perils of observation bias in their own work, and be equipped with tools to address it.

Noam Finkelstein - The Importance of Modeling Data Collection

MLconf

The Uncanny Valley of ML Every so often, the conundrum of the Uncanny Valley re-emerges as advanced technologies evolve from clearly experimental products to refined accepted technologies. We have seen its effects in robotics, computer graphics, and page load times. The debate of how to handle the new technology detracts from its benefits. When machine learning is added to human decision systems a similar effect can be measured in increased response time and decreased accuracy. These systems include radiology, judicial assignments, bus schedules, housing prices, power grids and a growing variety of applications. Unfortunately, the Uncanny Valley of ML can be hard to detect in these systems and can lead to degraded system performance when ML is introduced, at great expense. Here, we'll introduce key design principles for introducing ML into human decision systems to navigate around the Uncanny Valley and avoid its pitfalls.

June Andrews - The Uncanny Valley of ML

MLconf

Deep Learning Architectures for Semantic Relation Detection Tasks Recognizing and distinguishing specific semantic relations from other types of semantic relations is an essential part of language understanding systems. Identifying expressions with similar and contrasting meanings is valuable for NLP systems which go beyond recognizing semantic relatedness and require to identify specific semantic relations. In this talk, I will first present novel techniques for creating labelled datasets required for training deep learning models for classifying semantic relations between phrases. I will further present various neural network architectures that integrate morphological features into integrated path-based and distributional relation detection algorithms and demonstrate that this model outperforms state-of-the-art models in distinguishing semantic relations and is capable of efficiently handling multi-word expressions.

Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks

MLconf

Building an Incrementally Trained, Local Taste Aware, Global Deep Learned Recommender System Model At Netflix, our main goal is to maximize our members’ enjoyment of the selected show by minimizing the amount of time it takes for them to find it. We try to achieve this goal by personalizing almost all the aspects of our product -- from what shows to recommend, to how to present these shows and construct their home-pages to what images to select per show, among many other things. Everything is recommendations for us and as an applied Machine Learning group, we spend our time building models for personalization that will eventually increase the joy and satisfaction of our members. In this talk we will primarily focus our attention on a) making a global deep learned recommender model that is regional tastes and popularity aware and b) adapting this model to changing taste preferences as well as dynamic catalog availability. We will first go through some standard recommender system models that use Matrix Factorization and Topic Models and then compare and contrast them with more powerful and higher capacity deep learning based models such as sequence models that use recurrent neural networks. We will show what it entails to build a global model that is aware of regional taste preferences and catalog availability. We will show how models that are built on simple Maximum Likelihood principle fail to do that. We will then describe one solution that we have employed in order to enable the global deep learned models to focus their attention on capturing regional taste preferences and changing catalog.In the latter half of the talk, we will discuss how we do incremental learning of deep learned recommender system models. Why do we need to do that ? Everything changes with time. Users’ tastes change with time. What’s available on Netflix and what’s popular also change over time. Therefore, updating or improving recommendation systems over time is necessary to bring more joy to users. In addition to how we apply incremental learning, we will discuss some of the challenges we face involving large-scale data preparation, infrastructure setup for incremental model training as well as pipeline scheduling. The incremental training enables us to serve fresher models trained on fresher and larger amounts of data. This helps our recommender system to nicely and quickly adapt to catalog and users’ taste changes, and improve overall performance.

Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...

MLconf

Vito Ostuni - The Voice: New Challenges in a Zero UI World The adoption of voice-enabled devices has seen an explosive growth in the last few years and music consumption is among the most popular use cases. Music personalization and recommendation plays a major role at Pandora in providing a daily delightful listening experience for millions of users. In turn, providing the same perfectly tailored listening experience through these novel voice interfaces brings new interesting challenges and exciting opportunities. In this talk we will describe how we apply personalization and recommendation techniques in three common voice scenarios which can be defined in terms of request types: known-item, thematic, and broad open-ended. We will describe how we use deep learning slot filling techniques and query classification to interpret the user intent and identify the main concepts in the query. We will also present the differences and challenges regarding evaluation of voice powered recommendation systems. Since pure voice interfaces do not contain visual UI elements, relevance labels need to be inferred through implicit actions such as play time, query reformulations or other types of session level information. Another difference is that while the typical recommendation task corresponds to recommending a ranked list of items, a voice play request translates into a single item play action. Thus, some considerations about closed feedback loops need to be made. In summary, improving the quality of voice interactions in music services is a relatively new challenge and many exciting opportunities for breakthroughs still remain. There are many new aspects of recommendation system interfaces to address to bring a delightful and effortless experience for voice users. We will share a few open challenges to solve for the future.

Vito Ostuni - The Voice: New Challenges in a Zero UI World

MLconf

Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...

MLconf

Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...

MLconf

Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...

MLconf

Neel Sundaresan - Teaching a machine to code

MLconf

Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...

MLconf

Soumith Chintala - Increasing the Impact of AI Through Better Software

MLconf

Roy Lowrance - Predicting Bond Prices: Regime Changes

MLconf

Más de MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...

Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding

Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...

Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush

Josh Wills - Data Labeling as Religious Experience

Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...

Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...

Meghana Ravikumar - Optimized Image Classification on the Cheap

Noam Finkelstein - The Importance of Modeling Data Collection

June Andrews - The Uncanny Valley of ML

Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks

Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...

Vito Ostuni - The Voice: New Challenges in a Zero UI World

Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...

Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...

Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...

Neel Sundaresan - Teaching a machine to code

Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...

Soumith Chintala - Increasing the Impact of AI Through Better Software

Roy Lowrance - Predicting Bond Prices: Regime Changes

Último

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Zilliz

The microservices honeymoon is over. When starting a new project or revamping a legacy monolith, teams started looking for alternatives to microservices. The Modular Monolith, or 'Modulith', is an architecture that reaps the benefits of (vertical) functional decoupling without the high costs associated with separate deployments. This talk will delve into the advantages and challenges of this progressive architecture, beginning with exploring the concept of a 'module', its internal structure, public API, and inter-module communication patterns. Supported by spring-modulith, the talk provides practical guidance on addressing the main challenges of a Modultith Architecture: finding and guarding module boundaries, data decoupling, and integration module-testing. You should not miss this talk if you are a software architect or tech lead seeking practical, scalable solutions. About the author With two decades of experience, Victor is a Java Champion working as a trainer for top companies in Europe. Five thousands developers in 120 companies attended his workshops, so he gets to debate every week the challenges that various projects struggle with. In return, Victor summarizes key points from these workshops in conference talks and online meetups for the European Software Crafters, the world’s largest developer community around architecture, refactoring, and testing. Discover how Victor can help you on victorrentea.ro : company training catalog, consultancy and YouTube playlists.

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Victor Rentea

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Juan lago vázquez

ICT role in 21st century education and its challenges

rafiqahmad00786416

CNIC Information System with Pakdata Cf In Pakistan

danishmna97

Architecting Cloud Native Applications

WSO2

Exploring Multimodal Embeddings with Milvus

Zilliz

Whatsapp Number Escorts Call girls 8617370543 Available 24x7 Mcleodganj Call Girls Service Offer Genuine VIP Model Escorts Call Girls in Your Budget. Mcleodganj Call Girls Service Provide Real Call Girls Number. Make Your Sexual Pleasure Memorable with Our Mcleodganj Call Girls at Affordable Price. Top VIP Escorts Call Girls, High Profile Independent Escorts Call Girls, Housewife Women Escorts Call Girl, College Girls Escorts Call Girls, Russian Escorts Call girls Service in Your Budget.

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Deepika Singh

Discover the innovative features and strategic vision that keep WSO2 an industry leader. Explore the exciting 2024 roadmap of WSO2 API management, showcasing innovations, unified APIM/APK control plane, natural language API interaction, and cloud native agility. Discover how open source solutions, microservices architecture, and cloud native technologies unlock seamless API management in today's dynamic landscapes. Leave with a clear blueprint to revolutionize your API journey and achieve industry success!

WSO2's API Vision: Unifying Control, Empowering Developers

WSO2

Introduction to use of FHIR Documents in ABDM

Kumar Satyam

In this keynote, Asanka Abeysinghe, CTO,WSO2 will explore the shift towards platformless technology ecosystems and their importance in driving digital adaptability and innovation. We will discuss strategies for leveraging decentralized architectures and integrating diverse technologies, with a focus on building resilient, flexible, and future-ready IT infrastructures. We will also highlight WSO2's roadmap, emphasizing our commitment to supporting this transformative journey with our evolving product suite.

Platformless Horizons for Digital Adaptability

WSO2

Scaling API-first – The story of a global engineering organization Ian Reasor, Senior Computer Scientist - Adobe Radu Cotescu, Senior Computer Scientist - Adobe Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

apidays

Tracing the root cause of a performance issue requires a lot of patience, experience, and focus. It’s so hard that we sometimes attempt to guess by trying out tentative fixes, but that usually results in frustration, messy code, and a considerable waste of time and money. This talk explains how to correctly zoom in on a performance bottleneck using three levels of profiling: distributed tracing, metrics, and method profiling. After we learn to read the JVM profiler output as a flame graph, we explore a series of bottlenecks typical for backend systems, like connection/thread pool starvation, invisible aspects, blocking code, hot CPU methods, lock contention, and Virtual Thread pinning, and we learn to trace them even if they occur in library code you are not familiar with. Attend this talk and prepare for the performance issues that will eventually hit any successful system. About authorWith two decades of experience, Victor is a Java Champion working as a trainer for top companies in Europe. Five thousands developers in 120 companies attended his workshops, so he gets to debate every week the challenges that various projects struggle with. In return, Victor summarizes key points from these workshops in conference talks and online meetups for the European Software Crafters, the world’s largest developer community around architecture, refactoring, and testing. Discover how Victor can help you on victorrentea.ro : company training catalog, consultancy and YouTube playlists.

Finding Java's Hidden Performance Traps @ DevoxxUK 2024

Victor Rentea

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Edi Saputra

AWS Community Day CPH - Three problems of Terraform

Andrey Devyatkin

💥 You’re lucky! We’ve found two different (lead) developers that are willing to share their valuable lessons learned about using UiPath Document Understanding! Based on recent implementations in appealing use cases at Partou and SPIE. Don’t expect fancy videos or slide decks, but real and practical experiences that will help you with your own implementations. 📕 Topics that will be addressed: • Training the ML-model by humans: do or don't? • Rule-based versus AI extractors • Tips for finding use cases • How to start 👨‍🏫👨‍💻 Speakers: o Dion Morskieft, RPA Product Owner @Partou o Jack Klein-Schiphorst, Automation Developer @Tacstone Technology

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam

UiPathCommunity

Retrieval augmented generation (RAG) is the most popular style of large language model application to emerge from 2023. The most basic style of RAG works by vectorizing your data and injecting it into a vector database like Milvus for retrieval to augment the text output generated by an LLM. This is just the beginning. One of the ways that we can extend RAG, and extend AI, is through multilingual use cases. Typical RAG is done in English using embedding models that are trained in English. In this talk, we’ll explore how RAG could work in languages other than English. We’ll explore French, Chinese, and Polish.

Introduction to Multilingual Retrieval Augmented Generation (RAG)

Zilliz

Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows. We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases. This video focuses on the deployment of external web forms using Jotform for Bonterra Impact Management. This solution can be customized to your organization’s needs and deployed to support the common use cases below: - Intake and consent - Assessments - Surveys - Applications - Program registration Interested in deploying web form automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Jeffrey Haguewood

💉💊+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI}}+971581248768 +971581248768 Mtp-Kit (500MG) Prices » Dubai [(+971581248768**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Maya Whatsapp +971581248768 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971581248768''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971581248768' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Cl

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Strategies for Landing an Oracle DBA Job as a Fresher

Remote DBA Services

Quoc Le, Software Engineer, Google at MLconf SF

1. Sequence Learning for Language Understanding Presenter: Quoc V. Le Google Thanks: Andrew Dai, Jeff Dean, Matthieu Devin, Geoff Hinton, Thang Luong, Rajat Monga, Ilya Sutskever, Oriol Vinyals

2. Sequence Learning Typical success of Machine Learning: Mapping fixed length input to a scalar value: - Image recognition (Pixels -> “cat”) - Speech recognition (Waveforms -> the utterance of “cat”) Many language understanding problems require mapping from sequences to sequences: - Machine Translation (“I love music” -> “Je aime la musique”) Quoc V. Le

3. Sequence Learning Typical success of Machine Learning: Mapping fixed length input to a scalar value: - Image recognition (Pixels -> “cat”) - Speech recognition (Waveforms -> the utterance of “cat”) Many language understanding problems require mapping from sequences to sequences: - Machine Translation (“I love music” -> “Je aime la musique”) Quoc V. Le

4. How does Machine Translation work? Use a dictionary to translate one word at a time Use a model put reorder the words so that the sentence looks reasonable. Lots of rules: - Phrases instead of words (“New York” should not be translated as “New” + “York”) - Meaning of words depend on contexts Quoc V. Le

5. Ideas: Sequence Learning - Use a Recurrent Neural Net encoder to map an input sequence to a vector - Use a Recurrent Neural Net decoder to map the vector to another sequence Quoc V. Le

6. Sequence Learning W X Y Z <EOS> Quoc V. Le Example network that maps ABC -> WXYZ A B C <EOS> W X Y Z At test time, feed the output back into the decoder as the input For better output sequence, generate many candidates, feed each candidate to the decoder to have a beam of possible sequences Use “beam search” to find the top sequences

7. Sequence Learning W X Y Z <EOS> Quoc V. Le Example network that maps ABC -> WXYZ A B C <EOS> W X Y Z At test time, feed the output back into the decoder as the input For better output sequence, generate many candidates, feed each candidate to the decoder to have a beam of possible sequences Use “beam search” to find the top sequences

8. A machine translation experiment WMT’2014 (small in comparison to Google’s data): - State-of-art (a combination of many methods, took 20 years to develop): 37 - Our method (took 3 person year): 37 Important achievement because it’s a new way to represent input texts and output texts. Potential breakthrough in many other areas of language understanding. Quoc V. Le

9. Sequence Learning W X Y Z <EOS> A B C <EOS> W X Y Z Quoc V. Le

10.

11.

12. Contact: Quoc V. Le (qvl@google.com), Ilya Sutskever (ilyasu@google.com), Oriol Vinyals (vinyals@google.com) Minh-Thang Luong (lmthang@cs.stanford.edu) Paper: Sequence to Sequence Learning with Neural Networks Addressing the Rare Word Problem in Neural Machine Translation Upcoming NIPS paper Quoc V. Le

Quoc Le, Software Engineer, Google at MLconf SF

Recomendados

Recomendados

Más contenido relacionado

Destacado

Destacado (15)

Similar a Quoc Le, Software Engineer, Google at MLconf SF

Similar a Quoc Le, Software Engineer, Google at MLconf SF (20)

Más de MLconf

Más de MLconf (20)

Último

Último (20)

Quoc Le, Software Engineer, Google at MLconf SF