SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
Convolutional Neural Networks for
Natural Language Processing
Adriaan Schakel
November 26, 2015
Google Trends
Query: convolutional neural networks
arXiv
http://export.arxiv.org/api/query?search_query=
abs:convolutional+AND+neural+AND+network&
start=0&max_results=10000
ILSVRC2012
Challenge: identify main objects present in images (from 1000 object
categories)
Tranining data: 1,2 million labelled images
October 13, 2012: results released
Winner: Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton
(University of Toronto)
Score: top-5 test error rate of 15.3%, compared to 26.2% achieved
by second-best entry
ILSVRC2012
Krizhevsky, Sutskever, and Hinton (2012)
ILSVRC2012
AlexNet:
deep ConvNet trained on raw RGB pixel values
60 million parameters and 650,000 neurons
5 convolutional layers (some followed by max-pooling layers) and
3 globally-connected layers
a final 1000-way softmax
trained on two NVIDIA GPUs for about a week
use of dropout in the globally-connected layers
DNNResearch
Radical Change
Letter from Yann LeCun to editor of CVPR 2012:
Getting papers about feature learning accepted at vision
conference has always been a struggle, and I’ve had more than
my share of bad reviews over the years. I was very sure that
this paper was going to get good reviews because:
it uses no hand-crafted features (it’s all learned all the way
through. Incredibly, this was seen as a negative point by
the reviewers!);
it beats all published results on 3 standard datasets for
scene parsing;
it’s an order of magnitude faster than the competing
methods.
If that is not enough to get good reviews, I just don’t know
what is. So, I’m giving up on submitting to computer vision
conferences altogether. (. . . ) Submitting our papers is just a
waste of everyone’s time (and incredibly demoralizing to my lab
members).
Revolution?
History:
1980: introduction of ConvNets by Fukushima
late 1980s: further development by LeCun and
collaborators @ Bell Labs
late 1990s: LeNet-5 was reading about 20% of
written checks in U.S.
Breakthrough due to:
persistence of academic researchers
improved algorithms
increase in computing power
increase in amount of data
dissemination of knowledge
http://www.elitestreetsmagazine.com/magazine/2008/jan-mar/art.php
Neural Networks
1943 McCulloch and Pitts proposed first artificial neuron:
computes weighted sum of its binary input signals,
xi = 0, 1
y = θ
n
i=1
wi xi − u
1957 Rosenblatt developed a learning algorithm: the perceptron
(for linearly separable data only)
K Jain, J Mao, KM Mohiuddin - IEEE computer, 1996
Perceptron
The New York Times July 7, 1958:
Feed-Forward Neural Networks
neurons arranged in layers
neurons propagate signals only forward
input of jth neuron in layer l:
xl
j =
i
wl
ji yl−1
i
output:
yl
j = h xl
j
K Jain, J Mao, KM Mohiuddin - IEEE computer, 1996; commons.wikimedia.org
Backpropagation
Paul Werbos (1974):
1. initialize weights to small random values
2. choose input pattern
3. propagate signal forward through network
4. determine error (E) and propagate it backwards through network to
assign credit/blame to each unit
5. update weights by means of gradient descent:
∆wji = −η
∂E
∂wji
ConvNets
Feed-forward nets w/:
local receptive field
shared weights
Applications:
character recognition
face recognition
medical diagnosis
self-driving cars
object recognition
(e.g. birds)
Race to bring Deep Learning to the Masses
Major players:
Google
Facebook
Baidu
Microsoft
Nvidia
Apple
Amazon
LeCun @ Facebook
http://www.popsci.com/facebook-ai
Fooling ConvNets
fix trained network
carry out backprop using wrong class label
update input pixels:
Goodfellow, Shlens, and Szegedy, ICLR 2015
Dreaming ConvNets
fix trained network
initialize input by
average image of
some class
carry out backprop
using that class’
label
update input pixels:
Simonyan, Vedaldi, and Zisserman, arXiv:1312.6034
ConvNets for NLP Tasks
2008:
Case study: sentiment analysis (classification)
Rationale: key phrases, that are indicative of class membership, can
appear anywhere in a document
Applications
almost every image posted by Mrs
Merkel’s office of her in meetings
and summits has attracted
comments in Russian criticising
her and her policies.
Staff in Mrs Merkel’s office have
been deleting comments but some
remain despite the purge.
FAZ 07.06.2015:
Merkels Social-Media-Team, dessen Mitarbeiterzahl nicht
bekanntgegeben wird, war heillos ¨uberfordert.
Pre-trained Word Vectors
Word embeddings:
dense vectors (w/ dimension d of order 100)
derived from word co-occurrences: a word is characterized by the
company it keeps (Firth, 1957)
GloVe [Pennington, Socher, and Manning (2014)]:
log-bilinear regression model
learns word vectors, such that:
log(Xij ) = wT
i ˜wj + bi + ˜bj
Xij the number of times word j occurs in the context of word i
wi ∈ Rd word vectors
˜wj ∈ Rd context word vectors
Word2vec (skip-gram algorithm) [Mikolov et al. (2013)]:
shallow feed-forward neural network
learns word vectors, such that:
Xij
j Xij
=
ewT
i ˜wj
j ewT
i
˜wj
Pre-trained Word Vectors
Kim (2014): Sentence classification
Hyperparameters:
filters of width (region
size) 3, 4, and 5
100 feature maps each
max-pooling layer
penultimate layer: 300
units
Datasets (average sentence length ∼ 20):
movie reviews w/ one sentence per
review (pos/neg?)
electronic product reviews (pos/neg?)
TREC question dataset. Is question
about a person, a location, numeric
information, etc.? (6 categories).
arXiv:1408.5882
One-Hot Vectors
Johnson and Zhang (2015): Classification of larger pieces of text
(average size ∼ 200)
aardvark :





1
0
...
0





, zwieback :





0
...
0
1





Hyperparameters:
filter width (region size) 3
stack words in region
1000 feature maps
max-pooling
penultimate layer: 1000
units
Performance:
IMDB (|V | = 30k): error rate 8.74%
Amazon Elec: error rate 7.74%
arXiv:1412.1058
Character Input
Zhang, Zhao, and LeCun (2015): Large datasets
Hyperparameters:
alphabet of size 70
6 convolutional layers (all followed by max-pooling layers) and
3 fully-connected layers
filter width (region size) 7 or 3
1024 feature maps
Performance:
Model AG Sogou DBP. Yelp P. Yelp F. Yah. A. Amz. F. Amz. P.
BoW 11.19 7.15 3.39 7.76 42.01 31.11 45.36 9.60
BoW TFIDF 10.36 6.55 2.63 6.34 40.14 28.96 44.74 9.00
ngrams 7.96 2.92 1.37 4.36 43.74 31.53 45.73 7.98
ngrams TFIDF 7.64 2.81 1.31 4.56 45.20 31.49 47.56 8.46
ConvNet 12.82 4.88 1.73 5.89 39.62 29.55 41.31 5.51
arXiv:1509.01626
Outlook
Convenient and powerful libraries:
Theano (Lasagne, Keras) developed at LISA, University of Montreal
Torch primarily developed by Ronan Collobert (now @ Facebook),
used within Facebook, Google, Twitter, and NYU
TensorFlow by Google
The new iPhone 6S shows great GPU performance. So, expect
(more) deep learning coming to your phone.
Embedded devices like Nvidia’s TX1, a tiny
supercomputer w/ 256 CUDA cores and 4GB
memory, for driver-assistance systems and the like.
http://technonewschannel.com/tips-trick/5-hidden-features-of-android-camera-which-you-should-know/

Más contenido relacionado

La actualidad más candente

Tutorial on convolutional neural networks
Tutorial on convolutional neural networksTutorial on convolutional neural networks
Tutorial on convolutional neural networksHojin Yang
 
Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksJeremy Nixon
 
Convolutional neural networks deepa
Convolutional neural networks deepaConvolutional neural networks deepa
Convolutional neural networks deepadeepa4466
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationYogendra Tamang
 
HardNet: Convolutional Network for Local Image Description
HardNet: Convolutional Network for Local Image DescriptionHardNet: Convolutional Network for Local Image Description
HardNet: Convolutional Network for Local Image DescriptionDmytro Mishkin
 
Neuroevolution and deep learing
Neuroevolution and deep learing Neuroevolution and deep learing
Neuroevolution and deep learing Accenture
 
Deep Learning
Deep LearningDeep Learning
Deep LearningJun Wang
 
Devil in the Details: Analysing the Performance of ConvNet Features
Devil in the Details: Analysing the Performance of ConvNet FeaturesDevil in the Details: Analysing the Performance of ConvNet Features
Devil in the Details: Analysing the Performance of ConvNet FeaturesKen Chatfield
 
ujava.org Deep Learning with Convolutional Neural Network
ujava.org Deep Learning with Convolutional Neural Network ujava.org Deep Learning with Convolutional Neural Network
ujava.org Deep Learning with Convolutional Neural Network 신동 강
 
Neural Network as a function
Neural Network as a functionNeural Network as a function
Neural Network as a functionTaisuke Oe
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural NetworksTianxiang Xiong
 
Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsMathias Niepert
 
Review on cs231 part-2
Review on cs231 part-2Review on cs231 part-2
Review on cs231 part-2Jeong Choi
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Seonho Park
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksHannes Hapke
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationGioele Ciaparrone
 
CNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesCNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesDmytro Mishkin
 
Deep Learning behind Prisma
Deep Learning behind PrismaDeep Learning behind Prisma
Deep Learning behind Prismalostleaves
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)Fellowship at Vodafone FutureLab
 

La actualidad más candente (20)

Tutorial on convolutional neural networks
Tutorial on convolutional neural networksTutorial on convolutional neural networks
Tutorial on convolutional neural networks
 
Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural Networks
 
Convolutional neural networks deepa
Convolutional neural networks deepaConvolutional neural networks deepa
Convolutional neural networks deepa
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image Classfication
 
HardNet: Convolutional Network for Local Image Description
HardNet: Convolutional Network for Local Image DescriptionHardNet: Convolutional Network for Local Image Description
HardNet: Convolutional Network for Local Image Description
 
Neuroevolution and deep learing
Neuroevolution and deep learing Neuroevolution and deep learing
Neuroevolution and deep learing
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Devil in the Details: Analysing the Performance of ConvNet Features
Devil in the Details: Analysing the Performance of ConvNet FeaturesDevil in the Details: Analysing the Performance of ConvNet Features
Devil in the Details: Analysing the Performance of ConvNet Features
 
ujava.org Deep Learning with Convolutional Neural Network
ujava.org Deep Learning with Convolutional Neural Network ujava.org Deep Learning with Convolutional Neural Network
ujava.org Deep Learning with Convolutional Neural Network
 
Neural Network as a function
Neural Network as a functionNeural Network as a function
Neural Network as a function
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
 
Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for Graphs
 
CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
 
Review on cs231 part-2
Review on cs231 part-2Review on cs231 part-2
Review on cs231 part-2
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural Networks
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentation
 
CNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesCNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent Advances
 
Deep Learning behind Prisma
Deep Learning behind PrismaDeep Learning behind Prisma
Deep Learning behind Prisma
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 

Destacado

(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结君 廖
 
Scaling PyData Up and Out
Scaling PyData Up and OutScaling PyData Up and Out
Scaling PyData Up and OutTravis Oliphant
 
Thesis defense presentation
Thesis defense presentationThesis defense presentation
Thesis defense presentationPico De Lucchi
 
Jeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYCJeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYCMLconf
 
Convolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classificationConvolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classificationYunchao He
 
Lukáš Vrábel - Deep Convolutional Neural Networks
Lukáš Vrábel - Deep Convolutional Neural NetworksLukáš Vrábel - Deep Convolutional Neural Networks
Lukáš Vrábel - Deep Convolutional Neural NetworksMachine Learning Prague
 
Named Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 PresentationNamed Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 PresentationRichard Littauer
 
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Universitat Politècnica de Catalunya
 
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016Keunwoo Choi
 
Deep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewDeep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewKeunwoo Choi
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)Marina Santini
 
Master Thesis Presentation
Master Thesis PresentationMaster Thesis Presentation
Master Thesis PresentationWishofnight13
 
Anaconda and PyData Solutions
Anaconda and PyData SolutionsAnaconda and PyData Solutions
Anaconda and PyData SolutionsTravis Oliphant
 
101: Convolutional Neural Networks
101: Convolutional Neural Networks 101: Convolutional Neural Networks
101: Convolutional Neural Networks Mad Scientists
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 
Applied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksApplied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksMark Chang
 
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Vincenzo Lomonaco
 
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)Universitat Politècnica de Catalunya
 
Master Thesis presentation
Master Thesis presentationMaster Thesis presentation
Master Thesis presentationBogdan Vasilescu
 

Destacado (20)

(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结
 
Scaling PyData Up and Out
Scaling PyData Up and OutScaling PyData Up and Out
Scaling PyData Up and Out
 
Thesis defense presentation
Thesis defense presentationThesis defense presentation
Thesis defense presentation
 
Jeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYCJeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYC
 
Convolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classificationConvolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classification
 
Lukáš Vrábel - Deep Convolutional Neural Networks
Lukáš Vrábel - Deep Convolutional Neural NetworksLukáš Vrábel - Deep Convolutional Neural Networks
Lukáš Vrábel - Deep Convolutional Neural Networks
 
Named Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 PresentationNamed Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 Presentation
 
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
 
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
 
Deep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewDeep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - Overview
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)
 
Master Thesis Presentation
Master Thesis PresentationMaster Thesis Presentation
Master Thesis Presentation
 
Anaconda and PyData Solutions
Anaconda and PyData SolutionsAnaconda and PyData Solutions
Anaconda and PyData Solutions
 
101: Convolutional Neural Networks
101: Convolutional Neural Networks 101: Convolutional Neural Networks
101: Convolutional Neural Networks
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
Applied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksApplied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural Networks
 
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
 
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
 
Master Thesis presentation
Master Thesis presentationMaster Thesis presentation
Master Thesis presentation
 
Deep learning
Deep learningDeep learning
Deep learning
 

Similar a #4 Convolutional Neural Networks for Natural Language Processing

Deep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and HypeDeep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and HypeSiby Jose Plathottam
 
CNN Structure: From LeNet to ShuffleNet
CNN Structure: From LeNet to ShuffleNetCNN Structure: From LeNet to ShuffleNet
CNN Structure: From LeNet to ShuffleNetDalin Zhang
 
Deep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowDeep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowOswald Campesato
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryKenta Oono
 
Molecular autoencoder
Molecular autoencoderMolecular autoencoder
Molecular autoencoderDan Elton
 
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Ha Phuong
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Oswald Campesato
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Universitat Politècnica de Catalunya
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningOswald Campesato
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningOswald Campesato
 
Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)Oswald Campesato
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural NetworksYogendra Tamang
 
UNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptxUNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptxNoorUlHaq47
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Alessandro Suglia
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Claudio Greco
 
Using Formal Concept Analysis to Construct and Visualise Hierarchies of Socio...
Using Formal Concept Analysis to Construct and Visualise Hierarchies of Socio...Using Formal Concept Analysis to Construct and Visualise Hierarchies of Socio...
Using Formal Concept Analysis to Construct and Visualise Hierarchies of Socio...Michel Wermelinger
 
resume_Yuli_Liang
resume_Yuli_Liangresume_Yuli_Liang
resume_Yuli_LiangYuli Liang
 

Similar a #4 Convolutional Neural Networks for Natural Language Processing (20)

Deep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and HypeDeep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and Hype
 
CNN Structure: From LeNet to ShuffleNet
CNN Structure: From LeNet to ShuffleNetCNN Structure: From LeNet to ShuffleNet
CNN Structure: From LeNet to ShuffleNet
 
Java and Deep Learning
Java and Deep LearningJava and Deep Learning
Java and Deep Learning
 
Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
 
Deep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowDeep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlow
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
Molecular autoencoder
Molecular autoencoderMolecular autoencoder
Molecular autoencoder
 
C++ and Deep Learning
C++ and Deep LearningC++ and Deep Learning
C++ and Deep Learning
 
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep Learning
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep Learning
 
Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
UNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptxUNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptx
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 
Using Formal Concept Analysis to Construct and Visualise Hierarchies of Socio...
Using Formal Concept Analysis to Construct and Visualise Hierarchies of Socio...Using Formal Concept Analysis to Construct and Visualise Hierarchies of Socio...
Using Formal Concept Analysis to Construct and Visualise Hierarchies of Socio...
 
resume_Yuli_Liang
resume_Yuli_Liangresume_Yuli_Liang
resume_Yuli_Liang
 

Último

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 

Último (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

#4 Convolutional Neural Networks for Natural Language Processing

  • 1. Convolutional Neural Networks for Natural Language Processing Adriaan Schakel November 26, 2015
  • 4. ILSVRC2012 Challenge: identify main objects present in images (from 1000 object categories) Tranining data: 1,2 million labelled images October 13, 2012: results released Winner: Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton (University of Toronto) Score: top-5 test error rate of 15.3%, compared to 26.2% achieved by second-best entry
  • 6. ILSVRC2012 AlexNet: deep ConvNet trained on raw RGB pixel values 60 million parameters and 650,000 neurons 5 convolutional layers (some followed by max-pooling layers) and 3 globally-connected layers a final 1000-way softmax trained on two NVIDIA GPUs for about a week use of dropout in the globally-connected layers
  • 8. Radical Change Letter from Yann LeCun to editor of CVPR 2012: Getting papers about feature learning accepted at vision conference has always been a struggle, and I’ve had more than my share of bad reviews over the years. I was very sure that this paper was going to get good reviews because: it uses no hand-crafted features (it’s all learned all the way through. Incredibly, this was seen as a negative point by the reviewers!); it beats all published results on 3 standard datasets for scene parsing; it’s an order of magnitude faster than the competing methods. If that is not enough to get good reviews, I just don’t know what is. So, I’m giving up on submitting to computer vision conferences altogether. (. . . ) Submitting our papers is just a waste of everyone’s time (and incredibly demoralizing to my lab members).
  • 9. Revolution? History: 1980: introduction of ConvNets by Fukushima late 1980s: further development by LeCun and collaborators @ Bell Labs late 1990s: LeNet-5 was reading about 20% of written checks in U.S. Breakthrough due to: persistence of academic researchers improved algorithms increase in computing power increase in amount of data dissemination of knowledge http://www.elitestreetsmagazine.com/magazine/2008/jan-mar/art.php
  • 10. Neural Networks 1943 McCulloch and Pitts proposed first artificial neuron: computes weighted sum of its binary input signals, xi = 0, 1 y = θ n i=1 wi xi − u 1957 Rosenblatt developed a learning algorithm: the perceptron (for linearly separable data only) K Jain, J Mao, KM Mohiuddin - IEEE computer, 1996
  • 11. Perceptron The New York Times July 7, 1958:
  • 12. Feed-Forward Neural Networks neurons arranged in layers neurons propagate signals only forward input of jth neuron in layer l: xl j = i wl ji yl−1 i output: yl j = h xl j K Jain, J Mao, KM Mohiuddin - IEEE computer, 1996; commons.wikimedia.org
  • 13. Backpropagation Paul Werbos (1974): 1. initialize weights to small random values 2. choose input pattern 3. propagate signal forward through network 4. determine error (E) and propagate it backwards through network to assign credit/blame to each unit 5. update weights by means of gradient descent: ∆wji = −η ∂E ∂wji
  • 14. ConvNets Feed-forward nets w/: local receptive field shared weights Applications: character recognition face recognition medical diagnosis self-driving cars object recognition (e.g. birds)
  • 15. Race to bring Deep Learning to the Masses Major players: Google Facebook Baidu Microsoft Nvidia Apple Amazon LeCun @ Facebook http://www.popsci.com/facebook-ai
  • 16. Fooling ConvNets fix trained network carry out backprop using wrong class label update input pixels: Goodfellow, Shlens, and Szegedy, ICLR 2015
  • 17. Dreaming ConvNets fix trained network initialize input by average image of some class carry out backprop using that class’ label update input pixels: Simonyan, Vedaldi, and Zisserman, arXiv:1312.6034
  • 18. ConvNets for NLP Tasks 2008: Case study: sentiment analysis (classification) Rationale: key phrases, that are indicative of class membership, can appear anywhere in a document
  • 19. Applications almost every image posted by Mrs Merkel’s office of her in meetings and summits has attracted comments in Russian criticising her and her policies. Staff in Mrs Merkel’s office have been deleting comments but some remain despite the purge. FAZ 07.06.2015: Merkels Social-Media-Team, dessen Mitarbeiterzahl nicht bekanntgegeben wird, war heillos ¨uberfordert.
  • 20. Pre-trained Word Vectors Word embeddings: dense vectors (w/ dimension d of order 100) derived from word co-occurrences: a word is characterized by the company it keeps (Firth, 1957) GloVe [Pennington, Socher, and Manning (2014)]: log-bilinear regression model learns word vectors, such that: log(Xij ) = wT i ˜wj + bi + ˜bj Xij the number of times word j occurs in the context of word i wi ∈ Rd word vectors ˜wj ∈ Rd context word vectors Word2vec (skip-gram algorithm) [Mikolov et al. (2013)]: shallow feed-forward neural network learns word vectors, such that: Xij j Xij = ewT i ˜wj j ewT i ˜wj
  • 21. Pre-trained Word Vectors Kim (2014): Sentence classification Hyperparameters: filters of width (region size) 3, 4, and 5 100 feature maps each max-pooling layer penultimate layer: 300 units Datasets (average sentence length ∼ 20): movie reviews w/ one sentence per review (pos/neg?) electronic product reviews (pos/neg?) TREC question dataset. Is question about a person, a location, numeric information, etc.? (6 categories). arXiv:1408.5882
  • 22. One-Hot Vectors Johnson and Zhang (2015): Classification of larger pieces of text (average size ∼ 200) aardvark :      1 0 ... 0      , zwieback :      0 ... 0 1      Hyperparameters: filter width (region size) 3 stack words in region 1000 feature maps max-pooling penultimate layer: 1000 units Performance: IMDB (|V | = 30k): error rate 8.74% Amazon Elec: error rate 7.74% arXiv:1412.1058
  • 23. Character Input Zhang, Zhao, and LeCun (2015): Large datasets Hyperparameters: alphabet of size 70 6 convolutional layers (all followed by max-pooling layers) and 3 fully-connected layers filter width (region size) 7 or 3 1024 feature maps Performance: Model AG Sogou DBP. Yelp P. Yelp F. Yah. A. Amz. F. Amz. P. BoW 11.19 7.15 3.39 7.76 42.01 31.11 45.36 9.60 BoW TFIDF 10.36 6.55 2.63 6.34 40.14 28.96 44.74 9.00 ngrams 7.96 2.92 1.37 4.36 43.74 31.53 45.73 7.98 ngrams TFIDF 7.64 2.81 1.31 4.56 45.20 31.49 47.56 8.46 ConvNet 12.82 4.88 1.73 5.89 39.62 29.55 41.31 5.51 arXiv:1509.01626
  • 24. Outlook Convenient and powerful libraries: Theano (Lasagne, Keras) developed at LISA, University of Montreal Torch primarily developed by Ronan Collobert (now @ Facebook), used within Facebook, Google, Twitter, and NYU TensorFlow by Google The new iPhone 6S shows great GPU performance. So, expect (more) deep learning coming to your phone. Embedded devices like Nvidia’s TX1, a tiny supercomputer w/ 256 CUDA cores and 4GB memory, for driver-assistance systems and the like. http://technonewschannel.com/tips-trick/5-hidden-features-of-android-camera-which-you-should-know/