DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0

Recurrent Neural Networks
via TensorFlow 2.0

ORGANIZATION
PLATINUM SPONSORS
Thank you!
COLLABORATORS

@ematde
ematallanas@plainconcepts.com
Eduardo Matallanas
AI Lead – Plain Concepts UK
@PabloDoval
Pablo Doval
Data Pontifex – Plain Concepts UK
palvarez@plainconcepts.com

Things you will learn
today
• Handling of sequences in Neural Networks
• RNN in Natural Language Processing Tasks
• Basics of TF2
• Show me the code!!
ATTENTION: Some changes ahead!

Things you will not learn
today
• What is a…
• Neural network
• Autoencoder
• …
• What is TensorFlow
• How to code in Python
• Star Trek lore

How can we interact
with our bots?

Language is a process of free creation; its laws and
principles are fixed; but the manner in which the
principles of generation are used is free and
infinitely varied. Even the interpretation and used
of words involves a process of free creation.
Noam Chomsky

You're just not thinking fourth dimensionally!

Dynamic Structures
• Consider time as a variable
• Use energy functions to describe information inside the
network
• First time using own generated information → feedback
loop

Recurrent Neural Networks
• Dynamic properties → feedback loops
• Short term memory
• Adaptive behaviour
• Differential equation system
• Applications
• Signal processing
• Temporal series forecasting

How can we remember things?
Use an internal state

Internal State
Better if we unrolled the structure
Xt
ht
X0
h0
X1
h1
X2
h2
Xt
ht
=

X0
h0
hasta
X1
h1
que
X1
h1
se
X1
h1
seque
X1
h1
el
X1
h1
malecón
ThoghtVector
X1
h1
Until
X1
h1
the

Far, far ahead in the future
Problem with long term dependencies

LSTM to the rescue!
• LSTMs are explicitly designed to avoid the long-term
dependency problem
• Remembering as default behavior.
• How? → 3 in 1 operation
Forget Update Output

Does this make sense?
𝑓 𝑋 = 𝑋2 𝑓 "ℎ𝑒𝑦 𝑤ℎ𝑎𝑡′𝑠 𝑢𝑝? " = ?
Of course it doesn’t…

Word embeddings
• Words or phrases are mapped to vectors of real numbers
• Relationships of words
• Their representation is learned by the usage of words

hasta
que
se
seque
el
malecón
0.128 0.291 -0.711 0.889
0.462 -0.012 0.119 0.505
0.664 -0.008 0.386 0.464
-0.662 0.937 -0.335 -0.084
0.911 0.007 0.380 0.109
0.707 -0.111 0.844 0.117

Previously on TFv1.0
• Elements:
• Variables
• Inputs → Placeholders
• Operations
• Execution → Session

Some code about it
import tensorflow as tf
b = tf.Variable(tf.zeros((100,)))
W = tf.Variable(tf.random_uniform((784, 100), -1, 1))
x = tf.placeholder(tf.float32, (None, 784))
h_i = tf.nn.relu(tf.matmul(x, W) + b)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
sess.run(h_i, {x: np.random.random(64, 784)})

TF 2.0: Eager Execution
import tensorflow as tf
import numpy as np
b = np.zeros(100,)
W = np.random_uniform((784, 100), -1, 1)
x = np.random.random(64, 784)
h_i = tf.nn.relu(tf.matmul(x, W) + b)

TF2.0 – They did it!!!
• Eager Execution as default
• No need to add tf.enable_eager_execution()
• Blocks or Functions can be defined to be executed in
graph mode
• Less conventions, more object-oriented and pythonic
design
• Variable_scope removed
• And yes Keras is inside
• Clean up in libraries and contrib

So LSTMs are here to stay,
right?
Ermmm…
• Unable to parallelize within training
examples
• Large memory constraints impact the
parallelization across training examples

Transformer
Jun 2017
• Google Research:
• “Attention is all you need” [Arxiv: 1706.03762]
• Vs. RNNs:
• Order-of-magnitude improvement on training time
• Vs. Convolutional Models:
• Complexity grows with distance/lenght on
convolutional models

Transformer
Jun 2017
Hasta que se seque el malecón.
Until the beach goes …
dry

Learning paradigm
Up until now we had translations, but what can we do if we don’t have it
How about to learn new languages?
What???

Reality
No labelled language
Learn extracting relationships from text
Emmbedding structures are similar
Language Agnostics Models → Yes!!
Cross-lingual embeddings

Beyond words…
How about machines that make machines?
Encoder Decoderen en Encoder Decoderes es
Encoder Decoderen es Encoder Decoderes en

Huge models
• GPT2 – OpenAI project
• Training on larger datasets: books, webs…
• increases: reading comprehension, translation,
summarization and QA
The bigger they are, the harder they fall

Help me!!
Decision Making Support
HHRR and Hiring Processes
Grading Test
Law, Regulation and Compliance
Contract Analysis

Help me!!
Harder, Better, Faster, Stronger
Large and Multiple Documents
Multi-hop Reasoning
Contextualized Information in Dialogues

The Endgame
Towards a Generalized AI

Thanks and …
See you soon!
Thanks also to the sponsors.
Without whom this would not have been posible.

And a very, very
special thanks!

DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0

Similar a DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0 (20)

Más de Plain Concepts

Más de Plain Concepts (20)

Último

Último (20)

DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0