SlideShare una empresa de Scribd logo
1 de 49
Descargar para leer sin conexión
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 1/49
Batch normalization 與他愉快的⼩伙伴Batch normalization 與他愉快的⼩伙伴
杜岳華
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 2/49
OutlineOutline
Batch normalization (https://arxiv.org/abs/1502.03167)
Layer normalization (https://arxiv.org/abs/1607.06450)
Recurrent batch normalization (https://arxiv.org/abs/1603.09025)
Group normalization (https://arxiv.org/abs/1803.08494)
How does batch normalization help optimization?
(https://arxiv.org/abs/1805.11604)
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 3/49
E ectE ect
Improve accuracy
Faster learning
Stable
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 4/49
Batch normalization [Google]Batch normalization [Google]
Problem: the distribution of each layer's input changes during training.
Solution: x the distribution of inputs into a subnetwork
Effect: available of high laerning rate, improve training ef ciency
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 5/49
Internal covariate shift (ICS)Internal covariate shift (ICS)
Batch Normalization—What the hey?
(https://gab41.lab41.org/batch-normalization-what-the-hey-
d480039a9e3b)
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 6/49
AssumaptionsAssumaptions
= (x; )h1 F1 Θ1
⋮
= ( ; )hi Fi hi−1 Θi
⋮
y = ( ; )Fk hk−1 Θk
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 7/49
AssumaptionsAssumaptions
= (x; , ) = f( x + )h1 F1 W1 b1 W1 b1
⋮
= ( ; , ) = f( + )hi Fi hi−1 Wi bi Wi hi−1 bi
⋮
y = ( ; , ) = f( + )Fk hk−1 Wk bk Wk hk−1 bk
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 8/49
Batch normalizationBatch normalization
Ideally, calculating over all the training set is the best.
μ =
1
m
∑
i=1
m
xi
= ( − μσ
2
1
m
∑
i=1
m
xi )
2
←xi^
− μxi
+ ϵσ
2
− −−−−
√
← B ( ) = γ + βyi Nγ,β xi^ xi^
γ and β are network parameters.
μ, σ
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 9/49
Batch normalizationBatch normalization
is a constant added to mini-batch variance for numerical stability
each mini-batch produces estimates of the mean and variance
BatchNorm can be added before or after the activation function
ϵ
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 10/49
Batch normalizationBatch normalization
Wx+b f Wx+b
xi
^xi
yi
γ,β
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 11/49
Batch normalizationBatch normalization
Ensure the output statistics of a layer are xed.
Wx+b Wx+b
xi
yi
γ,β
f
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 12/49
TestingTesting
Ideal solution
Compute over all the training set
Practical solution
Compute moving average of of batches during training
μ, σ
μ, σ
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 13/49
Pros and ConsPros and Cons
Advantage
Increase learning rate
Remove dropout and reduce regularization
Remove local response normalization
Regularizer
Disadvantage
Extra computation
Small batch size: no effect
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 14/49
Layer normalization [University of Toronto, G. Hinton]Layer normalization [University of Toronto, G. Hinton]
Problem: BatchNorm is dependent on batch, and is not obvious how to apply to
RNN
Varied length sequence in RNN
Hard to applied to online learning
Solution: transpose the normalization into layer and place it before non-linearity
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 15/49
AssumptionsAssumptions
Feed-forward neural networkFeed-forward neural network
hidden layer
= f( + )h
l+1
W
l
h
l
b
l
l − th
Standard RNNStandard RNN
hidden layer
= f( + + )h
t+1
Wh h
t
Wx x
t
b
l
t − th
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 16/49
Layer normalizationLayer normalization
Compute the layer normalization statistics over all the hidden units in the same layer.
hidden unit
= f( + )h
l+1
i
w
lT
i
h
l
b
l
i
⇒ = , = f( + )a
l
i
w
lT
i
h
l
h
l+1
i
a
l
i
b
l
i
i − th
=μ
l
1
h
∑
i=1
h
a
l
i
=σ
l
( −
1
h
∑
i=1
h
a
l
i
μ
l
)
2
− −−−−−−−−−−−

⎷


2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 17/49
Layer normalizationLayer normalization
All the hidden units in a layer share the same normalization terms
Different training cases have different normlization terms
L ( ) = g ⊙ + bNg,b a
t
−a
t
μ
t
σ
t
g and b are network parameters.
⊙ :  Hadamard product, or element-wise multiply
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 18/49
Layer normalizationLayer normalization
= +a
t
Wh h
t
Wx x
t
= f(L ( ))h
t
Ng,b a
t
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 19/49
Some analysisesSome analysises
Compare the invariance between batch, weight and layer normalization
Geometry of parameter space during training
make learning more stable
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 20/49
Pros and ConsPros and Cons
Advantage
Fast converage
Reduce vanishing gradient problem
Disadvantage
Not suitable for CNN
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 21/49
Recurrent batch normalizationRecurrent batch normalization
problem: limited use in stacked RNN
solution: apply batch normalization to hidden-to-hidden transition
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 22/49
AssumptionsAssumptions
input:
hidden state:
output:
= sigm( + + b)ft Wh ht−1 Wx xt−1
= sigm( + + b)it Wh ht−1 Wx xt−1
= sigm( + + b)ot Wh ht−1 Wx xt−1
= tanh( + + b)gt
Wh ht−1 Wx xt−1
 
= ⊙ + ⊙ct ft ct−1 it gt
= ⊙ tanh( )ht ot ct
xt−1
,ht−1 ct
ht
圖解LSTM (https://brohrer.mcknote.com/zh-
Hant/how_machine_learning_works/how_rnns_lstm_work.html)
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 23/49
AssumptionsAssumptions
xt-1
ct-1,
ht-1
ot-1
xt
ot
ct+1,
ht+1
xt+1
ot+1
LSTM unit
σ σ tanh σ
tanh
ct-1
ht-1
xt
ht
ct
Ft
It
Ot
ht
... ...
Wiki (https://en.wikipedia.org/wiki/Recurrent_neural_network)
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 24/49
Recurrent batch normalizationRecurrent batch normalization
B (x) = γ ⊙ + βNγ,β
x − μ
+ ϵσ
2
− −−−−
√
 
= sigm(B ( ) + B ( ) + b)ft N ,γh
βh
Wh ht−1 N ,γx
βx
Wx xt−1
= sigm(B ( ) + B ( ) + b)it N ,γh
βh
Wh ht−1 N ,γx
βx
Wx xt−1
= sigm(B ( ) + B ( ) + b)ot N ,γh
βh
Wh ht−1 N ,γx
βx
Wx xt−1
= tanh(B ( ) + B ( ) + b)gt
N ,γh
βh
Wh ht−1 N ,γx
βx
Wx xt−1
 
= ⊙ + ⊙ct ft ct−1 it gt
= ⊙ tanh(B ( ))ht ot N ,γc
βc
ct
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 25/49
Group normalization [Facebook AI Research]Group normalization [Facebook AI Research]
problem: BatchNorm's error increase rapidly when the batch size drcrease
CV require small batches constrained by memory comsumption
solution: divide channels into groups and compute within each group the mean
and variance for normalization
2481632
batch size (images per worker)
22
24
26
28
30
32
34
36
error(%)
Batch Norm
Group Norm
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 26/49
Group normalizationGroup normalization
H,W
C N
Batch Norm
H,W
C N
Layer Norm
H,W
C N
Instance Norm
H,W
C N
Group Norm
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 27/49
Group normalizationGroup normalization
結論都差不多,懶得講
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 28/49
How does batch normalization help optimization? [MIT]How does batch normalization help optimization? [MIT]
No, it is not about internal covariate shift!No, it is not about internal covariate shift!
It makes the optimization landscape signi cantly smoother.It makes the optimization landscape signi cantly smoother.
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 29/49
Investigate the connection between ICS and BatchNormInvestigate the connection between ICS and BatchNorm
VGG on CIFAR-10 w/o BatchNormVGG on CIFAR-10 w/o BatchNorm
Dramatic improvement both in terms of optimization and generalization
Difference in distribution stability
0 5k 10k 15k
Steps
50
100
TrainingAccuracy(%)
Standard, LR=0.1
Standard + BatchNorm, LR=0.1
Standard, LR=0.5
Standard + BatchNorm, LR=0.5
0 5k 10k 15k
Steps
50
100
TestAccuracy(%)
Standard, LR=0.1
Standard + BatchNorm, LR=0.1
Standard, LR=0.5
Standard + BatchNorm, LR=0.5
Layer#3
Standard Standard + BatchNorm
Layer#11
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 30/49
QuestionsQuestions
1. Is the effectiveness of BatchNorm indeed related to internal covariate shift?
2. Is BatchNorm's stabilization of layer input distributions even effective in reducing
ICS?
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 31/49
Does BatchNorm's performance stem from controlling ICS?Does BatchNorm's performance stem from controlling ICS?
We train the network with random noise injected after BatchNorm layers.
Each activation for each sample in the batch using i.i.d. noise with non-zero mean
and non-unit variance.
Noise distribution change at each time step.
0 5k 10k 15k
Steps
20
40
60
80
100
TrainingAccuracy
Standard
Standard + BatchNorm
Standard + "Noisy" Batchnorm
Layer#2
Standard Standard +
BatchNorm
Standard +
"Noisy" BatchNorm
Layer#9Layer#13
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 32/49
Is BatchNorm reducing ICS?Is BatchNorm reducing ICS?
Is there a broader notion of ICS that has such a direct link to training
performance?
Attempt to capture ICS from a perspective that is more tied to the underlying
optimization phenomenon.
Measure the difference between the gradients of each layer before and after
updates to all the previous layer.
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 33/49
Is BatchNorm reducing ICS?Is BatchNorm reducing ICS?
internal covariate shift (ICS) as
corresponds to the gradient of the layer parameters
is the same gradient after all the previous layers have been updated.
Re ect the change in the optimization landscape of caused by the changes of its
input.
Def.
|| − |Gt,i G
′
t,i
|
2
= ∇L( , . . . , , . . . , ; , )Gt,i W
(t)
1
W
(t)
i
W
(t)
k
x
(t)
y
(t)
= ∇L( , . . . , , . . . , ; , )G
′
t,i
W
(t+1)
1
W
(t+1)
i
W
(t+1)
k
x
(t)
y
(t)
Gt,i
G
′
t,i
Wi
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 34/49
Is BatchNorm reducing ICS?Is BatchNorm reducing ICS?
(a) VGG
20
40
60
80
100
TrainingAccuracy(%)
LR=0.1LR=0.1
Standard
Standard + BatchNorm
0 5k 10k 15k
Steps
20
40
60
80
100
TrainingAccuracy(%)
LR=0.01LR=0.01
Standard
Standard + BatchNorm
10 2
10 0
2-difference
Layer #5
0
1
CosAngle
Layer #10
10 3
10 1
10 1
2-difference
0 5k 10k 15k
Steps
0
1
CosAngle
0 5k 10k 15k
Steps
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 35/49
Is BatchNorm reducing ICS?Is BatchNorm reducing ICS?
(b) Deep Linear Network
103
104
TrainingLoss
LR=1e-06LR=1e-06
Standard
Standard + BatchNorm
0 5k 10k
Steps
103
104
TrainingLoss
LR=1e-07LR=1e-07
Standard
Standard + BatchNorm
10 2
10 3
10 4
2-Difference
Layer #9
0
1
CosAngle
Layer #17
10 1
10 3
2-Difference
0 5k 10k
Steps
0
1
CosAngle
0 5k 10k
Steps
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 36/49
Is BatchNorm reducing ICS?Is BatchNorm reducing ICS?
Model with BatchNorm have similar, or even worse, ICS
and are almost uncorrelated
Controlling the distributions layer inputs might not even reduce the ICS
Gt,i G
′
t,i
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 37/49
Why doess BatchNorm work?Why doess BatchNorm work?
Is there a more fundamental phenomenon at play here?
It reparametrizes the underlying optimization problem to make its landscape be
signi cantly more smooth.
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 38/49
Landscape smoothnessLandscape smoothness
Loss changes at a smaller rate and the magnitudes of the gradients are smaller too.
0 5k 10k 15k
Steps
100
101
LossLandscape
Standard
Standard + BatchNorm
(a) loss landscape
0 5k 10k 15k
Steps
5
10
15
20
25
30
35
40
45
-smoothness
Standard
Standard + BatchNorm
(b) “effective”β-smoothness
0 5k 10k 15k
Steps
0
50
100
150
200
250
GradientPredictiveness
Standard
Standard + BatchNorm
(c) gradient predictiveness
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 39/49
LipschitznessLipschitzness
Lipschitz continuousLipschitz continuous
A function is Lipschitz continuous,
K: a Lipschitz constant,
the smallest K is the (best) Lipschitz constant
f : X → Y
⇔ ∃K ≥ 0, ∀ , ∈ X,x1 x2
|f( ) − f( )| ≤ K| − |x1 x2 x1 x2
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 40/49
-smoothness-smoothnessβ
-smoothness-smoothness
A function is -smooth
β
f β
⇔ ∇f is β-Lipschitz
⇔ ||∇f( ) − ∇f( )|| ≤ β|| − ||x1 x2 x1 x2
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 41/49
The optimization landscapeThe optimization landscape
0 5k 10k 15k
Steps
100
101
LossLandscape
Standard
Standard + BatchNorm
(a) loss landscape
0 5k 10k 15k
Steps
5
10
15
20
25
30
35
40
45
-smoothness
Standard
Standard + BatchNorm
(b) “effective”β-smoothness
0 5k 10k 15k
Steps
0
50
100
150
200
250
GradientPredictiveness
Standard
Standard + BatchNorm
(c) gradient predictiveness
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 42/49
The optimization landscapeThe optimization landscape
Improve the Lipschitzness of the loss function
BatchNorm's reparametrization leads to gradients of the loss being more Lipschitz
too
the loss exhibits a signi cantly better "effective" -smoothness
Make the gradients more reliable and predictive
β
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 43/49
Theoretical analysisTheoretical analysis
跳過(逃跳過(逃
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 44/49
Is BatchNorm the best (only?) way to smoothen theIs BatchNorm the best (only?) way to smoothen the
landscape?landscape?
Is this smoothening effect a unique feature of BatchNorm?
Study schemes taht x the rst order moment of the activations, as BatchNorm
does.
normalize them by the average of their norm
norm, norm and norm
Lp
L1 L2 L∞
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 45/49
Is BatchNorm the best (only?) way to smoothen theIs BatchNorm the best (only?) way to smoothen the
landscape?landscape?
0 5k 10k
Steps
20
40
60
80
100
TrainingAccuracy(%)
Standard
Standard + BatchNorm
Standard + L 1
Standard + L 2
Standard + L
0 5k 10k
Steps
102
103
104
TrainingLoss
Standard
Standard + BatchNorm
Standard + L 1
Standard + L 2
Standard + L
(a) VGG (b) Deep Linear Model
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 46/49
Is BatchNorm the best (only?) way to smoothen theIs BatchNorm the best (only?) way to smoothen the
landscape?landscape?
Layer#11
Standard Standard + BatchNorm Standard + L 1Norm Standard + L 2Norm Standard + L Norm
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 47/49
Is BatchNorm the best (only?) way to smoothen theIs BatchNorm the best (only?) way to smoothen the
landscape?landscape?
All the normalization strategies offer comparable performance to BatchNorm
For deep linear network, -normalization performs even better than BatchNorm
-normalization leads to larger distributional covariate shift than vanilla network,
yet stiil yield improved optimization performance
l1
lp
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 48/49
ConclusionConclusion
BatchNorm might not even be reducing internal covariate shift.
BatchNorm makes the landscape of the corresponding optimization problem be
signi cantly more smooth.
Provide empirical demostration and theoretical justi cation. (Lipschitzness)
The smoothening effect is not uniquely tied to BatchNorm.
2018/10/21 Batch_normalization slides
http://127.0.0.1:8000/Batch_normalization.slides.html?print-pdf#/ 49/49
Q & AQ & A
Extra papersExtra papers
Understanding Batch Normalization (https://arxiv.org/abs/1806.02375)
Norm matters: ef cient and accurate normalization schemes in deep networks
(https://arxiv.org/abs/1803.01814)
Batch-normalized Recurrent Highway Networks
(https://arxiv.org/abs/1809.10271)
Differentiable Learning-to-Normalize via Switchable Normalization
(https://arxiv.org/abs/1806.10779)

Más contenido relacionado

Similar a Batch Normalization Slides Explained Key Concepts

Question 1 of 205.0 PointsGiven the size of the product, an ai.docx
Question 1 of 205.0 PointsGiven the size of the product, an ai.docxQuestion 1 of 205.0 PointsGiven the size of the product, an ai.docx
Question 1 of 205.0 PointsGiven the size of the product, an ai.docxIRESH3
 
A Radical Challenge in Reliability Dynamic Life Test.pdf; Burn-In program Con...
A Radical Challenge in Reliability Dynamic Life Test.pdf; Burn-In program Con...A Radical Challenge in Reliability Dynamic Life Test.pdf; Burn-In program Con...
A Radical Challenge in Reliability Dynamic Life Test.pdf; Burn-In program Con...Phillip Chan
 
WeightWatcher Update: January 2021
WeightWatcher Update:  January 2021WeightWatcher Update:  January 2021
WeightWatcher Update: January 2021Charles Martin
 
IE431 Final Presentation
IE431 Final PresentationIE431 Final Presentation
IE431 Final PresentationArnaud Marquet
 
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W1 Process Capability
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W1 Process CapabilityJavier Garcia - Verdugo Sanchez - Six Sigma Training - W1 Process Capability
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W1 Process CapabilityJ. García - Verdugo
 
Scaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache BeamScaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache BeamTatiana Al-Chueyr
 
Module-5 Integration testing and System Testing.pdf
Module-5 Integration testing and System Testing.pdfModule-5 Integration testing and System Testing.pdf
Module-5 Integration testing and System Testing.pdfSATHYANARAYANAKB
 
Computer-Aided Assembly Planning
Computer-Aided Assembly PlanningComputer-Aided Assembly Planning
Computer-Aided Assembly PlanningRichard Farr
 
Mastering the C_S4CMA_2308 Certification Exam Your Path to IT Excellence
Mastering the C_S4CMA_2308 Certification Exam Your Path to IT ExcellenceMastering the C_S4CMA_2308 Certification Exam Your Path to IT Excellence
Mastering the C_S4CMA_2308 Certification Exam Your Path to IT ExcellenceAliza Oscar
 
ITG-Nov15-MgmtBrief-Cost-Benefit-Comparison-IBM-VMware
ITG-Nov15-MgmtBrief-Cost-Benefit-Comparison-IBM-VMwareITG-Nov15-MgmtBrief-Cost-Benefit-Comparison-IBM-VMware
ITG-Nov15-MgmtBrief-Cost-Benefit-Comparison-IBM-VMwareMichael Martin
 
System Kaizen Process Feb 6 2011
System Kaizen Process Feb 6 2011System Kaizen Process Feb 6 2011
System Kaizen Process Feb 6 2011ExerciseLeanLLC
 
Using error stack and error dt ps in sap bi 7.0
Using error stack and error dt ps in sap bi 7.0Using error stack and error dt ps in sap bi 7.0
Using error stack and error dt ps in sap bi 7.0gireesho
 
Precomputing recommendations with Apache Beam
Precomputing recommendations with Apache BeamPrecomputing recommendations with Apache Beam
Precomputing recommendations with Apache BeamTatiana Al-Chueyr
 
Lead Time Reduction in Manufacturing Process of CNC Machines by Lean Principles
Lead Time Reduction in Manufacturing Process of CNC Machines by Lean PrinciplesLead Time Reduction in Manufacturing Process of CNC Machines by Lean Principles
Lead Time Reduction in Manufacturing Process of CNC Machines by Lean PrinciplesIRJET Journal
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Greg Makowski
 
FixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidenceFixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidenceLEE HOSEONG
 

Similar a Batch Normalization Slides Explained Key Concepts (20)

4 h0 002
4 h0 0024 h0 002
4 h0 002
 
Question 1 of 205.0 PointsGiven the size of the product, an ai.docx
Question 1 of 205.0 PointsGiven the size of the product, an ai.docxQuestion 1 of 205.0 PointsGiven the size of the product, an ai.docx
Question 1 of 205.0 PointsGiven the size of the product, an ai.docx
 
000 237
000 237000 237
000 237
 
A Radical Challenge in Reliability Dynamic Life Test.pdf; Burn-In program Con...
A Radical Challenge in Reliability Dynamic Life Test.pdf; Burn-In program Con...A Radical Challenge in Reliability Dynamic Life Test.pdf; Burn-In program Con...
A Radical Challenge in Reliability Dynamic Life Test.pdf; Burn-In program Con...
 
WeightWatcher Update: January 2021
WeightWatcher Update:  January 2021WeightWatcher Update:  January 2021
WeightWatcher Update: January 2021
 
IE431 Final Presentation
IE431 Final PresentationIE431 Final Presentation
IE431 Final Presentation
 
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W1 Process Capability
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W1 Process CapabilityJavier Garcia - Verdugo Sanchez - Six Sigma Training - W1 Process Capability
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W1 Process Capability
 
Scaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache BeamScaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache Beam
 
Module-5 Integration testing and System Testing.pdf
Module-5 Integration testing and System Testing.pdfModule-5 Integration testing and System Testing.pdf
Module-5 Integration testing and System Testing.pdf
 
Computer-Aided Assembly Planning
Computer-Aided Assembly PlanningComputer-Aided Assembly Planning
Computer-Aided Assembly Planning
 
Apd and bpc
Apd and bpcApd and bpc
Apd and bpc
 
Wcre08.ppt
Wcre08.pptWcre08.ppt
Wcre08.ppt
 
Mastering the C_S4CMA_2308 Certification Exam Your Path to IT Excellence
Mastering the C_S4CMA_2308 Certification Exam Your Path to IT ExcellenceMastering the C_S4CMA_2308 Certification Exam Your Path to IT Excellence
Mastering the C_S4CMA_2308 Certification Exam Your Path to IT Excellence
 
ITG-Nov15-MgmtBrief-Cost-Benefit-Comparison-IBM-VMware
ITG-Nov15-MgmtBrief-Cost-Benefit-Comparison-IBM-VMwareITG-Nov15-MgmtBrief-Cost-Benefit-Comparison-IBM-VMware
ITG-Nov15-MgmtBrief-Cost-Benefit-Comparison-IBM-VMware
 
System Kaizen Process Feb 6 2011
System Kaizen Process Feb 6 2011System Kaizen Process Feb 6 2011
System Kaizen Process Feb 6 2011
 
Using error stack and error dt ps in sap bi 7.0
Using error stack and error dt ps in sap bi 7.0Using error stack and error dt ps in sap bi 7.0
Using error stack and error dt ps in sap bi 7.0
 
Precomputing recommendations with Apache Beam
Precomputing recommendations with Apache BeamPrecomputing recommendations with Apache Beam
Precomputing recommendations with Apache Beam
 
Lead Time Reduction in Manufacturing Process of CNC Machines by Lean Principles
Lead Time Reduction in Manufacturing Process of CNC Machines by Lean PrinciplesLead Time Reduction in Manufacturing Process of CNC Machines by Lean Principles
Lead Time Reduction in Manufacturing Process of CNC Machines by Lean Principles
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
 
FixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidenceFixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidence
 

Más de 岳華 杜

[COSCUP 2023] 我的Julia軟體架構演進之旅
[COSCUP 2023] 我的Julia軟體架構演進之旅[COSCUP 2023] 我的Julia軟體架構演進之旅
[COSCUP 2023] 我的Julia軟體架構演進之旅岳華 杜
 
Julia: The language for future
Julia: The language for futureJulia: The language for future
Julia: The language for future岳華 杜
 
The Language for future-julia
The Language for future-juliaThe Language for future-julia
The Language for future-julia岳華 杜
 
20190907 Julia the language for future
20190907 Julia the language for future20190907 Julia the language for future
20190907 Julia the language for future岳華 杜
 
Metaprogramming in julia
Metaprogramming in juliaMetaprogramming in julia
Metaprogramming in julia岳華 杜
 
Introduction to julia
Introduction to juliaIntroduction to julia
Introduction to julia岳華 杜
 
自然語言處理概覽
自然語言處理概覽自然語言處理概覽
自然語言處理概覽岳華 杜
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning岳華 杜
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation岳華 杜
 
從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論岳華 杜
 
COSCUP: Foreign Function Call in Julia
COSCUP: Foreign Function Call in JuliaCOSCUP: Foreign Function Call in Julia
COSCUP: Foreign Function Call in Julia岳華 杜
 
COSCUP: Metaprogramming in Julia
COSCUP: Metaprogramming in JuliaCOSCUP: Metaprogramming in Julia
COSCUP: Metaprogramming in Julia岳華 杜
 
COSCUP: Introduction to Julia
COSCUP: Introduction to JuliaCOSCUP: Introduction to Julia
COSCUP: Introduction to Julia岳華 杜
 
Introduction to Julia
Introduction to JuliaIntroduction to Julia
Introduction to Julia岳華 杜
 
20180506 Introduction to machine learning
20180506 Introduction to machine learning20180506 Introduction to machine learning
20180506 Introduction to machine learning岳華 杜
 
20171127 當julia遇上資料科學
20171127 當julia遇上資料科學20171127 當julia遇上資料科學
20171127 當julia遇上資料科學岳華 杜
 
20171117 oop and design patterns in julia
20171117 oop and design patterns in julia20171117 oop and design patterns in julia
20171117 oop and design patterns in julia岳華 杜
 
20171014 tips for manipulating filesystem in julia
20171014 tips for manipulating filesystem in julia20171014 tips for manipulating filesystem in julia
20171014 tips for manipulating filesystem in julia岳華 杜
 
20170807 julia的簡單而高效資料處理
20170807 julia的簡單而高效資料處理20170807 julia的簡單而高效資料處理
20170807 julia的簡單而高效資料處理岳華 杜
 
20170715 北Bio meetup
20170715 北Bio meetup20170715 北Bio meetup
20170715 北Bio meetup岳華 杜
 

Más de 岳華 杜 (20)

[COSCUP 2023] 我的Julia軟體架構演進之旅
[COSCUP 2023] 我的Julia軟體架構演進之旅[COSCUP 2023] 我的Julia軟體架構演進之旅
[COSCUP 2023] 我的Julia軟體架構演進之旅
 
Julia: The language for future
Julia: The language for futureJulia: The language for future
Julia: The language for future
 
The Language for future-julia
The Language for future-juliaThe Language for future-julia
The Language for future-julia
 
20190907 Julia the language for future
20190907 Julia the language for future20190907 Julia the language for future
20190907 Julia the language for future
 
Metaprogramming in julia
Metaprogramming in juliaMetaprogramming in julia
Metaprogramming in julia
 
Introduction to julia
Introduction to juliaIntroduction to julia
Introduction to julia
 
自然語言處理概覽
自然語言處理概覽自然語言處理概覽
自然語言處理概覽
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
 
從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論
 
COSCUP: Foreign Function Call in Julia
COSCUP: Foreign Function Call in JuliaCOSCUP: Foreign Function Call in Julia
COSCUP: Foreign Function Call in Julia
 
COSCUP: Metaprogramming in Julia
COSCUP: Metaprogramming in JuliaCOSCUP: Metaprogramming in Julia
COSCUP: Metaprogramming in Julia
 
COSCUP: Introduction to Julia
COSCUP: Introduction to JuliaCOSCUP: Introduction to Julia
COSCUP: Introduction to Julia
 
Introduction to Julia
Introduction to JuliaIntroduction to Julia
Introduction to Julia
 
20180506 Introduction to machine learning
20180506 Introduction to machine learning20180506 Introduction to machine learning
20180506 Introduction to machine learning
 
20171127 當julia遇上資料科學
20171127 當julia遇上資料科學20171127 當julia遇上資料科學
20171127 當julia遇上資料科學
 
20171117 oop and design patterns in julia
20171117 oop and design patterns in julia20171117 oop and design patterns in julia
20171117 oop and design patterns in julia
 
20171014 tips for manipulating filesystem in julia
20171014 tips for manipulating filesystem in julia20171014 tips for manipulating filesystem in julia
20171014 tips for manipulating filesystem in julia
 
20170807 julia的簡單而高效資料處理
20170807 julia的簡單而高效資料處理20170807 julia的簡單而高效資料處理
20170807 julia的簡單而高效資料處理
 
20170715 北Bio meetup
20170715 北Bio meetup20170715 北Bio meetup
20170715 北Bio meetup
 

Último

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Último (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

Batch Normalization Slides Explained Key Concepts