Novi Sad AI is the first AI community in Serbia with goal of democratizing knowledge of AI. On our first event we talked about Belief networks, Deep learning and many more.
4. NOVI SAD APPLIED INTELLIGENCE
COMMUNITY
City.AI NOVI SAD shaping up around a community whose goal
are :
To help local actors develop efficiently the Serbian
branch on AI internationally
To work around applied AI challenges with the local &
global ecosystem actors
To democratize AI innovation and close the gap
between technology and society
To train and challenge the local community
5. LEVERAGING THE POTENTIAL OF AI IN 40+ CITIES
AFRICA
Accra - Lagos
ASIA
Bangalore - Bangkok - Beirut - Chiang Mai - Hanoi - Hong Kong -
Jakarta - Johor Bahru - Karachi - Lahore - Manila - Pune - Seoul -
Singapore - Taipei
AUSTRALASIA
Wellington
EUROPE
Amsterdam - Berlin - Bratislava - Bristol - Brussels - Bucharest -
Budapest - Cambridge - Cluj - Cologne - Copenhagen - Hamburg -
Iasi - Krakow - Kyiv - London - Madrid - Munich - Novi Sad - Oxford
- Paris - Sofia - Stockholm - Stuttgart - Tallinn - Tirana - Valencia -
Valletta - Vienna - Vilnius
NORTH AMERICA
Austin - LA - New York - San Diego - San Francisco
SOUTH AMERICA
Bogota - La Paz - Sao Paulo
7. Our team
NOVI SAD AI TEAM
Jovan Stojanovic
Ambassador of Novi Sad-AI
Marko Jocic
Co-Ambassador of Novi Sad-AI
Jovana Miletic
Operation manager of Novi Sad-AI
8. Dejan Vukobratovic
PhD Proffesor and researcher, FTN
on
Belief networks
Ivan Peric
Deep learning engineer
on
NLP in Fintech
LESSONS LEARNED BY
9. Outline of the Talk
• Who are we?
– iCONIC Centre – Hotspot for massive communications and
information processing
• Our Focus?
– Large-scale networks for information acquisition and
processing (5G)
• Topic This Talk?
– Probabilistic Graphical Models and Belief Propagation
– Applications and Connections (to Deep Learning?)
10. Outline of the Talk
• Who are we?
– iCONIC Centre – Hotspot for massive communications and
information processing
• Our Focus?
– Large-scale networks for information acquisition and
processing (5G)
• Topic This Talk?
– Probabilistic Graphical Models and Belief Propagation
– Applications and Connections (to Deep Learning?)
12. Outline of the Talk
• Who are we?
– iCONIC Centre – Hotspot for massive communications and
information processing
• Our Focus?
– Large-scale networks for information acquisition and
processing (5G)
• Topic This Talk?
– Probabilistic Graphical Models and Belief Propagation
– Applications and Connections (to Deep Learning?)
15. https://shop.sodaq.com/en/nb-iot-shield-deluxe.html
3GPP Narrowband IoT (NB-IoT)
NarrowBand IOT
• Standardized within 3GPP
Release 13 (Nov 2016)
• Already in testing/
deployment at many mobile
operators
• Low-Power WAN
• Alternative to LoRa, SigFox
• “Next big thing” for mobile
operators
Early market solutions
16. 5G Focus No. 2: Distributed Information Processing in
Mobile Edge Computing (MEC)
17. Example: Massive Data Acquisition and Distributed
Information Processing in Smart Grids
IEEE Communications Magazine, Vol. 55, No. 10, October 2017. (http://ieeexplore.ieee.org/document/8067687/)
18. Outline of the Talk
• Who are we?
– iCONIC Centre – Hotspot for massive communications and
information processing
• Our Focus?
– Large-scale networks for information acquisition and
processing (5G)
• Topic This Talk?
– Probabilistic Graphical Models and Belief Propagation
– Applications and Connections (to Deep Learning?)
19. Probabilistic Graphical Models
• Model dependencies between random variables of a large-scale system
x1
x2
xN-1
xN
x4
x5
xN-3
xN-2
fs1
fs2
fsM
. . .
x3
21. Probabilistic Graphical Models
• We observe (measure) subset of random variables of a large-scale system
x1
x2
xN-1
xN
x4
x5
xN-3
xN-2
fs1
fs2
fsM
. . .
x3
22. Belief Propagation Algorithm
• Based on a new evidence, we infer values of all variables in the system
x1
x2
xN-1
xN
x4
x5
xN-3
xN-2
fs1
fs2
fsM
. . .
x3
23. Belief Propagation Algorithm
• Based on a new evidence, we infer values of all variables in the system
x1
x2
xN-1
xN
x4
x5
xN-3
xN-2
fs1
fs2
fsM
. . .
x3
24. Belief Propagation Algorithm
• After BP converges, we obtain new beliefs about all system variables
x1
x2
xN-1
xN
x4
x5
xN-3
xN-2
fs1
fs2
fsM
. . .
x3
p(xN) p(xN)
29. BP and Deep Learning
• https://sinews.siam.org/Details-Page/deep-deep-trouble
• Michael Elad (Technion) blog “Deep, deep trouble: Deep Learning’s
Impact on Image Processing, Mathematics and Humanity ”
– “Unfortunately, all of these great empirical achievements were obtained
with hardly any theoretical understanding of the underlying paradigm.
Moreover, the optimization employed in the learning process is highly non-
convex and intractable from a theoretical viewpoint.”
– “Should we be happy about this trend? Well, if we are in the business of
solving practical problems, the answer must be positive. Right? Therefore, a
company seeking such a solution should be satisfied. But what about us
scientists?”
– “This is clearly not the school of research we have been taught, and not the
kind of science we want to practice. Should we insist on our more rigorous
ways, even at the cost of falling behind in terms of output quality?”
31. Thinking Fast and Slow
• Two cognitive mechanisms:
• FAST: akin to DNN
– Instinctive
– Instantaneous
– Computationally fast shortcut
• SLOW: akin to PGM and BP
– Thoughtful but slow
– Deep
– Natural
32. Dejan Vukobratovic
PhD professor and researcher
on
Belief networks
Ivan Peric
Deep learning engineer
on
NLP in Fintech
LESSONS LEARNED BY
33. Application of Deep Learning to NLP
task in commercial projects
IVAN PERIĆ, NOVI SAD 2018
34. AI/ML Engineer – Synechron Serbia,
Novi Sad
Who am I?
Teaching Assistant – Chair of
Informatics, Faculty of Technical
Sciences, University of Novi Sad
35. What is Deep Learning?
Main goal – “Completely” simulate human brain
Mathematical models that approximate biological concepts, like neurons
Very big networks of artificial neurons
A lot of computations
The result is a hierarchical feature set extractor, that can be used for classification,
transformation to other feature sets, etc.
Known architectures
Multilayer Perceptron Neural Networks - MLPs
Convolutional Neural Networks – CNNs
Recurrent Neural Networks - RNNs
36. Where are we now?
State of the art deep artificial neural networks contain at most thousands of
neurons
Biological human brain contains 15-30 billion neurons
37. Use Case - Automated Question Answering Engine
Terabytes (or even petabytes) of documents
Banking contracts
Financial reports
Financial news articles
Client emails
Answer questions that might have an answer in
these documents
No predefined question list
38. Automated Question Answering Engine - Idea
[Question,
Answer,
Context]
Word
Embedding
Bidirectional
LSTM with
Attention
Mechanism
Answer
39. Word Embedding
Words need to be coded as numbers to be a valid input to ANN
One-Hot representation (high dimensionality, no semantics in word positions)
Word embedding allows word representation in a dense vector space, and adds
contextual similarity to words (word2vec, GloVe)
40. Bidirectional LSTM (Long Short-Term Memory)
Unidirectional LSTMs capture dependencies only in one direction
Bidirectional LSTMs captures future and past dependencies and work very good in
tasks that can benefit from this fact
41. Attention mechanism in LSTM
Attention mechanism lets network decide which
part of the input sequence it should focus on
Focused part is probably more important for the
task the sequence model is being trained for than
the rest of the sequence
In this case, decoder output depends on a
weighted combination of all the input states, not
just the last state
42. Bidirectional Attention Flow mechanism
Machine Comprehension of textual data
Published as a conference paper at ICLR 2017
43. Constraints in Deep Learning application
to commercial projects
Technical constraints in commercial use of
Deep Learning
Lack of any kind of data
Vast amounts of data, but unlabeled
Data is not opened to the public
Bad infrastructure for model training
44. Constraints in Deep Learning application
to commercial projects
Functional constraints in commercial use of Deep Learning
Lack of understanding of capabilities of Deep Learning
Black box model
Hard or impossible to explain any non-working cases to the client
Hard or impossible to explain any working cases to the client
Traditional Data Science approaches based on statistical models, as well as conventional
reasoning in AI and ML (searches, optimization algorithms, fuzzy logic, …) still look more
acceptable to clients
Usually easier to explain
They can work very well without big amounts of data