2. If you think the language
industry is new
haithem.afli@cit.ie 2
3. If you think the language
industry is new, think again!
haithem.afli@cit.ie 3
Rosetta Stone (British Museum)
4. Natural Language :
An age-old industry ?
§ For as far back as we can see, human has needed to
communicate → so the origin of language industry is closely
intertwined with the need of communication itself
04/02/2020 haithem.afli@cit.ie 4
The Tower of Babel and The House of Wisdom in Bagdad (Bait-al-Hikma)
5. The importance of Language
Processing
07/02/2020 haithem.afli@cit.ie 5
Media agencies and translators interpreted the word “treat with silent contempt” or “take
into account” (to ignore), as the categorical rejection by the Prime Minister.
The Americans understood that there would never be a diplomatic end to the war and
were naturally annoyed by what they considered the arrogant tone used in the Japanese
translation of the Prime Minister’s response. International news agencies reported to the
world that in the eyes of the Japanese government the ultimatum was “not worthy of
comment.”
8. The Rise of Natural Language Processing
(NLP), and How it is Changing the Way we
Retrieve Information
07/02/2020 haithem.afli@cit.ie 8
The 'creator' of Bitcoin, Satoshi Nakamoto, is
the world's most elusive billionaire. Very few
people outside of the Department of
Homeland Security know Satoshi's real
name. Satoshi has taken great care to keep
his identity secret employing the latest
encryption and obfuscation methods in his
communications.
Despite these efforts Satoshi Nakamoto gave
investigators the only tool they needed to find him -
- his own words. Using NLP, NSA (and everyone!)
was able to compare texts to determine authorship
of a particular work.
More info: https://tech.slashdot.org/story/17/08/28/1725232/how-the-nsa-identified-satoshi-
nakamoto
9. Timeline of (modern) AI
haithem.afli@cit.ie
Graph from The University Of Queensland Brain Institute
The 1st AI
Winter
The second AI
Winter
Including CIT MSc in AI
https://www.cit.ie/course/CRKARIN9
9
10. The first AI winter
Haithem.afli@cit.ie
By 1964, the National Research Council (NRC)
had become concerned about the lack of progress
and formed the Automatic Language Processing
Advisory Committee (ALPAC) to look into the
problem.
They concluded, in a famous 1966 report, that
machine translation was more expensive, less
accurate and slower than human translation.
After spending some 20 million dollars, the NRC
ended all support.
Image from Wikipedia
11. Haithem.afli@cit.ie
In 1984, John McCarthy criticized expert systems because they lacked common sense
and knowledge about their own limitations.
Schwarz, Director of DARPA ISTO from 1987 to 1989 concluded that AI research has
always had
“… very limited success in particular areas, followed immediately by failure to reach the
broader goal at which these initial successes seem at first to hint…”.
Ø Decrease in funding in AI research.
Ø Many AI companies closed their doors.
Ø The AAAI conference that attracted over 6000
visitors in 1986 quickly decreased to just 2000
by 1991.
The second AI winter
12. The survivors
The Deep Learning God Fathers
Haithem.afli@cit.ie
Turing Award given for:
• “The conceptual and engineering breakthroughs that have made deep neural
networks a critical component of computing.”
14. 2014: Generative Adversarial
Networks
§ The neural network at
the top is the
discriminator, and its task
is to distinguish the
training set’s real
information from the
generator’s creations.
§ In the simplest GAN
structure, the generator
starts with random data
and learns to transform
this noise into
information that matches
the distribution of the
real data.
haithem.afli@cit.ie 14
15. Do you know this person?
Haithem.afli@cit.ie
https://thispersondoesnotexist.com/
19. DeepFake
§ The development of
deepfakes has taken place
to a large extent in two
settings: research at
academic institutions, and
development by amateurs
in online communities.
haithem.afli@cit.ie 19
20. GAN
Applications of GANs
ØGANs for Image Editing
ØUsing GANs for Security
(SSGAN: Secure Steganography Based on GAN)
ØDe-aging Robert De Niro!
(Martin Scorsese spent millions of Netflix's money
to digitally de-age De Niro, Pacino, and Pesci so they could portray these men throughout
different parts of their lives.)
Haithem.afli@cit.ie
21. 2016: Sequence to Sequence
Learning with Attention
haithem.afli@cit.ie
This mechanism allows the
network to refer back to the input
sequence, instead of forcing it to
encode all information into one
fixed-length vector
21
30. Addressing commensense problem
haithem.afli@cit.ie 30
Cunxiang Wang, Shuailong Liang , Yue Zhang , Xiaonan Li and Tian Gao. Does It Make Sense?
And Why? A Pilot Study for Sense Making and Explanation.
31. Addressing real-world challenges
§ AI Technologies
- Natural Language Processing (NLP)
- Social Media and UGC Analysis
- Computer Vision (CV)
- Machine/Deep Learning (ML-DL)
§ Applications
- Digital Humanities
- Fintech
- Digital Health and Life-science
- Social Science and Psychology
- Security and Cybersecurity
31haithem.afli@cit.ie
32. NLP and ML to Address the
European migration crisis
§ ITFLOWS will model migration to the EU in two stages:
07/02/2020 haithem.afli@cit.ie 32
The first stage comprises
migration flows from third
countries to the EU borders.
Within this first stage,
migration flows are broadly
differentiated into regular
and irregular flows. ITFLOWS
will focus on predicting
irregular flows at this stage,
as regular migration is
authorised and regulated by
the receiving countries, in
this case the EU member
states.
33. § ITFLOWS will model migration to the EU in two stages:
07/02/2020 haithem.afli@cit.ie 33
The second stage of
movement takes place
between the crossing of the
borders into the EU and the
final settlement of migrants
in the EU member states.
Ø Models for the accurate prediction of irregular migration flows from regions in five
countries of origin to the EU, and
Ø A holistic global model that will give predictions of the arrivals of irregular migrants
in all EU Member States.
NLP and ML to Address the European
migration crisis
36. Ethics and Data Privacy
§ The collection of tweets related to the countries of origin
will be based mainly on the language (and dialect) and an
estimated location. If we take the example of Syrian users,
ITFLOWS will be focusing on collecting public data of users
of Levantine Arabic (spoken in Lebanon, Jordan, Syria,
Palestine, and Israel) language who are located (based on
the Twitter API information) at least in the following
locations: https://data2.unhcr.org/en/situations/syria .
§ Since the location is only approximated, there will be no
discrimination based on the nationality in this task.
07/02/2020 haithem.afli@cit.ie 36
37. Ethics and Data Privacy
§ De-identification methods (Authorship Obfustication) for
natural language processing tasks: multiple steps need to be
addressed. ITFLOWS technological partners (CIT and FIZ) will
extract identifiers from text, and they will anonymise the
data set used for NLP tasks. For example, all addresses,
names, and so on by using named entity recogniser will be
removed.
§ This practice will be conducted according to the EU data
protection laws and, from a technical point of view, it will
be based on Differential Privacy for Text Document.
07/02/2020 haithem.afli@cit.ie 37
38. CIT team
07/02/2020 haithem.afli@cit.ie 38
Dr Haithem Afli
Computer
science Dep.
RIOMH
ADAPT@CIT
Eileen Crowley
Halpin Centre
for Research &
Innovation
CIT team received €528k H2020 fund
and will be led by
41. ML meets NLP to address Digital
Health challenges
07/02/2020 haithem.afli@cit.ie 41
The STOP project is addressing the
health societal challenge of
obesity through the foundation of
an innovative platform
to support Persons with Obesity
(PwO) with better nutrition under
the supervision of Healthcare
Professionals.
https://cordis.europa.eu/project/rcn/218245/factsheet/en
43. ML meets NLP to address Digital
Health challenges
07/02/2020 haithem.afli@cit.ie 43
The STOP Platform will capture
various PwO data from different kind
of smart sensor streams and Chatbot
technology, manage and enrich
available data with existing
knowledge bases and fuse these by
machine learned driven Data Fusion
approaches for sophisticated AI data
analysis.
https://cordis.europa.eu/project/rcn/218245/factsheet/en
45. CIT team
07/02/2020 haithem.afli@cit.ie 45
Yanxin Wu
PhD candidate in computer
science
Ryan Donovan
PhD candidate in psychology
Dr Haithem Afli
Principal Investigator
48. Interne Orange
Digital Service Provider (DSP)
E2E eHealth_slice:{type: eMBB}
Vertical
National Ambulance Service
ML meets CV to address the limitations of
current network infrastructures
Network Service Provider (NSP) B
RAN
Core
IP/MPLS MECCore DC EPC
NSSI: core slice
NSSI: RAN slice
NSI2: [RAN, Core IP/MPLS] Network Slice
https://slicenet.eu/
49. Interne Orange
Digital Service Provider (DSP)
E2E eHealth_slice:{type: eMBB}
Vertical
National Ambulance Service
Network Service Provider (NSP) B
RAN
Core
IP/MPLS MECCore DC EPC
NSSI: core slice
NSSI: RAN slice
NSI2: [RAN, Core IP/MPLS] Network Slice
https://slicenet.eu/
ML meets CV to address the limitations of
current network infrastructures
50. Interne Orange
Digital Service Provider (DSP)
Network Service Provider (NSP) A
RAN
EPCMEC Core DC
NSSI: RAN slice
NSSI: MEC slice
NSI1: [RAN + EPC + Core DC + Core IP/MPLS] Network Slice
NSSI: Core slice
Core
IP/MPLS
E2E eHealth_slice:{type: eMBB}
Vertical
National Ambulance Service
Network Service Provider (NSP) B
RAN
Core
IP/MPLS MECCore DC EPC
NSSI: core slice
NSSI: RAN slice
NSI2: [RAN, Core IP/MPLS] Network Slice
QoE: Perceived SNR, RSRP and RSRQ measurements
…
The signal quality will
be degraded for the
future 5 minutes
One Stop API/
P&P
Vertical feedback
https://slicenet.eu/
ML meets CV to address the limitations of
current network infrastructures
51. Microbiability in Beef Cattle
Archae
a Bacteri
a
Protozo
a Fung
i
Feed and hidric efficiency
Meat tenderness
Environmental impact
A better cattle
Variations in the microbiome
Can make
ML meets DA to address
Microbiability challenges
52. - Investigate the relation between the microbiome
components.
- Investigate the impact ot the microbiome
components in the cattle biology.
- Characterize the microbiome composition.
Rumen
Feces
N = 52 animals
- Several phenotypes measured.
- Microbial relative abundances
- Nelore is the predominant breed in Brazil.
Dr Bruno Gabriel
Abdrade Collecting
samples...
ML meets DA to address
Microbiability challenges