All the information that is needed to find and stop bad actors from entering our financial system already exists and is available to you today; it’s just buried in terabits of messy, unstructured data all over the internet. For those performing investigations and evaluating risk, this needle in a stack of needles problem is huge and growing: Unstructured data already dominates the web (growing exponentially year over year), and the traditional technology these departments use cannot keep up. Recent developments in natural language processing technology (NLP), the field of AI that focuses on human language, have, for the first time, made it possible for automated systems to find and deliver identity-relevant intelligence hidden in unstructured textual data.
4. JIHADI BRIDES TRAGEDY
4AI FOR GOOD ● BASIS TECHNOLOGY
Image Sources:
- Bethnal trio: Mirror
- Article: Independent
5. ALL THE EVIDENCE EXISTS
5AI FOR GOOD ● BASIS TECHNOLOGY
Scotland Yard
Report
ID
Social Activity
Image Sources:
- Tweet: : ISD Global
6. WHAT’S AT STAKE
6AI FOR GOOD ● BASIS TECHNOLOGY
FINANCIAL STABILITY
Global Money Laundering Operations
1% of Illegal Funds Captured
PUBLIC SAFETY
Deaths from Terrorist Attacks in Europe
11,288 from 1970-2017
Sources:
- Terrorism: Washington Post
- Money laundering: Wall Street Journal
9. COMMON PATTERN
##AI FOR GOOD ● BASIS TECHNOLOGY
80% of data is
unstructured
Join Processed
and Structured
Data into
Knowledge Graph
1) Natural Language
Processing Extracts
Facts
2) Scored for
confidence
& relevance
Mine Graph
For
Patterns
& Changes
People
Organizatio
ns
Locations
Relationshi
ps
Searching
Alerting
Anomaly Detection
COLLECT
EXTRAC
T
COMBIN
E
ANALYZE
Reporting
!
...
10. CHALLENGES AT EVERY LEVEL
##AI FOR GOOD ● BASIS TECHNOLOGY
COLLE
CT
EXTRAC
T
COMBIN
E
ANALYZ
E
● Domains
● Languages
● Training
Data
● Data Salad!
● Data
Access
● Duplication
● Variation
● Ambiguity
● Semantics
● Honey Pots
● Training
Data
● GIGO
● Data
Overload
● Alert Bombs
● Privacy
● Trust
11. ... government officials
were convicted of
corruption. ABC
Company saw a drop in
sales as …
CHALLENGES AND ANTI-PATTERNS
##AI FOR GOOD ● BASIS TECHNOLOGY
Identifying Context
1) Reliance on Keywords
2) Naive Rules
Leads to False Positives
and False Negatives
COLLE
CT
EXTRAC
T
COMBIN
E
ANALYZ
E
12. CHALLENGES AND ANTI-PATTERNS
##AI FOR GOOD ● BASIS TECHNOLOGY
Identifying Proper Names
3) Name Variants
4) Name Parts (common keys)
Leads to False Positives
and False Negatives
abdul rashid
abdal rashide
abdal-rasheed
abdul-rashiyd
abdul-rachid
abd-errshiyd
abd-errchide
abd-errcheed
Abdul-Rasheed
➔
COLLE
CT
EXTRAC
T
COMBIN
E
ANALYZ
E
14. Challenges & Anti-patterns
3) Failure to match variants
4) Failure to disambiguate
5) Failure to model what
matters
6) Monolingual design
“Operation Hairball”
CHALLENGES AND ANTI-PATTERNS
##AI FOR GOOD ● BASIS TECHNOLOGY
COLLE
CT
EXTRAC
T
COMBIN
E
ANALYZ
E
17. CROSS-LINGUAL SEMANTIC MODELING
##AI FOR GOOD ● BASIS TECHNOLOGY
Machine Learning
למידהחישוביתEagle
Pharmaceuticals Inc.
Eagle
Drugs, Co.
Tesla
Energy Storage
טסלה
AI
تيسالموتورز
計算学習
אחסוןאנרגיה
18. AI BUILDING BLOCKS: Algorithms & High Quality Data
##AI FOR GOOD ● BASIS TECHNOLOGY
● NN NER
● NN CLASS
● NN RELAX
● SVM
● TEXT
EMBEDDING
S
● NNs
● NL SEARCH
● CLASSIC ML
● ANOMALY
DETECTION
● HMM
● SEMANTIC
MODELING
● GRAPH
SIMILARITY
● Data Filtering
● Classification
● Deduplication
● High Quality
Annotations
● Language & domain
combos
● Active Learning
Feedback
● High Quality Name
Pairs in every
language pair
● Confidence
Modeling
● Semantic Model
● Baseline
“normal”
● Queries
● Visualizations
COLLE
CT
EXTRAC
T
COMBIN
E
ANALYZ
E
19. PUTTING IT ALL TOGETHER
##AI FOR GOOD ● BASIS TECHNOLOGY
COLLE
CT
EXTRAC
T
COMBIN
E
ANALYZ
E
People
Organizatio
ns
Locations
Relationshi
ps
Searching
Alerting
Anomaly Detection
Reporting
!
...
20. ##AI FOR GOOD ● BASIS TECHNOLOGY
THIS TECHNOLOGY IS ALREADY
AT WORK
21. CAPTURING EL CHAPO
##AI FOR GOOD ● BASIS TECHNOLOGY
Source: U.S. Immigration and Customs Enforcement
22. KEY DOMAINS OF IMPACT
##AI FOR GOOD ● BASIS TECHNOLOGY
National Security Financial ServicesLaw EnforcementIntelligence
23. THANK YOU
##AI FOR GOOD ● BASIS TECHNOLOGY
Gil Irizarry
● Basis Technology
● I engineer NLP / NLU tech for
good
● Please reach out!
@conoagil
All the information that is needed to find and stop bad actors from entering our financial system already exists and is available to you today; it’s just buried in terabits of messy, unstructured data all over the internet. For those performing investigations and evaluating risk, this needle in a stack of needles problem is huge and growing: Unstructured data already dominates the web (growing exponentially year over year), and the traditional technology these departments use cannot keep up. Recent developments in natural language processing technology (NLP), the field of AI that focuses on human language, have, for the first time, made it possible for automated systems to find and deliver identity-relevant intelligence hidden in unstructured textual data. In this talk, I will share some of the common patterns, common mistakes, and opportunities that I see in the field.
These innovations unlock a new world of actionable insight, providing much-needed ammunition in the fight against fraud, money-laundering, financial crime, and terrorism.
As we all know, you can’t take liquids or gels onto commercial flights
But Most people don’t know the events that led up to that regulation.
USA “3-1-1 Liquids Rule” (source)
Each passenger may carry liquids, gels and aerosols in travel-size containers that are 3.4 ounces or 100 milliliters. Each passenger is limited to one quart-size bag of liquids, gels and aerosols. Common travel items that must comply with the 3-1-1 liquids rule include toothpaste, shampoo, conditioner, mouthwash and lotion.
German Rule (source)
Containers holding liquids may not be larger than 100 ml, otherwise you may not carry them in your hand luggage. All such containers must be placed in a transparent, reclosable plastic bag with a capacity of no more than one liter (for example, an ordinary freezer bag with zipper). The bag may contain any number of containers as long as it is still possible to completely close it. Please remember: Each passenger may only take one such bag on board the plane.
PUNCHLINE
In August of 2006, seven aircraft did not explode during their flight over the Atlantic. Instead of plunging into the ocean, they landed on runways—a happy ending to what would have been a human tragedy had law enforcement not been tipped off by some carefully crafted AI.
The sad and unfortunate situation here, is that it could have been avoided.
The data exists, crime (which was effectively manipulating nad kidnapping a minor) could be prevented
For US audience:
US Homeland Attacks (2015-2018)
64 plots disrupted
21 plots executed
The data and technology exist to make the world a safer place, and it’s already begun to make an impact.
A more recent example of NLP being used to analyze documents for national security is the 2016 capture and subsequent conviction of El Chapo. To find him, intel officer analyzed communications from email, phone, and sms (SIGINT) of El Chapo’s network; used semantic technology to look at the content of his and his networks conversations and determine who was talking about drugs to understand and link to people in the text to create a network; lead to the identification of the network of people that were involved in El Chapo, allowing agencies to find the location and capture him.