SlideShare a Scribd company logo
1 of 22
Download to read offline
Apply Chinese Radicals Into
Neural Machine Translation:
Deeper Than Character Level
Lifeng Han

lifeng.han@adaptcentre.ie

https://github.com/poethan

ADAPT, Dublin City University

Limerick, Ireland, May 24

LPRC 2018: Limerick Postgraduate Research Conference
Agenda
• Myself

• Topic intro

• Related work

• Proposed idea/model

• Experiments design 

• Evaluation results

• Future work
•collaborative work
with Shaohui Kuang, Soochow
Uni. NLP lab
Myself
• PhD student, ADAPT Centre, DCU, Dublin, 2016 on

• Student Researcher, Amsterdam, 2014-16

• Master degree of Sci., Macau, 2011-14

• Bachelor of Maths, Shijiazhuang, 2007-11

• Primary ~ high school, Handan

• No kindergarten
Intro
• Machine Translation

• - what I’ m doing. Translate human languages via
Machine.

• Natural Language Processing

• - different processing tasks of human languages

• Artificial Intelligence

• - teach machine to perform human intelligences
MT-NLP-AI
From: vikingsna.org
Related work
• Machine Translation: Rule to Neural

• - rule, example-based, statistical, phrase-based, hierarchical
structure, tree-best, forest, neural models

• Neural MT, sequence to sequence, attention, coverage

• - word embeddings, sequence to sequence encoding-decoding,
attention, coverage, document/discourse level

• Chinese NLP, radical applications

• - Word Segmentation, Entity recognition, MT, Sentiment Analysis,
text mining
Chinese radical: example
Chinese radical: example
Proposed Model
• Apply Chinese Radical into Translation

• - how to apply radicals into MT

• - how to split character into radicals

• Combine radical-level MT with Neural Model

• - attention-based Neural MT

• - radical combination into input data
Combinations
• Details of setup=>
• Lifeng Han and Shaohui Kuang. 2018. Apply Chinese radicals into neural machine translation:
Deeper than character level. ArXiv pre-print https://arxiv.org/abs/1805.01565v1
Experiments
• Attention Neural MT

• Word+Character+Radical

• Word+Character

• Character+Radical

• Word+Radical

• Data prepration

• Training: 1.25 million parallel Chinese-English sentences / 80.9 millions Chinese
words and 86.4 millions English

• Development / testing: NIST06/NIST08 (National Institute for Standards and
Technology, USA)
Settings
Evaluation
• Broader Evaluation Metrics

• hLEPOR, BEER, CharacTER -> BLEU, NIST

• Evaluation Scores

• - in-depth analysis
MT evaluation metric LEPOR
Code & WIKI: https://en.wikipedia.org/wiki/LEPOR
Development data BLEU
Development data NIST
Development data Broader
Testing data BLEU
Testing data NIST
Testing data Broader
Future work
• Improve parameter optimisation/tuning models

• Include more testing data

• Include different domain data

• Reduce training data and test low-resource scenario

• This paper pre-print: https://arxiv.org/pdf/1805.01565.pdf
Follow the project
• LEPOR: https://github.com/poethan/LEPOR/ 

• Chinese character decomposition: https://github.com/
poethan/MWE4MT/tree/master/radical4mt 

• Acknowledgment:

• The ADAPT Centre for Digital Content Technology is
funded under the SFI Research Centres Programme
(Grant 13/RC/2106) and is co-funded under the
European Regional Development Fund.
Selected references
• ALF Han, DF Wong, LS Chao. 2013. Chinese named entity recognition with conditional
random fields in the light of Chinese characteristics. Intelligent Information Systems
Symposium, 57-68. https://link.springer.com/chapter/10.1007/978-3-642-38634-3_8 

• ALF Han, DF Wong, LS Chao. 2013. LEPOR: A Robust Evaluation Metric for Machine
Translation with Augmented Factors. Proceedings of the 24th International Conference
on Computational Linguistics. https://aclanthology.org/C12-2044/ 

• ALF Han, DF Wong, LS Chao, L He, Y Lu, J Xing, X Zeng . 2013. Language-independent
Model for Machine Translation Evaluation with Reinforced Factors. Machine Translation
Summit XIV, 215-222. https://aclanthology.org/2013.mtsummit-posters.3.pdf 

• L Han. 2018. Machine Translation Evaluation Resources and Methods: A Survey. IPRC:
Ireland Postgraduate Research Conference. http://doras.dcu.ie/24493/ 

• Lifeng Han and Shaohui Kuang. 2018. Incorporating Chinese Radicals Into Neural
Machine Translation: Deeper Than Character Level. ESSLLI-2018: student session.
http://doras.dcu.ie/24732/

More Related Content

What's hot

Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Ahmed Magdy Ezzeldin, MSc.
 
Answer Selection and Validation for Arabic Questions
Answer Selection and Validation for Arabic QuestionsAnswer Selection and Validation for Arabic Questions
Answer Selection and Validation for Arabic QuestionsAhmed Magdy Ezzeldin, MSc.
 
Practical machine learning - Part 1
Practical machine learning - Part 1Practical machine learning - Part 1
Practical machine learning - Part 1Traian Rebedea
 
Question Answering - Application and Challenges
Question Answering - Application and ChallengesQuestion Answering - Application and Challenges
Question Answering - Application and ChallengesJens Lehmann
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringTraian Rebedea
 
Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...alessio_ferrari
 
Detecting and Describing Historical Periods in a Large Corpora
Detecting and Describing Historical Periods in a Large CorporaDetecting and Describing Historical Periods in a Large Corpora
Detecting and Describing Historical Periods in a Large CorporaTraian Rebedea
 
Transfer learning-presentation
Transfer learning-presentationTransfer learning-presentation
Transfer learning-presentationBushra Jbawi
 
Natural Language Processing (NLP) for Requirements Engineering (RE): an Overview
Natural Language Processing (NLP) for Requirements Engineering (RE): an OverviewNatural Language Processing (NLP) for Requirements Engineering (RE): an Overview
Natural Language Processing (NLP) for Requirements Engineering (RE): an Overviewalessio_ferrari
 
Transfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningTransfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningSebastian Ruder
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPindico data
 
Presentation of Domain Specific Question Answering System Using N-gram Approach.
Presentation of Domain Specific Question Answering System Using N-gram Approach.Presentation of Domain Specific Question Answering System Using N-gram Approach.
Presentation of Domain Specific Question Answering System Using N-gram Approach.Tasnim Ara Islam
 
Opinion mining for social media and news items in Romanian
Opinion mining for social media and news items in RomanianOpinion mining for social media and news items in Romanian
Opinion mining for social media and news items in RomanianTraian Rebedea
 
SSSW 2013 - Feeding Recommender Systems with Linked Open Data
SSSW 2013 - Feeding Recommender Systems with Linked Open DataSSSW 2013 - Feeding Recommender Systems with Linked Open Data
SSSW 2013 - Feeding Recommender Systems with Linked Open DataPolytechnic University of Bari
 

What's hot (20)

Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
 
Answer Selection and Validation for Arabic Questions
Answer Selection and Validation for Arabic QuestionsAnswer Selection and Validation for Arabic Questions
Answer Selection and Validation for Arabic Questions
 
Arabic question answering ‫‬
Arabic question answering ‫‬Arabic question answering ‫‬
Arabic question answering ‫‬
 
Practical machine learning - Part 1
Practical machine learning - Part 1Practical machine learning - Part 1
Practical machine learning - Part 1
 
Question Answering - Application and Challenges
Question Answering - Application and ChallengesQuestion Answering - Application and Challenges
Question Answering - Application and Challenges
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
 
Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...
 
Searching for the Best Machine Translation Combination
Searching for the Best Machine Translation CombinationSearching for the Best Machine Translation Combination
Searching for the Best Machine Translation Combination
 
Question answering
Question answeringQuestion answering
Question answering
 
Detecting and Describing Historical Periods in a Large Corpora
Detecting and Describing Historical Periods in a Large CorporaDetecting and Describing Historical Periods in a Large Corpora
Detecting and Describing Historical Periods in a Large Corpora
 
Transfer learning-presentation
Transfer learning-presentationTransfer learning-presentation
Transfer learning-presentation
 
Natural Language Processing (NLP) for Requirements Engineering (RE): an Overview
Natural Language Processing (NLP) for Requirements Engineering (RE): an OverviewNatural Language Processing (NLP) for Requirements Engineering (RE): an Overview
Natural Language Processing (NLP) for Requirements Engineering (RE): an Overview
 
co:op-READ-Convention Marburg - Roger Labahn
co:op-READ-Convention Marburg - Roger Labahnco:op-READ-Convention Marburg - Roger Labahn
co:op-READ-Convention Marburg - Roger Labahn
 
co:op-READ-Convention Marburg - Enrique Vidal
co:op-READ-Convention Marburg - Enrique Vidalco:op-READ-Convention Marburg - Enrique Vidal
co:op-READ-Convention Marburg - Enrique Vidal
 
Transfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningTransfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine Learning
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLP
 
Presentation of Domain Specific Question Answering System Using N-gram Approach.
Presentation of Domain Specific Question Answering System Using N-gram Approach.Presentation of Domain Specific Question Answering System Using N-gram Approach.
Presentation of Domain Specific Question Answering System Using N-gram Approach.
 
Opinion mining for social media and news items in Romanian
Opinion mining for social media and news items in RomanianOpinion mining for social media and news items in Romanian
Opinion mining for social media and news items in Romanian
 
NLP Project Full Cycle
NLP Project Full CycleNLP Project Full Cycle
NLP Project Full Cycle
 
SSSW 2013 - Feeding Recommender Systems with Linked Open Data
SSSW 2013 - Feeding Recommender Systems with Linked Open DataSSSW 2013 - Feeding Recommender Systems with Linked Open Data
SSSW 2013 - Feeding Recommender Systems with Linked Open Data
 

Similar to Apply chinese radicals into neural machine translation: deeper than character level

Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science LabScalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science LabSri Ambati
 
How Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment AnalysisHow Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment AnalysisCrowdFlower
 
Machine Learning Vs. Deep Learning – An Example Implementation
Machine Learning Vs. Deep Learning – An Example ImplementationMachine Learning Vs. Deep Learning – An Example Implementation
Machine Learning Vs. Deep Learning – An Example ImplementationSynerzip
 
Coursera data science specialization
Coursera data science specializationCoursera data science specialization
Coursera data science specializationMengshu Liu
 
Data carpentry run-a-workshop
Data carpentry run-a-workshopData carpentry run-a-workshop
Data carpentry run-a-workshoptracykteal
 
Essentials for a Better ICT Student in Palestine
Essentials for a Better ICT Student in PalestineEssentials for a Better ICT Student in Palestine
Essentials for a Better ICT Student in PalestineJafar Hajeer
 
Reproducible Research with R, The Tidyverse, Notebooks, and Spark
Reproducible Research with R, The Tidyverse, Notebooks, and SparkReproducible Research with R, The Tidyverse, Notebooks, and Spark
Reproducible Research with R, The Tidyverse, Notebooks, and SparkAdaryl "Bob" Wakefield, MBA
 
Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OSri Ambati
 
02-Lifecycle.pptx
02-Lifecycle.pptx02-Lifecycle.pptx
02-Lifecycle.pptxShree Shree
 
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)Alexander Borzunov
 
New Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 EditionNew Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 EditionSri Ambati
 
Cheng_Wang_resume
Cheng_Wang_resumeCheng_Wang_resume
Cheng_Wang_resumeCheng Wang
 
Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSADaniel S. Katz
 
Building NLP solutions for Davidson ML Group
Building NLP solutions for Davidson ML GroupBuilding NLP solutions for Davidson ML Group
Building NLP solutions for Davidson ML Groupbotsplash.com
 
Keeping ontology development Agile
Keeping ontology development AgileKeeping ontology development Agile
Keeping ontology development Agilerobertstevens65
 
Data carpentry instructor-onboarding
Data carpentry instructor-onboardingData carpentry instructor-onboarding
Data carpentry instructor-onboardingtracykteal
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsSri Ambati
 
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
The Canadian Linked Data Initiative: Charting a Path to a Linked Data FutureThe Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
The Canadian Linked Data Initiative: Charting a Path to a Linked Data FutureNASIG
 

Similar to Apply chinese radicals into neural machine translation: deeper than character level (20)

Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science LabScalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
 
How Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment AnalysisHow Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment Analysis
 
Designing e-Learning Objects
Designing e-Learning ObjectsDesigning e-Learning Objects
Designing e-Learning Objects
 
Machine Learning Vs. Deep Learning – An Example Implementation
Machine Learning Vs. Deep Learning – An Example ImplementationMachine Learning Vs. Deep Learning – An Example Implementation
Machine Learning Vs. Deep Learning – An Example Implementation
 
Coursera data science specialization
Coursera data science specializationCoursera data science specialization
Coursera data science specialization
 
Data carpentry run-a-workshop
Data carpentry run-a-workshopData carpentry run-a-workshop
Data carpentry run-a-workshop
 
Essentials for a Better ICT Student in Palestine
Essentials for a Better ICT Student in PalestineEssentials for a Better ICT Student in Palestine
Essentials for a Better ICT Student in Palestine
 
Reproducible Research with R, The Tidyverse, Notebooks, and Spark
Reproducible Research with R, The Tidyverse, Notebooks, and SparkReproducible Research with R, The Tidyverse, Notebooks, and Spark
Reproducible Research with R, The Tidyverse, Notebooks, and Spark
 
Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2O
 
02-Lifecycle.pptx
02-Lifecycle.pptx02-Lifecycle.pptx
02-Lifecycle.pptx
 
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
 
New Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 EditionNew Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 Edition
 
Cheng_Wang_resume
Cheng_Wang_resumeCheng_Wang_resume
Cheng_Wang_resume
 
Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSA
 
Building NLP solutions for Davidson ML Group
Building NLP solutions for Davidson ML GroupBuilding NLP solutions for Davidson ML Group
Building NLP solutions for Davidson ML Group
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 
Keeping ontology development Agile
Keeping ontology development AgileKeeping ontology development Agile
Keeping ontology development Agile
 
Data carpentry instructor-onboarding
Data carpentry instructor-onboardingData carpentry instructor-onboarding
Data carpentry instructor-onboarding
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
 
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
The Canadian Linked Data Initiative: Charting a Path to a Linked Data FutureThe Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
 

More from Lifeng (Aaron) Han

WMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni ManchesterWMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni ManchesterLifeng (Aaron) Han
 
Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)Lifeng (Aaron) Han
 
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Lifeng (Aaron) Han
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...Lifeng (Aaron) Han
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...Lifeng (Aaron) Han
 
Meta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsMeta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsLifeng (Aaron) Han
 
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...Lifeng (Aaron) Han
 
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...Lifeng (Aaron) Han
 
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longerBuild moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longerLifeng (Aaron) Han
 
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...Lifeng (Aaron) Han
 
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...Lifeng (Aaron) Han
 
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel CorporaMultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel CorporaLifeng (Aaron) Han
 
A deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationA deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationLifeng (Aaron) Han
 
machine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveymachine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveyLifeng (Aaron) Han
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelLifeng (Aaron) Han
 
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Lifeng (Aaron) Han
 
PubhD talk: MT serving the society
PubhD talk: MT serving the societyPubhD talk: MT serving the society
PubhD talk: MT serving the societyLifeng (Aaron) Han
 
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLifeng (Aaron) Han
 
Machine translation evaluation: a survey
Machine translation evaluation: a surveyMachine translation evaluation: a survey
Machine translation evaluation: a surveyLifeng (Aaron) Han
 

More from Lifeng (Aaron) Han (20)

WMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni ManchesterWMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
 
Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)
 
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 
Meta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsMeta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methods
 
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
 
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
 
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longerBuild moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
 
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
 
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
 
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel CorporaMultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
 
A deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationA deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine Translation
 
machine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveymachine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a survey
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
 
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
 
PubhD talk: MT serving the society
PubhD talk: MT serving the societyPubhD talk: MT serving the society
PubhD talk: MT serving the society
 
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metric
 
Thesis-Master-MTE-Aaron
Thesis-Master-MTE-AaronThesis-Master-MTE-Aaron
Thesis-Master-MTE-Aaron
 
Machine translation evaluation: a survey
Machine translation evaluation: a surveyMachine translation evaluation: a survey
Machine translation evaluation: a survey
 

Recently uploaded

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 

Recently uploaded (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Apply chinese radicals into neural machine translation: deeper than character level

  • 1. Apply Chinese Radicals Into Neural Machine Translation: Deeper Than Character Level Lifeng Han lifeng.han@adaptcentre.ie https://github.com/poethan ADAPT, Dublin City University Limerick, Ireland, May 24 LPRC 2018: Limerick Postgraduate Research Conference
  • 2. Agenda • Myself • Topic intro • Related work • Proposed idea/model • Experiments design • Evaluation results • Future work •collaborative work with Shaohui Kuang, Soochow Uni. NLP lab
  • 3. Myself • PhD student, ADAPT Centre, DCU, Dublin, 2016 on • Student Researcher, Amsterdam, 2014-16 • Master degree of Sci., Macau, 2011-14 • Bachelor of Maths, Shijiazhuang, 2007-11 • Primary ~ high school, Handan • No kindergarten
  • 4. Intro • Machine Translation • - what I’ m doing. Translate human languages via Machine. • Natural Language Processing • - different processing tasks of human languages • Artificial Intelligence • - teach machine to perform human intelligences
  • 6. Related work • Machine Translation: Rule to Neural • - rule, example-based, statistical, phrase-based, hierarchical structure, tree-best, forest, neural models • Neural MT, sequence to sequence, attention, coverage • - word embeddings, sequence to sequence encoding-decoding, attention, coverage, document/discourse level • Chinese NLP, radical applications • - Word Segmentation, Entity recognition, MT, Sentiment Analysis, text mining
  • 9. Proposed Model • Apply Chinese Radical into Translation • - how to apply radicals into MT • - how to split character into radicals • Combine radical-level MT with Neural Model • - attention-based Neural MT • - radical combination into input data
  • 10. Combinations • Details of setup=> • Lifeng Han and Shaohui Kuang. 2018. Apply Chinese radicals into neural machine translation: Deeper than character level. ArXiv pre-print https://arxiv.org/abs/1805.01565v1
  • 11. Experiments • Attention Neural MT • Word+Character+Radical • Word+Character • Character+Radical • Word+Radical • Data prepration • Training: 1.25 million parallel Chinese-English sentences / 80.9 millions Chinese words and 86.4 millions English • Development / testing: NIST06/NIST08 (National Institute for Standards and Technology, USA)
  • 13. Evaluation • Broader Evaluation Metrics • hLEPOR, BEER, CharacTER -> BLEU, NIST • Evaluation Scores • - in-depth analysis MT evaluation metric LEPOR Code & WIKI: https://en.wikipedia.org/wiki/LEPOR
  • 20. Future work • Improve parameter optimisation/tuning models • Include more testing data • Include different domain data • Reduce training data and test low-resource scenario • This paper pre-print: https://arxiv.org/pdf/1805.01565.pdf
  • 21. Follow the project • LEPOR: https://github.com/poethan/LEPOR/ • Chinese character decomposition: https://github.com/ poethan/MWE4MT/tree/master/radical4mt • Acknowledgment: • The ADAPT Centre for Digital Content Technology is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.
  • 22. Selected references • ALF Han, DF Wong, LS Chao. 2013. Chinese named entity recognition with conditional random fields in the light of Chinese characteristics. Intelligent Information Systems Symposium, 57-68. https://link.springer.com/chapter/10.1007/978-3-642-38634-3_8 • ALF Han, DF Wong, LS Chao. 2013. LEPOR: A Robust Evaluation Metric for Machine Translation with Augmented Factors. Proceedings of the 24th International Conference on Computational Linguistics. https://aclanthology.org/C12-2044/ • ALF Han, DF Wong, LS Chao, L He, Y Lu, J Xing, X Zeng . 2013. Language-independent Model for Machine Translation Evaluation with Reinforced Factors. Machine Translation Summit XIV, 215-222. https://aclanthology.org/2013.mtsummit-posters.3.pdf • L Han. 2018. Machine Translation Evaluation Resources and Methods: A Survey. IPRC: Ireland Postgraduate Research Conference. http://doras.dcu.ie/24493/ • Lifeng Han and Shaohui Kuang. 2018. Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than Character Level. ESSLLI-2018: student session. http://doras.dcu.ie/24732/