SlideShare una empresa de Scribd logo
1 de 28
ACL-IJCNLP 2015
Beijing, China
Stats
•173 long papers
•105 as oral
•68 as poster presentations
•692 total long paper submissions
•145 short papers
•50 as oral
•95 as poster presentations
•648 total short paper submissions
•13 TACL papers
•7 Student Research Workshop papers
•25 system demonstrations
•8 tutorials
•15 workshops
Many Minions
A Computational Approach to Automatic
Prediction of Drunk-Texting
Alcohol abuse may lead to unsociable behavior such
as crime, drunk driving, or privacy leaks. We introduce
automatic drunk-texting prediction as the task of
identifying whether a text was written when under
the influence of alcohol. We experiment with tweets
labeled using hashtags as distant supervision. Our
classifiers use a set of N-gram and stylistic features to
detect drunk tweets. Our observations present the
first quantitative evidence that text contains signals
that can be exploited to detect drunk-texting.
• Dataset 1 (2435 drunk, 762 sober)
• #drunk, #drank, #imdrunk
• #notdrunk, #imnotdrunk, #sober
• Dataset 2 (2435 drunk, 5644 sober)
• Dataset H (193 drunk, 317 sober) http://ej.uz/Drunk-Texting
Prediction of Drunk-Texting
http://ej.uz/Drunk-Texting
Feature Set for Drunk-texting Prediction
Drunk-Poster?
Modeling Argument Strength in Student Essays
While recent years have seen a surge of interest in automated essay grading, including
work on grading essays with respect to particular dimensions such as prompt
adherence, coherence, and technical quality, there has been relatively little work on
grading the essay dimension of argument strength, which is arguably the most
important aspect of argumentative essays. We introduce a new corpus of
argumentative student essays annotated with argument strength scores and propose a
supervised, feature-rich approach to automatically scoring the essays along this
dimension. Our approach significantly outperforms a baseline that relies solely on
heuristically applied sentence argument function labels by up to 16.1%.
http://ej.uz/ArgStrInStudEssays
Modeling Argument Strength in Student Essays
http://ej.uz/ArgStrInStudEssays
Novel features:
• POS N-grams
• Semantic Frames
• Transitional Phrases
• Coreference
• Prompt Agreement
• Argument Component Predictions
• Argument Errors
Driving ROVER with Segment-based
ASR Quality Estimation
ROVER is a widely used method to combine the output of multiple
automatic speech recognition (ASR) systems. Though effective, the basic approach
and its variants suffer from potential drawbacks: i) their results depend on the
order in which the hypotheses are used to feed the combination process, ii) when
applied to combine long hypotheses, they disregard possible differences in
transcription quality at local level, iii) they often rely on word confidence information.
We address these issues by proposing a segment-based ROVER in which hypothesis
ranking is obtained from a confidence-independent ASR quality estimation method.
Our results on English data from the IWSLT2012 and IWSLT2013 evaluation
campaigns significantly outperform standard ROVER and approximate two strong
oracles.
http://ej.uz/ROVER-SegASR-QEst
Driving ROVER with Segment-based
ASR Quality Estimation
1. Split the utterance into segments (ideally at sentence level);
2. For each segment, automatically estimate the quality (e.g. in terms of WER) of the
corresponding M (segment-level) hypotheses;
3. Use the estimates to rank the hypotheses and feed ROVER based on the ranking;
4. Reconstruct the entire utterance transcription by concatenating the combined
segment level transcriptions produced by ROVER;
5. Measure the overall WER differences against standard ROVER and other oracles.
http://ej.uz/ROVER-SegASR-QEst
Multi-level Translation Quality Prediction
with QUEST++
This paper presents QUEST++ , an open source tool for quality estimation which
can predict quality for texts at word, sentence and document level. It also provides
pipelined processing, whereby prediction smade at a lower level (e.g. for words) can
be used as input to build models for predictions at a higher level (e.g. sentences).
QUEST++ allows the extraction of a variety of features, and provides machine
learning algorithms to build and test quality estimation models. Results on recent
datasets show that QUEST++ achieves state-of-the-art performance.
http://ej.uz/QUESTpp
• 148 sentence level features
• 40 word level features
• 67 document level features
QUEST++
http://ej.uz/QUESTpp
QUEST++
Unsupervised Decomposition of a Multi-Author
Document Based on Naive-Bayesian Model
This paper proposes a new unsupervised method for decomposing a multi-author
document into authorial components. We assume that we do not know anything about
the document and the authors, except the number of the authors of that document.
The key idea is to exploit the difference in the posterior probability of the Naive-
Bayesian model to increase the precision of the clustering assignment and the accuracy
of the classification process of our method. Experimental results show that the
proposed method outperforms two state-of-the-art methods.
http://ej.uz/Mult-AuthDocDecomposition
Unsupervised Decomposition of a Multi-Author
Document Based on Naive-Bayesian Model
http://ej.uz/Mult-AuthDocDecomposition
Decomposition of a Multi-Author Document
• Step 1 Divide the document into segments of fixed length.
• Step 2 Represent the resulted segments as vectors using an appropriate feature set which can differentiate
the writing styles among authors.
• Step 3 Cluster the resulted vectors into l clusters using an appropriate clustering algorithm targeting on
achieving high recall rates.
• Step 4 Re-vectorize the segments using a different feature set to more accurately discriminate the segments
in each cluster.
• Step 5 Apply the ”Segment Elicitation Procedure” to select the best segments from each cluster to increase
the precision rates.
• Step 6 Re-vectorize all selected segments using another feature set that can capture the differences among
the writing styles of all sentences in a document.
• Step 7 Train the classifier using the Naive-Bayesian model.
• Step 8 Classify each sentence using the learned classifier.
• Step 9 Apply the ”Probability Indication Procedure” to increase the accuracy of the classification results
using five criteria.
http://ej.uz/Mult-AuthDocDecomposition
Automatic Identification of
Age-Appropriate Ratings of Song Lyrics
This paper presents a novel task, namely the
automatic identification of age-appropriate ratings
of a musical track, or album, based on its lyrics.
Details are provided regarding the construction of a
dataset of lyrics from 12,242 tracks across 1,798
albums along with age-appropriate ratings
obtained from various web resources, along with
results from various text classification experiments.
The best accuracy of 71.02% for classifying albums
by age groups is achieved by combining vector
space model and psycholinguistic features.
http://ej.uz/IDofSongAgeRatings
Statistics of the dataset:
Linguistic Harbingers of Betrayal:
A Case Study on an Online Strategy Game
Interpersonal relations are fickle, with close friendships
often dissolving into enmity. In this work, we explore
linguistic cues that presage such transitions by studying
dyadic interactions in an on-line strategy game where
players form alliances and break those alliances through
betrayal. We characterize friendships that are unlikely to
last and examine temporal patterns that foretell betrayal.
We reveal that subtle signs of imminent betrayal are
encoded in the conversational patterns of the dyad, even if
the victim is not aware of the relationship’s fate. In
particular, we find that lasting friendships exhibit a form of
balance that manifests itself through language. In contrast,
sudden changes in the balance of certain conversational
attributes—such as positive sentiment, politeness, or
focus on future planning—signal impending betrayal.
http://ej.uz/LinguisticBetrayal
Linguistic Harbingers of Betrayal
http://ej.uz/LinguisticBetrayal
Features for recognizing imminent betrayal:
in decreasing order
An analysis of the user occupational class
through Twitter content
Social media content can be used as a complementary source
to the traditional methods for extracting and studying
collective social attributes. This study focuses on the
prediction of the occupational class for a public user profile.
Our analysis is conducted on a new annotated corpus of
Twitter users, their respective job titles, posted textual
content and platform-related attributes. We frame our task as
classification using latent feature representations such as
word clusters and embeddings. The employed linear and,
especially, non-linear methods can predict a user’s
occupational class with strong accuracy for the coarsest level
of a standard occupation taxonomy which includes nine
classes. Combined with a qualitative assessment, the derived
results confirm the feasibility of our approach in inferring a
new user attribute that can be embedded in a multitude of
downstream applications. http://ej.uz/occupationalClass-Twitter
Occupational class through Twitter content
User level attributes for a Twitter user:Topics, represented by their most central and most frequent 10 words:
Beijing
Food
More photos & blog post
www.lielakeda.lv

Más contenido relacionado

La actualidad más candente

Comparing Three Plagiarism Tools (Ferret, Sherlock, and Turnitin)
Comparing Three Plagiarism Tools (Ferret, Sherlock, and Turnitin)Comparing Three Plagiarism Tools (Ferret, Sherlock, and Turnitin)
Comparing Three Plagiarism Tools (Ferret, Sherlock, and Turnitin)Waqas Tariq
 
Towards Vagueness-Aware Semantic Data
Towards Vagueness-Aware Semantic DataTowards Vagueness-Aware Semantic Data
Towards Vagueness-Aware Semantic DataPanos Alexopoulos
 
From Academic Papers To Production : A Learning To Rank Story
From Academic Papers To Production : A Learning To Rank StoryFrom Academic Papers To Production : A Learning To Rank Story
From Academic Papers To Production : A Learning To Rank StoryAlessandro Benedetti
 
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...Andre Freitas
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
 
Entity Search on Virtual Documents Created with Graph Embeddings
Entity Search on Virtual Documents Created with Graph EmbeddingsEntity Search on Virtual Documents Created with Graph Embeddings
Entity Search on Virtual Documents Created with Graph EmbeddingsSease
 
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUEJournal For Research
 
Semantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachSemantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachAndre Freitas
 
Improving VIVO search through semantic ranking.
Improving VIVO search through semantic ranking.Improving VIVO search through semantic ranking.
Improving VIVO search through semantic ranking.Deepak K
 
Haystack London - Search Quality Evaluation, Tools and Techniques
Haystack London - Search Quality Evaluation, Tools and Techniques Haystack London - Search Quality Evaluation, Tools and Techniques
Haystack London - Search Quality Evaluation, Tools and Techniques Andrea Gazzarini
 
Advanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache LuceneAdvanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache LuceneAlessandro Benedetti
 
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...Andre Freitas
 
Ontology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemIJTET Journal
 
Context-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity LinkingContext-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity LinkingGiuseppe Rizzo
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Andre Freitas
 
APPLYING DISTRIBUTIONAL SEMANTICS TO ENHANCE CLASSIFYING EMOTIONS IN ARABIC T...
APPLYING DISTRIBUTIONAL SEMANTICS TO ENHANCE CLASSIFYING EMOTIONS IN ARABIC T...APPLYING DISTRIBUTIONAL SEMANTICS TO ENHANCE CLASSIFYING EMOTIONS IN ARABIC T...
APPLYING DISTRIBUTIONAL SEMANTICS TO ENHANCE CLASSIFYING EMOTIONS IN ARABIC T...cscpconf
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information RetrievalDustin Smith
 
A framework for plagiarism
A framework for plagiarismA framework for plagiarism
A framework for plagiarismcsandit
 

La actualidad más candente (20)

Comparing Three Plagiarism Tools (Ferret, Sherlock, and Turnitin)
Comparing Three Plagiarism Tools (Ferret, Sherlock, and Turnitin)Comparing Three Plagiarism Tools (Ferret, Sherlock, and Turnitin)
Comparing Three Plagiarism Tools (Ferret, Sherlock, and Turnitin)
 
Towards Vagueness-Aware Semantic Data
Towards Vagueness-Aware Semantic DataTowards Vagueness-Aware Semantic Data
Towards Vagueness-Aware Semantic Data
 
From Academic Papers To Production : A Learning To Rank Story
From Academic Papers To Production : A Learning To Rank StoryFrom Academic Papers To Production : A Learning To Rank Story
From Academic Papers To Production : A Learning To Rank Story
 
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
Entity Search on Virtual Documents Created with Graph Embeddings
Entity Search on Virtual Documents Created with Graph EmbeddingsEntity Search on Virtual Documents Created with Graph Embeddings
Entity Search on Virtual Documents Created with Graph Embeddings
 
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
 
Semantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachSemantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional Approach
 
Improving VIVO search through semantic ranking.
Improving VIVO search through semantic ranking.Improving VIVO search through semantic ranking.
Improving VIVO search through semantic ranking.
 
Haystack London - Search Quality Evaluation, Tools and Techniques
Haystack London - Search Quality Evaluation, Tools and Techniques Haystack London - Search Quality Evaluation, Tools and Techniques
Haystack London - Search Quality Evaluation, Tools and Techniques
 
Advanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache LuceneAdvanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache Lucene
 
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
 
BoTLRet: A Template-based Linked Data Information Retrieval
 BoTLRet: A Template-based Linked Data Information Retrieval BoTLRet: A Template-based Linked Data Information Retrieval
BoTLRet: A Template-based Linked Data Information Retrieval
 
Ontology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval System
 
Context-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity LinkingContext-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity Linking
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
 
APPLYING DISTRIBUTIONAL SEMANTICS TO ENHANCE CLASSIFYING EMOTIONS IN ARABIC T...
APPLYING DISTRIBUTIONAL SEMANTICS TO ENHANCE CLASSIFYING EMOTIONS IN ARABIC T...APPLYING DISTRIBUTIONAL SEMANTICS TO ENHANCE CLASSIFYING EMOTIONS IN ARABIC T...
APPLYING DISTRIBUTIONAL SEMANTICS TO ENHANCE CLASSIFYING EMOTIONS IN ARABIC T...
 
Vivo Search
Vivo SearchVivo Search
Vivo Search
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
A framework for plagiarism
A framework for plagiarismA framework for plagiarism
A framework for plagiarism
 

Similar a ACL-IJCNLP 2015

Natural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine LearningNatural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine Learningcsandit
 
SubbuProjectReport
SubbuProjectReportSubbuProjectReport
SubbuProjectReportSubba Oota
 
Hybrid Deep Learning Model for Multilingual Sentiment Analysis
Hybrid Deep Learning Model for Multilingual Sentiment AnalysisHybrid Deep Learning Model for Multilingual Sentiment Analysis
Hybrid Deep Learning Model for Multilingual Sentiment AnalysisIRJET Journal
 
Supervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmSupervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmIJSRD
 
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET Journal
 
It services & research methods
It services & research methodsIt services & research methods
It services & research methodsAkanshShandilya
 
UNDERSTAND SHORTTEXTS BY HARVESTING & ANALYZING SEMANTIKNOWLEDGE
UNDERSTAND SHORTTEXTS BY HARVESTING & ANALYZING SEMANTIKNOWLEDGEUNDERSTAND SHORTTEXTS BY HARVESTING & ANALYZING SEMANTIKNOWLEDGE
UNDERSTAND SHORTTEXTS BY HARVESTING & ANALYZING SEMANTIKNOWLEDGEPrasadu Peddi
 
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...IEEEGLOBALSOFTTECHNOLOGIES
 
Comparable entity mining from comparative questions
Comparable entity mining from comparative questionsComparable entity mining from comparative questions
Comparable entity mining from comparative questionsIEEEFINALYEARPROJECTS
 
View the Microsoft Word document.doc
View the Microsoft Word document.docView the Microsoft Word document.doc
View the Microsoft Word document.docbutest
 
View the Microsoft Word document.doc
View the Microsoft Word document.docView the Microsoft Word document.doc
View the Microsoft Word document.docbutest
 
View the Microsoft Word document.doc
View the Microsoft Word document.docView the Microsoft Word document.doc
View the Microsoft Word document.docbutest
 
Characterization of Open-Source Applications and Test Suites
Characterization of Open-Source Applications and Test Suites Characterization of Open-Source Applications and Test Suites
Characterization of Open-Source Applications and Test Suites ijseajournal
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISrathnaarul
 
Class Diagram Extraction from Textual Requirements Using NLP Techniques
Class Diagram Extraction from Textual Requirements Using NLP TechniquesClass Diagram Extraction from Textual Requirements Using NLP Techniques
Class Diagram Extraction from Textual Requirements Using NLP Techniquesiosrjce
 
Babelfish_Report
Babelfish_ReportBabelfish_Report
Babelfish_ReportJoel Mathew
 
Data Mining on SpamBase,Wine Quality and Communities and Crime Datasets
Data Mining on SpamBase,Wine Quality and Communities and Crime DatasetsData Mining on SpamBase,Wine Quality and Communities and Crime Datasets
Data Mining on SpamBase,Wine Quality and Communities and Crime DatasetsAnkit Ghosalkar
 

Similar a ACL-IJCNLP 2015 (20)

Natural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine LearningNatural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine Learning
 
SubbuProjectReport
SubbuProjectReportSubbuProjectReport
SubbuProjectReport
 
Hybrid Deep Learning Model for Multilingual Sentiment Analysis
Hybrid Deep Learning Model for Multilingual Sentiment AnalysisHybrid Deep Learning Model for Multilingual Sentiment Analysis
Hybrid Deep Learning Model for Multilingual Sentiment Analysis
 
Supervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmSupervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithm
 
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
 
It services & research methods
It services & research methodsIt services & research methods
It services & research methods
 
UNDERSTAND SHORTTEXTS BY HARVESTING & ANALYZING SEMANTIKNOWLEDGE
UNDERSTAND SHORTTEXTS BY HARVESTING & ANALYZING SEMANTIKNOWLEDGEUNDERSTAND SHORTTEXTS BY HARVESTING & ANALYZING SEMANTIKNOWLEDGE
UNDERSTAND SHORTTEXTS BY HARVESTING & ANALYZING SEMANTIKNOWLEDGE
 
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
 
Comparable entity mining from comparative questions
Comparable entity mining from comparative questionsComparable entity mining from comparative questions
Comparable entity mining from comparative questions
 
Marvin_Capstone
Marvin_CapstoneMarvin_Capstone
Marvin_Capstone
 
View the Microsoft Word document.doc
View the Microsoft Word document.docView the Microsoft Word document.doc
View the Microsoft Word document.doc
 
View the Microsoft Word document.doc
View the Microsoft Word document.docView the Microsoft Word document.doc
View the Microsoft Word document.doc
 
View the Microsoft Word document.doc
View the Microsoft Word document.docView the Microsoft Word document.doc
View the Microsoft Word document.doc
 
Characterization of Open-Source Applications and Test Suites
Characterization of Open-Source Applications and Test Suites Characterization of Open-Source Applications and Test Suites
Characterization of Open-Source Applications and Test Suites
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
 
Class Diagram Extraction from Textual Requirements Using NLP Techniques
Class Diagram Extraction from Textual Requirements Using NLP TechniquesClass Diagram Extraction from Textual Requirements Using NLP Techniques
Class Diagram Extraction from Textual Requirements Using NLP Techniques
 
D017232729
D017232729D017232729
D017232729
 
Babelfish_Report
Babelfish_ReportBabelfish_Report
Babelfish_Report
 
Estimating the overall sentiment score by inferring modus ponens law
Estimating the overall sentiment score by inferring modus ponens lawEstimating the overall sentiment score by inferring modus ponens law
Estimating the overall sentiment score by inferring modus ponens law
 
Data Mining on SpamBase,Wine Quality and Communities and Crime Datasets
Data Mining on SpamBase,Wine Quality and Communities and Crime DatasetsData Mining on SpamBase,Wine Quality and Communities and Crime Datasets
Data Mining on SpamBase,Wine Quality and Communities and Crime Datasets
 

Más de Matīss ‎‎‎‎‎‎‎  

Hybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation SystemsHybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation SystemsMatīss ‎‎‎‎‎‎‎  
 
Effective online learning implementation for statistical machine translation
Effective online learning implementation for statistical machine translationEffective online learning implementation for statistical machine translation
Effective online learning implementation for statistical machine translationMatīss ‎‎‎‎‎‎‎  
 
Hybrid machine translation by combining multiple machine translation systems
Hybrid machine translation by combining multiple machine translation systemsHybrid machine translation by combining multiple machine translation systems
Hybrid machine translation by combining multiple machine translation systemsMatīss ‎‎‎‎‎‎‎  
 

Más de Matīss ‎‎‎‎‎‎‎   (20)

日本のお風呂
日本のお風呂日本のお風呂
日本のお風呂
 
Thrifty Food Tweets on a Rainy Day
Thrifty Food Tweets on a Rainy DayThrifty Food Tweets on a Rainy Day
Thrifty Food Tweets on a Rainy Day
 
私の趣味
私の趣味私の趣味
私の趣味
 
How Masterly Are People at Playing with Their Vocabulary?
How Masterly Are People at Playing with Their Vocabulary?How Masterly Are People at Playing with Their Vocabulary?
How Masterly Are People at Playing with Their Vocabulary?
 
私の町リガ
私の町リガ私の町リガ
私の町リガ
 
大学への交通手段
大学への交通手段大学への交通手段
大学への交通手段
 
小学生に 携帯電話
小学生に 携帯電話小学生に 携帯電話
小学生に 携帯電話
 
Tracing multisensory food experience on twitter
Tracing multisensory food experience on twitterTracing multisensory food experience on twitter
Tracing multisensory food experience on twitter
 
ラトビア大学
ラトビア大学ラトビア大学
ラトビア大学
 
私の趣味
私の趣味私の趣味
私の趣味
 
富士山りょこう
富士山りょこう富士山りょこう
富士山りょこう
 
Tips and Tools for NMT
Tips and Tools for NMTTips and Tools for NMT
Tips and Tools for NMT
 
Hybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation SystemsHybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation Systems
 
The Impact of Corpora Qulality on Neural Machine Translation
The Impact of Corpora Qulality on Neural Machine TranslationThe Impact of Corpora Qulality on Neural Machine Translation
The Impact of Corpora Qulality on Neural Machine Translation
 
Advancing Estonian Machine Translation
Advancing Estonian Machine TranslationAdvancing Estonian Machine Translation
Advancing Estonian Machine Translation
 
Debugging neural machine translations
Debugging neural machine translationsDebugging neural machine translations
Debugging neural machine translations
 
Effective online learning implementation for statistical machine translation
Effective online learning implementation for statistical machine translationEffective online learning implementation for statistical machine translation
Effective online learning implementation for statistical machine translation
 
Neirontulkojumu atkļūdošana
Neirontulkojumu atkļūdošanaNeirontulkojumu atkļūdošana
Neirontulkojumu atkļūdošana
 
Hybrid machine translation by combining multiple machine translation systems
Hybrid machine translation by combining multiple machine translation systemsHybrid machine translation by combining multiple machine translation systems
Hybrid machine translation by combining multiple machine translation systems
 
Paying attention to MWEs in NMT
Paying attention to MWEs in NMTPaying attention to MWEs in NMT
Paying attention to MWEs in NMT
 

Último

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Último (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

ACL-IJCNLP 2015

  • 2. Stats •173 long papers •105 as oral •68 as poster presentations •692 total long paper submissions •145 short papers •50 as oral •95 as poster presentations •648 total short paper submissions •13 TACL papers •7 Student Research Workshop papers •25 system demonstrations •8 tutorials •15 workshops
  • 4. A Computational Approach to Automatic Prediction of Drunk-Texting Alcohol abuse may lead to unsociable behavior such as crime, drunk driving, or privacy leaks. We introduce automatic drunk-texting prediction as the task of identifying whether a text was written when under the influence of alcohol. We experiment with tweets labeled using hashtags as distant supervision. Our classifiers use a set of N-gram and stylistic features to detect drunk tweets. Our observations present the first quantitative evidence that text contains signals that can be exploited to detect drunk-texting. • Dataset 1 (2435 drunk, 762 sober) • #drunk, #drank, #imdrunk • #notdrunk, #imnotdrunk, #sober • Dataset 2 (2435 drunk, 5644 sober) • Dataset H (193 drunk, 317 sober) http://ej.uz/Drunk-Texting
  • 5. Prediction of Drunk-Texting http://ej.uz/Drunk-Texting Feature Set for Drunk-texting Prediction Drunk-Poster?
  • 6. Modeling Argument Strength in Student Essays While recent years have seen a surge of interest in automated essay grading, including work on grading essays with respect to particular dimensions such as prompt adherence, coherence, and technical quality, there has been relatively little work on grading the essay dimension of argument strength, which is arguably the most important aspect of argumentative essays. We introduce a new corpus of argumentative student essays annotated with argument strength scores and propose a supervised, feature-rich approach to automatically scoring the essays along this dimension. Our approach significantly outperforms a baseline that relies solely on heuristically applied sentence argument function labels by up to 16.1%. http://ej.uz/ArgStrInStudEssays
  • 7. Modeling Argument Strength in Student Essays http://ej.uz/ArgStrInStudEssays Novel features: • POS N-grams • Semantic Frames • Transitional Phrases • Coreference • Prompt Agreement • Argument Component Predictions • Argument Errors
  • 8. Driving ROVER with Segment-based ASR Quality Estimation ROVER is a widely used method to combine the output of multiple automatic speech recognition (ASR) systems. Though effective, the basic approach and its variants suffer from potential drawbacks: i) their results depend on the order in which the hypotheses are used to feed the combination process, ii) when applied to combine long hypotheses, they disregard possible differences in transcription quality at local level, iii) they often rely on word confidence information. We address these issues by proposing a segment-based ROVER in which hypothesis ranking is obtained from a confidence-independent ASR quality estimation method. Our results on English data from the IWSLT2012 and IWSLT2013 evaluation campaigns significantly outperform standard ROVER and approximate two strong oracles. http://ej.uz/ROVER-SegASR-QEst
  • 9. Driving ROVER with Segment-based ASR Quality Estimation 1. Split the utterance into segments (ideally at sentence level); 2. For each segment, automatically estimate the quality (e.g. in terms of WER) of the corresponding M (segment-level) hypotheses; 3. Use the estimates to rank the hypotheses and feed ROVER based on the ranking; 4. Reconstruct the entire utterance transcription by concatenating the combined segment level transcriptions produced by ROVER; 5. Measure the overall WER differences against standard ROVER and other oracles. http://ej.uz/ROVER-SegASR-QEst
  • 10. Multi-level Translation Quality Prediction with QUEST++ This paper presents QUEST++ , an open source tool for quality estimation which can predict quality for texts at word, sentence and document level. It also provides pipelined processing, whereby prediction smade at a lower level (e.g. for words) can be used as input to build models for predictions at a higher level (e.g. sentences). QUEST++ allows the extraction of a variety of features, and provides machine learning algorithms to build and test quality estimation models. Results on recent datasets show that QUEST++ achieves state-of-the-art performance. http://ej.uz/QUESTpp • 148 sentence level features • 40 word level features • 67 document level features
  • 13. Unsupervised Decomposition of a Multi-Author Document Based on Naive-Bayesian Model This paper proposes a new unsupervised method for decomposing a multi-author document into authorial components. We assume that we do not know anything about the document and the authors, except the number of the authors of that document. The key idea is to exploit the difference in the posterior probability of the Naive- Bayesian model to increase the precision of the clustering assignment and the accuracy of the classification process of our method. Experimental results show that the proposed method outperforms two state-of-the-art methods. http://ej.uz/Mult-AuthDocDecomposition
  • 14. Unsupervised Decomposition of a Multi-Author Document Based on Naive-Bayesian Model http://ej.uz/Mult-AuthDocDecomposition
  • 15. Decomposition of a Multi-Author Document • Step 1 Divide the document into segments of fixed length. • Step 2 Represent the resulted segments as vectors using an appropriate feature set which can differentiate the writing styles among authors. • Step 3 Cluster the resulted vectors into l clusters using an appropriate clustering algorithm targeting on achieving high recall rates. • Step 4 Re-vectorize the segments using a different feature set to more accurately discriminate the segments in each cluster. • Step 5 Apply the ”Segment Elicitation Procedure” to select the best segments from each cluster to increase the precision rates. • Step 6 Re-vectorize all selected segments using another feature set that can capture the differences among the writing styles of all sentences in a document. • Step 7 Train the classifier using the Naive-Bayesian model. • Step 8 Classify each sentence using the learned classifier. • Step 9 Apply the ”Probability Indication Procedure” to increase the accuracy of the classification results using five criteria. http://ej.uz/Mult-AuthDocDecomposition
  • 16. Automatic Identification of Age-Appropriate Ratings of Song Lyrics This paper presents a novel task, namely the automatic identification of age-appropriate ratings of a musical track, or album, based on its lyrics. Details are provided regarding the construction of a dataset of lyrics from 12,242 tracks across 1,798 albums along with age-appropriate ratings obtained from various web resources, along with results from various text classification experiments. The best accuracy of 71.02% for classifying albums by age groups is achieved by combining vector space model and psycholinguistic features. http://ej.uz/IDofSongAgeRatings Statistics of the dataset:
  • 17. Linguistic Harbingers of Betrayal: A Case Study on an Online Strategy Game Interpersonal relations are fickle, with close friendships often dissolving into enmity. In this work, we explore linguistic cues that presage such transitions by studying dyadic interactions in an on-line strategy game where players form alliances and break those alliances through betrayal. We characterize friendships that are unlikely to last and examine temporal patterns that foretell betrayal. We reveal that subtle signs of imminent betrayal are encoded in the conversational patterns of the dyad, even if the victim is not aware of the relationship’s fate. In particular, we find that lasting friendships exhibit a form of balance that manifests itself through language. In contrast, sudden changes in the balance of certain conversational attributes—such as positive sentiment, politeness, or focus on future planning—signal impending betrayal. http://ej.uz/LinguisticBetrayal
  • 18. Linguistic Harbingers of Betrayal http://ej.uz/LinguisticBetrayal Features for recognizing imminent betrayal: in decreasing order
  • 19. An analysis of the user occupational class through Twitter content Social media content can be used as a complementary source to the traditional methods for extracting and studying collective social attributes. This study focuses on the prediction of the occupational class for a public user profile. Our analysis is conducted on a new annotated corpus of Twitter users, their respective job titles, posted textual content and platform-related attributes. We frame our task as classification using latent feature representations such as word clusters and embeddings. The employed linear and, especially, non-linear methods can predict a user’s occupational class with strong accuracy for the coarsest level of a standard occupation taxonomy which includes nine classes. Combined with a qualitative assessment, the derived results confirm the feasibility of our approach in inferring a new user attribute that can be embedded in a multitude of downstream applications. http://ej.uz/occupationalClass-Twitter
  • 20. Occupational class through Twitter content User level attributes for a Twitter user:Topics, represented by their most central and most frequent 10 words:
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27. Food
  • 28. More photos & blog post www.lielakeda.lv

Notas del editor

  1. TACL – Transactions of the Association for Computational Linguistics