File(s) under permanent embargo
ASOBEK: Twitter paraphrase identification with simple overlap features and SVMs
chapter
posted on 2023-06-09, 01:51 authored by Asli Eyecioglu, Bill KellerWe present an approach to identifying Twitter paraphrases using simple lexical over-lap features. The work is part of ongoing re-search into the applicability of knowledge-lean techniques to paraphrase identification. We utilize features based on overlap of word and character n-grams and train support vector machine (SVM). Our results demonstrate that character and word level overlap features in combination can give performance comparable to methods employing more sophisticated NLP processing tools and external resources. We achieve the highest F-score for identifying paraphrases on the Twitter Paraphrase Corpus as part of the SemEval-2015 Task1.
History
Publication status
- Published
File Version
- Published version
Journal
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)Publisher
Association for Computational Linguistics (ACL)Page range
64-69Book title
SemEval-2015: The 9th International Workshop on Semantic Evaluation: proceedings of SemEval-2015: June 4-5, 2016, Denver, Colorado, USAPlace of publication
Stroudsburg, PAISBN
9781941643402Department affiliated with
- Informatics Publications
Full text available
- No
Peer reviewed?
- Yes