University of Sussex
Browse

File(s) under permanent embargo

ASOBEK: Twitter paraphrase identification with simple overlap features and SVMs

chapter
posted on 2023-06-09, 01:51 authored by Asli Eyecioglu, Bill Keller
We present an approach to identifying Twitter paraphrases using simple lexical over-lap features. The work is part of ongoing re-search into the applicability of knowledge-lean techniques to paraphrase identification. We utilize features based on overlap of word and character n-grams and train support vector machine (SVM). Our results demonstrate that character and word level overlap features in combination can give performance comparable to methods employing more sophisticated NLP processing tools and external resources. We achieve the highest F-score for identifying paraphrases on the Twitter Paraphrase Corpus as part of the SemEval-2015 Task1.

History

Publication status

  • Published

File Version

  • Published version

Journal

Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

Publisher

Association for Computational Linguistics (ACL)

Page range

64-69

Book title

SemEval-2015: The 9th International Workshop on Semantic Evaluation: proceedings of SemEval-2015: June 4-5, 2016, Denver, Colorado, USA

Place of publication

Stroudsburg, PA

ISBN

9781941643402

Department affiliated with

  • Informatics Publications

Full text available

  • No

Peer reviewed?

  • Yes

Legacy Posted Date

2016-06-23

First Open Access (FOA) Date

2016-06-23

First Compliant Deposit (FCD) Date

2016-06-23

Usage metrics

    University of Sussex (Publications)

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC