File(s) not publicly available
The distributional similarity of sub-parses.
presentationposted on 2023-06-07, 19:32 authored by Julie WeedsJulie Weeds, David WeirDavid Weir, Bill Keller
This work explores computing distributional similarity between sub-parses, i.e., fragments of a parse tree, as an extension to general lexical distributional similarity techniques. In the same way that lexical distributional similarity is used to estimate lexical semantic similarity, we propose using distributional similarity between subparses to estimate the semantic similarity of phrases. Such a technique will allow us to identify paraphrases where the component words are not semantically similar. We demonstrate the potential of the method by applying it to a small number of examples and showing that the paraphrases are more similar than the non-paraphrases.
PublisherAssociation for Computational Linguistics
Event nameACL Workshop on Empirical Modelling of Semantic Equivalence and Entailment
Event locationAnn Arbor
Event dateJune, 2005.
Department affiliated with
- Informatics Publications
NotesOriginality: This paper proposes a novel application of distributional similarity techniques in order to estimate the semantic similarity of phrases. The approach allows the identification of paraphrases where the component words are not semantically similar. Rigour: The approach was evaluated using the Pascal Textual Entailment Challenge dataset as a suitable source of paraphrase test data. In order to avoid sparse data problems when computing distributional similarity for phrases, corpus data was gathered for each phrase by mining the worldwide web. Significance: The work advances research on semantic similarity by extending lexical similarity techniques to phrases. It provides a new approach to the identification of paraphrases, with application in areas such as information retrieval, summarization and language understanding. Impact: Appeared in a specialist ACL workshop focussed on the rapidly developing topics of semantic equivalence and textual entailment. The same year saw the first in the annual series of workshops on the Pascal Textual Entailment Challenge that has set a bench-mark for work in this area subsequently.
Full text available