University of Sussex

File(s) not publicly available

The distributional similarity of sub-parses.

posted on 2023-06-07, 19:32 authored by Julie WeedsJulie Weeds, David WeirDavid Weir, Bill Keller
This work explores computing distributional similarity between sub-parses, i.e., fragments of a parse tree, as an extension to general lexical distributional similarity techniques. In the same way that lexical distributional similarity is used to estimate lexical semantic similarity, we propose using distributional similarity between subparses to estimate the semantic similarity of phrases. Such a technique will allow us to identify paraphrases where the component words are not semantically similar. We demonstrate the potential of the method by applying it to a small number of examples and showing that the paraphrases are more similar than the non-paraphrases.


Publication status

  • Published


Association for Computational Linguistics



Presentation Type

  • paper

Event name

ACL Workshop on Empirical Modelling of Semantic Equivalence and Entailment

Event location

Ann Arbor

Event type


Event date

June, 2005.

Department affiliated with

  • Informatics Publications


Originality: This paper proposes a novel application of distributional similarity techniques in order to estimate the semantic similarity of phrases. The approach allows the identification of paraphrases where the component words are not semantically similar. Rigour: The approach was evaluated using the Pascal Textual Entailment Challenge dataset as a suitable source of paraphrase test data. In order to avoid sparse data problems when computing distributional similarity for phrases, corpus data was gathered for each phrase by mining the worldwide web. Significance: The work advances research on semantic similarity by extending lexical similarity techniques to phrases. It provides a new approach to the identification of paraphrases, with application in areas such as information retrieval, summarization and language understanding. Impact: Appeared in a specialist ACL workshop focussed on the rapidly developing topics of semantic equivalence and textual entailment. The same year saw the first in the annual series of workshops on the Pascal Textual Entailment Challenge that has set a bench-mark for work in this area subsequently.

Full text available

  • No

Peer reviewed?

  • Yes

Legacy Posted Date


Usage metrics

    University of Sussex (Publications)


    No categories selected