File(s) not publicly available
Detecting a continuum of compositionality in phrasal verbs
presentation
posted on 2023-06-07, 20:09 authored by Diana McCarthy, Bill Keller, John CarrollWe investigate the use of an automatically acquired thesaurus for measures designed to indicate the compositionality of candidate multiword verbs, specifically English phrasal verbs identified automatically using a robust parser. We examine various measures using the nearest neighbours of the phrasal verb, and in some cases the neighbours of the simplex counterpart and show that some of these correlate significantly with human rankings of compositionality on the test set. We also show that whilst the compositionality judgements correlate with some statistics commonly used for extracting multiwords, the relationship is not as strong as that using the automatically constructed thesaurus.
History
Publication status
- Published
Publisher URL
External DOI
Page range
73-80Pages
8.0Presentation Type
- paper
Event name
Workshop on Multi-Word Expressions: Analysis, Acquisition and Treatment (ACL 2003)Event location
Sapporo, JapanEvent type
conferenceDepartment affiliated with
- Informatics Publications
Notes
Originality: Describes an original approach to determining the degree to which multi-word expressions (phrasal verbs) are compositional in meaning, based on an automatically acquired thesaurus. Proposes a continuum of compositionality. Rigour: Evaluated on a novel dataset with human judgements of compositionality showing a highly significant figure for inter-annotator agreement. Highly significant correlations were obtained between the human judgements and measures proposed in the paper. Significance: The methodology and dataset have been taken up by other researchers, though to date, several of the measures proposed have not been outperformed on this data. Other researchers have adapted the methodology to detect compositionality of other multiword constructions. Impact: 38 Google Scholar citations (not counting two cites by co-authors). The dataset has been made publicly available and several international researchers have used it in subsequent experiments. Outlet: Appeared in the first in a series of 4 workshops to date in the burgeoning field on multiword expressions. The workshop forms part of the ACL conference.Full text available
- No
Peer reviewed?
- Yes