Improving sparse word representations with distributional inference for semantic composition

Kober, Thomas; Weeds, Julie; Reffin, Jeremy; Weir, David

D16-1175.pdf (232.22 kB)

Improving sparse word representations with distributional inference for semantic composition

chapter

posted on 2023-06-09, 05:14 authored by Thomas Kober, Julie WeedsJulie Weeds, Jeremy ReffinJeremy Reffin, David WeirDavid Weir

Distributional models are derived from co- occurrences in a corpus, where only a small proportion of all possible plausible co-occurrences will be observed. This results in a very sparse vector space, requiring a mechanism for inferring missing knowledge. Most methods face this challenge in ways that render the resulting word representations uninterpretable, with the consequence that semantic composition becomes hard to model. In this paper we explore an alternative which involves explicitly inferring unobserved co-occurrences using the distributional neighbourhood. We show that distributional inference improves sparse word repre- sentations on several word similarity benchmarks and demonstrate that our model is competitive with the state-of-the-art for adjective- noun, noun-noun and verb-object compositions while being fully interpretable.

History

Publication status

Published

File Version

Published version

Publisher

Association for Computational Linguistics

Publisher URL

http://aclanthology.info/events/emnlp-2016

Page range

1691-1702

Pages

2392.0

Event name

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

Event location

Austin, TX

Event type

conference

Event date

1-5 November 2016

Book title

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

ISBN

9781945626258

Department affiliated with

Informatics Publications

Research groups affiliated with

Data Science Research Group Publications

Full text available

Yes

Peer reviewed?

Yes

Legacy Posted Date

2017-02-20

First Open Access (FOA) Date

2017-02-20

First Compliant Deposit (FCD) Date

2017-02-20

Usage metrics

Keywords

Uncategorised value

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Improving sparse word representations with distributional inference for semantic composition

History

Publication status

File Version

Publisher

Publisher URL

Page range

Pages

Event name

Event location

Event type

Event date

Book title

ISBN

Department affiliated with

Research groups affiliated with

Full text available

Peer reviewed?

Legacy Posted Date

First Open Access (FOA) Date

First Compliant Deposit (FCD) Date

Usage metrics

Categories

Keywords

Licence

Exports