Inducing crosslingual distributed representations of words

Alexandre Klementiev; Ivan Titov; Binod Bhattarai

Inducing crosslingual distributed representations of words

Alexandre Klementiev, Ivan Titov, Binod Bhattarai

Computing Science

Saarland University

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

Abstract

Distributed representations of words have proven extremely useful in numerous natural language processing tasks. Their appeal is that they can help alleviate data sparsity problems common to supervised learning. Methods for inducing these representations require only unlabeled language data, which are plentiful for many natural languages. In this work, we induce distributed representations for a pair of languages jointly. We treat it as a multitask learning problem where each task corresponds to a single word, and task relatedness is derived from co-occurrence statistics in bilingual parallel data. These representations can be used for a number of crosslingual learning tasks, where a learner can be trained on annotations present in one language and applied to test data in another. We show that our representations are informative by using them for crosslingual document classification, where classifiers trained on these representations substantially outperform strong baselines (e.g. machine translation) when applied to a new language.

Original language	English
Title of host publication	Proceedings of COLING 2012
Pages	1459-1474
Number of pages	16
Publication status	Published - 2012

Bibliographical note

Proceedings of COLING 2012: Technical Papers, pages 1459–1474,
COLING 2012, Mumbai, December 2012.

Access to Document

https://aclanthology.org/C12-1089.pdf

Cite this

@inproceedings{258c696e6806491fa34fd5446ad7a58f,

title = "Inducing crosslingual distributed representations of words",

abstract = "Distributed representations of words have proven extremely useful in numerous natural language processing tasks. Their appeal is that they can help alleviate data sparsity problems common to supervised learning. Methods for inducing these representations require only unlabeled language data, which are plentiful for many natural languages. In this work, we induce distributed representations for a pair of languages jointly. We treat it as a multitask learning problem where each task corresponds to a single word, and task relatedness is derived from co-occurrence statistics in bilingual parallel data. These representations can be used for a number of crosslingual learning tasks, where a learner can be trained on annotations present in one language and applied to test data in another. We show that our representations are informative by using them for crosslingual document classification, where classifiers trained on these representations substantially outperform strong baselines (e.g. machine translation) when applied to a new language.",

author = "Alexandre Klementiev and Ivan Titov and Binod Bhattarai",

note = "Proceedings of COLING 2012: Technical Papers, pages 1459–1474, COLING 2012, Mumbai, December 2012.",

year = "2012",

language = "English",

pages = "1459--1474",

booktitle = "Proceedings of COLING 2012",

}

TY - GEN

T1 - Inducing crosslingual distributed representations of words

AU - Klementiev, Alexandre

AU - Titov, Ivan

AU - Bhattarai, Binod

N1 - Proceedings of COLING 2012: Technical Papers, pages 1459–1474, COLING 2012, Mumbai, December 2012.

PY - 2012

Y1 - 2012

N2 - Distributed representations of words have proven extremely useful in numerous natural language processing tasks. Their appeal is that they can help alleviate data sparsity problems common to supervised learning. Methods for inducing these representations require only unlabeled language data, which are plentiful for many natural languages. In this work, we induce distributed representations for a pair of languages jointly. We treat it as a multitask learning problem where each task corresponds to a single word, and task relatedness is derived from co-occurrence statistics in bilingual parallel data. These representations can be used for a number of crosslingual learning tasks, where a learner can be trained on annotations present in one language and applied to test data in another. We show that our representations are informative by using them for crosslingual document classification, where classifiers trained on these representations substantially outperform strong baselines (e.g. machine translation) when applied to a new language.

AB - Distributed representations of words have proven extremely useful in numerous natural language processing tasks. Their appeal is that they can help alleviate data sparsity problems common to supervised learning. Methods for inducing these representations require only unlabeled language data, which are plentiful for many natural languages. In this work, we induce distributed representations for a pair of languages jointly. We treat it as a multitask learning problem where each task corresponds to a single word, and task relatedness is derived from co-occurrence statistics in bilingual parallel data. These representations can be used for a number of crosslingual learning tasks, where a learner can be trained on annotations present in one language and applied to test data in another. We show that our representations are informative by using them for crosslingual document classification, where classifiers trained on these representations substantially outperform strong baselines (e.g. machine translation) when applied to a new language.

M3 - Published conference contribution

SP - 1459

EP - 1474

BT - Proceedings of COLING 2012

ER -

Inducing crosslingual distributed representations of words

Abstract

Bibliographical note

Access to Document

Fingerprint

Cite this