Russian SuperGLUE

Task RUSSE

Name	Identifier	Type of the task	Metrics	License	Download	HB score	Baseline score
Russian WiC - RUSSE	RUSSE	Binary Classification	Accuracy	MIT License		0.805

Description

WiC: The Word-in-Context Dataset A reliable benchmark for the evaluation of context-sensitive word embeddings.

Depending on its context, an ambiguous word can refer to multiple, potentially unrelated, meanings. Mainstream static word embeddings, such as Word2vec and GloVe, are unable to reflect this dynamic semantic nature. Contextualised word embeddings are an attempt at addressing this limitation by computing dynamic representations for words which can adapt based on context.

Russian SuperGLUE task borrows original data from the Russe project, Word Sense Induction and Disambiguation shared task (2018)

Task Type¶

Reading Comprehension. Binary Classification: true/false

Example¶


{
  "idx" : 8,
  "word" : "дорожка",
  "sentence1" : "Бурые ковровые дорожки заглушали шаги",
  "sentence2" : "Приятели решили выпить на дорожку в местном баре",
  "start1" : 15,
  "end1" : 23,
  "start2" : 26,
  "end2" : 34,
  "label" : false,
  "gold_sense1" : 1,
  "gold_sense2" : 2
}

How did we collect data? ¶

All text examples were collected from Russe original dataset, already collected by Russian Semantic Evaluation at ACL SIGSLAV. Human assessment was carried out on Yandex.Toloka.

In version 2, we have manually collected in the same format testset.

State of the Art

English WiC - Accuracy 76.9%

Related papers

Original Russe paper: Panchenko, A., Lopukhina, A., Ustalov, D., Lopukhin, K., Arefyev, N., Leontyev, A., Loukachevitch, N.: RUSSE’2018: A Shared Task on Word Sense Induction for the Russian Language. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”. pp. 547–564. RSUH, Moscow, Russia (2018)
Original WiC paper: WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations M.T. Pilehvar and J. Camacho-Collados, NAACL 2019 (Minneapolis, USA). Note: Results slightly differ between NAACL and Arxiv versions of the paper. Please take results in the Arxiv version, which is more up to date, as baseline for your evaluations.
Wang A. et al. Superglue: A stickier benchmark for general-purpose language understanding systems //Advances in Neural Information Processing Systems. – 2019. – С. 3261-3275.