Russian SuperGLUE

Task PARus

Name	Identifier	Type of the task	Metrics	License	Download	HB score
Choice of Plausible Alternatives for Russian language	PARus	Binary Classification	Accuracy	MIT License		0.982

Description

Choice of Plausible Alternatives for Russian language (PARus) evaluation provides researchers with a tool for assessing progress in open-domain commonsense causal reasoning. Each question in PARus is composed of a premise and two alternatives, where the task is to select the alternative that more plausibly has a causal relation with the premise. The correct alternative is randomized so that the expected performance of randomly guessing is 50%.

Task Type¶

Evaluation of commonsense causal reasoning

Sentence Pair Classification: suitable - not suitable

Example¶


{
  "premise": "Гости вечеринки прятались за диваном.",
  "choice1": "Это была вечеринка-сюрприз.",
  "choice2":"Это был день рождения.",
  "question": "cause",
  "label": 0,
  "idx": 4
}

How did we collect data? ¶

All text examples were collected from open news sources and literary magazines, then manually reviewed and supplemented by a human assessment on Yandex.Toloka

Please, be carefull! PArsed RUssian Sentences is the dataset for morphological and syntactic annotation, which is not a part of Russian SuperGLUE.

State of the Art

English COPA - Accuracy 94.8%

Related papers

Original COPA paper: Roemmele, M., Bejan, C., and Gordon, A. (2011) Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning. AAAI Spring Symposium on Logical Formalizations of Commonsense Reasoning, Stanford University, March 21-23, 2011.
See also: SemEval 2012 Task 7 COPA was used a shared task (Task 7) in the 6th International Workshop on Semantic Evaluation (SemEval 2012). The winning system was created by Travis Goodwin, Bryan Rink, Kirk Roberts, and Sanda M. Harabagiu from the University of Texas at Dallas, Human Language Technology Research Institute. Details about this shared task and the performance of competing systems are provided in the following paper:
Gordon, A., Kozareva, Z., and Roemmele, M. (2012) SemEval-2012 Task 7: Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning. Proceedings of the 6th International Workshop on Semantic Evaluation (SemEval 2012), June 7-8, 2012, Montreal, Canada.
Wang A. et al. Superglue: A stickier benchmark for general-purpose language understanding systems //Advances in Neural Information Processing Systems. – 2019. – С. 3261-3275.