Name | Identifier | Type of the task | Metrics | License | Download | HB score |
---|---|---|---|---|---|---|
Choice of Plausible Alternatives for Russian language | PARus | Binary Classification | Accuracy | MIT License | 0.982 |
Choice of Plausible Alternatives for Russian language (PARus) evaluation provides researchers with a tool for assessing progress in open-domain commonsense causal reasoning. Each question in PARus is composed of a premise and two alternatives, where the task is to select the alternative that more plausibly has a causal relation with the premise. The correct alternative is randomized so that the expected performance of randomly guessing is 50%.
Evaluation of commonsense causal reasoning
Sentence Pair Classification: suitable - not suitable
{
"premise": "Гости вечеринки прятались за диваном.",
"choice1": "Это была вечеринка-сюрприз.",
"choice2":"Это был день рождения.",
"question": "cause",
"label": 0,
"idx": 4
}
All text examples were collected from open news sources and literary magazines, then manually reviewed and supplemented by a human assessment on Yandex.Toloka
Please, be carefull! PArsed RUssian Sentences is the dataset for morphological and syntactic annotation, which is not a part of Russian SuperGLUE.
English COPA - Accuracy 94.8%
See also: SemEval 2012 Task 7 COPA was used a shared task (Task 7) in the 6th International Workshop on Semantic Evaluation (SemEval 2012). The winning system was created by Travis Goodwin, Bryan Rink, Kirk Roberts, and Sanda M. Harabagiu from the University of Texas at Dallas, Human Language Technology Research Institute. Details about this shared task and the performance of competing systems are provided in the following paper:
Gordon, A., Kozareva, Z., and Roemmele, M. (2012) SemEval-2012 Task 7: Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning. Proceedings of the 6th International Workshop on Semantic Evaluation (SemEval 2012), June 7-8, 2012, Montreal, Canada.