Task PARus

Name Identifier Type of the task Metrics License Download HB score Baseline score
Choice of Plausible Alternatives for Russian language PARus Binary Classification Accuracy MIT License 0.982 0.588

Description

Choice of Plausible Alternatives for Russian language (PARus) evaluation provides researchers with a tool for assessing progress in open-domain commonsense causal reasoning. Each question in PARus is composed of a premise and two alternatives, where the task is to select the alternative that more plausibly has a causal relation with the premise. The correct alternative is randomized so that the expected performance of randomly guessing is 50%.

Task Type

Evaluation of commonsense causal reasoning

Sentence Pair Classification: suitable - not suitable

Example


{
  "premise": "Гости вечеринки прятались за диваном.",
  "choice1": "Это была вечеринка-сюрприз.",
  "choice2":"Это был день рождения.",
  "question": "cause",
  "label": 0,
  "idx": 4
}

How did we collect data?

All text examples were collected from open news sources and literary magazines, then manually reviewed and supplemented by a human assessment on Yandex.Toloka

State of the Art

English COPA - Accuracy 94.8%

Related papers