| Dataset | Score | Metric | 
|---|---|---|
| LiDiRus | 0.369 | Matthew`s Corr | 
| RCB | 0.328 / 0.457 | F1/Acc | 
| PARus | 0.59 | Accuracy | 
| MuSeRC | 0.809 / 0.501 | F1a/Em | 
| TERRa | 0.798 | Accuracy | 
| RUSSE | 0.765 | Accuracy | 
| RWSD | 0.669 | Accuracy | 
| DaNetQA | 0.757 | Accuracy | 
| RuCoS | 0.89 / 0.886 | F1/EM | 
The underlying model (contains 560M parameters) was pre-trained by Facebook on the CC100 multi-lingual dataset (Russian included). We further fine-tune the pre-trained model separately for each RussianSuperGLUE task. For LiDiRus we start with TERRa-finetuned model. For RWSD we use the majority class baseline. For each task we submit best-performing checkpoint (saving each epoch) based on validation metrics.
Hyper-parameters for fine-tuning xlm-roberta-large: batch size {8, 16, 32, 64}, epochs {20, 30}, lr {1e-06, 2e-06, 1e-05, 3e-05}, lr scheduler {constant, linear}, warmup ratio {0.02, 0.05, 0.1}, weight decay {0, 0.01, 0.1}.
| Category | Score | 
|---|---|
| LOGIC | 0.1963386058559272 | 
| KNOWLEDGE | 0.3156834130078425 | 
| PREDICATE-ARGUMENT STRUCTURE | 0.3777715485318068 | 
| LEXICAL SEMANTICS | 0.45839855677638014 | 
| Lexical Semantics - Lexical Entailment | 0.4282616143136151 | 
|---|---|
| Lexical Semantics - Morphological Negation | 0.45175395145262565 | 
| Lexical Semantics - Factivity | 0.18428853505018536 | 
| Lexical Semantics - Symmetry/Collectivity | 0.43852900965351466 | 
| Lexical Semantics - Redundancy | 0.27348301713730944 | 
| Lexical Semantics - Named Entities | 0.38949041885226005 | 
| Lexical Semantics - Quantifiers | 0.501227406091964 | 
| Predicate-Argument Structure Core Args | 0.40943028340181464 | 
| Predicate-Argument Structure Prepositional Phrases | 0.5823384109195534 | 
| Predicate-Argument Structure Ellipsis/Implicits | 0.3263956049169334 | 
| Predicate-Argument Structure Anaphora/Coreference | 0.26875600982680947 | 
| Predicate-Argument Structure Active/Passive | 0.1835325870964494 | 
| Predicate-Argument Structure Nominalization | 0.4687501237868722 | 
| Predicate-Argument Structure Genitives/Partitives | 0.8660254037844386 | 
| Predicate-Argument Structure Datives | 0.629940788348712 | 
| Predicate-Argument Structure Relative Clauses | 0.16265001215808886 | 
| Predicate-Argument Structure Coordination Scopes | 0.33454829277463405 | 
| Predicate-Argument Structure Intersectivity | 0.2553606237816764 | 
| Predicate-Argument Structure Restrictivity | 0.2748737083745107 | 
| Logic Negation | 0.045531262041544854 | 
| Logic Double Negation | 0.30151134457776363 | 
| Logic Interval/Numbers | -0.1050485078938172 | 
| Logic Conjuction | 0.42163702135578396 | 
| Logic Disjunction | 0.008695652173913044 | 
| Logic Conditionals | 0.14285714285714285 | 
| Logic Universal | 0.4029114820126901 | 
| Logic Existential | 0.04279604925109129 | 
| Logic Temporal | 0.2548235957188128 | 
| Logic Upward Monotone | 0.623033246356214 | 
| Logic Downward Monotone | -0.06726727939963124 | 
| Logic Non-Monotonic | 0.09267505241022214 | 
| Knowledge Common Sense | 0.2876139817629179 | 
| Knowledge World Knowledge | 0.3387649364472491 | 
| Dataset | Speed | RAM | 
|---|---|---|
| LiDiRus | - | - | 
| RCB | - | - | 
| PARus | - | - | 
| MuSeRC | - | - | 
| TERRa | - | - | 
| RUSSE | - | - | 
| RWSD | - | - | 
| DaNetQA | - | - | 
| RuCoS | - | - |