Sept. 4, 2023, 7:48 a.m.
Team: Saiga team
| Dataset | Score | Metric | 
|---|---|---|
| LiDiRus | 0.365 | Matthew`s Corr | 
| RCB | 0.385 / 0.461 | F1/Acc | 
| PARus | 0.82 | Accuracy | 
| MuSeRC | 0.669 / 0.098 | F1a/Em | 
| TERRa | 0.811 | Accuracy | 
| RUSSE | 0.59 | Accuracy | 
| RWSD | 0.831 | Accuracy | 
| DaNetQA | 0.878 | Accuracy | 
| RuCoS | 0.69 / 0.678 | F1/EM | 
LLaMA-70B fine-tuned for Russian on several datasets. See https://github.com/IlyaGusev/rulm/tree/master/self_instruct Evaluation code: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/src/eval_rsg.py
See https://github.com/IlyaGusev/rulm/tree/master/self_instruct
| Category | Score | 
|---|---|
| LOGIC | 0.3228855331810909 | 
| KNOWLEDGE | 0.3518320326381725 | 
| PREDICATE-ARGUMENT STRUCTURE | 0.3516834681361791 | 
| LEXICAL SEMANTICS | 0.4558592159876882 | 
| Lexical Semantics - Lexical Entailment | 0.42364268926359044 | 
|---|---|
| Lexical Semantics - Morphological Negation | 0.6126374746329801 | 
| Lexical Semantics - Factivity | 0.34641016151377546 | 
| Lexical Semantics - Symmetry/Collectivity | 0.6454972243679028 | 
| Lexical Semantics - Redundancy | 0.6922186552431729 | 
| Lexical Semantics - Named Entities | 0.4472135954999579 | 
| Lexical Semantics - Quantifiers | 0.39074105418879573 | 
| Predicate-Argument Structure Core Args | 0.5692099788303083 | 
| Predicate-Argument Structure Prepositional Phrases | 0.2735506022160966 | 
| Predicate-Argument Structure Ellipsis/Implicits | 0.39148014634313566 | 
| Predicate-Argument Structure Anaphora/Coreference | 0.28539089649269644 | 
| Predicate-Argument Structure Active/Passive | 0.45241392835886407 | 
| Predicate-Argument Structure Nominalization | 0.7006490497453707 | 
| Predicate-Argument Structure Genitives/Partitives | 0.6666666666666666 | 
| Predicate-Argument Structure Datives | 0.5091750772173156 | 
| Predicate-Argument Structure Relative Clauses | 0.13912166872805048 | 
| Predicate-Argument Structure Coordination Scopes | 0.1175019408963036 | 
| Predicate-Argument Structure Intersectivity | 0.2929462832020831 | 
| Predicate-Argument Structure Restrictivity | 0.0 | 
| Logic Negation | 0.3452479097635773 | 
| Logic Double Negation | 0.4382862782019066 | 
| Logic Interval/Numbers | 0.07147416898918632 | 
| Logic Conjuction | 0.25819888974716115 | 
| Logic Disjunction | 0.3143473067309657 | 
| Logic Conditionals | 0.38331077069024155 | 
| Logic Universal | 0.4264014327112209 | 
| Logic Existential | 0.36689969285267143 | 
| Logic Temporal | 0.3114510614341045 | 
| Logic Upward Monotone | 0.22213082915965962 | 
| Logic Downward Monotone | 0.14511418742827673 | 
| Logic Non-Monotonic | 0.18389242812245682 | 
| Knowledge Common Sense | 0.334653129013519 | 
| Knowledge World Knowledge | 0.3649501415425901 | 
| Dataset | Speed | RAM | 
|---|---|---|
| LiDiRus | - | - | 
| RCB | - | - | 
| PARus | - | - | 
| MuSeRC | - | - | 
| TERRa | - | - | 
| RUSSE | - | - | 
| RWSD | - | - | 
| DaNetQA | - | - | 
| RuCoS | - | - |