Oct. 9, 2023, 5:38 p.m.
Team: Saiga team
Model url: https://huggingface.co/IlyaGusev/saiga_mistral_7b_lora
| Dataset | Score | Metric | 
|---|---|---|
| LiDiRus | 0.322 | Matthew`s Corr | 
| RCB | 0.436 / 0.5 | F1/Acc | 
| PARus | 0.698 | Accuracy | 
| MuSeRC | 0.84 / 0.553 | F1a/Em | 
| TERRa | 0.807 | Accuracy | 
| RUSSE | 0.587 | Accuracy | 
| RWSD | 0.727 | Accuracy | 
| DaNetQA | 0.839 | Accuracy | 
| RuCoS | 0.58 / 0.571 | F1/EM | 
Saiga Mistral 7B zero-shot Mistral 7B (https://huggingface.co/mistralai/Mistral-7B-v0.1) was tuned on 5 Russian datasets: ru_turbo_saiga, ru_sharegpt_cleaned, oasst1_ru_main_branch, gpt_roleplay_realm, ru_instruct_gpt4 See more information on HuggingFace: https://huggingface.co/IlyaGusev/saiga_mistral_7b_lora Zero-shot evaluation script for the inference for Saiga Mistral model: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/src/benchmarks/eval_zs_rsg.py
| Category | Score | 
|---|---|
| LOGIC | 0.24068909266898375 | 
| KNOWLEDGE | 0.3467006882891795 | 
| PREDICATE-ARGUMENT STRUCTURE | 0.32835497921408513 | 
| LEXICAL SEMANTICS | 0.38897213279554893 | 
| Lexical Semantics - Lexical Entailment | 0.39133162093562956 | 
|---|---|
| Lexical Semantics - Morphological Negation | 0.4816727030991569 | 
| Lexical Semantics - Factivity | 0.16140004156875074 | 
| Lexical Semantics - Symmetry/Collectivity | 0.43852900965351466 | 
| Lexical Semantics - Redundancy | 0.6922186552431729 | 
| Lexical Semantics - Named Entities | 0.4472135954999579 | 
| Lexical Semantics - Quantifiers | 0.404651319111256 | 
| Predicate-Argument Structure Core Args | 0.40987803063838396 | 
| Predicate-Argument Structure Prepositional Phrases | 0.3020666145593326 | 
| Predicate-Argument Structure Ellipsis/Implicits | 0.39148014634313566 | 
| Predicate-Argument Structure Anaphora/Coreference | 0.33606722016672236 | 
| Predicate-Argument Structure Active/Passive | 0.3244428422615251 | 
| Predicate-Argument Structure Nominalization | 0.6255432421712243 | 
| Predicate-Argument Structure Genitives/Partitives | 0.6666666666666666 | 
| Predicate-Argument Structure Datives | 0.35043832202523123 | 
| Predicate-Argument Structure Relative Clauses | 0.13912166872805048 | 
| Predicate-Argument Structure Coordination Scopes | 0.0 | 
| Predicate-Argument Structure Intersectivity | 0.32489314482696546 | 
| Predicate-Argument Structure Restrictivity | 0.2748737083745107 | 
| Logic Negation | 0.23037198846919968 | 
| Logic Double Negation | 0.19312181983410703 | 
| Logic Interval/Numbers | 0.033731512431528755 | 
| Logic Conjuction | 0.25819888974716115 | 
| Logic Disjunction | 0.2364331218717302 | 
| Logic Conditionals | 0.3646984043128985 | 
| Logic Universal | 0.15228622596829317 | 
| Logic Existential | 0.24459979523511427 | 
| Logic Temporal | 0.007053982594841415 | 
| Logic Upward Monotone | 0.27640672769878033 | 
| Logic Downward Monotone | 0.16238586255274184 | 
| Logic Non-Monotonic | 0.18389242812245682 | 
| Knowledge Common Sense | 0.3170077616914835 | 
| Knowledge World Knowledge | 0.37712430952994663 | 
| Dataset | Speed | RAM | 
|---|---|---|
| LiDiRus | - | - | 
| RCB | - | - | 
| PARus | - | - | 
| MuSeRC | - | - | 
| TERRa | - | - | 
| RUSSE | - | - | 
| RWSD | - | - | 
| DaNetQA | - | - | 
| RuCoS | - | - |