Dataset | Score | Metric |
---|---|---|
LiDiRus | 0.293 | Matthew`s Corr |
RCB | 0.42 / 0.466 | F1/Acc |
PARus | 0.63 | Accuracy |
MuSeRC | 0.681 / 0.223 | F1a/Em |
TERRa | 0.702 | Accuracy |
RUSSE | 0.565 | Accuracy |
RWSD | 0.675 | Accuracy |
DaNetQA | 0.763 | Accuracy |
RuCoS | 0.47 / 0.458 | F1/EM |
LLaMA-13B fine-tuned for Russian on several datasets. See https://github.com/IlyaGusev/rulm/tree/master/self_instruct Evaluation code: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/src/eval_rsg.py
See https://github.com/IlyaGusev/rulm/tree/master/self_instruct
Category | Score |
---|---|
LOGIC | 0.21917517305160486 |
KNOWLEDGE | 0.2057287918779686 |
PREDICATE-ARGUMENT STRUCTURE | 0.27409125950273655 |
LEXICAL SEMANTICS | 0.388895694649691 |
Lexical Semantics - Lexical Entailment | 0.33262580622443827 |
---|---|
Lexical Semantics - Morphological Negation | 0.45647498109908324 |
Lexical Semantics - Factivity | 0.2721655269759087 |
Lexical Semantics - Symmetry/Collectivity | 0.3243723035407737 |
Lexical Semantics - Redundancy | -0.08695652173913043 |
Lexical Semantics - Named Entities | 0.2891574659831201 |
Lexical Semantics - Quantifiers | 0.38402732637062953 |
Predicate-Argument Structure Core Args | 0.4222222222222222 |
Predicate-Argument Structure Prepositional Phrases | 0.4612405150708389 |
Predicate-Argument Structure Ellipsis/Implicits | 0.1901597073139162 |
Predicate-Argument Structure Anaphora/Coreference | 0.09396796986295447 |
Predicate-Argument Structure Active/Passive | 0.2135744251723958 |
Predicate-Argument Structure Nominalization | 0.18856180831641267 |
Predicate-Argument Structure Genitives/Partitives | 0.28867513459481287 |
Predicate-Argument Structure Datives | 0.629940788348712 |
Predicate-Argument Structure Relative Clauses | 0.29814239699997197 |
Predicate-Argument Structure Coordination Scopes | 0.0615486357075876 |
Predicate-Argument Structure Intersectivity | 0.33040329483301567 |
Predicate-Argument Structure Restrictivity | 0.055227791305300936 |
Logic Negation | 0.28001211610132504 |
Logic Double Negation | 0.24771684715343115 |
Logic Interval/Numbers | 0.12776451341301862 |
Logic Conjuction | 0.24575816200113812 |
Logic Disjunction | 0.11582810137580164 |
Logic Conditionals | 0.22084711628963774 |
Logic Universal | 0.20385887657505022 |
Logic Existential | 0.12087912087912088 |
Logic Temporal | 0.2040518334346933 |
Logic Upward Monotone | 0.10799789913383483 |
Logic Downward Monotone | 0.10701842524298437 |
Logic Non-Monotonic | 0.27583864218368526 |
Knowledge Common Sense | 0.15201252993723102 |
Knowledge World Knowledge | 0.2700595235395947 |
Dataset | Speed | RAM |
---|---|---|
LiDiRus | - | - |
RCB | - | - |
PARus | - | - |
MuSeRC | - | - |
TERRa | - | - |
RUSSE | - | - |
RWSD | - | - |
DaNetQA | - | - |
RuCoS | - | - |