Sept. 4, 2023, 7:48 a.m.
Team: Saiga team
Dataset | Score | Metric |
---|---|---|
LiDiRus | 0.365 | Matthew`s Corr |
RCB | 0.385 / 0.461 | F1/Acc |
PARus | 0.82 | Accuracy |
MuSeRC | 0.669 / 0.098 | F1a/Em |
TERRa | 0.811 | Accuracy |
RUSSE | 0.59 | Accuracy |
RWSD | 0.831 | Accuracy |
DaNetQA | 0.878 | Accuracy |
RuCoS | 0.69 / 0.678 | F1/EM |
LLaMA-70B fine-tuned for Russian on several datasets. See https://github.com/IlyaGusev/rulm/tree/master/self_instruct Evaluation code: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/src/eval_rsg.py
See https://github.com/IlyaGusev/rulm/tree/master/self_instruct
Category | Score |
---|---|
LOGIC | 0.3228855331810909 |
KNOWLEDGE | 0.3518320326381725 |
PREDICATE-ARGUMENT STRUCTURE | 0.3516834681361791 |
LEXICAL SEMANTICS | 0.4558592159876882 |
Lexical Semantics - Lexical Entailment | 0.42364268926359044 |
---|---|
Lexical Semantics - Morphological Negation | 0.6126374746329801 |
Lexical Semantics - Factivity | 0.34641016151377546 |
Lexical Semantics - Symmetry/Collectivity | 0.6454972243679028 |
Lexical Semantics - Redundancy | 0.6922186552431729 |
Lexical Semantics - Named Entities | 0.4472135954999579 |
Lexical Semantics - Quantifiers | 0.39074105418879573 |
Predicate-Argument Structure Core Args | 0.5692099788303083 |
Predicate-Argument Structure Prepositional Phrases | 0.2735506022160966 |
Predicate-Argument Structure Ellipsis/Implicits | 0.39148014634313566 |
Predicate-Argument Structure Anaphora/Coreference | 0.28539089649269644 |
Predicate-Argument Structure Active/Passive | 0.45241392835886407 |
Predicate-Argument Structure Nominalization | 0.7006490497453707 |
Predicate-Argument Structure Genitives/Partitives | 0.6666666666666666 |
Predicate-Argument Structure Datives | 0.5091750772173156 |
Predicate-Argument Structure Relative Clauses | 0.13912166872805048 |
Predicate-Argument Structure Coordination Scopes | 0.1175019408963036 |
Predicate-Argument Structure Intersectivity | 0.2929462832020831 |
Predicate-Argument Structure Restrictivity | 0.0 |
Logic Negation | 0.3452479097635773 |
Logic Double Negation | 0.4382862782019066 |
Logic Interval/Numbers | 0.07147416898918632 |
Logic Conjuction | 0.25819888974716115 |
Logic Disjunction | 0.3143473067309657 |
Logic Conditionals | 0.38331077069024155 |
Logic Universal | 0.4264014327112209 |
Logic Existential | 0.36689969285267143 |
Logic Temporal | 0.3114510614341045 |
Logic Upward Monotone | 0.22213082915965962 |
Logic Downward Monotone | 0.14511418742827673 |
Logic Non-Monotonic | 0.18389242812245682 |
Knowledge Common Sense | 0.334653129013519 |
Knowledge World Knowledge | 0.3649501415425901 |
Dataset | Speed | RAM |
---|---|---|
LiDiRus | - | - |
RCB | - | - |
PARus | - | - |
MuSeRC | - | - |
TERRa | - | - |
RUSSE | - | - |
RWSD | - | - |
DaNetQA | - | - |
RuCoS | - | - |