Russian SuperGLUE

Dataset	Score	Metric
LiDiRus	0.46	Matthew`s Corr
RCB	0.529 / 0.573	F1/Acc
PARus	0.824	Accuracy
MuSeRC	0.927 / 0.787	F1a/Em
TERRa	0.888	Accuracy
RUSSE	0.758	Accuracy
RWSD	0.786	Accuracy
DaNetQA	0.919	Accuracy
RuCoS	0.83 / 0.816	F1/EM

Model description:

The Mistral-7B-v0.1, LoRA-tuned on RSG sets. For the information about inference see: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/src/benchmarks/eval_lora_rsg.py train: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/src/train.py The main config for LoRA: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/configs/mistral_7b_rsg.json configs for separate tasks are in the same folder. The Mistral-7B-v01 model was trained on multi-task with the main config, merged into the main model, and then task-level LoRA adapters were trained on top of this merged model. zero-shot evaluation script: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/src/benchmarks/eval_zs_rsg.py

Parameter description:

Diagnostic (Matthew`s Correlation): 0.46

Category	Score
LOGIC	0.4272477243463581
KNOWLEDGE	0.4267562897825355
PREDICATE-ARGUMENT STRUCTURE	0.4644487887609425
LEXICAL SEMANTICS	0.5423872872028785

Lexical Semantics - Lexical Entailment	0.5684993638729131
Lexical Semantics - Morphological Negation	0.6172133998483676
Lexical Semantics - Factivity	0.29814239699997197
Lexical Semantics - Symmetry/Collectivity	0.6243713415848884
Lexical Semantics - Redundancy	0.1444869078105018
Lexical Semantics - Named Entities	0.6708203932499369
Lexical Semantics - Quantifiers	0.4287214448277836
Predicate-Argument Structure Core Args	0.5051219141436532
Predicate-Argument Structure Prepositional Phrases	0.6207200740216239
Predicate-Argument Structure Ellipsis/Implicits	0.4992872412627317
Predicate-Argument Structure Anaphora/Coreference	0.3682002176496948
Predicate-Argument Structure Active/Passive	0.623033246356214
Predicate-Argument Structure Nominalization	0.40881490876633847
Predicate-Argument Structure Genitives/Partitives	0.5773502691896257
Predicate-Argument Structure Datives	0.28511240114923325
Predicate-Argument Structure Relative Clauses	0.42289003161103106
Predicate-Argument Structure Coordination Scopes	0.48038446141526137
Predicate-Argument Structure Intersectivity	0.3629539763832752
Predicate-Argument Structure Restrictivity	0.46549138385896505
Logic Negation	0.631059217297185
Logic Double Negation	0.420084025208403
Logic Interval/Numbers	0.42051713353118003
Logic Conjuction	0.6713171133426189
Logic Disjunction	0.3768673314407158
Logic Conditionals	0.2698412698412698
Logic Universal	0.6700593942604899
Logic Existential	0.3144854510165755
Logic Temporal	0.33910215700436014
Logic Upward Monotone	0.4146442144313646
Logic Downward Monotone	0.1438234930593239
Logic Non-Monotonic	0.2895702534395041
Knowledge Common Sense	0.39380049095855213
Knowledge World Knowledge	0.4619338817592217

Performance:

Dataset	Speed	RAM
LiDiRus	-	-
RCB	-	-
PARus	-	-
MuSeRC	-	-
TERRa	-	-
RUSSE	-	-
RWSD	-	-
DaNetQA	-	-
RuCoS	-	-

Submission Mistral 7B LoRA

Total score: 0.763

Model description:

Parameter description:

Diagnostic (Matthew`s Correlation): 0.46

Performance: