Russian SuperGLUE

Dataset	Score	Metric
LiDiRus	0.436	Matthew`s Corr
RCB	0.439 / 0.5	F1/Acc
PARus	0.694	Accuracy
MuSeRC	0.898 / 0.704	F1a/Em
TERRa	0.865	Accuracy
RUSSE	0.728	Accuracy
RWSD	0.714	Accuracy
DaNetQA	0.862	Accuracy
RuCoS	0.85 / 0.83	F1/EM

Model description:

LLaMA-13B tuned on a mixture of Russian datasets. Additionally, we tuned LoRA on all tasks simultaneously, and one more adapter for RWSD/RUSSE/RuCoS. See: https://github.com/IlyaGusev/rulm/tree/master/self_instruct

Parameter description:

See: https://github.com/IlyaGusev/rulm/tree/master/self_instruct

Diagnostic (Matthew`s Correlation): 0.436

Category	Score
LOGIC	0.4152763176374242
KNOWLEDGE	0.3771107822764328
PREDICATE-ARGUMENT STRUCTURE	0.44137698015355864
LEXICAL SEMANTICS	0.5192475430833731

Lexical Semantics - Lexical Entailment	0.514443402768482
Lexical Semantics - Morphological Negation	0.3880141219409875
Lexical Semantics - Factivity	0.28426762180748055
Lexical Semantics - Symmetry/Collectivity	0.7302967433402214
Lexical Semantics - Redundancy	0.6922186552431729
Lexical Semantics - Named Entities	0.612056372482123
Lexical Semantics - Quantifiers	0.4113063728303113
Predicate-Argument Structure Core Args	0.6155101914869567
Predicate-Argument Structure Prepositional Phrases	0.4454180314547771
Predicate-Argument Structure Ellipsis/Implicits	0.4992872412627317
Predicate-Argument Structure Anaphora/Coreference	0.27941176470588236
Predicate-Argument Structure Active/Passive	0.5331139899831832
Predicate-Argument Structure Nominalization	0.651187825474021
Predicate-Argument Structure Genitives/Partitives	0.375
Predicate-Argument Structure Datives	0.49099025303098287
Predicate-Argument Structure Relative Clauses	0.4666666666666667
Predicate-Argument Structure Coordination Scopes	0.366007208697342
Predicate-Argument Structure Intersectivity	0.3107299650387684
Predicate-Argument Structure Restrictivity	0.3619613829965134
Logic Negation	0.48227978168037317
Logic Double Negation	0.420084025208403
Logic Interval/Numbers	0.40250218775298874
Logic Conjuction	0.39440531887330776
Logic Disjunction	0.592220092263982
Logic Conditionals	0.5039526306789696
Logic Universal	0.4332001127219817
Logic Existential	0.38981938376529196
Logic Temporal	0.15289415743128767
Logic Upward Monotone	0.4083133966424866
Logic Downward Monotone	0.22171945701357465
Logic Non-Monotonic	0.45044261646145084
Knowledge Common Sense	0.32052793402378565
Knowledge World Knowledge	0.4379610999329309

Performance:

Dataset	Speed	RAM
LiDiRus	-	-
RCB	-	-
PARus	-	-
MuSeRC	-	-
TERRa	-	-
RUSSE	-	-
RWSD	-	-
DaNetQA	-	-
RuCoS	-	-

Submission Saiga 13B LoRA

Total score: 0.712

Model description:

Parameter description:

Diagnostic (Matthew`s Correlation): 0.436

Performance: