Russian SuperGLUE

Датасет	Результат	Метрика
LiDiRus	0,46	Кор, коэффициент Мэтью
RCB	0,529 / 0,573	F1/Точность
PARus	0,824	Точность
MuSeRC	0,927 / 0,787	F1a/Em
TERRa	0,888	Точность
RUSSE	0,758	Точность
RWSD	0,786	Точность
DaNetQA	0,919	Точность
RuCoS	0,83 / 0,816	F1/EM

Описание модели:

The Mistral-7B-v0.1, LoRA-tuned on RSG sets. For the information about inference see: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/src/benchmarks/eval_lora_rsg.py train: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/src/train.py The main config for LoRA: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/configs/mistral_7b_rsg.json configs for separate tasks are in the same folder. The Mistral-7B-v01 model was trained on multi-task with the main config, merged into the main model, and then task-level LoRA adapters were trained on top of this merged model. zero-shot evaluation script: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/src/benchmarks/eval_zs_rsg.py

Описание параметров:

Диагностика: 0,46

Категория	Результат
LOGIC	0,4272477243463581
KNOWLEDGE	0,4267562897825355
PREDICATE-ARGUMENT STRUCTURE	0,4644487887609425
LEXICAL SEMANTICS	0,5423872872028785

Lexical Semantics - Lexical Entailment	0,5684993638729131
Lexical Semantics - Morphological Negation	0,6172133998483676
Lexical Semantics - Factivity	0,29814239699997197
Lexical Semantics - Symmetry/Collectivity	0,6243713415848884
Lexical Semantics - Redundancy	0,1444869078105018
Lexical Semantics - Named Entities	0,6708203932499369
Lexical Semantics - Quantifiers	0,4287214448277836
Predicate-Argument Structure Core Args	0,5051219141436532
Predicate-Argument Structure Prepositional Phrases	0,6207200740216239
Predicate-Argument Structure Ellipsis/Implicits	0,4992872412627317
Predicate-Argument Structure Anaphora/Coreference	0,3682002176496948
Predicate-Argument Structure Active/Passive	0,623033246356214
Predicate-Argument Structure Nominalization	0,40881490876633847
Predicate-Argument Structure Genitives/Partitives	0,5773502691896257
Predicate-Argument Structure Datives	0,28511240114923325
Predicate-Argument Structure Relative Clauses	0,42289003161103106
Predicate-Argument Structure Coordination Scopes	0,48038446141526137
Predicate-Argument Structure Intersectivity	0,3629539763832752
Predicate-Argument Structure Restrictivity	0,46549138385896505
Logic Negation	0,631059217297185
Logic Double Negation	0,420084025208403
Logic Interval/Numbers	0,42051713353118003
Logic Conjuction	0,6713171133426189
Logic Disjunction	0,3768673314407158
Logic Conditionals	0,2698412698412698
Logic Universal	0,6700593942604899
Logic Existential	0,3144854510165755
Logic Temporal	0,33910215700436014
Logic Upward Monotone	0,4146442144313646
Logic Downward Monotone	0,1438234930593239
Logic Non-Monotonic	0,2895702534395041
Knowledge Common Sense	0,39380049095855213
Knowledge World Knowledge	0,4619338817592217

Производительность:

Датасет	Speed	RAM
LiDiRus	-	-
RCB	-	-
PARus	-	-
MuSeRC	-	-
TERRa	-	-
RUSSE	-	-
RWSD	-	-
DaNetQA	-	-
RuCoS	-	-

Новый сабмит Mistral 7B LoRA

Результат бейзлайна: 0,763

Описание модели:

Описание параметров:

Диагностика: 0,46

Производительность: