Submission FRED-T5 1.7B (only encoder 760M) finetune

Feb. 8, 2023, 7:42 a.m.

Team: SberDevices

Model url: https://huggingface.co/sberbank-ai/FRED-T5-1.7B


Total score: 0.694

Dataset Score Metric
LiDiRus 0.421 Matthew`s Corr
RCB 0.311 / 0.441 F1/Acc
PARus 0.806 Accuracy
MuSeRC 0.882 / 0.666 F1a/Em
TERRa 0.831 Accuracy
RUSSE 0.723 Accuracy
RWSD 0.669 Accuracy
DaNetQA 0.735 Accuracy
RuCoS 0.91 / 0.911 F1/EM
Model description:

Here we evaluate encoder-only part of the pretrained FRED-T5-1.7B model (https://russiansuperglue.com/login/submit_info/1936), resulting 760M parameters (2.2 times smaller). We fine-tune the model separately for each RussianSuperGLUE task. For LiDiRus we start with TERRa-finetuned model. For RWSD we use the majority class baseline. For each task we submit best-performing checkpoint (saving each epoch, but more frequently for RCB, PARus and RuCoS) based on validation metrics. No fixes were applied to the datasets. No filters/fixes were applied to datasets.


Parameter description:

Hyper-parameters for fine-tuning: batch size of 16, epochs {10, 20, 30}, lr {1e-06, 1e-04, 1e-5, 2e-5, 3e-5}, linear lr scheduler, warmup ratio {0.02, 0.05}, weight decay {0, 0.01, 0.1}.

Diagnostic (Matthew`s Correlation): 0.421

Category Score
LOGIC 0.25548133714772964
KNOWLEDGE 0.35721247554168717
PREDICATE-ARGUMENT STRUCTURE 0.4530930715114425
LEXICAL SEMANTICS 0.49319873437048684
Lexical Semantics - Lexical Entailment 0.4770239916187289
Lexical Semantics - Morphological Negation 0.39477101697586137
Lexical Semantics - Factivity 0.4226770155886447
Lexical Semantics - Symmetry/Collectivity 0.3243723035407737
Lexical Semantics - Redundancy 0.27348301713730944
Lexical Semantics - Named Entities 0.612056372482123
Lexical Semantics - Quantifiers 0.3157894736842105
Predicate-Argument Structure Core Args 0.5487601413337525
Predicate-Argument Structure Prepositional Phrases 0.6588289607878823
Predicate-Argument Structure Ellipsis/Implicits 0.5260558322946913
Predicate-Argument Structure Anaphora/Coreference 0.3349672436203912
Predicate-Argument Structure Active/Passive 0.2843611155188746
Predicate-Argument Structure Nominalization 0.5017348819226064
Predicate-Argument Structure Genitives/Partitives 0.15724272550828775
Predicate-Argument Structure Datives 0.7637626158259734
Predicate-Argument Structure Relative Clauses 0.3333333333333333
Predicate-Argument Structure Coordination Scopes 0.5091750772173156
Predicate-Argument Structure Intersectivity 0.3973597071195131
Predicate-Argument Structure Restrictivity 0.28741691319281637
Logic Negation 0.12866255886641637
Logic Double Negation 0.21320071635561044
Logic Interval/Numbers 0.010615495921641366
Logic Conjuction 0.5680375574437545
Logic Disjunction 0.16447838793172298
Logic Conditionals 0.07100716024967263
Logic Universal 0.2548235957188128
Logic Existential 0.2058790548922549
Logic Temporal 0.33910215700436014
Logic Upward Monotone 0.821271097469555
Logic Downward Monotone -0.30207927000959933
Logic Non-Monotonic 0.2364331218717302
Knowledge Common Sense 0.3594723992410968
Knowledge World Knowledge 0.3394352270463883

Performance:

Dataset Speed RAM
LiDiRus - -
RCB - -
PARus - -
MuSeRC - -
TERRa - -
RUSSE - -
RWSD - -
DaNetQA - -
RuCoS - -