Feb. 8, 2023, 7:42 a.m.
Team: SberDevices
Dataset | Score | Metric |
---|---|---|
LiDiRus | 0.421 | Matthew`s Corr |
RCB | 0.311 / 0.441 | F1/Acc |
PARus | 0.806 | Accuracy |
MuSeRC | 0.882 / 0.666 | F1a/Em |
TERRa | 0.831 | Accuracy |
RUSSE | 0.723 | Accuracy |
RWSD | 0.669 | Accuracy |
DaNetQA | 0.735 | Accuracy |
RuCoS | 0.91 / 0.911 | F1/EM |
Here we evaluate encoder-only part of the pretrained FRED-T5-1.7B model (https://russiansuperglue.com/login/submit_info/1936), resulting 760M parameters (2.2 times smaller). We fine-tune the model separately for each RussianSuperGLUE task. For LiDiRus we start with TERRa-finetuned model. For RWSD we use the majority class baseline. For each task we submit best-performing checkpoint (saving each epoch, but more frequently for RCB, PARus and RuCoS) based on validation metrics. No fixes were applied to the datasets. No filters/fixes were applied to datasets.
Hyper-parameters for fine-tuning: batch size of 16, epochs {10, 20, 30}, lr {1e-06, 1e-04, 1e-5, 2e-5, 3e-5}, linear lr scheduler, warmup ratio {0.02, 0.05}, weight decay {0, 0.01, 0.1}.
Category | Score |
---|---|
LOGIC | 0.25548133714772964 |
KNOWLEDGE | 0.35721247554168717 |
PREDICATE-ARGUMENT STRUCTURE | 0.4530930715114425 |
LEXICAL SEMANTICS | 0.49319873437048684 |
Lexical Semantics - Lexical Entailment | 0.4770239916187289 |
---|---|
Lexical Semantics - Morphological Negation | 0.39477101697586137 |
Lexical Semantics - Factivity | 0.4226770155886447 |
Lexical Semantics - Symmetry/Collectivity | 0.3243723035407737 |
Lexical Semantics - Redundancy | 0.27348301713730944 |
Lexical Semantics - Named Entities | 0.612056372482123 |
Lexical Semantics - Quantifiers | 0.3157894736842105 |
Predicate-Argument Structure Core Args | 0.5487601413337525 |
Predicate-Argument Structure Prepositional Phrases | 0.6588289607878823 |
Predicate-Argument Structure Ellipsis/Implicits | 0.5260558322946913 |
Predicate-Argument Structure Anaphora/Coreference | 0.3349672436203912 |
Predicate-Argument Structure Active/Passive | 0.2843611155188746 |
Predicate-Argument Structure Nominalization | 0.5017348819226064 |
Predicate-Argument Structure Genitives/Partitives | 0.15724272550828775 |
Predicate-Argument Structure Datives | 0.7637626158259734 |
Predicate-Argument Structure Relative Clauses | 0.3333333333333333 |
Predicate-Argument Structure Coordination Scopes | 0.5091750772173156 |
Predicate-Argument Structure Intersectivity | 0.3973597071195131 |
Predicate-Argument Structure Restrictivity | 0.28741691319281637 |
Logic Negation | 0.12866255886641637 |
Logic Double Negation | 0.21320071635561044 |
Logic Interval/Numbers | 0.010615495921641366 |
Logic Conjuction | 0.5680375574437545 |
Logic Disjunction | 0.16447838793172298 |
Logic Conditionals | 0.07100716024967263 |
Logic Universal | 0.2548235957188128 |
Logic Existential | 0.2058790548922549 |
Logic Temporal | 0.33910215700436014 |
Logic Upward Monotone | 0.821271097469555 |
Logic Downward Monotone | -0.30207927000959933 |
Logic Non-Monotonic | 0.2364331218717302 |
Knowledge Common Sense | 0.3594723992410968 |
Knowledge World Knowledge | 0.3394352270463883 |
Dataset | Speed | RAM |
---|---|---|
LiDiRus | - | - |
RCB | - | - |
PARus | - | - |
MuSeRC | - | - |
TERRa | - | - |
RUSSE | - | - |
RWSD | - | - |
DaNetQA | - | - |
RuCoS | - | - |