8 февраля 2023 г. 7:42
Команда SberDevices
Ссылка на модель https://huggingface.co/sberbank-ai/FRED-T5-1.7B
Датасет | Результат | Метрика |
---|---|---|
LiDiRus | 0,421 | Кор, коэффициент Мэтью |
RCB | 0,311 / 0,441 | F1/Точность |
PARus | 0,806 | Точность |
MuSeRC | 0,882 / 0,666 | F1a/Em |
TERRa | 0,831 | Точность |
RUSSE | 0,723 | Точность |
RWSD | 0,669 | Точность |
DaNetQA | 0,735 | Точность |
RuCoS | 0,91 / 0,911 | F1/EM |
Here we evaluate encoder-only part of the pretrained FRED-T5-1.7B model (https://russiansuperglue.com/login/submit_info/1936), resulting 760M parameters (2.2 times smaller). We fine-tune the model separately for each RussianSuperGLUE task. For LiDiRus we start with TERRa-finetuned model. For RWSD we use the majority class baseline. For each task we submit best-performing checkpoint (saving each epoch, but more frequently for RCB, PARus and RuCoS) based on validation metrics. No fixes were applied to the datasets. No filters/fixes were applied to datasets.
Hyper-parameters for fine-tuning: batch size of 16, epochs {10, 20, 30}, lr {1e-06, 1e-04, 1e-5, 2e-5, 3e-5}, linear lr scheduler, warmup ratio {0.02, 0.05}, weight decay {0, 0.01, 0.1}.
Категория | Результат |
---|---|
LOGIC | 0,25548133714772964 |
KNOWLEDGE | 0,35721247554168717 |
PREDICATE-ARGUMENT STRUCTURE | 0,4530930715114425 |
LEXICAL SEMANTICS | 0,49319873437048684 |
Lexical Semantics - Lexical Entailment | 0,4770239916187289 |
---|---|
Lexical Semantics - Morphological Negation | 0,39477101697586137 |
Lexical Semantics - Factivity | 0,4226770155886447 |
Lexical Semantics - Symmetry/Collectivity | 0,3243723035407737 |
Lexical Semantics - Redundancy | 0,27348301713730944 |
Lexical Semantics - Named Entities | 0,612056372482123 |
Lexical Semantics - Quantifiers | 0,3157894736842105 |
Predicate-Argument Structure Core Args | 0,5487601413337525 |
Predicate-Argument Structure Prepositional Phrases | 0,6588289607878823 |
Predicate-Argument Structure Ellipsis/Implicits | 0,5260558322946913 |
Predicate-Argument Structure Anaphora/Coreference | 0,3349672436203912 |
Predicate-Argument Structure Active/Passive | 0,2843611155188746 |
Predicate-Argument Structure Nominalization | 0,5017348819226064 |
Predicate-Argument Structure Genitives/Partitives | 0,15724272550828775 |
Predicate-Argument Structure Datives | 0,7637626158259734 |
Predicate-Argument Structure Relative Clauses | 0,3333333333333333 |
Predicate-Argument Structure Coordination Scopes | 0,5091750772173156 |
Predicate-Argument Structure Intersectivity | 0,3973597071195131 |
Predicate-Argument Structure Restrictivity | 0,28741691319281637 |
Logic Negation | 0,12866255886641637 |
Logic Double Negation | 0,21320071635561044 |
Logic Interval/Numbers | 0,010615495921641366 |
Logic Conjuction | 0,5680375574437545 |
Logic Disjunction | 0,16447838793172298 |
Logic Conditionals | 0,07100716024967263 |
Logic Universal | 0,2548235957188128 |
Logic Existential | 0,2058790548922549 |
Logic Temporal | 0,33910215700436014 |
Logic Upward Monotone | 0,821271097469555 |
Logic Downward Monotone | -0,30207927000959933 |
Logic Non-Monotonic | 0,2364331218717302 |
Knowledge Common Sense | 0,3594723992410968 |
Knowledge World Knowledge | 0,3394352270463883 |
Датасет | Speed | RAM |
---|---|---|
LiDiRus | - | - |
RCB | - | - |
PARus | - | - |
MuSeRC | - | - |
TERRa | - | - |
RUSSE | - | - |
RWSD | - | - |
DaNetQA | - | - |
RuCoS | - | - |