3 февраля 2023 г. 14:21
Команда SberDevices
Ссылка на модель https://huggingface.co/sberbank-ai/ruElectra-medium
| Датасет | Результат | Метрика |
|---|---|---|
| LiDiRus | 0,182 | Кор, коэффициент Мэтью |
| RCB | 0,413 / 0,525 | F1/Точность |
| PARus | 0,576 | Точность |
| MuSeRC | 0,615 / 0,189 | F1a/Em |
| TERRa | 0,544 | Точность |
| RUSSE | 0,649 | Точность |
| RWSD | 0,669 | Точность |
| DaNetQA | 0,6 | Точность |
| RuCoS | 0,63 / 0,624 | F1/EM |
ruElectra-medium model of critic encoder class, pretraining with google ELECTRA code. It has 12 layers and hidden size 576. It was trained on a Russian language corpus (100GB). The dataset is the same as for sbert_large_mt_nlu_ru models. Wordpiece tokenizer. This model we use as reward-critic model for RLHF and black-box attack. Source data for training: Taiga, Lenta, OpenSubtitles, Wiki, etc. - all with ru lang. domain. Scoring pipeline: https://github.com/ai-forever/rsg-baselines
| Категория | Результат |
|---|---|
| LOGIC | 0,13877200806086512 |
| KNOWLEDGE | 0,2542761568913483 |
| PREDICATE-ARGUMENT STRUCTURE | 0,16065050857951518 |
| LEXICAL SEMANTICS | 0,21126894813889507 |
| Lexical Semantics - Lexical Entailment | 0,05541499624957016 |
|---|---|
| Lexical Semantics - Morphological Negation | 0,05143444998736397 |
| Lexical Semantics - Factivity | 0,32387513781564786 |
| Lexical Semantics - Symmetry/Collectivity | -0,271746488194703 |
| Lexical Semantics - Redundancy | 0,33267391956523024 |
| Lexical Semantics - Named Entities | 0,35355339059327373 |
| Lexical Semantics - Quantifiers | 0,32360209754104324 |
| Predicate-Argument Structure Core Args | 0,19599157740244455 |
| Predicate-Argument Structure Prepositional Phrases | 0,29310451774759705 |
| Predicate-Argument Structure Ellipsis/Implicits | 0,07042952122737638 |
| Predicate-Argument Structure Anaphora/Coreference | 0,2811258418541589 |
| Predicate-Argument Structure Active/Passive | 0,018620327436868072 |
| Predicate-Argument Structure Nominalization | 0,09607689228305229 |
| Predicate-Argument Structure Genitives/Partitives | 0,050251890762960605 |
| Predicate-Argument Structure Datives | 0,2182178902359924 |
| Predicate-Argument Structure Relative Clauses | -0,01642880193633814 |
| Predicate-Argument Structure Coordination Scopes | 0,12087912087912088 |
| Predicate-Argument Structure Intersectivity | 0,2724196464492864 |
| Predicate-Argument Structure Restrictivity | -0,2958081738859997 |
| Logic Negation | -0,004593297481561526 |
| Logic Double Negation | 0,2119213177352503 |
| Logic Interval/Numbers | -0,040522044923655395 |
| Logic Conjuction | 0,08944271909999159 |
| Logic Disjunction | 0,09855138041620624 |
| Logic Conditionals | 0,1111111111111111 |
| Logic Universal | 0,4029114820126901 |
| Logic Existential | 0,2058790548922549 |
| Logic Temporal | 0,15244937348544793 |
| Logic Upward Monotone | 0,05923488777590923 |
| Logic Downward Monotone | 0,014678923792502562 |
| Logic Non-Monotonic | 0,18917776913478015 |
| Knowledge Common Sense | 0,26383624620352564 |
| Knowledge World Knowledge | 0,22110440420299576 |
| Датасет | Speed | RAM |
|---|---|---|
| LiDiRus | - | - |
| RCB | - | - |
| PARus | - | - |
| MuSeRC | - | - |
| TERRa | - | - |
| RUSSE | - | - |
| RWSD | - | - |
| DaNetQA | - | - |
| RuCoS | - | - |