Leaderboard

* More information about speed scores and RAM are available here.

Rank Name Team Link Score LiDiRus RCB PARus MuSeRC TERRa RUSSE RWSD DaNetQA RuCoS
1 HUMAN BENCHMARK AGI NLP 0.811 0.626 0.68 / 0.702 0.982 0.806 / 0.42 0.92 0.805 0.84 0.915 0.93 / 0.89
2 FRED-T5 1.7B finetune SberDevices 0.762 0.497 0.497 / 0.541 0.842 0.916 / 0.773 0.871 0.823 0.669 0.889 0.9 / 0.902
3 Golden Transformer v2.0 Avengers Ensemble 0.755 0.515 0.384 / 0.534 0.906 0.936 / 0.804 0.877 0.687 0.643 0.911 0.92 / 0.924
4 YaLM p-tune (3.3B frozen + 40k trainable params) Yandex 0.711 0.364 0.357 / 0.479 0.834 0.892 / 0.707 0.841 0.71 0.669 0.85 0.92 / 0.916
5 FRED-T5 large finetune SberDevices 0.706 0.389 0.456 / 0.546 0.776 0.887 / 0.678 0.801 0.775 0.669 0.799 0.87 / 0.863
6 RuLeanALBERT Yandex Research 0.698 0.403 0.361 / 0.413 0.796 0.874 / 0.654 0.812 0.789 0.669 0.76 0.9 / 0.902
7 FRED-T5 1.7B (only encoder 760M) finetune SberDevices 0.694 0.421 0.311 / 0.441 0.806 0.882 / 0.666 0.831 0.723 0.669 0.735 0.91 / 0.911
8 ruT5-large finetune SberDevices 0.686 0.32 0.45 / 0.532 0.764 0.855 / 0.608 0.775 0.773 0.669 0.79 0.86 / 0.859
9 ruRoberta-large finetune SberDevices 0.684 0.343 0.357 / 0.518 0.722 0.861 / 0.63 0.801 0.748 0.669 0.82 0.87 / 0.867
10 Golden Transformer v1.0 Avengers Ensemble 0.679 0.0 0.406 / 0.546 0.908 0.941 / 0.819 0.871 0.587 0.545 0.917 0.92 / 0.924
11 xlm-roberta-large (Facebook) finetune SberDevices 0.654 0.369 0.328 / 0.457 0.59 0.809 / 0.501 0.798 0.765 0.669 0.757 0.89 / 0.886
12 mdeberta-v3-base (Microsoft) finetune SberDevices 0.651 0.332 0.27 / 0.489 0.716 0.825 / 0.531 0.783 0.727 0.669 0.708 0.87 / 0.868
13 ruT5-base finetune Sberdevices 0.635 0.267 0.423 / 0.461 0.636 0.808 / 0.475 0.736 0.707 0.669 0.769 0.85 / 0.847
14 ruBert-large finetune SberDevices 0.62 0.235 0.356 / 0.5 0.656 0.778 / 0.436 0.704 0.707 0.669 0.773 0.81 / 0.805
15 ruBert-base finetune SberDevices 0.578 0.224 0.333 / 0.509 0.476 0.742 / 0.399 0.703 0.706 0.669 0.712 0.74 / 0.716
16 YaLM 1.0B few-shot Yandex 0.577 0.124 0.408 / 0.447 0.766 0.673 / 0.364 0.605 0.587 0.669 0.637 0.86 / 0.859
17 RuGPT3XL few-shot SberDevices 0.535 0.096 0.302 / 0.418 0.676 0.74 / 0.546 0.573 0.565 0.649 0.59 0.67 / 0.665
18 RuBERT plain DeepPavlov 0.521 0.191 0.367 / 0.463 0.574 0.711 / 0.324 0.642 0.726 0.669 0.639 0.32 / 0.314
19 SBERT_Large_mt_ru_finetuning SberDevices 0.514 0.218 0.351 / 0.486 0.498 0.642 / 0.319 0.637 0.657 0.675 0.697 0.35 / 0.347
20 SBERT_Large SberDevices 0.51 0.209 0.371 / 0.452 0.498 0.646 / 0.327 0.637 0.654 0.662 0.675 0.36 / 0.351
21 RuGPT3Large SberDevices 0.505 0.231 0.417 / 0.484 0.584 0.729 / 0.333 0.654 0.647 0.636 0.604 0.21 / 0.202
22 RuBERT conversational DeepPavlov 0.5 0.178 0.452 / 0.484 0.508 0.687 / 0.278 0.64 0.729 0.669 0.606 0.22 / 0.218
23 Multilingual Bert DeepPavlov 0.495 0.189 0.367 / 0.445 0.528 0.639 / 0.239 0.617 0.69 0.669 0.624 0.29 / 0.29
24 heuristic majority hse_ling 0.468 0.147 0.4 / 0.438 0.478 0.671 / 0.237 0.549 0.595 0.669 0.642 0.26 / 0.257
25 RuGPT3Medium SberDevices 0.468 0.01 0.372 / 0.461 0.598 0.706 / 0.308 0.505 0.642 0.669 0.634 0.23 / 0.224
26 RuGPT3Small SberDevices 0.438 -0.013 0.356 / 0.473 0.562 0.653 / 0.221 0.488 0.57 0.669 0.61 0.21 / 0.204
27 Baseline TF-IDF1.1 AGI NLP 0.434 0.06 0.301 / 0.441 0.486 0.587 / 0.242 0.471 0.57 0.662 0.621 0.26 / 0.252
28 Random weighted hse_ling 0.385 0.0 0.319 / 0.374 0.48 0.45 / 0.071 0.483 0.528 0.597 0.52 0.25 / 0.247
29 majority_class hse_ling 0.374 0.0 0.217 / 0.484 0.498 0.0 / 0.0 0.513 0.587 0.669 0.503 0.25 / 0.247