Leaderboard

* More information about speed scores and RAM are available here.

Rank Name Team Link Score LiDiRus RCB PARus MuSeRC TERRa RUSSE RWSD DaNetQA RuCoS
1 HUMAN BENCHMARK AGI NLP 0.811 0.626 0.68 / 0.702 0.982 0.806 / 0.42 0.92 0.805 0.84 0.915 0.93 / 0.89
2 ruadapt Solar 10.7 twostage RCC MSU 0.805 0.591 0.597 / 0.594 0.916 0.946 / 0.837 0.927 0.739 0.844 0.933 0.82 / 0.797
3 Mistral 7B LoRA Saiga team 0.763 0.46 0.529 / 0.573 0.824 0.927 / 0.787 0.888 0.758 0.786 0.919 0.83 / 0.816
4 FRED-T5 1.7B finetune SberDevices 0.762 0.497 0.497 / 0.541 0.842 0.916 / 0.773 0.871 0.823 0.669 0.889 0.9 / 0.902
5 Golden Transformer v2.0 Avengers Ensemble 0.755 0.515 0.384 / 0.534 0.906 0.936 / 0.804 0.877 0.687 0.643 0.911 0.92 / 0.924
6 LLaMA-2 13B LoRA Saiga team 0.718 0.398 0.489 / 0.543 0.784 0.919 / 0.761 0.793 0.74 0.714 0.907 0.78 / 0.76
7 Saiga 13B LoRA Saiga team 0.712 0.436 0.439 / 0.5 0.694 0.898 / 0.704 0.865 0.728 0.714 0.862 0.85 / 0.83
8 YaLM p-tune (3.3B frozen + 40k trainable params) Yandex 0.711 0.364 0.357 / 0.479 0.834 0.892 / 0.707 0.841 0.71 0.669 0.85 0.92 / 0.916
9 ruadapt LLaMA-2 7B LoRA RCC MSU 0.71 0.417 0.545 / 0.555 0.756 0.894 / 0.695 0.876 0.668 0.708 0.878 0.76 / 0.733
10 FRED-T5 large finetune SberDevices 0.706 0.389 0.456 / 0.546 0.776 0.887 / 0.678 0.801 0.775 0.669 0.799 0.87 / 0.863
11 RuLeanALBERT Yandex Research 0.698 0.403 0.361 / 0.413 0.796 0.874 / 0.654 0.812 0.789 0.669 0.76 0.9 / 0.902
12 FRED-T5 1.7B (only encoder 760M) finetune SberDevices 0.694 0.421 0.311 / 0.441 0.806 0.882 / 0.666 0.831 0.723 0.669 0.735 0.91 / 0.911
13 ruT5-large finetune SberDevices 0.686 0.32 0.45 / 0.532 0.764 0.855 / 0.608 0.775 0.773 0.669 0.79 0.86 / 0.859
14 ruRoberta-large finetune SberDevices 0.684 0.343 0.357 / 0.518 0.722 0.861 / 0.63 0.801 0.748 0.669 0.82 0.87 / 0.867
15 gpt-3.5-turbo zero-shot Saiga team 0.682 0.422 0.484 / 0.505 0.888 0.817 / 0.532 0.795 0.596 0.714 0.878 0.68 / 0.667
16 Golden Transformer v1.0 Avengers Ensemble 0.679 0.0 0.406 / 0.546 0.908 0.941 / 0.819 0.871 0.587 0.545 0.917 0.92 / 0.924
17 xlm-roberta-large (Facebook) finetune SberDevices 0.654 0.369 0.328 / 0.457 0.59 0.809 / 0.501 0.798 0.765 0.669 0.757 0.89 / 0.886
18 mdeberta-v3-base (Microsoft) finetune SberDevices 0.651 0.332 0.27 / 0.489 0.716 0.825 / 0.531 0.783 0.727 0.669 0.708 0.87 / 0.868
19 Saiga2 70B zero-shot Saiga team 0.643 0.365 0.385 / 0.461 0.82 0.669 / 0.098 0.811 0.59 0.831 0.878 0.69 / 0.678
20 Saiga Mistral 7B zero-shot Saiga team 0.635 0.322 0.436 / 0.5 0.698 0.84 / 0.553 0.807 0.587 0.727 0.839 0.58 / 0.571
21 ruT5-base finetune Sberdevices 0.635 0.267 0.423 / 0.461 0.636 0.808 / 0.475 0.736 0.707 0.669 0.769 0.85 / 0.847
22 ruBert-large finetune SberDevices 0.62 0.235 0.356 / 0.5 0.656 0.778 / 0.436 0.704 0.707 0.669 0.773 0.81 / 0.805
23 ruBert-base finetune SberDevices 0.578 0.224 0.333 / 0.509 0.476 0.742 / 0.399 0.703 0.706 0.669 0.712 0.74 / 0.716
24 YaLM 1.0B few-shot Yandex 0.577 0.124 0.408 / 0.447 0.766 0.673 / 0.364 0.605 0.587 0.669 0.637 0.86 / 0.859
25 Saiga 13B zero-shot Saiga team 0.554 0.293 0.42 / 0.466 0.63 0.681 / 0.223 0.702 0.565 0.675 0.763 0.47 / 0.458
26 RuGPT3XL few-shot SberDevices 0.535 0.096 0.302 / 0.418 0.676 0.74 / 0.546 0.573 0.565 0.649 0.59 0.67 / 0.665
27 ruElectra-medium finetune SberDevices 0.524 0.182 0.413 / 0.525 0.576 0.615 / 0.189 0.544 0.649 0.669 0.6 0.63 / 0.624
28 ruElectra-large finetune SberDevices 0.522 0.197 0.386 / 0.459 0.644 0.549 / 0.078 0.583 0.632 0.669 0.627 0.61 / 0.607
29 RuBERT plain DeepPavlov 0.521 0.191 0.367 / 0.463 0.574 0.711 / 0.324 0.642 0.726 0.669 0.639 0.32 / 0.314
30 SBERT_Large_mt_ru_finetuning SberDevices 0.514 0.218 0.351 / 0.486 0.498 0.642 / 0.319 0.637 0.657 0.675 0.697 0.35 / 0.347
31 SBERT_Large SberDevices 0.51 0.209 0.371 / 0.452 0.498 0.646 / 0.327 0.637 0.654 0.662 0.675 0.36 / 0.351
32 ruElectra-small finetune SberDevices 0.505 0.106 0.346 / 0.461 0.564 0.628 / 0.21 0.54 0.592 0.669 0.658 0.6 / 0.596
33 RuGPT3Large SberDevices 0.505 0.231 0.417 / 0.484 0.584 0.729 / 0.333 0.654 0.647 0.636 0.604 0.21 / 0.202
34 RuBERT conversational DeepPavlov 0.5 0.178 0.452 / 0.484 0.508 0.687 / 0.278 0.64 0.729 0.669 0.606 0.22 / 0.218
35 Multilingual Bert DeepPavlov 0.495 0.189 0.367 / 0.445 0.528 0.639 / 0.239 0.617 0.69 0.669 0.624 0.29 / 0.29
36 heuristic majority hse_ling 0.468 0.147 0.4 / 0.438 0.478 0.671 / 0.237 0.549 0.595 0.669 0.642 0.26 / 0.257
37 RuGPT3Medium SberDevices 0.468 0.01 0.372 / 0.461 0.598 0.706 / 0.308 0.505 0.642 0.669 0.634 0.23 / 0.224
38 RuGPT3Small SberDevices 0.438 -0.013 0.356 / 0.473 0.562 0.653 / 0.221 0.488 0.57 0.669 0.61 0.21 / 0.204
39 Baseline TF-IDF1.1 AGI NLP 0.434 0.06 0.301 / 0.441 0.486 0.587 / 0.242 0.471 0.57 0.662 0.621 0.26 / 0.252
40 Random weighted hse_ling 0.385 0.0 0.319 / 0.374 0.48 0.45 / 0.071 0.483 0.528 0.597 0.52 0.25 / 0.247
41 majority_class hse_ling 0.374 0.0 0.217 / 0.484 0.498 0.0 / 0.0 0.513 0.587 0.669 0.503 0.25 / 0.247