Russian SuperGLUE

Leaderboard

We have improved the datasets. Please, change the leaderboard for the version (1.0/1.1) you are looking for, clicked on the button below.
You can switch between the scores and inference speed leaderboard as well. Click on the button Performance.

* More information about speed scores and RAM are available here.

Rank	Name	Team	Score	LiDiRus	RCB	PARus	MuSeRC	TERRa	RUSSE	RWSD	DaNetQA	RuCoS
1	HUMAN BENCHMARK	AGI NLP	0.811	0.626	0.68 / 0.702	0.982	0.806 / 0.42	0.92	0.805	0.84	0.915	0.93 / 0.89
2	ruadapt Solar 10.7 twostage	RCC MSU	0.805	0.591	0.597 / 0.594	0.916	0.946 / 0.837	0.927	0.739	0.844	0.933	0.82 / 0.797
3	Mistral 7B LoRA	Saiga team	0.763	0.46	0.529 / 0.573	0.824	0.927 / 0.787	0.888	0.758	0.786	0.919	0.83 / 0.816
4	FRED-T5 1.7B finetune	SberDevices	0.762	0.497	0.497 / 0.541	0.842	0.916 / 0.773	0.871	0.823	0.669	0.889	0.9 / 0.902
5	Golden Transformer v2.0	Avengers Ensemble	0.755	0.515	0.384 / 0.534	0.906	0.936 / 0.804	0.877	0.687	0.643	0.911	0.92 / 0.924
6	LLaMA-2 13B LoRA	Saiga team	0.718	0.398	0.489 / 0.543	0.784	0.919 / 0.761	0.793	0.74	0.714	0.907	0.78 / 0.76
7	Saiga 13B LoRA	Saiga team	0.712	0.436	0.439 / 0.5	0.694	0.898 / 0.704	0.865	0.728	0.714	0.862	0.85 / 0.83
8	YaLM p-tune (3.3B frozen + 40k trainable params)	Yandex	0.711	0.364	0.357 / 0.479	0.834	0.892 / 0.707	0.841	0.71	0.669	0.85	0.92 / 0.916
9	ruadapt LLaMA-2 7B LoRA	RCC MSU	0.71	0.417	0.545 / 0.555	0.756	0.894 / 0.695	0.876	0.668	0.708	0.878	0.76 / 0.733
10	FRED-T5 large finetune	SberDevices	0.706	0.389	0.456 / 0.546	0.776	0.887 / 0.678	0.801	0.775	0.669	0.799	0.87 / 0.863
11	RuLeanALBERT	Yandex Research	0.698	0.403	0.361 / 0.413	0.796	0.874 / 0.654	0.812	0.789	0.669	0.76	0.9 / 0.902
12	FRED-T5 1.7B (only encoder 760M) finetune	SberDevices	0.694	0.421	0.311 / 0.441	0.806	0.882 / 0.666	0.831	0.723	0.669	0.735	0.91 / 0.911
13	ruT5-large finetune	SberDevices	0.686	0.32	0.45 / 0.532	0.764	0.855 / 0.608	0.775	0.773	0.669	0.79	0.86 / 0.859
14	ruRoberta-large finetune	SberDevices	0.684	0.343	0.357 / 0.518	0.722	0.861 / 0.63	0.801	0.748	0.669	0.82	0.87 / 0.867
15	gpt-3.5-turbo zero-shot	Saiga team	0.682	0.422	0.484 / 0.505	0.888	0.817 / 0.532	0.795	0.596	0.714	0.878	0.68 / 0.667
16	Golden Transformer v1.0	Avengers Ensemble	0.679	0.0	0.406 / 0.546	0.908	0.941 / 0.819	0.871	0.587	0.545	0.917	0.92 / 0.924
17	xlm-roberta-large (Facebook) finetune	SberDevices	0.654	0.369	0.328 / 0.457	0.59	0.809 / 0.501	0.798	0.765	0.669	0.757	0.89 / 0.886
18	mdeberta-v3-base (Microsoft) finetune	SberDevices	0.651	0.332	0.27 / 0.489	0.716	0.825 / 0.531	0.783	0.727	0.669	0.708	0.87 / 0.868
19	Saiga2 70B zero-shot	Saiga team	0.643	0.365	0.385 / 0.461	0.82	0.669 / 0.098	0.811	0.59	0.831	0.878	0.69 / 0.678
20	Saiga Mistral 7B zero-shot	Saiga team	0.635	0.322	0.436 / 0.5	0.698	0.84 / 0.553	0.807	0.587	0.727	0.839	0.58 / 0.571
21	ruT5-base finetune	Sberdevices	0.635	0.267	0.423 / 0.461	0.636	0.808 / 0.475	0.736	0.707	0.669	0.769	0.85 / 0.847
22	ruBert-large finetune	SberDevices	0.62	0.235	0.356 / 0.5	0.656	0.778 / 0.436	0.704	0.707	0.669	0.773	0.81 / 0.805
23	ruBert-base finetune	SberDevices	0.578	0.224	0.333 / 0.509	0.476	0.742 / 0.399	0.703	0.706	0.669	0.712	0.74 / 0.716
24	YaLM 1.0B few-shot	Yandex	0.577	0.124	0.408 / 0.447	0.766	0.673 / 0.364	0.605	0.587	0.669	0.637	0.86 / 0.859
25	Qwen 14B saiga zero-shot	Maxim Bolgov	0.554	0.334	0.442 / 0.482	0.61	0.725 / 0.254	0.717	0.464	0.695	0.791	0.43 / 0.42
26	Saiga 13B zero-shot	Saiga team	0.554	0.293	0.42 / 0.466	0.63	0.681 / 0.223	0.702	0.565	0.675	0.763	0.47 / 0.458
27	RuGPT3XL few-shot	SberDevices	0.535	0.096	0.302 / 0.418	0.676	0.74 / 0.546	0.573	0.565	0.649	0.59	0.67 / 0.665
28	ruElectra-medium finetune	SberDevices	0.524	0.182	0.413 / 0.525	0.576	0.615 / 0.189	0.544	0.649	0.669	0.6	0.63 / 0.624
29	ruElectra-large finetune	SberDevices	0.522	0.197	0.386 / 0.459	0.644	0.549 / 0.078	0.583	0.632	0.669	0.627	0.61 / 0.607
30	RuBERT plain	DeepPavlov	0.521	0.191	0.367 / 0.463	0.574	0.711 / 0.324	0.642	0.726	0.669	0.639	0.32 / 0.314
31	Qwen 7B saiga zero-shot	Maxim Bolgov	0.519	0.334	0.405 / 0.479	0.576	0.659 / 0.239	0.707	0.547	0.604	0.728	0.29 / 0.284
32	SBERT_Large_mt_ru_finetuning	SberDevices	0.514	0.218	0.351 / 0.486	0.498	0.642 / 0.319	0.637	0.657	0.675	0.697	0.35 / 0.347
33	SBERT_Large	SberDevices	0.51	0.209	0.371 / 0.452	0.498	0.646 / 0.327	0.637	0.654	0.662	0.675	0.36 / 0.351
34	Qwen 4B saiga zero-shot	Maxim Bolgov	0.505	0.274	0.361 / 0.493	0.554	0.656 / 0.112	0.655	0.57	0.623	0.661	0.4 / 0.395
35	ruElectra-small finetune	SberDevices	0.505	0.106	0.346 / 0.461	0.564	0.628 / 0.21	0.54	0.592	0.669	0.658	0.6 / 0.596
36	RuGPT3Large	SberDevices	0.505	0.231	0.417 / 0.484	0.584	0.729 / 0.333	0.654	0.647	0.636	0.604	0.21 / 0.202
37	RuBERT conversational	DeepPavlov	0.5	0.178	0.452 / 0.484	0.508	0.687 / 0.278	0.64	0.729	0.669	0.606	0.22 / 0.218
38	Multilingual Bert	DeepPavlov	0.495	0.189	0.367 / 0.445	0.528	0.639 / 0.239	0.617	0.69	0.669	0.624	0.29 / 0.29
39	heuristic majority	hse_ling	0.468	0.147	0.4 / 0.438	0.478	0.671 / 0.237	0.549	0.595	0.669	0.642	0.26 / 0.257
40	RuGPT3Medium	SberDevices	0.468	0.01	0.372 / 0.461	0.598	0.706 / 0.308	0.505	0.642	0.669	0.634	0.23 / 0.224
41	RuGPT3Small	SberDevices	0.438	-0.013	0.356 / 0.473	0.562	0.653 / 0.221	0.488	0.57	0.669	0.61	0.21 / 0.204
42	Baseline TF-IDF1.1	AGI NLP	0.434	0.06	0.301 / 0.441	0.486	0.587 / 0.242	0.471	0.57	0.662	0.621	0.26 / 0.252
43	Random weighted	hse_ling	0.385	0.0	0.319 / 0.374	0.48	0.45 / 0.071	0.483	0.528	0.597	0.52	0.25 / 0.247
44	majority_class	hse_ling	0.374	0.0	0.217 / 0.484	0.498	0.0 / 0.0	0.513	0.587	0.669	0.503	0.25 / 0.247