Submission gpt-3.5-turbo zero-shot

July 6, 2023, 5:17 p.m.

Team: Saiga team

Model url: https://platform.openai.com/docs/models/gpt-3-5


Total score: 0.682

Dataset Score Metric
LiDiRus 0.422 Matthew`s Corr
RCB 0.484 / 0.505 F1/Acc
PARus 0.888 Accuracy
MuSeRC 0.817 / 0.532 F1a/Em
TERRa 0.795 Accuracy
RUSSE 0.596 Accuracy
RWSD 0.714 Accuracy
DaNetQA 0.878 Accuracy
RuCoS 0.68 / 0.667 F1/EM
Model description:

gpt-3.5-turbo as is with prompts from here: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/src/eval_rsg.py


Parameter description:

temperature = 0.0 top_p = 1.0

Diagnostic (Matthew`s Correlation): 0.422

Category Score
LOGIC 0.27724846114423907
KNOWLEDGE 0.48593972117950296
PREDICATE-ARGUMENT STRUCTURE 0.4222012507483124
LEXICAL SEMANTICS 0.4507478531035058
Lexical Semantics - Lexical Entailment 0.4587428680014169
Lexical Semantics - Morphological Negation 0.2140767569329586
Lexical Semantics - Factivity 0.34020690871988585
Lexical Semantics - Symmetry/Collectivity 0.6243713415848884
Lexical Semantics - Redundancy 0.2211629342323457
Lexical Semantics - Named Entities 0.6666666666666666
Lexical Semantics - Quantifiers 0.37854910518078255
Predicate-Argument Structure Core Args 0.5760964547890037
Predicate-Argument Structure Prepositional Phrases 0.48034143356248
Predicate-Argument Structure Ellipsis/Implicits 0.5493502655735357
Predicate-Argument Structure Anaphora/Coreference 0.44034755759456745
Predicate-Argument Structure Active/Passive 0.5771944181220839
Predicate-Argument Structure Nominalization 0.40881490876633847
Predicate-Argument Structure Genitives/Partitives 0.5773502691896257
Predicate-Argument Structure Datives 0.629940788348712
Predicate-Argument Structure Relative Clauses 0.33954987505086615
Predicate-Argument Structure Coordination Scopes 0.13725270326150324
Predicate-Argument Structure Intersectivity 0.33452515977294983
Predicate-Argument Structure Restrictivity 0.040422604172722164
Logic Negation 0.3221028323526659
Logic Double Negation 0.2763853991962833
Logic Interval/Numbers 0.12017278061240777
Logic Conjuction 0.24809590313546123
Logic Disjunction 0.053838190205816545
Logic Conditionals -0.07100716024967263
Logic Universal 0.5324675324675324
Logic Existential 0.3851644432598216
Logic Temporal 0.6386392673039035
Logic Upward Monotone 0.3612343522752406
Logic Downward Monotone 0.008988968316207744
Logic Non-Monotonic 0.2895702534395041
Knowledge Common Sense 0.45069440991558835
Knowledge World Knowledge 0.5198996752635257

Performance:

Dataset Speed RAM
LiDiRus - -
RCB - -
PARus - -
MuSeRC - -
TERRa - -
RUSSE - -
RWSD - -
DaNetQA - -
RuCoS - -