small features: add option to save cache in parquet, save judge input…#35
small features: add option to save cache in parquet, save judge input…#35
Conversation
…, improve error handling of openrouter
| completion_A: str # completion of the first model | ||
| completion_B: str # completion of the second model | ||
| judge_completion: str # output of the judge | ||
| judge_input: str | None = None # input that was passed to the judge |
There was a problem hiding this comment.
Should this be added to the estimate_elo_ratings.py workflow as well?
There was a problem hiding this comment.
It uses judge_and_parse_prefs from this file so it is updated as well if I am not mistaken.
There was a problem hiding this comment.
Yes, but I think we are dropping it here because we are constructing the Dataframe manually
| return x | ||
|
|
||
| for col in df.select_dtypes(include="object").columns: | ||
| df[col] = df[col].apply(_to_python).astype(str) |
There was a problem hiding this comment.
I think we should be careful here if the dataframe can contain missing values. Calling .astype(str) on missing values (None or np.nan) converts them into strings "None" and "nan". When the parquet file is read back, they would be processed as strings instead of missing values.
There was a problem hiding this comment.
yes I agree but I dont see another way to serialize to parquet. I agree that this conversion is loosing the missingness information but I think all downstream code should probably exclude empty strings too when computing annotations.
…, improve error handling of openrouter, remove compute_cohen_kappa
Reasons: