Rys loss

#5
by dnhkng - opened

How were the loss curves on the RYS models? Did you see a faster initial drop than in the calme-2.1 on the base Qwen2-72B model?

The loss curve looks identical to calme-2.1-qwen2-72b model. The differences are negligible in this case considering the RLHF. The model is too large for me to evaluate it locally on those benchmarks so I don't know how much improvements we are expecting.

Is the training data public? I see a calme legal, is that it?

the used dataset for RLHF is listed in the README

image.png

Ah cool, thanks!

Did you hand generate these? 1k pairs seems within the limit of hand-tailored DPO pairs, barely 😅

I've not tried fine tuning in anger before, when I'm back from holiday.in September, let me know if you are interested in meeting. I'm always interested to meet other AI enthusiasts!

Wow, I'm really shocked this didnt take first place.

I assumed this would be top of the Leaderboard when the results came in. Very surprising outcome, considering the procedures used.

Maybe we should do some collaboration?

To be honest, this one locally scored higher. The seed, the batch, etc. cause some small precision changes. It's fine for me, I was trying to top my previous model without making it worse. So mission accomplished.

Sign up or log in to comment