Ways to measure how well the model is performing:
Perplexity
Human Eval (optional)
This is on feat/evaluation:
https://github.com/MichiganDataScienceTeam/F24-mini-copilot/tree/feat/evaluation
I will be gone this weekend, just opening this issue so anyone can get started
Ways to measure how well the model is performing:
Perplexity
Human Eval (optional)
This is on feat/evaluation:
https://github.com/MichiganDataScienceTeam/F24-mini-copilot/tree/feat/evaluation
I will be gone this weekend, just opening this issue so anyone can get started