Conversation
| lam=10.0, | ||
| n_epochs=100, | ||
| verbose=False))), | ||
| train_ratio=0.8, |
There was a problem hiding this comment.
One simple improvement after discussing with Richard Zhang is to make the size of the cross-validation set adaptive:
- When we start off we have little data, so maybe a 80/20 split produces too small a validation set size.
- Once we have more data, a 80/20 split might be too big, i.e. we are being too conservative as the algorithm progresses (because we shrink the training set size more than necessary), and as a consequence calling Compass more than we should to meet the user's specified reconstruction quality.
There was a problem hiding this comment.
It sounds to me that what would be ideal would be to use a fixed CV size for each reaction, say size 50. In that case I would need to implement another concretion for CVMatrixCompletionModel which performs CV splitting based on a fixed CV size, rather than on a percentage.
There was a problem hiding this comment.
(or, rather than creating a new concretion, cleverly factor out the CV-splitting behavior to get the new behavior via composition)
There was a problem hiding this comment.
Or maybe, it's easier to just add a max_cv_size argument to the TrainValSplitCVMatrixCompletionModel API.
Was able to install on the server and run the test suite.