-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Hello Anomaly Detection Team,
I’m currently testing the workflow using a single plate, following the instructions and your paper. However, I foresee some issues and would appreciate your advice:
-
Very few training rows: After splitting controls (train/val/test), I only have 16 rows for training. This seems low - do you expect reasonable results with so few controls? What is a typical shape of your data here? Mine is currently:
INFO: Train controls: (6, 391), Validation controls: (1, 391), Test controls: (9, 391), Treatments: (368, 391)
-
Shape consistency: I assume the number of columns (features) must be identical between training and inference (as with CLIPn/PyTorch workflows), correct? Otherwise pythorch complains?
-
Multi-plate datasets: If I want to use more than one plate, what is your suggested approach? Should I standardise features on each plate using StandardScaler before concatenating to reduce batch effects? This would integrate more data for training/ validation and then I could include other reference datasets … Is this how you approach this?4.
Any guidance or best practices for combining multiple plates and increasing training data would be really helpful.
Thanks for your time!
Peter Thorpe