This is for when someone (including me) can pick back up work! I am currently testing the multimodal configuration:
docker compose --profile prod-rocm-wsl run --rm prod-rocm-wsl python -m oncolearn.trainer --config data/configs/modeling/multimodal/tcga_brca_cbioportal_pam5
With cross validation using 5 folds each with a train / test set. I have ran a hyper parameter search using optuna:
- For each trial, training for 10 epochs on the train split, then testing on the test split
- Averaging F1 scores of the test splits for each of the 5 trials to get our maximization metric
- Tuning to maximize this F1 score
I am not sure if this is the best approach, but the goal would then be to have a proper train / val / test split and use the maximized hyper parameters for the final implementation of the model. We can then do the same thing for the cancer stage subtypes.
I have included my current trials, which can be viewed with the VSCode extension "Optuna Dashboard". The code expects the unzipped .db file to be under "outputs" in the base directory:
brca_cbioportal_pam50.zip
This is for when someone (including me) can pick back up work! I am currently testing the multimodal configuration:
With cross validation using 5 folds each with a train / test set. I have ran a hyper parameter search using optuna:
I am not sure if this is the best approach, but the goal would then be to have a proper train / val / test split and use the maximized hyper parameters for the final implementation of the model. We can then do the same thing for the cancer stage subtypes.
I have included my current trials, which can be viewed with the VSCode extension "Optuna Dashboard". The code expects the unzipped .db file to be under "outputs" in the base directory:
brca_cbioportal_pam50.zip