-
Notifications
You must be signed in to change notification settings - Fork 2
[ENH] Benchmarking framework and csv loader #114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…on tests and bug fixing
can you kindly summarize how requests were addressed and what the changes since last review are? |
The changes made were the ones requested, so:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
May I request to add a short notebook with a small benchmarking experiment and some reasonably chosen dataset?
I think that is important to to understand your design as well as for review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you want to do this in a different PR or this one? There are pros/cons for either option. In this one, it might allow to spot bugs which would otherwise need to be fixed in additional PR.
I would rather do it in a separate issue (#164), I can add bug fixing as part of the notebook PR as well. |
from pyaptamer.datasets._loaders import load_pfoa_structure | ||
|
||
|
||
def test_pfoa_loader(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we deleting this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in the description of the PR
could you kindly make sure you write a good PR description in the first post? |
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are making changes to AptaNetPipeline
- is this required for benchmarking? Would this not interact with other PRs, e.g., #153?
I would recommend to make this a separate PR.
While investigating #144, I noticed that the benchmark class performs cross-validation on the given dataset (training data) to generate validation splits on which the model is then evaluated. However, the purpose of cross-validation is to maximize the use of training data during model selection to select the best hyperparameters. Here there is no model selection though. Shouldn't the purpose of the benchmarking class be to train the model on training data using the best hyperparameters previously identified, and then perform a final evaluation on held-out test data (i.e., the |
If you read the PR description, the |
We can use |
What exactly is the bug? |
@NennoMP, re-sampling or CV can be used both in benchmarking and in tuning - it is even possible to combine both, to benchmark a model including the tuning algorithm. For benchmarking, one has to be careful about error bars etc, the CV-based error bars are not reliable (due to sample correlation from the cv splits). |
|
closes #141
This PR:
AptaNetPipeline
AptaNetPipeline
inherit fromBaseObject
to prevent errors during benchmarkingtest_pfoa
), the loader is already being tested intest_loaders