[ENH] Benchmarking framework and csv loader #114

satvshr · 2025-08-25T08:04:47Z

closes #141

This PR:

Fixes a small bug in AptaNetPipeline
Makes AptaNetPipeline inherit from BaseObject to prevent errors during benchmarking
A csv loader
Removes an unnecessary test (test_pfoa), the loader is already being tested in test_loaders
The benchmarking framework

…on tests and bug fixing

fkiraly · 2025-09-25T07:34:00Z

can you kindly summarize how requests were addressed and what the changes since last review are?

satvshr · 2025-09-26T05:40:26Z

The changes made were the ones requested, so:

The class now only take 3 arguments, X, y, and cv.
The csv loader returns dataframes instead of numpy arrays and bunches now.
Added an example to show what output the run method yields.
Added tests.
Removed task_check and metaclass.

fkiraly

Great!

May I request to add a short notebook with a small benchmarking experiment and some reasonably chosen dataset?
I think that is important to to understand your design as well as for review.

fkiraly

do you want to do this in a different PR or this one? There are pros/cons for either option. In this one, it might allow to spot bugs which would otherwise need to be fixed in additional PR.

satvshr · 2025-09-29T20:58:19Z

In this one, it might allow to spot bugs which would otherwise need to be fixed in additional PR.

I would rather do it in a separate issue (#164), I can add bug fixing as part of the notebook PR as well.

fkiraly · 2025-09-30T07:56:25Z

pyaptamer/datasets/tests/test_pfoa.py

-from pyaptamer.datasets._loaders import load_pfoa_structure
-
-
-def test_pfoa_loader():


why are we deleting this file?

Added in the description of the PR

fkiraly · 2025-09-30T07:57:08Z

could you kindly make sure you write a good PR description in the first post?

satvshr · 2025-09-30T10:13:27Z

could you kindly make sure you write a good PR description in the first post?

Done.

fkiraly

We are making changes to AptaNetPipeline - is this required for benchmarking? Would this not interact with other PRs, e.g., #153?

I would recommend to make this a separate PR.

NennoMP · 2025-10-02T19:57:54Z

While investigating #144, I noticed that the benchmark class performs cross-validation on the given dataset (training data) to generate validation splits on which the model is then evaluated. However, the purpose of cross-validation is to maximize the use of training data during model selection to select the best hyperparameters. Here there is no model selection though.

Shouldn't the purpose of the benchmarking class be to train the model on training data using the best hyperparameters previously identified, and then perform a final evaluation on held-out test data (i.e., the test_li2014.csv file from AptaTrans)?

satvshr · 2025-10-06T12:36:25Z

is this required for benchmarking?

If you read the PR description, the BaseObject part is required for benchmarking, the bug on the other hand should be patched and is small so I thought id put it here too.

satvshr · 2025-10-06T12:39:20Z

Shouldn't the purpose of the benchmarking class be to train the model on training data using the best hyperparameters previously identified, and then perform a final evaluation on held-out test data (i.e., the test_li2014.csv file from AptaTrans)?

We can use StratifiedSplit if we want a train-test split for evaluation, and "generic" cv to ensure a certain split is not the cause for one model performing better than the other one, and we can get a more accurate result by evaluating it over n iterations.

fkiraly · 2025-10-08T22:09:59Z

is this required for benchmarking?

If you read the PR description, the BaseObject part is required for benchmarking, the bug on the other hand should be patched and is small so I thought id put it here too.

What exactly is the bug?

fkiraly · 2025-10-08T22:11:27Z

While investigating #144, I noticed that the benchmark class performs cross-validation on the given dataset (training data) to generate validation splits on which the model is then evaluated. However, the purpose of cross-validation is to maximize the use of training data during model selection to select the best hyperparameters. Here there is no model selection though.

Shouldn't the purpose of the benchmarking class be to train the model on training data using the best hyperparameters previously identified, and then perform a final evaluation on held-out test data (i.e., the test_li2014.csv file from AptaTrans)?

@NennoMP, re-sampling or CV can be used both in benchmarking and in tuning - it is even possible to combine both, to benchmark a model including the tuning algorithm. For benchmarking, one has to be careful about error bars etc, the CV-based error bars are not reliable (due to sample correlation from the cv splits).

satvshr · 2025-10-09T06:37:08Z

What exactly is the bug?

FunctionTransformer takes a dictionary as input to kw_args, the test was only passing as I was not giving an input to kw_args. Once I did that, tests started failing.

satvshr added 30 commits July 7, 2025 13:21

Added the pseaac encoding algorithm

e37135c

Added Aptanet implementation

6ea5ff7

Made pseaac to a class and made the functions private, still working …

a5f01e0

…on tests and bug fixing

Made a few readability changes

3773a90

Edited tests

9b9a3da

Added pytest to tests

2dfe0c7

Added numpy style docstrings and ruff formatting

1e182d3

Added docstrings, made functions pvt and made code more clean

20d7e37

Removed AptaNet from root

fc2f051

Added example

62f6c42

Removed AptaNet from root

848fc9b

Made requested changes

1515efe

Merge branch 'main' into issue28

75d4efb

Made requested changes and updated tests

733f908

Made suggested changes

04ab599

Removed lint. from pyproject, will push it as a separate PR

dc78e44

Refactored code

c347988

Added pandas as a dependancy

d9537f4

Renamed parent folder name to put it in the same level as AptaNet

1c46c55

Merge remote-tracking branch 'origin/main' into issue13

a716872

Refactored code and made architecture flexible

7781441

Edited docstrings and directory structure

e762cc8

Merge branch 'main' into issue28

e844d4f

weird rename experiment

f9392ef

weird rename experiment pt. 2

beb45ec

Made requested changes

d603d07

Made requested changes

6ecf576

Made requested changes

b91c511

chore: dummy commit to retrigger CI

b2428b0

Added missing init file to utils

2982954

Test suite added and bugs fixed

2e2d71f

satvshr requested a review from fkiraly September 21, 2025 13:28

satvshr added 3 commits September 21, 2025 19:12

arg name fixing

90af1ee

Update _csv_loader.py

971ae29

Update test_csv_loader.py

fe19150

Merge branch 'main' into issue109

bb80a81

fkiraly requested changes Sep 29, 2025

View reviewed changes

fixed docstring and example

18833f0

satvshr mentioned this pull request Sep 29, 2025

[ENH] Notebook for Benchmarking #164

Open

fkiraly reviewed Sep 29, 2025

View reviewed changes

satvshr requested a review from fkiraly September 29, 2025 20:59

satvshr mentioned this pull request Sep 30, 2025

[ENH] Notebook for Benchmarking #165

Open

satvshr changed the title ~~[ENH] Benchmarking framework~~ [ENH] Benchmarking framework and csv loader Sep 30, 2025

satvshr mentioned this pull request Sep 30, 2025

[ENH] Update AptaNet notebook to use AptaTrans data and a pdb for prediction #153

Open

fkiraly reviewed Sep 30, 2025

View reviewed changes

satvshr requested a review from fkiraly September 30, 2025 10:13

Update _aptanet_utils.py

84a2754

fkiraly requested changes Oct 1, 2025

View reviewed changes

satvshr requested a review from fkiraly October 8, 2025 14:01

		from pyaptamer.datasets._loaders import load_pfoa_structure


		def test_pfoa_loader():

[ENH] Benchmarking framework and csv loader #114

Are you sure you want to change the base?

[ENH] Benchmarking framework and csv loader #114

Uh oh!

Conversation

satvshr commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fkiraly commented Sep 25, 2025

Uh oh!

satvshr commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fkiraly left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

satvshr commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fkiraly Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

satvshr Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

fkiraly commented Sep 30, 2025

Uh oh!

satvshr commented Sep 30, 2025

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

NennoMP commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

satvshr commented Oct 6, 2025

Uh oh!

satvshr commented Oct 6, 2025

Uh oh!

fkiraly commented Oct 8, 2025

Uh oh!

fkiraly commented Oct 8, 2025

Uh oh!

satvshr commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

satvshr commented Aug 25, 2025 •

edited

Loading

satvshr commented Sep 26, 2025 •

edited

Loading

fkiraly left a comment •

edited

Loading

satvshr commented Sep 29, 2025 •

edited

Loading

NennoMP commented Oct 2, 2025 •

edited

Loading