Skip to content

Conversation

satvshr
Copy link
Collaborator

@satvshr satvshr commented Jul 23, 2025

closes #59.

This PR moves the current PSeAAC algorithm into the aptanet directory and renames the old class to AptaNetPSeAAC (since it is aptanet specific).
This PR also makes the PSeAAC algorithm (in root) more flexible as:

  1. You can choose how many physicochemical properties you want to use-it's no longer limited to 21 (enabled via prop_indices).
  2. You can choose custom groups, which are no longer limited to size 3 (via group_props, which groups based on the order of elements in prop_indices).
  3. You can define your own groups explicitly, instead of being limited to automatically constructed groups from group_props (via custom_groups).

@satvshr satvshr requested a review from fkiraly September 10, 2025 19:59
@satvshr satvshr marked this pull request as ready for review September 10, 2025 20:09
@satvshr
Copy link
Collaborator Author

satvshr commented Sep 12, 2025

I think I will add a parameter to aa_props to be able to take a set of property vectors instead of working with the 21 defined ones, moving this to "in progress" till then.
Edit: on the other hand some feedback regarding whether or not I should make that change would be appreciated.

Copy link
Contributor

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's please split this into two classes: "vanilla" and generalized.

See #60 (review)

@satvshr
Copy link
Collaborator Author

satvshr commented Sep 29, 2025

Let's please split this into two classes: "vanilla" and generalized.

The generalized version has all the defaults of the "vanilla" version, hence the vanilla class call and generalized class call produce the same results, and the old tests still pass (without any change to the test itself). If there was some change in the values being used I would split it. If you still want me to split it (I dont get why if default class calls match), what should the defaults for the generalized class be?

@fkiraly
Copy link
Contributor

fkiraly commented Sep 30, 2025

I dont get why if default class calls match

I think a user is significantly more confused how to use the "generalized" class than the current class. This is definitely true at the current state of docstrings, which does not explain very well what the new variables would or could be, nor does it give any examples.
I am guessing that even in the case of perfect docstrings, the "generalized" class will be more confusing to the user, that may also be due to the chosen parametrization.

Also, it is easier to iterate on this class in review if it is a separate one, and we are less likely to reintroduce bugs in the existing class that is used elsewhere.

what should the defaults for the generalized class be?

The same, I would say.

@satvshr satvshr requested a review from fkiraly September 30, 2025 12:02
Copy link
Contributor

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Can you kindly put the generalized and the specific algorithm both in the pseaac folder?

@satvshr
Copy link
Collaborator Author

satvshr commented Oct 2, 2025

Thanks. Can you kindly put the generalized and the specific algorithm both in the pseaac folder?

Done.

@satvshr satvshr requested a review from fkiraly October 2, 2025 10:54
Copy link
Contributor

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what you did, but now you seem to have two identical classes, one in pseaac, and one in pseaac.aptanet. The old class has been deleted.

Can you kindly do the following:

  • restore the original class exactly as it was before this PR.
  • add the new class to a module called _features_general, in pseaac.
  • put tests in the same folder, aptanet.tests.

@satvshr
Copy link
Collaborator Author

satvshr commented Oct 4, 2025

I am not sure what you did, but now you seem to have two identical classes, one in pseaac, and one in pseaac.aptanet

I do not know why this happened either, will look into it

  • add the new class to a module called _features_general, in pseaac.

I would like to rename the old module to _features_aptanet then, I would also like to point out if I do this the properties which were used for aptanet will also then be used for generalized. And if someone wants to expand on the available property groups for generalized pseaac, aptanet pseaac will end up using the newly added ones too.

  • put tests in the same folder, aptanet.tests.

You mean pseaac.tests?

@fkiraly
Copy link
Contributor

fkiraly commented Oct 4, 2025

You mean pseaac.tests?

Yes, that is what I meant

I would like to rename the old module to _features_aptanet then

Can you do the renaming in another PR? Just to avoid any accidents and keep the files tracked.

@satvshr
Copy link
Collaborator Author

satvshr commented Oct 4, 2025

I would like to rename the old module to _features_aptanet then, I would also like to point out if I do this the properties which were used for aptanet will also then be used for generalized. And if someone wants to expand on the available property groups for generalized pseaac, aptanet pseaac will end up using the newly added ones too.

@fkiraly are you ok with this? I think we should have the generalized form and aptanet form be in separate directories, so that their feature sets can be different.

@fkiraly
Copy link
Contributor

fkiraly commented Oct 6, 2025

@fkiraly are you ok with this? I think we should have the generalized form and aptanet form be in separate directories, so that their feature sets can be different.

Not sure if we agree on the final state or what to do in this PR.

Regarding the end state:

  • you say "separate directories", I think the algorithm should be in the same directory (pseaac)
  • and in separate files, _pseaac_aptanet and _pseaac_general

In this PR specifically, I would recommend to avoid renaming any existing files to avoid merge conflicts later on. Renaming _pseaac tp _pseaac I would do in a separate PR.

@satvshr
Copy link
Collaborator Author

satvshr commented Oct 6, 2025

  • and in separate files, _pseaac_aptanet and _pseaac_general

I disagree with that as stated above: "the properties which were used for aptanet will also then be used for generalized. And if someone wants to expand on the available property groups for generalized pseaac, aptanet pseaac will end up using the newly added ones too." nstead id rather have the aptanet version inside a sub-directory in pseaac.

@satvshr satvshr requested a review from fkiraly October 8, 2025 14:02
Copy link
Contributor

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please do this as follows:

  • do not change the location or content of the current files
  • add the generalized algorithm in a sepraate file inside pseaac?

Thanks

@fkiraly
Copy link
Contributor

fkiraly commented Oct 8, 2025

I am not sure what you did, but now you seem to have two identical classes, one in pseaac, and one in pseaac.aptanet. The old class has been deleted.

Can you kindly do the following:

* restore the original class exactly as it was before this PR.

* add the new class to a module called `_features_general`, in `pseaac`.

* put tests in the same folder, `aptanet.tests`.

Like here

@fkiraly fkiraly merged commit b764bbc into main Oct 12, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] Generalized PSeAAC algorithm

2 participants