Skip to content

Conversation

@cwperks
Copy link
Member

@cwperks cwperks commented Jul 16, 2025

Description

This PR resolves an issue seen in neural-search when trying to upgrade to JDK24

Action in Neural-Search where this was seen: https://github.com/opensearch-project/neural-search/actions/runs/16206585072/job/46116901486?pr=1436

Relevant change in the JDK where AccessController.doPrivileged calls were removed: openjdk/jdk24u@3d49665#diff-96f6e99fbc0d093a8b423d1a3fc86c0408768b2bb10747d926832d9286c6b3bb

See the removed CGroupUtil file

»  Caused by: java.lang.SecurityException: Denied OPEN (read) access to file: /sys/fs/cgroup/system.slice/hosted-compute-agent.service/memory.max, domain: ProtectionDomain  (file:/home/runner/work/neural-search/neural-search/qa/rolling-upgrade/build/testclusters/neuralSearchBwcCluster-rolling-0/distro/3.2.0-ARCHIVE/plugins/opensearch-knn/opensearch-knn-3.2.0.0-SNAPSHOT.jar <no signer certificates>)
»   java.net.URLClassLoader@2e4eda17
»   <no principals>
»   java.security.Permissions@3a2546d6 (
»  )
»  
»  
»  	at java.base/java.nio.channels.FileChannel.open(FileChannel.java:347) ~[?:?]
»  	at java.base/java.nio.file.Files.lines(Files.java:3738) ~[?:?]
»  	at java.base/java.nio.file.Files.lines(Files.java:3829) ~[?:?]
»  	at java.base/jdk.internal.platform.CgroupSubsystemController.getStringValue(CgroupSubsystemController.java:66) ~[?:?]
»  	at java.base/jdk.internal.platform.cgroupv2.CgroupV2Subsystem.getMemoryLimit(CgroupV2Subsystem.java:247) ~[?:?]
»  	at java.base/jdk.internal.platform.CgroupMetrics.getMemoryLimit(CgroupMetrics.java:129) ~[?:?]
»  	at [email protected]/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:246) ~[?:?]
»  	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) ~[?:?]
»  	... 38 more
» ERROR][o.o.b.OpenSearchUncaughtExceptionHandler] [neuralSearchBwcCluster-rolling-0] fatal error in thread [main], exiting
»  java.util.ServiceConfigurationError: org.apache.lucene.codecs.KnnVectorsFormat: Provider org.opensearch.knn.index.codec.KNN990Codec.NativeEngines990KnnVectorsFormat could not be instantiated
»  	at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:552) ~[?:?]
»  	at java.base/java.util.ServiceLoader$ProviderImpl.newInstance(ServiceLoader.java:712) ~[?:?]
»  	at java.base/java.util.ServiceLoader$ProviderImpl.get(ServiceLoader.java:672) ~[?:?]
»  	at java.base/java.util.ServiceLoader$2.next(ServiceLoader.java:1256) ~[?:?]
»  	at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:68) ~[lucene-core-10.2.2.jar:10.2.2 279eb7aaafe985e5d0552b7f2a10b63185a3f893 - 2025-06-17 09:30:59]
»  	at org.apache.lucene.codecs.KnnVectorsFormat.reloadKnnVectorsFormat(KnnVectorsFormat.java:82) ~[lucene-core-10.2.2.jar:10.2.2 279eb7aaafe985e5d0552b7f2a10b63185a3f893 - 2025-06-17 09:30:59]
»  	at org.opensearch.plugins.PluginsService.reloadLuceneSPI(PluginsService.java:842) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.plugins.PluginsService.loadBundle(PluginsService.java:795) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.plugins.PluginsService.loadBundles(PluginsService.java:615) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.plugins.PluginsService.<init>(PluginsService.java:229) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.node.Node.<init>(Node.java:540) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.node.Node.<init>(Node.java:468) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:249) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.bootstrap.Bootstrap.setup(Bootstrap.java:249) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.bootstrap.Bootstrap.init(Bootstrap.java:411) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.bootstrap.OpenSearch.init(OpenSearch.java:168) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.bootstrap.OpenSearch.execute(OpenSearch.java:159) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.common.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:110) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.cli.Command.mainWithoutErrorHandling(Command.java:138) ~[opensearch-cli-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.cli.Command.main(Command.java:101) ~[opensearch-cli-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:125) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  	at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:91) ~[opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
»  Caused by: java.lang.ExceptionInInitializerError
»  	at org.opensearch.knn.index.codec.KNN990Codec.NativeEngines990KnnVectorsFormat.<init>(NativeEngines990KnnVectorsFormat.java:50) ~[?:?]
»  	at org.opensearch.knn.index.codec.KNN990Codec.NativeEngines990KnnVectorsFormat.<init>(NativeEngines990KnnVectorsFormat.java:42) ~[?:?]
»  	at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) ~[?:?]
»  	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) ~[?:?]
»  	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:483) ~[?:?]
»  	at java.base/java.util.ServiceLoader$ProviderImpl.newInstance(ServiceLoader.java:707) ~[?:?]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@cwperks
Copy link
Member Author

cwperks commented Jul 16, 2025

FYI @reta there are some more issues arising around JDK24, not only socket access for windows.

@github-actions
Copy link
Contributor

✅ Gradle check result for db4186b: SUCCESS

@codecov
Copy link

codecov bot commented Jul 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.83%. Comparing base (548e467) to head (00179f2).
Report is 2 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #18771      +/-   ##
============================================
- Coverage     72.90%   72.83%   -0.07%     
+ Complexity    68587    68535      -52     
============================================
  Files          5566     5566              
  Lines        314701   314701              
  Branches      45653    45653              
============================================
- Hits         229434   229221     -213     
- Misses        66655    66905     +250     
+ Partials      18612    18575      -37     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cwperks
Copy link
Member Author

cwperks commented Jul 16, 2025

Actually, I think it makes sense to add this in the general grant portion of security.policy.

If the JDK allows it generally, then so should our policy and then it doesn't require an update to the agent

@reta
Copy link
Contributor

reta commented Jul 17, 2025

FYI @reta there are some more issues arising around JDK24, not only socket access for windows.

Thanks @cwperks , the missing cgroups parts should be included into security.policy (https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/resources/org/opensearch/bootstrap/security.policy#L233) I believe

@cwperks
Copy link
Member Author

cwperks commented Jul 17, 2025

FYI @reta there are some more issues arising around JDK24, not only socket access for windows.

Thanks @cwperks , the missing cgroups parts should be included into security.policy (https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/resources/org/opensearch/bootstrap/security.policy#L233) I believe

Pushing an update now. Sorry was afk for a couple of hours.

@cwperks cwperks changed the title Update java agent to ignore frames after jdk.internal.platform.CgroupSubsystemController.getStringValue Add permission to read /sys/fs/cgroup/system.slice/hosted-compute-agent.service/memory.max in security.policy Jul 17, 2025
@cwperks cwperks marked this pull request as ready for review July 17, 2025 01:44
@cwperks cwperks requested a review from a team as a code owner July 17, 2025 01:44
@cwperks
Copy link
Member Author

cwperks commented Jul 17, 2025

@reta updated this PR.

Signed-off-by: Craig Perkins <[email protected]>
@github-actions
Copy link
Contributor

❌ Gradle check result for 862b7a6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for 862b7a6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for 862b7a6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for 862b7a6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for 862b7a6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for 00179f2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

✅ Gradle check result for 00179f2: SUCCESS

@cwperks cwperks merged commit 787a150 into opensearch-project:main Jul 17, 2025
30 of 32 checks passed
pranikum pushed a commit to pranikum/OpenSearch that referenced this pull request Jul 21, 2025
…nt.service/memory.max in security.policy (opensearch-project#18771)

* Ignore frames after jdk.internal.platform.CgroupSubsystemController.getStringValue

Signed-off-by: Craig Perkins <[email protected]>

* Add missing permission in security.policy

Signed-off-by: Craig Perkins <[email protected]>

* Add current as well

Signed-off-by: Craig Perkins <[email protected]>

---------

Signed-off-by: Craig Perkins <[email protected]>
rgsriram pushed a commit to rgsriram/OpenSearch that referenced this pull request Jul 22, 2025
…nt.service/memory.max in security.policy (opensearch-project#18771)

* Ignore frames after jdk.internal.platform.CgroupSubsystemController.getStringValue

Signed-off-by: Craig Perkins <[email protected]>

* Add missing permission in security.policy

Signed-off-by: Craig Perkins <[email protected]>

* Add current as well

Signed-off-by: Craig Perkins <[email protected]>

---------

Signed-off-by: Craig Perkins <[email protected]>
tandonks pushed a commit to tandonks/OpenSearch that referenced this pull request Aug 5, 2025
…nt.service/memory.max in security.policy (opensearch-project#18771)

* Ignore frames after jdk.internal.platform.CgroupSubsystemController.getStringValue

Signed-off-by: Craig Perkins <[email protected]>

* Add missing permission in security.policy

Signed-off-by: Craig Perkins <[email protected]>

* Add current as well

Signed-off-by: Craig Perkins <[email protected]>

---------

Signed-off-by: Craig Perkins <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants