Skip to content

Conversation

deepika-u
Copy link
Contributor

Fixes #2316

As suggested tried fixing in destroy(), but looks like we need extra testing to ensure no shutdown regressions considering the below

  • In shutdown, syncExec might block if the display is already disposed or unresponsive.
  • EventBroker cleanup still runs in non-UI thread, which is fine. But we have 2 different parts happening - part disposed later, eventbroker unsubscribed immediately.

Note : I have no recreate to test this fix, can this be tested with the existing Unit Test infrastructure in place already?

Copy link
Contributor

github-actions bot commented Aug 29, 2025

Test Results

 2 904 files  ±0   2 904 suites  ±0   1h 53m 3s ⏱️ - 5m 20s
 8 017 tests ±0   7 772 ✅ +1  245 💤 ±0  0 ❌  - 1 
23 585 runs  ±0  22 803 ✅ +1  782 💤 ±0  0 ❌  - 1 

Results for commit 8050325. ± Comparison against base commit 88774cd.

♻️ This comment has been updated with latest results.

@deepika-u deepika-u force-pushed the CompatibilityPart_2316 branch from 733cbf5 to 5bf0e5c Compare September 1, 2025 04:13
@eclipse-platform-bot
Copy link
Contributor

eclipse-platform-bot commented Sep 1, 2025

This pull request changes some projects for the first time in this development cycle.
Therefore the following files need a version increment:

bundles/org.eclipse.ui.navigator.resources/META-INF/MANIFEST.MF

An additional commit containing all the necessary changes was pushed to the top of this PR's branch. To obtain these changes (for example if you want to push more changes) either fetch from your fork or apply the git patch.

Git patch
From dd1a36b314adcee608aff97fead6bc029ee868bb Mon Sep 17 00:00:00 2001
From: Eclipse Platform Bot <[email protected]>
Date: Thu, 4 Sep 2025 12:44:17 +0000
Subject: [PATCH] Version bump(s) for 4.38 stream


diff --git a/bundles/org.eclipse.ui.navigator.resources/META-INF/MANIFEST.MF b/bundles/org.eclipse.ui.navigator.resources/META-INF/MANIFEST.MF
index 30277bf444..484bfaa8e9 100644
--- a/bundles/org.eclipse.ui.navigator.resources/META-INF/MANIFEST.MF
+++ b/bundles/org.eclipse.ui.navigator.resources/META-INF/MANIFEST.MF
@@ -2,7 +2,7 @@ Manifest-Version: 1.0
 Bundle-ManifestVersion: 2
 Bundle-Name: %Plugin.name
 Bundle-SymbolicName: org.eclipse.ui.navigator.resources; singleton:=true
-Bundle-Version: 3.9.800.qualifier
+Bundle-Version: 3.9.900.qualifier
 Bundle-Activator: org.eclipse.ui.internal.navigator.resources.plugin.WorkbenchNavigatorPlugin
 Bundle-Vendor: %Plugin.providerName
 Bundle-Localization: plugin
-- 
2.51.0

Further information are available in Common Build Issues - Missing version increments.

@deepika-u deepika-u force-pushed the CompatibilityPart_2316 branch from bc431ae to 76b3d34 Compare September 1, 2025 10:16
@deepika-u
Copy link
Contributor Author

@jukzi : can you take a look at this when you have some time please?

@merks
Copy link
Contributor

merks commented Sep 1, 2025

Note that he is "missing in action" so don't expect anything from him.

if (!alreadyDisposed) {
invalidate();
Display display = Display.getDefault();
if (display != null && !display.isDisposed() && Display.getCurrent() != display) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typically the isDisposed() check has to be repeated in Display thread, because dispose could happen in the diplay thread meanwhile

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typically the isDisposed() check has to be repeated in Display thread, because dispose could happen in the diplay thread meanwhile

added another check as well.

@deepika-u deepika-u force-pushed the CompatibilityPart_2316 branch 2 times, most recently from 9128c12 to 467d4f5 Compare September 2, 2025 07:18
@deepika-u
Copy link
Contributor Author

deepika-u commented Sep 2, 2025

@laeubi @jukzi : all the comments are incorporated, can you check now please. Thanks for your suggestions.

@laeubi
Copy link
Contributor

laeubi commented Sep 2, 2025

@deepika-u I still think it is better fixed at the caller side than in this particular method, so in what cases can it really happen that it is called from non-ui thread?

@deepika-u
Copy link
Contributor Author

deepika-u commented Sep 2, 2025

@deepika-u I still think it is better fixed at the caller side than in this particular method, so in what cases can it really happen that it is called from non-ui thread?

CompatibilityPart.destroy()
→ InjectorImpl.disposed()
→ EclipseContext.dispose()
→ OSGI/Equinox bundle stop thread
That path is not on the UI thread. It’s on the OSGi framework thread during bundle shutdown. So yes, it can and does happen that destroy() can be called from non-UI thread. That’s how we got the NPE in the first place.

In general

  • In normal user-driven part closure, destroy() will indeed be called from the UI thread.
  • But during application or bundle shutdown, destroy() can be invoked from OSGi/framework threads (non-UI). That’s exactly how the reported NPE occurred.
  • Since IntroPart.dispose() and other cleanup touch SWT/JFace resources are all dealing with UI components, they must be marshaled to the UI thread.

Therefore, the correct place to fix is in CompatibilityPart.destroy, which bridges the generic e4 lifecycle and the UI-thread-sensitive 3.x parts.

This way we can ensure

  • Lifecycle events are still respected.
  • invalidate() always runs in UI thread.
  • No raw Display usage, just e4-provided UISynchronize.

@laeubi
Copy link
Contributor

laeubi commented Sep 2, 2025

But during application or bundle shutdown, destroy() can be invoked from OSGi/framework threads (non-UI). That’s exactly how the reported NPE occurred.

I think you should find out where exactly this enters the system and what is triggering that part. There should be one place where we can ensure it is called inside the UI thread.

So can you share a stacktrace of the problematic call so we can better decide what would be the best place here?

@deepika-u
Copy link
Contributor Author

deepika-u commented Sep 3, 2025

@laeubi

So can you share a stacktrace of the problematic call so we can better decide what would be the best place here?

I tried myself running CTabItemTest and captured these logs.
I dint find anything related to the error in .log file. But could see the same error in console of eclipse. So just copied the content of console to a file and also sharing the .log file for reference.
console_output.txt
.log

Please have a look when you get some time and let me know your opinion.

@deepika-u deepika-u force-pushed the CompatibilityPart_2316 branch 2 times, most recently from b4d306b to cd07743 Compare September 4, 2025 12:29
@deepika-u
Copy link
Contributor Author

deepika-u commented Sep 4, 2025

@laeubi
On further anlaysis, i felt EditActionGroup.dispose() also needs to be updated. So adding this change too to make the fix complete.
Now with the EditActionGroup.dispose() changes in place along with CompatibilityPart.destroy(), i am not able to get the original reported NPE anymore on running the same CTabItemTest. Attaching the console for the same ->
Console_output_4thSept.txt

Once you get some time, can you review the changes please.

But i see another problem here though not very important but still a problem "org.eclipse.swt.SWTException: Widget is disposed"
For addressing this problem i have a pr attached which adds safe guards but no functionality being updated via it.
Will be adding the pr once raised.

#3241 is created for this.

@deepika-u deepika-u force-pushed the CompatibilityPart_2316 branch 2 times, most recently from 0e23a26 to 671c7ae Compare September 4, 2025 12:39
@laeubi
Copy link
Contributor

laeubi commented Sep 4, 2025

In the log I can only see errors about already disposed part / device but not any access from another thread?

@deepika-u
Copy link
Contributor Author

In the log I can only see errors about already disposed part / device but not any access from another thread?

yes exactly now with these both file changes, i dont see NPE anymore.

@laeubi
Copy link
Contributor

laeubi commented Sep 4, 2025

In the log I can only see errors about already disposed part / device but not any access from another thread?

yes exactly now with these both file changes, i dont see NPE anymore.

The point is that we should not fix side-effects but the callers who call the method outside the Device thread so that callers stacktrace would be required here to decide where to best fix it. Requiring everyone dispose method (and you already discovered now two places) does not really scale well.

@deepika-u
Copy link
Contributor Author

@laeubi

The point is that we should not fix side-effects but the callers who call the method outside the Device thread so that callers stacktrace would be required here to decide where to best fix it. Requiring everyone dispose method (and you already discovered now two places) does not really scale well.

May be i missed your intent and didnt understand you clearly. But what i also tried as updated in #3232 (comment)

I have tried running /org.eclipse.e4.ui.tests.css.swt/src/org/eclipse/e4/ui/tests/css/swt/CTabItemTest.java locally to reproduce the NPE. As said by jukzi in #2316 (comment)

Please do let me know if i have to try out anything else or capture any other logs to be specific.

@laeubi
Copy link
Contributor

laeubi commented Sep 4, 2025

I have tried running /org.eclipse.e4.ui.tests.css.swt/src/org/eclipse/e4/ui/tests/css/swt/CTabItemTest.java locally to reproduce the NPE

And was this successful? Sorry to ask but its a bit unclear to me if the problem is currently happens, only happens in the test and how it relates to EditActionGroup?

@deepika-u
Copy link
Contributor Author

And was this successful? Sorry to ask but its a bit unclear to me if the problem is currently happens, only happens in the test and how it relates to EditActionGroup?

From this comment -> #3232 (comment)
If you go through the full description in this section i have ended up in adding the change in EditActionGroup. Respective log for reference is also attached.

Now with CompatibilityPart and EditActionGroup changes in place then attempt to run /org.eclipse.e4.ui.tests.css.swt/src/org/eclipse/e4/ui/tests/css/swt/CTabItemTest.java locally i am not seeing the NPE as reported in #2316

But i see errors related to "org.eclipse.swt.SWTException: Widget is disposed" which i am trying to address with pr #3241

Never mind, thanks for asking again.

Observation : Now with both the pr changes in place when i run CTabItemTest locally, i dont see both the errors. It is clean.

@laeubi
Copy link
Contributor

laeubi commented Sep 5, 2025

@deepika-u
Copy link
Contributor Author

@laeubi
Thanks for merging #3241

is this PR actually needed?

Yes definitely this would be needed. Because both are addressing 2 different problems.
Today i have updated my master to latest code.
Now when i run CTabItemTest on master, i am still seeing the java.lang.RuntimeException caused by CompatibilityPart and EditActionGroup. Attached the console output here ->
console_output_8thSept.txt

Now i have updated my branch to be in sync with master and now when i run CTabItemTest, i dont see any exceptions as such.
For reference also attached console output ->
console_output_8thSept_onbranch_3232.txt

So this confirms that this pr is also needed for successful run of CTabItemTest along with #3241. Hope i have clarified. Thanks for the ask.

@laeubi
Copy link
Contributor

laeubi commented Sep 8, 2025

Now when i run CTabItemTest on master, i am still seeing the java.lang.RuntimeException caused by CompatibilityPart and EditActionGroup.

What I'm wondering about is the following:

  1. Do we see it anywhere when running Eclipse at all?
  2. Why does it not happen in the ibuild but only in the IDE?
  3. Should maybe the tests be simply marked to be running in the UI thread instead?

@deepika-u
Copy link
Contributor Author

@laeubi
We don’t see this issue in day-to-day Eclipse usage - it only shows up in JUnit test runs because shutdown timing differs between the IDE and the ibuild environment. The change we introduced makes disposal more robust (guards against disposed widgets and device access), so it fixes the test failures without negative side effects. Marking the test as “UI-thread only” would only mask the problem; our fix ensures correctness in both runtime and test scenarios with no behavioral change.

@deepika-u deepika-u force-pushed the CompatibilityPart_2316 branch from 58f0da9 to 3b19584 Compare September 8, 2025 09:59
}
}
}
super.dispose();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not call super.dispose() anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laeubi
The problem if we still call "super.dispose();" inside dispose(), it would run immediately - before the clipboard is actually disposed, since disposal happens asynchronously later.
This creates a lifecycle ordering ambiguity:
Context cleared too early.
Asynchronous disposal runs after super.dispose()

If that async runnable references anything indirectly tied to the context (logging, resources, actions), you can get into NPEs or disposed-object access.

So this is Why super.dispose() was dropped.

  • super.dispose() is trivial (just setContext(null)).
  • Avoided calling it before async disposal finishes, to prevent race conditions.
  • In practice, leaving context uncleared is not a big resource leak compared to crashing the UI.

So it was sacrificed for safety.

But if we still want to be correct, the better solution is to still call super.dispose() inside the async block. That way we don’t silently skip the context cleanup from ActionGroup.

We can do this way ::

@Override
public void dispose() {
    Display display = Display.getDefault();
    if (display != null && !display.isDisposed()) {
        display.asyncExec(() -> {
            if (clipboard != null && !clipboard.isDisposed()) {
                clipboard.dispose();
                clipboard = null;
            }
            EditActionGroup.super.dispose();
        });
    } else {
        // Display already gone, dispose synchronously as a fallback
        if (clipboard != null && !clipboard.isDisposed()) {
            try {
                clipboard.dispose();
                clipboard = null;
            } catch (SWTException e) {
                // Log or ignore safely
            }
        }
        super.dispose();
    }
}

@laeubi
Copy link
Contributor

laeubi commented Sep 8, 2025

our fix ensures correctness in both runtime and test scenarios with no behavioral change.

When disposal happens during call of dispose or async / blocking in UI running alongside could of course seen as behavior change. And if Test + real environments differ there is of course a chance for false positives (or we did not capture problems). Also it might have side-effects and complicates the code.

e.g. in your current change I can see

Display display = Display.getDefault();
if (display != null && !display.isDisposed()) {

}

if you look at javadoc you will see

Returns the default display. One is created (making the thread that invokes this method its user-interface thread) if it did not already exist.

so your check for null will never succeed. If no display exits one will be created here (and maybe never disposed). So its always hard to guess its really "side-effect-free" to add such code.

clipboard = null;
Display display = Display.getDefault();
if (display != null && !display.isDisposed()) {
display.asyncExec(() -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the otherplace syncExec is used, why is this different here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EditActionGroup.dispose() with asyncExec() because safety during shutdown is more important than ordering.

  • Disposing the Clipboard (and similar UI resources) must happen on the UI thread.
  • But dispose() might be called from a non-UI thread during shutdown.
  • If we used syncExec() here and called it from a non-UI thread while the display is already disposing, it could deadlock (the UI thread might already be blocked, tearing down widgets).
  • asyncExec() safely queues the disposal for "later" on the UI thread, avoiding blocking and deadlock risk.
  • If the display is already gone, we just try/catch to clean up as best we can.
    So in summary asyncExec() => fire and forget cleanup, safe during shutdown.

CompatibilityPart.destroy() with syncExec() because ordering and completion are crucial for consistent teardown.

  • destroy() is part of the Eclipse 4 lifecycle (@PreDestroy), which means the framework expects the part to be fully disposed when this method returns.
  • If you used asyncExec() here, invalidate() would run later, and there’s a risk that other teardown code(unsubscribes, model cleanup, etc.) would happen before invalidate() ran. That could cause inconsistencies or NPEs.
  • By using syncExec(), you ensure that invalidate() actually completes before the method continues and the lifecycle is maintained correctly.
    So in summary syncExec() => enforce order, synchronous cleanup before continuing.

@deepika-u deepika-u force-pushed the CompatibilityPart_2316 branch from 3b19584 to 8050325 Compare September 16, 2025 07:04
@deepika-u
Copy link
Contributor Author

@laeubi

Display display = Display.getDefault();
if (display != null && !display.isDisposed()) {

}

if you look at javadoc you will see

> Returns the default display. **One is created (making the thread that invokes this method its user-interface thread) if it did not already exist**.

Thanks for the feedback! You're absolutely right about the side effects of Display.getDefault(). I've updated the code to use Display.getCurrent() instead, which avoids creating a new display and ensures we're only disposing UI resources on the UI thread. This should reduce complexity and avoid unintended behavior in both test and runtime environments.

When you get some time, could you take a look at it now please?

if (clipboard != null) {
clipboard.dispose();
clipboard = null;
Display display = Display.getCurrent();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not using shell.getDisplay() it seems obvious here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah shell.getDisplay() is also a good choice. But i need to ensure that the Shell reference is not null and not disposed before calling shell.getDisplay(). Can i go ahead with this way?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more can you use clipboard itself, it is already is disposed what is the point doing more work here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leubi
Sorry, was not able to get your intent. You meant to use Display.getCurrent() or shell.getDisplay()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we dont have this pr in place, while running CTabItemTest locally we'll again endup into original problem reshaped as( post pr 3241)
this problem shown in the console output below ->
Console_without_pr3232.txt

Copy link
Contributor

@laeubi laeubi Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deepika-u where do you get this is it windows?
I think it makes sense to go through each exception step by step, the first I see in your log is:

org.eclipse.swt.SWTException: Device is disposed
	at org.eclipse.swt.SWT.error(SWT.java:4946)
	at org.eclipse.swt.SWT.error(SWT.java:4861)
	at org.eclipse.swt.SWT.error(SWT.java:4832)
	at org.eclipse.swt.widgets.Display.error(Display.java:1332)
	at org.eclipse.swt.widgets.Display.getThread(Display.java:2696)
	at org.eclipse.swt.dnd.Clipboard.dispose(Clipboard.java:219)
	at org.eclipse.ui.internal.navigator.resources.actions.EditActionGroup.dispose(EditActionGroup.java:59)

This looks like a bug in SWT (windows), because calling dispose on an already disposed component should be a no-op (and if device is disposed clipboard is gone anyways...).

Looking into other widgets implementations the do

if (display.thread != Thread.currentThread ()) error (SWT.ERROR_THREAD_INVALID_ACCESS);

so calling getThread instead of access the field directly seems the culprit here.

// Non-UI thread or display disposed
if (clipboard != null && !clipboard.isDisposed()) {
try {
clipboard.dispose();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the clipboard really be disposed outside the UI thread?!?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally not supposed to be which might result into SWTException or resource leaks if disposal fails silently or false positives.
In my opinion - If we are not on the UI thread and can't guarantee access to it, it's safer to skip disposal or log a warning.

Do you mean i can skip this line clipboard.dispose(); at line 70?

@merks
Copy link
Contributor

merks commented Sep 18, 2025

I hate to be a party pooper here but I think a call to dispose on UI part on a non UI thread is wrong and the caller should not do that. And this doesn't happen in a normal IDE when I exit. But we're changing code here that will affect the normal behavior. Normally this happens before org.eclipse.e4.core.internal.contexts.osgi.EclipseContextOSGi.dispose() is called:

image

I really don't think we should add defensive code in the UI to deal with a situation that arises only in a test.

I highly recommend finding a way to fix this by modifying the test to dispose the UI (close the workbench) before this destroy handling kicks in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CompatibilityPart/IntroPart: tries to dispose in not-SWT thread

5 participants