Skip to content

Conversation

@gregw
Copy link
Contributor

@gregw gregw commented Oct 21, 2025

Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.
@gregw
Copy link
Contributor Author

gregw commented Oct 21, 2025

The issue is fundamentally that completeStream is being called twice.

For this to happen, we need a failure to be detected in the ChannelCallback.succeeded() method so that the following code isrun:

if (failure != null)
{
httpChannelState._callbackFailure = failure;
if (!stream.isCommitted())
errorResponse = new ErrorResponse(request);
else
completeStream = true;
}

This means that completeStream will be called, even though the other "legs of the 3 legged stool" are not complete - specifically we may still be inside the call to HandlerInvoker.run(), as in this stack for thread 258 :

2025-10-20 08:14:42,457 INFO  [WebServerImpl-258] trace.jetty.session.complete: complete() called on session [ManagedSession@4df349fb{id=MYSECRETSESSIONID,x=MYSECRETSESSIONID.node0,req=6,res=true}]
java.lang.Exception: complete stack
	...
	at org.eclipse.jetty.session.AbstractSessionManager.complete(AbstractSessionManager.java)
	at org.eclipse.jetty.session.AbstractSessionManager$SessionStreamWrapper.doComplete(AbstractSessionManager.java:1509)
	at org.eclipse.jetty.server.handler.ContextHandler$ScopedContext.run(ContextHandler.java:1518)
	at org.eclipse.jetty.session.AbstractSessionManager$SessionStreamWrapper.failed(AbstractSessionManager.java:1479)
	at org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.completeStream(HttpChannelState.java:788)
	at org.eclipse.jetty.server.internal.HttpChannelState$ChannelCallback.succeeded(HttpChannelState.java:1591)
	at org.eclipse.jetty.server.handler.gzip.GzipResponseAndCallback.succeeded(GzipResponseAndCallback.java:95)
	at org.eclipse.jetty.ee10.servlet.ServletChannel.onCompleted(ServletChannel.java:765)
	at org.eclipse.jetty.ee10.servlet.ServletChannel.handle(ServletChannel.java:429)
	at org.eclipse.jetty.ee10.servlet.ServletHandler.handle(ServletHandler.java:470)
	at org.eclipse.jetty.ee10.servlet.SessionHandler.handle(SessionHandler.java:717)
	at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1071)
	at org.eclipse.jetty.server.Handler$Wrapper.handle(Handler.java:740)
	at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:138)
	at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:611)
	at org.eclipse.jetty.server.Handler$Sequence.handle(Handler.java:805)
	at org.eclipse.jetty.server.Server.handle(Server.java:182)
	at org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.run(HttpChannelState.java:677)
	at org.eclipse.jetty.util.thread.Invocable$ReadyTask.run(Invocable.java:177)
	at org.eclipse.jetty.http2.server.internal.HttpStreamOverHTTP2$1.run(HttpStreamOverHTTP2.java:144)
	...

We can see that there is a failure detected in ChannelCallback.succeeded() because SessionStreamWrapper.failed is ultimately called. This means that there must have been one of the following application errors:

These are all plausible application errors, especially with something like server sent events.

So once thread 258 has called completeStream it returns all the way out of handling and it can be seen calling completeStream again, in this stack trace:

java.lang.Exception: complete stack
	...
	at org.eclipse.jetty.session.AbstractSessionManager.complete(AbstractSessionManager.java)
	at org.eclipse.jetty.session.AbstractSessionManager$SessionStreamWrapper.doComplete(AbstractSessionManager.java:1509)
	at org.eclipse.jetty.server.handler.ContextHandler$ScopedContext.run(ContextHandler.java:1524)
	at org.eclipse.jetty.session.AbstractSessionManager$SessionStreamWrapper.failed(AbstractSessionManager.java:1479)
	at org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.completeStream(HttpChannelState.java:788)
	at org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.run(HttpChannelState.java:712)
	at org.eclipse.jetty.util.thread.Invocable$ReadyTask.run(Invocable.java:177)
	at org.eclipse.jetty.http2.server.internal.HttpStreamOverHTTP2$1.run(HttpStreamOverHTTP2.java:144)
	...

which is called from this code after the handler has been invoked:

try (AutoLock ignored = _lock.lock())
{
stream = _stream;
_handling = null;
_handled = true;
failure = _callbackFailure;
callbackCompleted = _callbackCompleted;
lastStreamSendComplete = lockedIsLastStreamSendCompleted();
completeStream = callbackCompleted && lastStreamSendComplete;
if (LOG.isDebugEnabled())
LOG.debug("handler invoked: completeStream={} failure={} callbackCompleted={} {}", completeStream, failure, callbackCompleted, HttpChannelState.this);
}
if (LOG.isDebugEnabled())
LOG.debug("stream={}, failure={}, callbackCompleted={}, completeStream={}", stream, failure, callbackCompleted, completeStream);
if (completeStream)
{
if (LOG.isDebugEnabled())
LOG.debug("completeStream({}, {})", stream, Objects.toString(failure));
completeStream(stream, failure);
}

Note that in order for this code to actually call completeStream then it must be true that completeStream = callbackCompleted && lastStreamSendComplete. Note that this is normally not the case for HTTP/1 because the first call to completeStream would have recycled the HttpChannelState and the callbackCompleted and lastStreamSendComplete will both be false. However, for H2, HttpChannelStates are re-used after being recycled, so another request may have come in and set the fields of the state again, so that the second call to completeStream incorrectly completes that new request.

Thus I believe the core fix is to not call completeStream whilst we are still handling. Furthermore, if we are to ignore the last write leg of the stool, we should explicitly force lastStreamSendComplete to true;

Unfortunately I have been unable to produce a unit test for this, as I believe it needs precisely unlucky timing and an application error.

Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.
@gregw
Copy link
Contributor Author

gregw commented Oct 21, 2025

@sbordet @lorban Can you review the diagnosis that @janbartel and I have come up with. I'm 90% sure this is it, but I cannot reproduce (any thoughts how we might be able to do that?).

@gregw
Copy link
Contributor Author

gregw commented Oct 21, 2025

Note that we added this completeStream call in #9684

Copy link
Contributor

@sbordet sbordet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChannelCallback.failed() seems not entirely correct either.

Can we write test cases for this scenario?

@gregw
Copy link
Contributor Author

gregw commented Oct 21, 2025

ChannelCallback.failed() seems not entirely correct either.

@sbordet I will look....

Can we write test cases for this scenario?

Very hard, because unless there is another thread racing the second completeStream is not called. I'm open to suggestions.

gregw added 3 commits October 22, 2025 11:02
Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

refactor success and failure into single method.
Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

refactor success and failure into single method.

Improved EventSourceServlet
Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

refactor success and failure into single method.

Improved EventSourceServlet
@gregw
Copy link
Contributor Author

gregw commented Oct 22, 2025

@sbordet @lorban I'm getting concerned at the number of tests this PR is breaking in its current state.
I think we need to take time to consider the more "cleanup" changes and probably only make this in 12.1.x

So I propose that this PR should simply be 83c1718 for 12.0.x (perhaps with the EventSourceServlet cleanups), and then we can do a wider cleanup and refactor in 12.1.x next month.

@gregw gregw requested a review from sbordet October 22, 2025 19:50
gregw added 3 commits October 23, 2025 07:33
Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

refactor success and failure into single method.

Improved EventSourceServlet
Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

refactor success and failure into single method.

Improved EventSourceServlet
Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

refactor success and failure into single method.

Improved EventSourceServlet
@lorban
Copy link
Contributor

lorban commented Oct 23, 2025

Regarding the minimal 12.0 fix, I think httpChannelState._handling == null should actually be httpChannelState._handled

@gregw
Copy link
Contributor Author

gregw commented Oct 24, 2025

@lorban this is passing tests now.... so let's go for this one?
@sbordet @lorban Review please!!

@gregw gregw requested a review from sbordet October 24, 2025 21:04
@gregw gregw requested a review from sbordet October 29, 2025 04:51
@gregw gregw merged commit 9b54bfe into jetty-12.0.x Oct 29, 2025
10 checks passed
@github-project-automation github-project-automation bot moved this to ✅ Done in Jetty 12.0.30 Oct 29, 2025
@gregw gregw deleted the fix/12.0.x/13470-completeStreamOnce branch October 29, 2025 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

Jetty 12.0: ManagedSession issues due to recursion and/pr multiple completions of the stream.

4 participants