-
Couldn't load subscription status.
- Fork 881
stage1: rework systemd service structure #1407
Conversation
|
After this PR, when an app finishes its exit status will be written but it will not bring down the whole pod. I'm not sure how to do that yet with this new scheme and we should decide how to expose it to the user (if we want to). Comments welcome :) |
4f3173d to
0a8a587
Compare
|
This looks good to me. |
How can I trigger them? |
Starting rkt from a systemd unit file and stopping it, |
|
@iaguis Thanks, tried sigterm, didn't work. Turns out the pid returned by |
|
@yifan-gu you can send http://www.freedesktop.org/software/systemd/man/systemd.html |
|
@iaguis Nice, it works! LGTM |
I would like to have some plan here. No ideas for how we might implement it? As far as how it's exposed to the user, I imagine it'll be part of the spec in the pod manifest, perhaps with a corresponding rkt run flag. |
|
BTW, does this fix #1365 ? |
I'll try to find way to implement it tomorrow.
I can't reproduce it anymore with this rework. |
|
@jonboulle As for the spec, we can just make something like the kubernetes' pod lifecycle. |
|
@iaguis awesome diagram btw, would love to add a similar one to the documentation :-) |
Actually I can still reproduce it 😞. I don't get it using src-v225 but I do if I use host-v225. This is very weird. |
|
ACK that this doesn't fix host flavor with systemd v226. As discussed on IRC, the message is confusing because isolate is not used by the rkt services anymore. It's unclear what dependency is pulled in and why. |
00134c4 to
0f81603
Compare
|
Updated:
TODO:
|
|
@iaguis @jonboulle Though a little bit about how we can implement kubernete's restart policy by using systemd service to run a rkt pod:
This PR solves the first two options. For the last one, we would need to:
|
|
Other things we need to consider: (FWIW, for now in rkt/kubernetes, systemd's [Service].Restart is not being used. There is a loop that checking pod's state and restarting them according to the restart policy periodically, which I feel is suboptimal. |
|
Thought more on the prober stuff. I think it can be solved by just killing the whole pod when any app fails the probing. But would need a way for rkt to return non-zero exit code. (Currently this can be achieved by killing the rkt process) |
|
TestPodManifest I did some tests with current master and it seems the pod exits when a single app fails or when all apps are finished. If an app exits with Success, the pod doesn't exit. To preserve the current behavior we should add the mentioned line but changing ExecStop with OnFailure. |
f2ab380 to
dab8f6f
Compare
|
Updated and green :) |
|
Unimportant notes: I just tried to rethink this so that the shutdown.service wouldn't is uneeded, but it would probably be really ugly since the this additional service makes it easier to understand what's happening. |
|
Unimportant notes: I just tried to rethink this so that the shutdown.service wouldn't is uneeded, but it would probably be really ugly. The additional service makes it easy to grasp the point, so |
stage1/init/pod.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the caller also builds up an error message, this would print something like:
Failed to write default.target: failed to create service unit file: Permission denied
Can it be made shorter? Also, it's not a service unit file but a target..
|
A couple of small things, but it looks good otherwise :) Do you want to add the documentation with the diagram in a separate PR or in this one? |
This changes the way rkt starts apps in a pod.
The default.target has a Wants and After dependency on each app, making
sure they all start. Each app has a Wants dependency on an associated
reaper service that deals with writing the app's status exit. Each
reaper service has a Wants and After dependency with a shutdown service
that simply shuts down the pod.
The reaper services and the shutdown service all start at the beginning
but do nothing and remain after exit (with the RemainAfterExit flag). By
using the StopWhenUnneeded flag, whenever they stop being referenced,
they'll do the actual work via the ExecStop command.
This means that when an app service is stopped, its associated reaper
will run and will write its exit status to /rkt/status/${app} without
having to wait until the pod exits. When all apps' services stop, their
associated reaper services will also stop and will cease referencing the
shutdown service. This will cause the pod to exit.
A Conflicts dependency was also added between each reaper service and
the halt and poweroff targets (they are triggered when the pod is
stopped from the outside). This will activate all the reaper services
when one of the targets is activated, causing the exit statuses to be
saved and the pod to finish like it was described in the previous
paragraph.
For now we preserve the current rkt lifecycle by shutting down the pod
when the first app exits with failure.
dab8f6f to
1a2bcb6
Compare
|
Updated.
Let's discuss the documentation in #1459 |
|
LGTM 👍 |
stage1: rework systemd service structure
|
please file a follow-up re: nuclear option? |
This changes the way rkt starts apps in a pod.
The default.target has a Wants and After dependency on each app, making
sure they all start. Each app has a Wants dependency on an associated
reaper service that deals with writing the app's status exit. Each
reaper service has a Wants and After dependency with a shutdown service
that simply shuts down the pod.
The reaper services and the shutdown service all start at the beginning
but do nothing and remain after exit (with the RemainAfterExit flag). By
using the StopWhenUnneeded flag, whenever they stop being referenced,
they'll do the actual work via the ExecStop command.
This means that when an app service is stopped, its associated reaper
will run and will write its exit status to /rkt/status/${app} without
having to wait until the pod exits. When all apps' services stop, their
associated reaper services will also stop and will cease referencing the
shutdown service. This will cause the pod to exit.
A Conflicts dependency was also added between each reaper service and
the halt and poweroff targets (they are triggered when the pod is
stopped from the outside). This will activate all the reaper services
when one of the targets is activated, causing the exit statuses to be
saved and the pod to finish like it was described in the previous
paragraph.