Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service of Type=notify is started despite no notification, sidestepping ExecStartPost action #6049

Closed
michalpalka opened this issue May 30, 2017 · 4 comments

Comments

@michalpalka
Copy link

Submission type

  • Bug report

systemd version the issue has been seen with

systemd 232
+PAM +AUDIT -SELINUX +IMA +APPARMOR -SMACK -SYSVINIT +UTMP -LIBCRYPTSETUP +GCRYPT -GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID -ELFUTILS +KMOD -IDN

Used distribution

NixOS 8a11612d502d8578d7828e21778dcf3391084095 (26 May 2017)

In case of bug report: Expected behaviour you didn't see

A service of Type=notify should not be considered by systemd as having started if no notification is received, as explained by the man page systemd.service:

Behavior of notify is similar to simple; however, it is expected that the daemon sends a notification message via sd_notify(3) or an equivalent call when it has finished starting up. systemd will proceed with starting follow-up units after this notification message has been sent.

Furthermore, a service should be considered started only after having executed its ExecStartPost actions, although this is not stated clearly in the documentation.

In case of bug report: Unexpected behaviour you saw

Systemd considers a service of Type=notify as having started as soon as the main process executed by ExecStart terminates if no notification is received. The ExecStartPost actions are not executed.

In case of bug report: Steps to reproduce the problem

Below are the service files for myservice1 and myservice2, which depends on the first one. Included are also the scripts, and the results of starting service2:
myservice1.service:

[Unit]
Description=My Service 1

[Service]
ExecStart=/etc/myservice1.start
ExecStartPost=/etc/myservice1.startpost
NotifyAccess=all
Type=notify

myservice2.service:

[Unit]
After=myservice1.service
Description=My Service 2
Wants=myservice1.service

[Service]
ExecStart=/etc/myservice2.start
Type=simple

myservice1.start:

#!/bin/sh

echo myservice1 start
/etc/myservice1.run
sleep 5
echo myservice1 started

myservice1.run:

#!/bin/sh

echo myservice1 start running
sleep 3
echo myservice1 start notify
systemd-notify --ready
sleep 300
echo myservice1 finish running

myservice1.startpost:

#!/bin/sh

echo myservice1 startpost
sleep 5
echo myservice1 startpost finish

myservice2.start:

#!/bin/sh

echo myservice2 start
sleep 300
echo myservice2 finish

Running

# systemctl start myservice2.service

yields a correct behaviour, where myservice1 and myservice2 are started in order:

May 30 08:43:24 nixos systemd[1]: Starting My Service 1...
May 30 08:43:24 nixos myservice1.start[1359]: myservice1 start
May 30 08:43:24 nixos myservice1.start[1359]: myservice1 start running
May 30 08:43:27 nixos myservice1.start[1359]: myservice1 start notify
May 30 08:43:27 nixos myservice1.startpost[1364]: myservice1 startpost
May 30 08:43:29 nixos myservice1.start[1359]: myservice1 started
May 30 08:43:32 nixos myservice1.startpost[1364]: myservice1 startpost finish
May 30 08:43:32 nixos systemd[1]: Started My Service 1.
May 30 08:43:32 nixos systemd[1]: Started My Service 2.
May 30 08:43:32 nixos myservice2.start[1369]: myservice2 start

As visible in the trace, the ExecStartPost action of myservice1 is run after the notification is received, and myservice1 is considered as having started only after its ExecStartPost action terminates, which is followed by starting the second service.

On the other hand, if we replace the line containing sleep 3 in myservice1.run with sleep 7, the notification will be ignored, as it will be sent only after the main process started in ExecStart of myservice1 has terminated.

May 30 08:44:34 nixos systemd[1]: Starting My Service 1...
May 30 08:44:34 nixos myservice1.start[1380]: myservice1 start
May 30 08:44:34 nixos myservice1.start[1380]: myservice1 start running
May 30 08:44:39 nixos myservice1.start[1380]: myservice1 started
May 30 08:44:39 nixos systemd[1]: Started My Service 1.
May 30 08:44:39 nixos systemd[1]: Started My Service 2.
May 30 08:44:39 nixos myservice2.start[1386]: myservice2 start

As the trace above shows, systemd does not even wait for the notification after the main process from ExecStart terminates, but immediately declares myservice1 as having started, ignoring its ExecStartPost action, and goes on to start myservice2.

This behaviour is very unexpected, and is probably not intended.

@evverx
Copy link
Member

evverx commented May 30, 2017

This might have been fixed by #4745. Could you try v233, please?

@evverx evverx added pid1 needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer labels May 30, 2017
@evverx
Copy link
Member

evverx commented May 30, 2017

I'm sorry for giving you the wrong link. The correct one is #4259. That is also included in v233.

@michalpalka
Copy link
Author

With systemd v233, I get the same (correct) result with sleep 3:

May 31 06:08:42 nixos systemd[1]: Starting My Service 1...
May 31 06:08:42 nixos myservice1.start[1167]: myservice1 start
May 31 06:08:42 nixos myservice1.start[1167]: myservice1 start running
May 31 06:08:45 nixos myservice1.start[1167]: myservice1 start notify
May 31 06:08:46 nixos myservice1.startpost[1174]: myservice1 startpost
May 31 06:08:47 nixos myservice1.start[1167]: myservice1 started
May 31 06:08:51 nixos myservice1.startpost[1174]: myservice1 startpost finish
May 31 06:08:51 nixos systemd[1]: Started My Service 1.
May 31 06:08:51 nixos systemd[1]: Started My Service 2.
May 31 06:08:51 nixos myservice2.start[1177]: myservice2 start

With sleep 7, myservice1 fails to start, while myservice2 is still started afterwards:

May 31 06:10:05 nixos systemd[1]: Starting My Service 1...
May 31 06:10:05 nixos myservice1.start[1188]: myservice1 start
May 31 06:10:05 nixos myservice1.start[1188]: myservice1 start running
May 31 06:10:10 nixos myservice1.start[1188]: myservice1 started
May 31 06:10:10 nixos systemd[1]: Failed to start My Service 1.
May 31 06:10:10 nixos systemd[1]: myservice1.service: Unit entered failed state.
May 31 06:10:10 nixos systemd[1]: myservice1.service: Failed with result 'protocol'.
May 31 06:10:10 nixos systemd[1]: Started My Service 2.
May 31 06:10:10 nixos myservice2.start[1192]: myservice2 start

If I add Requires=myservice1.service to myservice2.service then the start of myservice2 is also correctly prevented, which is not the case with systemd v232.

May 31 06:21:53 nixos systemd[1]: Starting My Service 1...
May 31 06:21:53 nixos myservice1.start[2550]: myservice1 start
May 31 06:21:53 nixos myservice1.start[2550]: myservice1 start running
May 31 06:21:58 nixos myservice1.start[2550]: myservice1 started
May 31 06:21:58 nixos systemd[1]: Failed to start My Service 1.
May 31 06:21:58 nixos systemd[1]: Dependency failed for My Service 2.
May 31 06:21:58 nixos systemd[1]: myservice2.service: Job myservice2.service/start failed with result 'dependency'.
May 31 06:21:58 nixos systemd[1]: myservice1.service: Unit entered failed state.
May 31 06:21:58 nixos systemd[1]: myservice1.service: Failed with result 'protocol'.

It seems that the behaviour in v233 is correct and the issue can be closed.

I would recommend clarifying two things in the documentation:

  • Currently, the systemd.service man page says:

    ExecStartPost= commands are only run after the service has started successfully

    This suggests that a service will be considered started even before the ExecStartPost= commands are run, which is not the case and could be made clearer.

  • It would be good to mention that a service specified to start After another service will be started after the first service either succeeded or failed to start. This is mentioned in the description of Wants of the systemd.unit man page:

    A weaker version of Requires=. Units listed in this option will be started if the configuring unit is. However, if the listed units fail to start or cannot be added to the transaction, this has no impact on the validity of the transaction as a whole.

    All the available documentation discusses After, Wants and Requires separately, without giving any examples, which makes it difficult to understand how services will behave when different combinations of these are used.

@evverx
Copy link
Member

evverx commented Jun 6, 2017

Thanks for the feedback. I think the issue can be closed. Would you mind creating the new issue concerning the documentation?

@evverx evverx added already-implemented and removed needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer labels Jun 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants