Skip to content

Conversation

@hatifnatt
Copy link
Contributor

PR progress checklist (to be filled in by reviewers)

  • Changes to documentation are appropriate (or tick if not required)
  • Changes to tests are appropriate (or tick if not required)
  • Reviews completed

What type of PR is this?

Primary type

  • [build] Changes related to the build system
  • [chore] Changes to the build process or auxiliary tools and libraries such as documentation generation
  • [ci] Changes to the continuous integration configuration
  • [feat] A new feature
  • [fix] A bug fix
  • [perf] A code change that improves performance
  • [refactor] A code change that neither fixes a bug nor adds a feature
  • [revert] A change used to revert a previous commit
  • [style] Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc.)

Secondary type

  • [docs] Documentation changes
  • [test] Adding missing or correcting existing tests

Does this PR introduce a BREAKING CHANGE?

Yes.

BREAKING CHANGE: due changes in default service name, on systems where 'archive' installation method is used, duplicate service will be created. This can be avoided by updating pillar with 'docker:pkg:docker:service:name: dockerd'. Due fact that 'archive' method is default this change may affect a large number of users

Related issues and/or pull requests

Closes #273

Describe the changes you're proposing

Docker daemon service name in official Docker packages is 'docker' but formula uses 'dockerd'. Due to these differences docker service won't start after installation if package installation method is selected - docker:pkg:docker:use_upstream: repo .

Fix documentation link in systemd service

docker/software/archive/install.sls
Docker daemon binary in official Docker archive distribution is always 'dockerd' no reason to substitute service name

defaults.yaml
'null' from yaml is rendering as 'None' by Jinja therefore, systemd displays a warning in the log, replace with empty string to prevent this issue

Pillar / config required to test the proposed changes

docker:
  pkg:
    docker:
      use_upstream: repo

Debug log showing how the proposed changes work

2021-03-10 17:24:59,564 [salt.template    :122 ][DEBUG   ][2654] Rendered data from file: /var/cache/salt/minion/files/base/docker/software/service/running.sls:
# -*- coding: utf-8 -*-
# vim: ft=sls

include:
  - docker.software.package.install
  - docker.software.config.file

docker-software-service-running-unmasked:
  service.unmasked:
    - name: docker
    - onlyif: systemctl list-unit-files | grep docker >/dev/null 2>&1
    - require_in:
      - service: docker-software-service-running-docker

docker-software-service-running-docker:
  service.running:
    - name: docker
    - enable: True

docker-software-service-running-docker-fail-notify:
  test.fail_without_changes:
    - comment: |
        Formula is trying to start 'docker' service
        but failed, is it a correct name for Docker service in your OS?

        In certain circumstances the docker service will not start.
        Your kernel is missing some modules, or not in ideal state.
        See https://github.com/moby/moby/blob/master/contrib/check-config.sh
        * Rebooting your host is recommended!
    - onfail:
      - service: docker-software-service-running-docker
  service.enabled:
    - onlyif: True
    - name: docker
    - onfail:
      - service: docker-software-service-running-docker

Documentation checklist

  • Updated the README (e.g. Available states).
  • Updated pillar.example.

Testing checklist

  • Included in Kitchen (i.e. under state_top).
  • Covered by new/existing tests (e.g. InSpec, Serverspec, etc.).
  • Updated the relevant test pillar.

Additional context

Better merge this after #276

@myii
Copy link
Contributor

myii commented Mar 10, 2021

@hatifnatt Looks like you're going to need to modify this test as well:

describe service('dockerd') do

@hatifnatt
Copy link
Contributor Author

@myii good point, I'm still not using Kitchen locally
Does this need to be uncommented too?


Title is stating 'running and enabled' but it seems like running is not actually tested.
title 'should be running and enabled'

@myii
Copy link
Contributor

myii commented Mar 10, 2021

Does this need to be uncommented too?

Title is stating 'running and enabled' but it seems like running is not actually tested.

title 'should be running and enabled'

@hatifnatt Ideally, yes -- but there's probably a reason why @noelmcloughlin disabled it, so let's leave it for this particular PR. I suppose some of the instances will fail due to some container-based issues.

In fact, this block has also been disabled and would need to be investigated:

# describe service(name) do
# it { should be_enabled }
# it { should be_running }
# end

Docker daemon service name in official Docker packages
is 'docker' but formula uses 'dockerd'. Due to these differences
docker service won't start after installation.

Fix documentation link in systemd service

Docker daemon binary in official Docker archive distribution
is always 'dockerd' no reason to substitute service name

'null' from yaml is rendering as 'None' by Jinja therefore,
systemd displays a warning in the log, replace with empty string
to prevent this issue

BREAKING CHANGE: due changes in default service name, on systems
where 'archive' installation method is used, duplicate service
will be created. This can be avoided by updating pillar with
'docker:pkg:docker:service:name: dockerd'. Due fact that 'archive'
method is default this change may affect a large number of users
@hatifnatt hatifnatt force-pushed the fix-default-service-name branch from 79a56bc to f8f96f1 Compare March 10, 2021 20:53
Copy link
Contributor

@myii myii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @hatifnatt. Since this involves a BREAKING CHANGE, I'll leave this open for a second opinion, probably from @noelmcloughlin.

@noelmcloughlin
Copy link
Contributor

thanks @hatifnatt
Can we discuss service name at #273 (comment)

@noelmcloughlin
Copy link
Contributor

noelmcloughlin commented Mar 10, 2021

Title is stating 'running and enabled' but it seems like running is not actually tested.

@myii the docker service.running state did not work in Travis CI/CD so this was fudge. So yes, that was the reason those tests were disabled.

@myii
Copy link
Contributor

myii commented Mar 10, 2021

Title is stating 'running and enabled' but it seems like running is not actually tested.

@myii the docker service.running state did not work in Travis CI/CD so this was fudge. So yes, that was the reason those tests were disabled.

@noelmcloughlin Thanks for the response. I'm intrigued... let me try a run here in GitHub Actions to see what happens. I'll let you know shortly.

@noelmcloughlin
Copy link
Contributor

noelmcloughlin commented Mar 10, 2021

Service name is (probably) docker in linux native package format, & (definitely) dockerd with official archive releases. #273 (comment)

@noelmcloughlin
Copy link
Contributor

noelmcloughlin commented Mar 10, 2021

I'm intrigued... let me try a run here in GitHub Actions

In Travis CI/CD we used docker driver and means we cannot test docker.swarm states, etc.
In some linux distributions docker service has issues with iptables/kernel and reboot is necessary. This is all I recollect - so that is why the unit tests included the phase "reboot".

Suse: #203 #197
and I think there was CentOS ticket somewhere too.

@noelmcloughlin
Copy link
Contributor

Same issue in arch: #229 We need to figure out if it is package or archive or both formats that encounter the issue.

@noelmcloughlin
Copy link
Contributor

Actually maybe these issues only happen when kernel is upgraded and docker is installed before required reboots.

@myii
Copy link
Contributor

myii commented Mar 10, 2021

@noelmcloughlin @sticky-note So I've run a build with the service tests enabled, adding this commit as well -- almost there now:

Let me figure out the specifics of the failing builds and then we can add that to this PR.

@myii
Copy link
Contributor

myii commented Mar 10, 2021

@noelmcloughlin OK, the fedora-33 failure to start the service is due to this problem:

systemd[1]: /usr/lib/systemd/system/docker.service:16: PIDFile= references a path below legacy directory /var/run/, updating /var/run/docker.pid → /run/docker.pid; please update the unit file accordingly.
  • I.e. Need to use /run/ instead of /var/run/.

That will probably be the same for the other two failures as well.

@myii
Copy link
Contributor

myii commented Mar 10, 2021

Ah, that's not enough by itself, that then leads to the following error:

dockerd[1548]: failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain DOCKER: Iptables not found

So until there's a proper fix for that, perhaps the best gains would be to enable the service testing everywhere except for these three failing instances. That's easy enough to do from InSpec.

@noelmcloughlin
Copy link
Contributor

Yes that makes sense. @hatifnatt could you update the PR based on the test results please?

@myii
Copy link
Contributor

myii commented Mar 10, 2021

Yes that makes sense. @hatifnatt could you update the PR based on the test results please?

Don't worry, I'll push the fixed tests once I figure out the right ignores to use.

@hatifnatt Is Allow edits from maintainers selected?

@noelmcloughlin
Copy link
Contributor

dockerd[1548]: failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain DOCKER: Iptables not found

Here is ugly hack: https://forums.docker.com/t/failing-to-start-dockerd-failed-to-create-nat-chain-docker/78269/9
Yes - I think its best to disable those failing checks.

@myii
Copy link
Contributor

myii commented Mar 10, 2021

OK, the other two are failing with the same.

archive-fedora-32-master-py3:

dockerd[803]: failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain DOCKER: Iptables not found

archive-opensuse-leap-152-master-py3:

dockerd[758]: failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain DOCKER: Iptables not found

So let me fix up those tests and then push them here.

Excluding for `archive` on the following platforms until resolved.

---

`archive-fedora-33-master-py3`:

```
systemd[1]: /usr/lib/systemd/system/docker.service:16: PIDFile= references a path below legacy directory /var/run/,
            updating /var/run/docker.pid → /run/docker.pid; please update the unit file accordingly.
```

However, even when that's fixed:

```
dockerd[1548]: failed to start daemon: Error initializing network controller:
               error obtaining controller instance: failed to create NAT chain DOCKER: Iptables not found
```

---

`archive-fedora-32-master-py3`:

```
dockerd[803]: failed to start daemon: Error initializing network controller:
              error obtaining controller instance: failed to create NAT chain DOCKER: Iptables not found
```

---

`archive-opensuse-leap-152-master-py3`:

```
dockerd[758]: failed to start daemon: Error initializing network controller:
              error obtaining controller instance: failed to create NAT chain DOCKER: Iptables not found
```
@myii
Copy link
Contributor

myii commented Mar 10, 2021

@noelmcloughlin @hatifnatt Tests updated in c168ee1, service is being tested wherever possible.

@hatifnatt
Copy link
Contributor Author

@myii thanks for the fixes!

About iptables issues - iptables can be considered as mandatory and it's seems it can work with firewalld too
https://docs.docker.com/network/iptables/
I have already added iptables to docker:pkg:deps in #275 for CentOS 7 and 8 it was already here fo RedHat os_family


but due lists are not merged by grains.filter_by it's lost when docker:pkg:deps defined in osfamilymap

@hatifnatt
Copy link
Contributor Author

@noelmcloughlin could you please review and give any conclusion?

@noelmcloughlin
Copy link
Contributor

noelmcloughlin commented Apr 16, 2021

could you please review and give any conclusion?

I checked and found #273 (comment) and I was suggesting this line be fixed:

start: {{ d.pkg.docker.path }}/{{ d.pkg.docker.service.name }}

  • Add docker.pkg.docker.binary.name: docker to YAML
  • Change line 97 to start: {{ d.pkg.docker.path }}/{{ d.pkg.docker.binary.name }}

But I think you were not keen. @myii has approved feel free to merge yourself.

@hatifnatt
Copy link
Contributor Author

I outlined my thoughts about additional variable in #273 (comment) Do you still think additional variable is required?
Docker daemon binary is named dockerd so even if add new variable it will be docker.pkg.docker.binary.name: dockerd if it's supposed to be used in

start: {{ d.pkg.docker.path }}/{{ d.pkg.docker.service.name }}

Actually I already wrote my own formula for docker, just want to close this 'hanging' PR in some way.

Copy link
Contributor

@noelmcloughlin noelmcloughlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. You are probably correct. I reviewed the CI and both package/archive states are working across an impressive matrix of OS. So LGTM

@noelmcloughlin noelmcloughlin merged commit b0b1ba2 into saltstack-formulas:master Apr 16, 2021
@noelmcloughlin
Copy link
Contributor

Because your PR includes this fix: start: {{ d.pkg.docker.path }}/dockerd

@saltstack-formulas-travis

🎉 This PR is included in version 2.0.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

@hatifnatt
Copy link
Contributor Author

@noelmcloughlin thanks for review and merge!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] default service name must be docker not dockerd

4 participants