Skip to content

Conversation

cah-hbaum
Copy link
Contributor

@cah-hbaum cah-hbaum commented Apr 10, 2024

  • added a new input that enables testing the test by providing yaml files containing label information for different nodes
  • added "test-example.yaml" containing an example for such a test file
  • removed the internal config, since the normal config file is already provided and read in as a default
  • added a pytest that enables easy testing of the check_nodes function (thanks to @martinmo)

@cah-hbaum cah-hbaum added Container Issues or pull requests relevant for Team 2: Container Infra and Tooling standards Issues / ADR / pull requests relevant for standardization & certification SCS is continuously built and tested SCS is continuously built and tested in order to raise velocity and quality SCS is standardized SCS is standardized SCS-VP10 Related to tender lot SCS-VP10 labels Apr 10, 2024
@cah-hbaum cah-hbaum self-assigned this Apr 10, 2024
@cah-hbaum
Copy link
Contributor Author

Also, like its written in the issue itself, we could think about providing some more scenarios for this test, especially with the newly added testing possibilities via files.

@cah-hbaum cah-hbaum requested a review from martinmo April 10, 2024 08:49
Copy link
Member

@martinmo martinmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea with the test data. I have some suggestions about this to make it easier to unit test, see the inline comments.

Apart from that: I think it makes more sense to remove external logging configuration and the --config flag. It is not used anywhere and the short form -c also conflicts with the short form of --os-cloud in the IaaS scripts.

The logging config can be simplified like it is done here:

if __name__ == '__main__':
logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.INFO)
cli(obj=Config())

Also, when I run without any arguments, the script fails.

$ ./kaas/k8s-node-distribution/k8s-node-distribution-check.py
The config file under ./config.yaml couldn't be found.
Traceback (most recent call last):
  File "/home/martinmorgenstern/workspace/scs-standards/Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py", line 127, in initialize_config
    with open(config.config_path, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: './config.yaml'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/martinmorgenstern/workspace/scs-standards/Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py", line 131, in initialize_config
    exit(1)
  File "/usr/lib/python3.10/_sitebuiltins.py", line 26, in __call__
    raise SystemExit(code)
SystemExit: 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/martinmorgenstern/workspace/scs-standards/Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py", line 244, in <module>
    return_code = asyncio.run(main(sys.argv[1:]))
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/home/martinmorgenstern/workspace/scs-standards/Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py", line 197, in main
    config = initialize_config(parse_arguments(argv))
  File "/home/martinmorgenstern/workspace/scs-standards/Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py", line 135, in initialize_config
    setup_logging(config.logging)
  File "/home/martinmorgenstern/workspace/scs-standards/Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py", line 112, in setup_logging
    logging.config.dictConfig(config_log)
  File "/usr/lib/python3.10/logging/config.py", line 811, in dictConfig
    dictConfigClass(config).configure()
  File "/usr/lib/python3.10/logging/config.py", line 375, in __init__
    self.config = ConvertingDict(config)
TypeError: 'NoneType' object is not iterable

@cah-hbaum cah-hbaum linked an issue Apr 17, 2024 that may be closed by this pull request
3 tasks
Copy link
Member

@martinmo martinmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard states that

At least one control plane instance MUST be run in each "failure zone"

I played around with the test-example.yaml you provided in ipython and created a scenario where all control plane nodes are in the same zone and region. In this case, the script would still exit with 0 and just print

WARNING: There seems to be no distribution across multiple regions or labels aren't set correctly across nodes.
WARNING: There seems to be no distribution across multiple zones or labels aren't set correctly across nodes.
INFO: The master nodes are distributed across 3 host-ids.
INFO: The worker nodes are distributed across 2 regions.

Shouldn't it produce an ERROR? Or do we assume the physical host is the failure zone?

Furthermore, I would like to see more scenarios like the one in the yaml file, and have them tested automatically in the CI against the expected check outcome, with pytest.

The config yaml template file can be removed.

@cah-hbaum
Copy link
Contributor Author

The standard states that

At least one control plane instance MUST be run in each "failure zone"

I played around with the test-example.yaml you provided in ipython and created a scenario where all control plane nodes are in the same zone and region. In this case, the script would still exit with 0 and just print

WARNING: There seems to be no distribution across multiple regions or labels aren't set correctly across nodes.
WARNING: There seems to be no distribution across multiple zones or labels aren't set correctly across nodes.
INFO: The master nodes are distributed across 3 host-ids.
INFO: The worker nodes are distributed across 2 regions.

Shouldn't it produce an ERROR? Or do we assume the physical host is the failure zone?

The thing that was a bit problematic while writing the standard was that different CSPs probably have different ideas about their failure zone, especially having small CSPs in mind. So it could very well be possible that the physical hosts are dividing the cluster into failure zones.
Thats why the test doesn't throw an error here.

Personally I think thats perfectly reasonable, but if you have another opinion on that, we can discuss this further. (But that would probably also mean to change the standard again)

Furthermore, I would like to see more scenarios like the one in the yaml file, and have them tested automatically in the CI against the expected check outcome, with pytest.

I'm gonna add the other files I added with some additional examples.

The config yaml template file can be removed.
You're right.

@martinmo
Copy link
Member

The thing that was a bit problematic while writing the standard was that different CSPs probably have different ideas about their failure zone, especially having small CSPs in mind. So it could very well be possible that the physical hosts are dividing the cluster into failure zones.
Thats why the test doesn't throw an error here.

Personally I think thats perfectly reasonable, but if you have another opinion on that, we can discuss this further. (But that would probably also mean to change the standard again)

Thanks for the clarification, I agree 👍

@cah-hbaum
Copy link
Contributor Author

Added some more example files.

@cah-hbaum cah-hbaum requested a review from martinmo April 24, 2024 10:57
Copy link
Member

@martinmo martinmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second review round:

As we discussed today in person, I've committed a pytest script that uses the scenarios in your YAML files and I added a few more (in my opinion) interesting scenarios.

One test currently fails (intentionally) because the conformance test issues a warning twice if the host-id label is missing where I think it should only do that once (that's one of the new scenarios I added). What do you think?

Furthermore, I propose to replace all occurences of master with control-plane (or just control), because this fits better with the current terminology in Kubernetes (master is outdated).

Small typo, which is not part of the diff: in line 149, it should read "the label … doesn't …" instead of "… don't …".

@cah-hbaum
Copy link
Contributor Author

cah-hbaum commented Apr 25, 2024

One test currently fails (intentionally) because the conformance test issues a warning twice if the host-id label is missing
where I think it should only do that once (that's one of the new scenarios I added). What do you think?

Agree with that, that duplication shouldn't happen. I removed that in the coming commit.

I also did a general update to test test script in order to enforce the standard requirement for labels to be provided (MUST). And I made a small change to your test to reflect this.

@cah-hbaum cah-hbaum requested a review from mbuechse April 30, 2024 10:14
@cah-hbaum cah-hbaum force-pushed the 489-small-update-to-scs-0214-test branch from a2a432a to 976a078 Compare April 30, 2024 10:14
Copy link
Contributor

@mbuechse mbuechse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor PEP-8 thing.

Copy link
Contributor

@mbuechse mbuechse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I would like to request two changes:

  • rename the script under test so that it can be imported directly
  • put the test data into a single yaml file, load that once, and hand out the sections using a fixture

@cah-hbaum
Copy link
Contributor Author

Ah sorry, I would like to request two changes:

  • rename the script under test so that it can be imported directly
  • put the test data into a single yaml file, load that once, and hand out the sections using a fixture

c60f895 should have the requested changes.

@cah-hbaum cah-hbaum requested a review from mbuechse May 22, 2024 07:12
@mbuechse
Copy link
Contributor

@cah-hbaum Thanks! Sorry for the long delay on my end. I think you need to update https://github.com/SovereignCloudStack/standards/blob/main/Tests/scs-compatible-kaas.yaml as well, no? Apart from that, I would prefer if the yaml file were loaded only once, which should be possible with pytest, if you use fixtures, or if you just do it on the module level. But it's not a deal breaker.

@cah-hbaum
Copy link
Contributor Author

@cah-hbaum Thanks! Sorry for the long delay on my end. I think you need to update https://github.com/SovereignCloudStack/standards/blob/main/Tests/scs-compatible-kaas.yaml as well, no? Apart from that, I would prefer if the yaml file were loaded only once, which should be possible with pytest, if you use fixtures, or if you just do it on the module level. But it's not a deal breaker.

I updated the scs-compatible-kaas.yaml and introduced fixtures (I hope thats how you meant it).

Copy link
Contributor

@mbuechse mbuechse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ;)

@cah-hbaum
Copy link
Contributor Author

Waiting on #524, will be merged afterwards.

@cah-hbaum cah-hbaum force-pushed the 489-small-update-to-scs-0214-test branch from 8630c60 to e64a2fb Compare June 5, 2024 11:44
@cah-hbaum cah-hbaum force-pushed the 489-small-update-to-scs-0214-test branch from e64a2fb to 3954816 Compare June 17, 2024 09:15
cah-hbaum and others added 10 commits June 17, 2024 14:01
- added a new input that enables testing the test by providing yaml files containing label information for different nodes
- added "test-example.yaml" containing an example for such a test file
- removed the internal config, since the normal config file is already provided and read in as a default

Signed-off-by: Hannes Baum <[email protected]>
Some dates fixing some problems mentioned by @martinmo.

Signed-off-by: Hannes Baum <[email protected]>
Adds additional test files and removes previously deprecated config (template) files.

Signed-off-by: Hannes Baum <[email protected]>
Signed-off-by: Martin Morgenstern <[email protected]>
Signed-off-by: Martin Morgenstern <[email protected]>
Some fixes and updates in order to be compliant with the testdata.
Thanks to @martinmo.

Signed-off-by: Hannes Baum <[email protected]>
Small pep-8 changes.

Signed-off-by: Hannes Baum <[email protected]>
@mbuechse had some change requests, that are tackled with this commit.

Signed-off-by: Hannes Baum <[email protected]>
@cah-hbaum cah-hbaum force-pushed the 489-small-update-to-scs-0214-test branch from 3954816 to 3eb16be Compare June 17, 2024 12:02
@cah-hbaum cah-hbaum merged commit a04755e into main Jun 17, 2024
@cah-hbaum cah-hbaum deleted the 489-small-update-to-scs-0214-test branch June 17, 2024 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Container Issues or pull requests relevant for Team 2: Container Infra and Tooling SCS is continuously built and tested SCS is continuously built and tested in order to raise velocity and quality SCS is standardized SCS is standardized SCS-VP10 Related to tender lot SCS-VP10 standards Issues / ADR / pull requests relevant for standardization & certification

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Small updates to conformance test for scs-0214-v1

3 participants