-
Notifications
You must be signed in to change notification settings - Fork 6
Refactor merge.py and add tests for it #165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
so it's easier to read what's going on
Add tests for some of the important functions in merge.py. Since I wrote the tests for pytest, I switched to using pytest as the runner for the rest of the tests.
5fbc5c6 to
6d219b4
Compare
QuanMPhm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few small questions before I approve
| if cluster_name is None: | ||
| cluster_name = metrics_from_file.get("cluster_name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any concern that cluster_name is not being checked? What if the provided files are from different clusters. It seems this behavior has been in the code prior to this refactoring, but wanted to ask just in case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we checked that cluster_name could be different in different files, so this behavior is unchanged. But it doesn't hurt to add that additional check. I'll add that in a different PR.
openshift_metrics/merge.py
Outdated
| gpu_a100sxm4=rates_data.get_value_at( | ||
| cpu=rates_data.get_value_at("CPU SU Rate", report_month, Decimal), # type: ignore | ||
| gpu_a100=rates_data.get_value_at("GPUA100 SU Rate", report_month, Decimal), # type: ignore | ||
| gpu_a100sxm4=rates_data.get_value_at( # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you using a linter additional to the ruff that we use in the CI? I didn't got any pre-commit errors when removing these comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was vscode yelling at me so I put these, but I am going to remove these.
| with open(file, "r") as jsonfile: | ||
| metrics_from_file = json.load(jsonfile) | ||
| cpu_request_metrics = metrics_from_file["cpu_metrics"] | ||
| memory_request_metrics = metrics_from_file["memory_metrics"] | ||
| gpu_request_metrics = metrics_from_file.get("gpu_metrics", None) | ||
| processor.merge_metrics("cpu_request", cpu_request_metrics) | ||
| processor.merge_metrics("memory_request", memory_request_metrics) | ||
| if gpu_request_metrics is not None: | ||
| processor.merge_metrics("gpu_request", gpu_request_metrics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestion, but I think this is more concise:
| with open(file, "r") as jsonfile: | |
| metrics_from_file = json.load(jsonfile) | |
| cpu_request_metrics = metrics_from_file["cpu_metrics"] | |
| memory_request_metrics = metrics_from_file["memory_metrics"] | |
| gpu_request_metrics = metrics_from_file.get("gpu_metrics", None) | |
| processor.merge_metrics("cpu_request", cpu_request_metrics) | |
| processor.merge_metrics("memory_request", memory_request_metrics) | |
| if gpu_request_metrics is not None: | |
| processor.merge_metrics("gpu_request", gpu_request_metrics) | |
| for resource in ["cpu_metrics", "memory_metrics", "gpu_metrics"]: | |
| if resource == "gpu_metrics": | |
| if gpu_request_metrics := metrics_from_file.get(resource): | |
| processor.merge_metrics(resource, gpu_request_metrics) | |
| else: | |
| request_metrics = metrics_from_file[resource] | |
| processor.merge_metrics(resource, request_metrics) |
If cpu_metrics and memory_metrics is always present, is it fine to make the loop even simpler?
for resource in ["cpu_metrics", "memory_metrics", "gpu_metrics"]:
if request_metrics := metrics_from_file.get(resource):
processor.merge_metrics(resource, request_metrics)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh, while this is a good suggestion, due to the old clumsy naming of things it'll break stuff. See, the files put things in cpu_metrics but the processors call it cpu_request so our neat little loop won't work. I am going to leave this as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, the strings looked similar and I thought they were the same
| return processor | ||
|
|
||
|
|
||
| def load_metadata(files: List[str]) -> MetricsMetadata: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I could load data and metadata in a single loop instead of loading files twice. And for that I reason I don't like what I've done. I am going to refactor it again later.
This refactors merge.py a bit, by moving some things into functions. Additionally it adds some basic tests and this time I switched to using pytest.
I ended up working on this because I realized I was adding more stuff in https://github.com/CCI-MOC/openshift-usage-scripts/pull/164/files and there were no tests.