Skip to content

Conversation

@treefern
Copy link
Collaborator

This PR improves content validity / consistency tests in SP3 reading, tidies up unit tests, and addresses some Pandas deprecation and performance warnings.

  • Checks for SP3 column alignment (looks for spaces in columns which the (SP3d) spec declares as unused)
  • Revised strict_mode logic and structure of options, including optional overrides for specific use cases (e.g. comments, duplicate epochs, content vs header discrepancy).
  • Various unit test clean ups, including:
    • Using bytes input rather than mock files for input in many SP3 tests (simpler).
    • Fixes for checking values of raised exceptions (some checks were previously unreachable and have been moved outside the assertRaises() ContextManager).
    • Quieter output, by turning off strict_mode for things a test is not trying to actively check.
  • Pandas deprecation fixes including .view() -> .astype()
  • Performance warning fixes, by sorting DataFrame and by specifying a level when dropping.

@treefern treefern requested a review from ronaldmaj July 11, 2025 07:40
@treefern
Copy link
Collaborator Author

Previous test issue fixed.
A few other improvements added including:

  • more backwards compatible defaults for filename property extraction
  • more reliable discrepancy check
  • quieter error logging for overlong SP3 content lines
  • better docstrings

ronaldmaj
ronaldmaj previously approved these changes Oct 8, 2025
Copy link
Collaborator

@ronaldmaj ronaldmaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor edits suggested here and a couple comments. I think it's basically good enough to go in as-is, but if you wanted to address the edits, I can re-approve again.

import io as _io
import os as _os
import re as _re
from typing import Callable, Literal, Mapping, Optional, Union, List, Tuple, overload
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need List or Tuple to be imported?
I guess for compatibility right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. These were deprecated in Python3.9, so we can clean them up.
https://docs.python.org/3/library/typing.html#typing.List

This PEP gives broader info: https://peps.python.org/pep-0585/

Copy link
Collaborator Author

@treefern treefern Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of these across the codebase, so I'll put them in a separate PR to keep things tidy.
New PR here: #97


# Columns in SP3 we expect (by SP3 d spec) to be unused (contain a space).
# Deviation from this can be used to detect column misalignment
_SP3_UNUSED_COLUMN_INDEXES_EPOCH_HEADER: List[int] = [3, 8, 11, 14, 17, 20]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The List[int] could be list[int] but I assume for back compatibility we still need the one from Typing right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed since Python3.9 (thanks to PEP 585). Will be addressed in a subsequent PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes for new code implemented. Fixes for the rest of the code base are in #97 (to be merged after this PR goes in)

@treefern treefern self-assigned this Oct 9, 2025
@treefern treefern merged commit be77e02 into main Oct 10, 2025
4 checks passed
@treefern treefern deleted the NPI-4067-check-sp3-column-alignment branch October 10, 2025 02:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants