Skip to content

Conversation

Spectre5
Copy link
Contributor

@Spectre5 Spectre5 commented May 25, 2021

Major Note: This PR does NOT start with the ongoing shapely PR --- so whichever one were to get merged second would require a lot of changes to make them compatible. @connorferster please take a look at this PR as we'll be clashing!

A few months ago I started a major PR to re-work a lot of the project tooling, per the various discussions I started in the GitHub Discussions. I wanted to address the packaging, code quality, testing, and documentation. My implementation is mostly done, but I have not had any time in the last 1-2 months to finish the last bits. That said, it is currently in a working state and so I've decided to go ahead and raise this PR now and make those minor updates in future PR's if this is well received. This is partly because I want to get input before I try to carve out any more time to work it --- I've already put in a LOT of time to get it to this point. If it isn't well received, then I'll save myself the time of continuing to work it!

Summary

There are a LOT of changes here and basically every file in the entire repository has been touched and things are re-organized. If you pull this branch, the structure of the directory will be quite different than you're used to. If you just want the quick-start guide, it would be:

  1. Install Poetry
  2. In a shell (something with bash, like git-bash, if on Windows), simply execute ./runner clean -a and then ./runner, which will clean all un-needed files (including .venv directory), and then it executes EVERYTHING (formatting, linting, testing, documentation). This will take awhile the first time it is run.
  3. To view the generated documentation, go into the docs/build directory and open index.html
  4. To test how benchmarking works, you can run ./runner benchmark. Adding benchmarking was one of my big motivations since I have some other future ideas to improve the run-time execution, but I needed a way to objectively say how much anything I change improves the run time.

The High-Level Changes

This is essentially 6 semi-related pull requests in one. I'm merging the "dev" branch from my fork, which consists of 6 branches merged into it on my fork. You can see what changed in each PR by clicking the links below.

  1. Fixes some comments
    Minor change, I simply started with this since I already had a PR open with this change

  2. Updated Figure Method
    Added the plotting context manager (I already had a PR with this change so included it here too)

  3. Code Quality Tools
    Added many tools for code and documentation quality. This PR has a lot of discussion on the tool selection, justification, and alternatives.

    • poetry for package and virutal environment management
    • black for code formatting
    • isort for import sorting
    • flake8 + extensions for code linting
    • pylint for static analysis/linting
    • rstcheck to lint RST files (to remove in the future, doc8 is better)
    • doc8 to lint RST files (added later than this pull request)
    • pre-commit to perform format/linting checks on git commit
    • pytest and coverage for all testing and code coverage
    • custom runner bash script to manage these tools (works with git-bash on Windows too)
  4. GitHub Actions
    Drops travisCI in favor of GitHib Actions. This runs many checks on each PR. It runs the tests on various versions of Python on each of Linux, macOS, and Windows as well as code quality checks and building the documentation. The advantage is enforceing good code quality and passing tests (we still need more tests!), but the disadvantage is that it does take a little while to run since the tests and documentation are somewhat slow. I've used caches in the GitHub actions though, which should speed up the execution on subsequent re-runs of the CI (as long as the lock file doesn't change). As an example, click the "view details" button on this PR to see all of the CI checks that get performed.

  5. Code/Test Benchmarking
    This PR added pytest-benchmark to run benchmarking tests. They are disabled by default, since they run slowly, but can be run using the runner script with ./runner benchmark. This allows us to run the benchmark locally, make changes to the code, re-run the benchmarks and compare the results to see if our changes improve or slow down the code.

  6. Automated Docs
    MAJOR changes to the documentation setup, highlights are listed below

    • Remove all documentation build artifacts --- they don't belong in source control
    • Converted all source code docstrings from sphinx style to numpy style, which are far easier to read and write (this took a painful amount of time to do!)
    • Moved to a src directory layout and moved examples and tests to top level (more common modern layout and allows separate conftest.py files for using pytest)
    • ALL code is executed on documentation build and ALL images are generated at that time too (so all documentation images have been deleted from the repo). I saw that many of the doc examples were out of date --- with this setup, they are always updated since they are always run with the latest code.
      • Uses the excellent sphinx-gallery to run all examples and make a nice gallery view of the examples. It runs them as a notebook and allows for natural dialog between bits of code. See the linked PR for a screenshot.
      • Other code in the RST files are run by either jupyter_sphinx or the matplotlib plot directive, which automatically run the code and insert the resulting text or images right into the doc.
    • By far the biggest downside is that building the docs takes much, much longer now due to running ALL of the docstrings and examples. That said, they get written to the source directory (but excluded from git), so subsequent re-runs will be much quicker as only the changed parts should need to be re-run. I have some ideas for future PRs that can speed this up some too.

Still todo (future PRs)

  • Add more tests, including tests to check stress results
  • In the RST files, I've used some jupyter_sphinx and matplotlib plot directives. I wanted to play with both to see which I like better, but I'm not sure that I've made a decision yet and so there are some uses of both. Eventually we should choose just one to use and probably remove the other, unless there is a good reason to keep both (there are some different features).
  • Recall that the examples are run using sphinx-gallery, which converts them to notebooks to run. You can choose where to break the examples to add dialog (like a notebook!) and I've currently just done this in semi-arbitrary places as a proof of concept. We should probably walk through each example and put the breaks at the most natural place. Removing them all together is an option too, which would just show all of the code in 1 code block.
  • Re-structure the docs. I think we can make it overall a little more natural. Most of the classes are in the API section and also in other subsections which results in a lot of duplication, which seems un-neccessary. In addition, it means that the RST code in those docstrings is run twice when building the documentation, which results in nearly doubling the time to build the docs (the stress plots take a long time and there are many of them and they are all duplicated!). The autoautosummmary in the conf.py file can be used to better group some of the attributes and methods in the documentation, but I haven't had time to play much with it yet.

@Czarified
Copy link
Contributor

Wow! This is big, and all the changes seem great. I'm not knowledgeable enough on the programming side to comment/review most of these changes...

Also, would this require any future dev-work and PR's to use Poetry? I know it's popular, but I'm behind the curve there... Would I need to setup a poetry environment and learn how to use it, if I wanted to expand our testing?

@Spectre5 Spectre5 changed the title Dev Tooling, Testing, and Documentation Updates May 25, 2021
@Spectre5
Copy link
Contributor Author

You would need to use Poetry with this update. Even with minimal knowledge, it is very simple to get it installed and the runner script kind of automated the use of it for basic usage. They have a suggested method to install it on the Poetry website, but I prefer to just use pipx myself:

python3 -m pip install pipx
pipx ensurepath
pipx install poetry

Then you can start a new shell and use Poetry. If you pull this branch, you can just run the runner script as I mentioned, which uses Poetry.

IMPORTANT NOTE - part of the installation with this runner script, it installs a pre-commit hook. You may need to remove it if you switch back to another branch (rm .git/hooks/pre-commit). Alternatively, you could pull this branch into a new/separate directory and then just remove the entire directory.

@Spectre5
Copy link
Contributor Author

See also the contributing guide in the branch (can view it on the web on my fork)

@connorferster
Copy link
Collaborator

@Spectre5 Nice! Thanks for the heads up. I look forward to reviewing.

@Spectre5
Copy link
Contributor Author

I don't suppose anyone has had a chance to look at or play with this yet? :)

I'd like to make some other changes and improvements at some point, but want this merged first

@Czarified
Copy link
Contributor

I was honestly hoping we could get Connor's PR merged in, and release a v2.0 first. I think Robbie is occupied elsewhere right now, though. He's the only one that can actually merge and push releases. For this PR, I've still got a lot of learning to do. I'll try to pull it down this week and look at all your new features, but it'll definitely be a learning curve. 😄

@connorferster
Copy link
Collaborator

connorferster commented Jul 26, 2021 via email

@Czarified
Copy link
Contributor

Hi @Spectre5 , I'm still learning poetry and I'm on Windows. I didn't have git-bash installed, so I wanted to just run the poetry commands for your venv in run_install() from the runner script. It didn't seem to create a new virtual environment, it just used the one I had already made. I ran the 3 commands listed in the run_install function, in order.

I basically had a blank venv with only pip, setuptools, and poetry installed. So it's no big deal, but your descriptions seem to indicate it would create a new one. Am I missing something here?

Hopefully there will be more meaningful feedback from me in the near future. 😄

@Spectre5
Copy link
Contributor Author

Hi @Czarified - If there is already a virtual environment in the current directory (the project directory), then poetry will use that. The "normal" way to install it is in some global location (i.e. using pipx or the recommended installation script on their GitHub page). In this case, running it will make a new virtual environment in the project directory (if local venv is turned on globally or in the project settings for it).

@Spectre5 Spectre5 closed this May 20, 2022
@Spectre5 Spectre5 deleted the dev branch October 14, 2023 04:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants