Skip to content

Commit b3f1a71

Browse files
authored
Update and fill out documentation (#13)
* Update documentation to match the current state of code * Add documentation for search tools, more doc cleanup.
1 parent 549c0f7 commit b3f1a71

File tree

7 files changed

+238
-54
lines changed

7 files changed

+238
-54
lines changed

docs/configuration.rst

Lines changed: 23 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,15 @@ Configuring Code Annotations is a pretty simple affair. Here is an example showi
77
88
source_path: /path/to/be/searched/
99
report_path: /path/to/write/report/to/
10+
safelist_path: .annotation_safe_list.yml
11+
coverage_target: 100.0
1012
annotations:
11-
name_of_annotation: ".. annotation_token::"
12-
another_annotation: ".. annotation_token2::"
13-
choice_annotation:
13+
".. annotation_token::":
14+
".. annotation_token2::":
15+
".. choice_annotation::":
1416
choices: [choice_1, choice_2, choice_3]
1517
name_of_annotation_group:
16-
- ".. first_group_token::"
18+
- ".. first_group_token::":
1719
- ".. second_group_token::":
1820
choices: [choice_4, choice_5]
1921
- ".. third_group_token::":
@@ -32,12 +34,23 @@ Configuring Code Annotations is a pretty simple affair. Here is an example showi
3234
``report_path``
3335
The directory where the YAML report file will be written. If it does not exist, it will be created.
3436

37+
``safelist_path``
38+
The path to a safelist, used by the Django Search tool to find annotations in models that are defined outside of
39+
the local source tree. See :doc:`safelist` for more information.
40+
41+
``coverage_target``
42+
A number from 0 - 100 that represents the percentage of Django models in the project that should have annotations.
43+
The Django Search tool will fail when run with the ``--coverage`` option if the covered percentage is below this
44+
number. See :doc:`django_coverage` for more information.
45+
3546
``annotations``
3647
The definition of annotations to be searched for. There are two types of annotations.
3748

38-
- Basic, or comment, annotations such as ``name_of_annotation`` and ``another_annotation`` above, allow for
39-
free-form text following the annotation itself. At this time the comment must be all on one line to be included
40-
in the report. Multi-line annotation comments are not yet supported.
49+
- Basic, or comment, annotations such as ``annotation_token`` and ``first_group_token`` above, allow for
50+
free-form text following the annotation itself. Note the colon after the annotation token! In configuration this
51+
type is a mapping type, mapping to a null value.
52+
53+
Note: At this time the comment must be all on one line. Multi-line annotation comments are not yet supported.
4154

4255
- Choice annotations, such as ``choice_annotation``, ``second_group_token`` and ``third_group_token``, limit the
4356
potential values of the annotation to the ones listed in ``choices``. This can help enforce consistency across the
@@ -46,9 +59,9 @@ Configuring Code Annotations is a pretty simple affair. Here is an example showi
4659

4760
In addition to the two types of annotations, it is also possible to group several annotations together into a fixed
4861
structure. In our example ``name_of_annotation_group`` is a group consisting of 3 annotations. When grouped, all
49-
of the annotations in the group **must** be present, or linting will fail. The order of the grouping does not matter
50-
except that the first annotation in the group **must** come first in the comments. Only one of each annotation is
51-
allowed in a group. See :doc:`writing_annotations` for more information and examples.
62+
of the annotations in the group **must** be present, or linting will fail. The order of the grouping does not
63+
matter as long as all of them are found before any other annotations. Only one of each annotation is allowed in a
64+
group. See :doc:`writing_annotations` for more information and examples.
5265

5366
``extensions``
5467
Code Annotations uses Stevedore extensions to extend the capability of finding new language comments. Language

docs/django_search.rst

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
Django Model Search Tool
2+
------------------------
3+
4+
code_annotations django_find_annotations::
5+
Usage: code_annotations django_find_annotations [OPTIONS]
6+
7+
Subcommand for dealing with annotations in Django models.
8+
9+
--config_file FILE Path to the configuration file
10+
--seed_safelist
11+
Generate an initial safelist file based on
12+
the current Django environment. [default:
13+
False]
14+
15+
--list_local_models
16+
List all locally defined models (in the
17+
current repo) that require annotations.
18+
[default: False]
19+
20+
--report_path TEXT Location to write the report
21+
-v Verbosity level (-v through -vvv)
22+
--lint Enable or disable linting checks [default:
23+
False]
24+
--report Enable or disable writing the report
25+
[default: False]
26+
--coverage Enable or disable coverage checks [default:
27+
False]
28+
--help Show this message and exit.
29+
30+
31+
Overview
32+
========
33+
The Django Model Search Tool, or Django Tool, is written to provide more structured searching and validation in a place
34+
where data is often stored. Since all of the models in a package can be enumerated it is possible, though not required,
35+
to use this tool to positively assert that **all** concrete (non-proxy, non-abstract) models in a project are annotated
36+
in some way. If you do not need this functionality and simply want to find annotations and create a report, the static
37+
search tool is much easier to configure and can search all of your code (instead of just model docstrings).
38+
39+
.. important::
40+
To use the Django tool you must first set the ``DJANGO_SETTINGS_MODULE`` environment variable to point to
41+
a valid settings file. The tool will initialize Django and use its introspection to find models. The settings file
42+
should have ``INSTALLED_APPS`` configured for all Django apps that you wish to have annotated. See the
43+
`Django Docs`_ for details.
44+
45+
.. _Django Docs: https://docs.djangoproject.com/en/dev/topics/settings/#designating-the-settings
46+
47+
The edX use case which prompted the creation of this tool is evident in many of our tests and code samples. It is to
48+
be able to track the storage, use, and retirement of personally identifiable information (PII) across our many projects
49+
and repositories. Since the majority of our information is stored via Django models, this tool helps us make sure that
50+
at least all of those are annotated to assert whether they contain PII or not.
51+
52+
The tool works by actually running your Django app or project in a development-like environment. It then uses Django's
53+
introspection tools to find all installed apps and enumerate their models. Each model further enumerates its inheritance
54+
tree and all model docstrings are checked for annotations. All annotations in all models and their ancestors are
55+
added to the list.
56+
57+
The Safelist
58+
============
59+
In order to assert that **all** concrete models in a project are annotated, it is also necessary to be able to annotate
60+
models that are otherwise installed in the Python virtual environment and are not part of your source tree. Models in
61+
your source tree are called "local models", and ones otherwise installed in the Python environment are "non-local"
62+
models. In order to annotate non-local models, which may come from other repositories or PyPI packages, use the
63+
"safelist" feature.
64+
65+
"Safe" in safelist doesn't mean that the models themselves do not require annotation, but rather it gives developers a
66+
place to annotate those models and put them in a known state. When setting up a repository to use the Django tool, you
67+
should use the ``--seed_safelist`` option to generate an initial safelist template that contains empty entries for all
68+
non-local models. In order for those models to count as "covered", you must add annotations to them in the safelist.
69+
70+
An freshly created safelist:
71+
72+
.. code-block:: yaml
73+
74+
social_django.Association: {}
75+
social_django.Code: {}
76+
77+
And one that has been annotated:
78+
79+
.. code-block:: yaml
80+
81+
social_django.Association:
82+
".. no_pii::": "This model has no PII"
83+
social_django.Code:
84+
".. pii::": "Email address"
85+
".. pii_types::": other
86+
".. pii_retirement::": local_api
87+
88+
.. note::
89+
Note that each model can only have one annotation for each token type. For example, it would be invalid to add a
90+
second ``.. no_pii::`` annotation to ``social_django.Association``.
91+
92+
.. important::
93+
Some types of "local" models are procedurally generated and do not have files in code, e.g. models created by
94+
django-simple-history. In those unusual circumstances you can choose to annotate them in the safelist to make
95+
sure they are covered.
96+
97+
Coverage
98+
========
99+
The second unique part of the Django tool is the model coverage report and check. Since we are able to find all models
100+
in a project with a reasonable degree of accuracy we can target a percentage of them that must be annotated. When you
101+
run the tool with the ``--coverage`` option it will compare the percentage of annotated models against the configuration
102+
variable ``coverage_target``. If the ``coverage_target`` is not met the search will fail and a list of the un-annotated
103+
models will be displayed.
104+
105+
Having annotations at any level of a model's inheritance will result in that model being considered "covered".
106+
107+
Lint and Report
108+
===============
109+
This tool supports the same ``--lint`` and ``--report`` options as the :doc:`static_search` tool, and
110+
they are functionally the same. Linting will fail on malformed annotations found in model docstrings, such as bad
111+
choices or incomplete groups. Reporting will write out a report file in the same format as the Static Tool, but with
112+
some additional information in the ``extra`` key such as the ``model_id``, which is a string in the format of
113+
"parentApp.ModelClassName", as Django uses to represent models internally. It also has the full model docstring in
114+
``full_comment``.
115+
116+
If a model inherits from another model that has annotations, those annotations will be included in the report under the
117+
child model's name, as well as any annotations in the model itself.
118+
119+
Local Models
120+
============
121+
Finally, to help find models in the local source tree that still need to be annotated, the tool has a
122+
``--list_local_models`` option. This will output the model id of all models that still need to be annotated.

docs/extensions.rst

Lines changed: 5 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,8 @@ Extensions
33

44
Code Annotations uses `Stevedore`_ to allow new lanuages to be statically searched in an easily extensible fashion. All
55
language searches, even the ones that come by default, are implemented as extensions. A language extension is
6-
responsible for finding all comments in files of the given type.
6+
responsible for finding all comments in files of the given type. Note that extensions are only used in the Static Search
7+
and not in Django Search, as Django models are obviously all written in Python.
78

89
.. _Stevedore: https://docs.openstack.org/stevedore/latest/
910

@@ -13,20 +14,13 @@ be fully functional. This is how the Javascript and Python extensions work, see
1314

1415
If a language has more than one single-line or multi-line comment type you may need to work at the lower level and
1516
inherit from ``AnnotationExtension``. ``SimpleRegexAnnotationExtension`` inherits from ``AnnotationExtension`` and
16-
serve as an example.
17+
serves as an example.
1718

1819
When inheriting from ``AnnotationExtension`` you must override:
1920

2021
``extension_name`` - A unique name for your extension, usually the name of the language it supports. This must match the
2122
name given in ``setup.py`` or ``setup.cfg`` (see below).
2223

23-
``_add_annotation_token`` - On construction this will be called once for each single annotation that is configured,
24-
allowing you to do any setup necessary to find these tokens in your search.
25-
26-
``_add_annotation_group`` - On construction this will be called once for each annotation group that is configured,
27-
allowing you to do any setup necessary to find these tokens in your search. Note that annotations in a group are
28-
*not* also sent to ``_add_annotation_token``, though you can do that yourself.
29-
3024
``search`` - Called to search for all annotations in a given file. Takes an open file handle, returns a list of dicts.
3125
Extensions do not need to worry about linting groups or choices, just returning all found annotations in the order
3226
they were discovered in the file.
@@ -40,7 +34,8 @@ When inheriting from ``AnnotationExtension`` you must override:
4034
'filename': name of the file passed in (available from file_handle.name),
4135
'line_number': line number of the beginning of the comment,
4236
'annotation_token': the annotation token,
43-
'annotation_data': the rest of the text after the annotation token (choices do not need to be split out here)
37+
'annotation_data': the rest of the text after the annotation token (choices do not need to be split out here),
38+
'extra': a dict containing any additional information your extension would like to include in the report
4439
}
4540
4641
In order to test your extension you will need to install it into your Python environment or virtualenv. First you must
@@ -58,4 +53,3 @@ define it as an entry point in your setup.py (or setup.cfg). The entry point nam
5853
Then you can simply ``pip install -e .`` from your project directory. If all goes well you should see your extension
5954
being loaded when you run the static annotation tool with the `-vv` or `-vvv` option. For your extension to work you
6055
will also need to add it to the ``extensions`` section of your configuration file.
61-

docs/getting_started.rst

Lines changed: 19 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -24,23 +24,29 @@ following is an example of a minimal configuration file. See ``.annotations_samp
2424

2525
.. code-block:: yaml
2626
27-
# Path that you wish to search, can be passed on the command line.
28-
# Directories will be searched recursively, can also point to a single file.
29-
source_path: ../path/to/be/searched/
27+
# Path that you wish to static search, can be passed on the command line
28+
# Directories will be searched recursively, but this can also point to a single file
29+
source_path: ../
3030
31-
# Directory to write the report to, can be passed on the command line.
32-
report_path: /path/to/write/report/to/
31+
# Directory to write the report to, can be passed on the command line
32+
report_path: reports
3333
34-
# Definitions of the annotations to search for.
34+
# Path to the Django annotation safelist file
35+
safelist_path: .annotation_safe_list.yml
36+
37+
# Percentage of Django models which must have annotations in order to pass coverage checking
38+
coverage_target: 50.0
39+
40+
# Definitions of the annotations to search for. Notice the trailing colon, this is a mapping type!
41+
# For more information see "Writing Annotations"
3542
annotations:
36-
name-of-annotation: ".. annotation_token::"
43+
".. annotation_token::":
3744
3845
# Code Annotations extensions to load and the file extensions to map them to
3946
extensions:
4047
python:
4148
- py
42-
javascript:
43-
- js
49+
4450
4551
Create some annotations
4652
-----------------------
@@ -79,10 +85,10 @@ your favorite text editor to make sure all of your annotations were found. Diffe
7985
this command, try ``-v``, ``-vv``, and ``-vvv`` to assist in debugging. ``--help`` will provide information on all of
8086
the available options.
8187

82-
By default the annotation search will perform linting, which makes sure that any found annotations match the structure
83-
listed in configuration. If any issues are found the command will fail with no report written, otherwise a YAML file
84-
containing the results of the search will be written to your ``report_path``. Both linting and reporting features can be
85-
turned off via command line flags.
88+
By default the static annotation search will perform linting, which makes sure that any found annotations match the
89+
structure listed in configuration. If any issues are found the command will fail with no report written, otherwise a
90+
YAML file containing the results of the search will be written to your ``report_path``. Both linting and reporting
91+
features can be turned off via command line flags.
8692

8793
Add more structure to your annotations
8894
--------------------------------------

docs/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ Contents:
1515
readme
1616
getting_started
1717
writing_annotations
18+
static_search
19+
django_search
1820
configuration
1921
extensions
2022
testing

docs/static_search.rst

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
Static Search Tool
2+
------------------
3+
4+
code_annotations static_find_annotations::
5+
Usage: code_annotations static_find_annotations [OPTIONS]
6+
7+
Subcommand to find annotations via static file analysis.
8+
9+
Options:
10+
--config_file FILE Path to the configuration file
11+
--source_path PATH Location of the source code to search
12+
--report_path TEXT Location to write the report
13+
-v, --verbosity Verbosity level (-v through -vvv)
14+
--lint Enable or disable linting checks [default: True]
15+
--report Enable or disable writing the report file [default: True]
16+
--help Show this message and exit.
17+
18+
Overview
19+
========
20+
The Static Search Tool, or Static Tool, is written as an extensible way to find annotations in code. The tool performs
21+
static analysis on the files themselves instead of relying on the language's runtime and introspection. It
22+
will optionally write a report file in YAML, and optionally check for annotation validity (linting).
23+
24+
Linting
25+
=======
26+
When passed the ``--lint`` option, each annotation will be checked for the following:
27+
28+
- Choice annotations must have one or more of the configured choices
29+
- Groups must have their annotations occur consecutively, though their order doesn't matter
30+
31+
If any of these checks fails, all errors will be printed and the return code of the command will be non-zero. If the
32+
``--report`` option was also provided no report will be written.
33+
34+
Reporting
35+
=========
36+
The YAML report is the main output of the Static Tool. It is a simple YAML document that contains a list of found
37+
annotations, grouped by file. Each annotation entry has the following keys:
38+
39+
.. code-block:: yaml
40+
41+
{
42+
'found_by': 'python', # The name of the extension which found the annotation
43+
'filename': 'foo/bar/file.py', # The filename where the extension was found
44+
'line_number': 101, # The line number of the beginning of the comment which contained the annotation
45+
'annotation_token': '.. no_pii::', # The annotation token found
46+
'annotation_data': 'This model contains no PII.', # The comment, or choices, found with the annotation token
47+
}
48+
49+
Extensions can also send back some additional data in an ``extra`` key, if desired. The Django Model Search Tool does
50+
this to return the Django app and model name.
51+
52+
Extensions
53+
==========
54+
The Static Tool uses Stevedore named extensions to allow for language-specific functionality. Python and Javascript
55+
extensions are included, many others can be made easily as needed. We will gladly accept pull requests for new languages
56+
or you can release them yourself on PyPI. For more information on extensions, see :doc:`extensions` and
57+
:doc:`configuration`.

0 commit comments

Comments
 (0)