Skip to content

[Discussion] Tag Usage & Tagging Strategy #73

@BethanyG

Description

@BethanyG

Please review this PR comment as background: #69 (comment)


The TL;DR

We are going to be tagging literally everything, and now is probably the time to have a discussion about our thoughts and architecture around it.

Some Background

  • We've decided to use a Django add-in library called taggit to manage tagging for our backend API points. We currently use the default configuration.

  • Taggits default configuration uses Djangos contenttypes framework and the Django ORMs generic relation to model a manytomany relationship between items to be tagged and tags.

  • We can change this behavior to be more direct and more explicit by using what is called a through model, but this requires tracking tags for, say Resources in a resources_tags table, and tags for Hangouts in a hangouts_tags table.

What this means

(not an exhaustive list of points, but a start)

  • We can "tag" any model (resources, hangouts, people, discussions, notes) in a relatively straightforward matter by adding a "field" that points to taggits TaggableManager class. See this for basic usage. The TaggableManager takes care of looking up and saving the appropriate entries into the three associated "general" tables used for tagging.

  • We can use common taggit code to pull associated tags for an item, update those tags, or remove those tags, without having to write additional ORM code.

  • We can use common code to serialize and de-serialize tags for our API.

  • Pulling tags across many items or endpoints could get expensive, due to the generic relations used. See this warning from the docs.

  • Someone cannot easily trace the relationship between the models being tagged and the tags/taggeditems/contenttyes from looking at the DB - there are no explicit links from, for example, the resources_resource table to the taggit_tageditem table.

  • Custom serializer code had to be written to properly serialize the tags into the format we wanted them for for Resources. We'll need to use the same code for any other endpoint that uses tags and that code may still have bugs in it.

Some Questions

(again, not exhaustive, but a start)

  • Do we re-configure taggit to use explicit relations for each item being tagged? Pros? Cons?

  • Do we want a explicit Tags endpoint to handle things like translating, filtering, pulling relations, etc?

  • What makes sense for how we'll be using tags?

  • Because of the queries involved, will we be setting ourselves up for performance issues?

  • Some additional related concerns in the conversation here: Issues 43 & 67 #69 (comment)

  • Some side debate about generic relations here

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs discussionThe fix for this issue needs discussionquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions