-
Notifications
You must be signed in to change notification settings - Fork 27
INTPYTHON-729: (PoC) Improve flexibility and QOL of Atlas/Vector Search Index Configurations #370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances MongoDB Atlas search index functionality by adding field mapping capabilities and index status monitoring. The changes allow developers to specify custom field mappings for search indexes and ensure proper synchronization during index operations.
Key changes:
- Added
field_mappings
parameter toSearchIndex
to allow custom Atlas Search field configurations - Introduced index status monitoring functions to wait for index creation/deletion completion
- Added
DynamicSearchIndex
class for dynamic field mapping scenarios
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
django_mongodb_backend/schema.py | Added index status monitoring functions and integrated them into index operations |
django_mongodb_backend/indexes.py | Enhanced SearchIndex with field_mappings support and added DynamicSearchIndex class |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
) | ||
self.field_mappings = field_mappings or {} | ||
|
||
fields = list({*fields, *self.field_mappings.keys()}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using set unpacking with {*fields, *self.field_mappings.keys()}
may not preserve the original order of fields. Consider using list(dict.fromkeys(list(fields) + list(self.field_mappings.keys())))
to maintain order while removing duplicates.
fields = list({*fields, *self.field_mappings.keys()}) | |
fields = list(dict.fromkeys(list(fields) + list(self.field_mappings.keys()))) |
Copilot uses AI. Check for mistakes.
fields_mappings
to get added to SearchIndexModel
configurationsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagined the index type API as subclasses like AutocompleteSearchIndex
but I guess that's not flexible enough if an index has multiple fields with different types.
field_path = column_prefix + model._meta.get_field(field_name).column | ||
fields[field_path] = {"type": type_} | ||
if field_name in self.field_mappings: | ||
fields[field_path] = self.field_mappings[field_name].copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why copy()
?
super().__init__(fields=fields, name=name) | ||
|
||
def deconstruct(self): | ||
path, args, kwargs = super().deconstruct() | ||
kwargs["field_mappings"] = self.field_mappings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self.field_mappings != {}:
(or None check), field_mappings or {}
in __init__()
is expedient, but unsure if it's the cleanest.
if field_name in self.field_mappings: | ||
fields[field_path] = self.field_mappings[field_name].copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is field_mappings
really supposed to contain the entire mapping? (e.g. "type" too). I'd think it would be more likely to be interpreted as "extra options to add to the field".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, type
in the Atlas Search Field Mapping refers to the Atlas Search Field Type. We infer type from our fields, but, for instance, strings can be interpreted as four different types:
- string (we infer)
- token
- stringFacet
- autocomplete
class DynamicSearchIndex(SearchIndex): | ||
suffix = "dsix" | ||
_error_id_prefix = "django_mongodb_backend.indexes.DynamicSearchIndex" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I can override the __init__()
and have it be something like this:
class DynamicSearchIndex(SearchIndex):
def __init__(...):
super().__init__(fields=("id"), name=name, field_mappings=field_mappings)
Overall design should be properly bike-shed either way.
Summary
Defining Field Mappings for Atlas Search and Vector Search indexes can get complicated. Our initial
SearchIndex
andVectorSearchIndex
solutions provide reasonable defaults for categorized fields -- however for the typical MongoDB poweruser, there may be more nuanced indexes they may want to use. This PR introduces an avenue to provide more custom field mappings on a field.Key changes
field_mappings
parameter to SearchIndex to allow custom Atlas Search field configurationsTest Plan
Screenshots
Image of a customized field_mapping added in a migration

It's representation on MongoDB Compass

Checklist
Checklist for Author
Checklist for Reviewer
TODOS
have JIRA tickets?Additional Considerations
SearchIndex
constructed without anything provided infields
fields
andfields_mapping