-
Notifications
You must be signed in to change notification settings - Fork 646
Adding multi-column indexes to the DB #243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9ba0b95 to
9282989
Compare
cloutiertyler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally seems quite straightforward. Please see the comments.
| } | ||
| } | ||
|
|
||
| pub(crate) fn get_fields(&self, row: &ProductValue) -> Result<AlgebraicValue, DBError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is doing a project of a ProductValue right? Shouldn't it also be returning a product value? Why is it returning an AlgebraicValue?
| pub(crate) cols: NonEmpty<u32>, | ||
| pub(crate) name: String, | ||
| pub(crate) is_unique: bool, | ||
| idx: BTreeSet<IndexKey>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the IndexKey have a ProductValue now rather than an AlgebraicValue given that it is indexing over multiple (but at least one) columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, the major reason for this is because "single columns" are the most common case for a index key, and "multiple columns" is relatively rare.
So, when storing the key, most of the time will be like:
CREATE INDEX a ON t1 (column_i32); -- Most common case index `BuiltIn`
CREATE INDEX b ON t1 (column_i32, column_i32); -- Sometimes we have more complex keys
Will be on memory:
[AlgebraicValue::I32(0)] = Row(ProductValue(...))
[AlgebraicValue::Product(AlgebraicValue::I32(0), AlgebraicValue::I32(1))] = Row(ProductValue(...)) The comment in product_value.rs explain the logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this handle the difference between a field with a product value or several fields as a product value? Otherwise we could do something like enum OneOrMore<T> { One(T), More(Vec<T>) } for a tighter representation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you go into how that is achieved / where in the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| pub(crate) fn get_rows_that_violate_unique_constraint<'a>( | ||
| &'a self, | ||
| row: &'a ProductValue, | ||
| row: &'a AlgebraicValue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are rows represented as AlgebraicValues now? Can a row be a SumType or a Builtin type?
| table.indexes.remove(&ColId(col)); | ||
| table.schema.indexes.retain(|x| x.col_id != col); | ||
| table.indexes.remove(&col.clone().map(ColId)); | ||
| table.schema.indexes.retain(|x| x.cols != col); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This second part could be done by index_id instead of by cols, but I suppose this works as well.
| for index in table.indexes.iter() { | ||
| let [index_col_id] = *index.col_ids else { | ||
| anyhow::bail!("multi-column indexes not yet supported") | ||
| //Ignore multi-column indexes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we ignoring multi-column indexes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs constraint to fully resolve this info, and that comes in the next PR.
|
Looks like you should also give this PR a rebase |
e10d191 to
410a125
Compare
| pub(crate) index_id: IndexId, | ||
| pub(crate) table_id: u32, | ||
| pub(crate) col_id: u32, | ||
| pub(crate) cols: NonEmpty<u32>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| pub(crate) cols: NonEmpty<u32>, | |
| pub(crate) cols: NonEmpty<ColId>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this suggestions are already applied in the next PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DOH! xD
|
|
||
| impl BTreeIndex { | ||
| pub(crate) fn new(index_id: IndexId, table_id: u32, col_id: u32, name: String, is_unique: bool) -> Self { | ||
| pub(crate) fn new(index_id: IndexId, table_id: u32, cols: NonEmpty<u32>, name: String, is_unique: bool) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| pub(crate) fn new(index_id: IndexId, table_id: u32, cols: NonEmpty<u32>, name: String, is_unique: bool) -> Self { | |
| pub(crate) fn new(index_id: IndexId, table_id: u32, cols: NonEmpty<ColId>, name: String, is_unique: bool) -> Self { |
| &'a self, | ||
| table_id: &TableId, | ||
| col_id: &ColId, | ||
| cols: NonEmpty<u32>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| cols: NonEmpty<u32>, | |
| cols: NonEmpty<ColId>, |
| let mut cols = vec![]; | ||
| for index in table.indexes.values_mut() { | ||
| if index.index_id == *index_id { | ||
| cols.push(index.col_id); | ||
| cols.push(index.cols.clone()); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simpler: table.indexes.values().filter(|i| i.index_id == index_id).map(|i| i.cols.clone()).collect();
| col_names: index | ||
| .cols | ||
| .iter() | ||
| .map(|&x| insert_table.schema.columns[x as usize].col_name.clone()) | ||
| .collect(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's dedup the code here
| let value = row.project_not_empty(&index.cols)?; | ||
| return Err(IndexError::UniqueConstraintViolation { | ||
| constraint_name: index.name.clone(), | ||
| table_name: table.schema.table_name.clone(), | ||
| col_name: table.schema.columns[index.col_id as usize].col_name.clone(), | ||
| value: value.clone(), | ||
| col_names: index | ||
| .cols | ||
| .iter() | ||
| .map(|&x| insert_table.schema.columns[x as usize].col_name.clone()) | ||
| .collect(), | ||
| value, | ||
| } | ||
| .into()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
| pub(crate) cols: NonEmpty<u32>, | ||
| pub(crate) name: String, | ||
| pub(crate) is_unique: bool, | ||
| idx: BTreeSet<IndexKey>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this handle the difference between a field with a product value or several fields as a product value? Otherwise we could do something like enum OneOrMore<T> { One(T), More(Vec<T>) } for a tighter representation
410a125 to
5865a29
Compare
ee4ca12 to
8be67fd
Compare
8be67fd to
c2e7a71
Compare
cloutiertyler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally LGTM.
Description of Changes
Note: This is a multi-step set of PRs to bring constraints + multi-column indexes
Step:
IMPORTANT:
The use of
multi-columnindexes changes what is stored in theindexaccording to how many fields are defined:AlgebraicValueAlgebraicValue::ProductIf the API is breaking, please state below what will break