-
Notifications
You must be signed in to change notification settings - Fork 645
Wrangle benchmarks #289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrangle benchmarks #289
Conversation
62e0a3c to
f797782
Compare
|
benchmarks please |
fef1058 to
73e591b
Compare
c3e555d to
303b332
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. Inline is mostly minor comments about comments. I think this PR can land whenever, as far as I'm concerned, but I want two things.
One is the framework or schema or strategy or whatever you want to call it, by which you derived the set of benchmarks to write out. If Joshua (say) adds an Update operation that isn't just a delete+insert (say), I'd love for him to be able to read that framework and just slot in exactly the right set of benchmarks. I'd like this to be part of this PR.
Two, if there's a ways to go, I'd like other people to be able to see either in the code or in a ticket how far there is to go until this work is complete. I'm happy if there are more PRs following this one until you get there.
| fn clear_table(&self, table_name: String) -> Result<(), anyhow::Error> { | ||
| let db = &*self.worker_database_instance.relational_db; | ||
| db.with_auto_commit(|tx| { | ||
| let tables = db.get_all_tables(tx)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could plumb through a table_by_name function?
| let tables = db.get_all_tables(tx)?; | ||
| for table in tables { | ||
| if table.table_name != table_name { | ||
| continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also return once we find it. This probably can also be expressed as a call to find or whatever on tables.iter, if we don't do a table_by_name approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if there could be multiple tables with the same name here, can there be?
| fn build(prefill: bool, fsync: bool) -> ResultBench<Self> | ||
| pub struct SQLite { | ||
| db: Connection, | ||
| _temp_dir: TempDir, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The finickiness of the TempDir drop stuff makes me want a comment here, we've had enough bugs in tests and CI about it that I don't want this refactored away to a local or whatever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact I think refactoring it to a local could cause the tempdir to be dropped, which might break things... then again, the earlier benchmarks version did that and the sqlite benches still worked. So idk.
crates/bench/src/sqlite.rs
Outdated
|
|
||
| if prefill { | ||
| prefill_data(&mut db, Runs::Small)?; | ||
| fn create_table<T: BenchTable>(&mut self, table_style: crate::schemas::TableStyle) -> ResultBench<Self::TableId> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mario or someone might be a better reviewer of the sqlite stuff here than me.
crates/bench/src/spacetime_raw.rs
Outdated
| } | ||
|
|
||
| type PreparedFilter = PreparedFilter; | ||
| #[inline(never)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I picked this #[inline(never)] one arbitrarily out of all of them to comment: you should describe where and why you do this in some top level place if possible!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added them all in in an attempt to improve compile times, but it didn't have a huge effect, because everything is still heavily monomorphizing due to the generics. I'll strip them out.
crates/bench/src/spacetime_module.rs
Outdated
|
|
||
| pub struct SpacetimeModule { | ||
| runtime: Runtime, | ||
| // it's necessary for this to be dropped BEFORE the runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally prefer this (an explicit Drop impl) to e.g. relying on a comment about relying on the drop ordering based on field order. Maybe Mazdak or Noa have a stronger style opinion about it, I'd be curious.
|
@kulakowski, I'll address your comments and then I think this is ready for merge.
Good point, I'll document this.
Yeah, I wasn't really sure where I was going while working on this which meant I had a hard time communicating what I was doing. I have some more targeted benchmark ideas which I'll stick in a followup PR. I do have one remaining concern, which is benchmark time. Looking at the tests on this PR, it seems that running benchmarks on GitHub actions currently takes about 80 minutes. We could:
|
|
Oh, also, should I rename "Person" to "IntStringInt" and "Location" to "IntIntInt" or something similar? Having mysterious schema names in benchmark results might confuse people. |
|
I have a path to fix benchmark times -- it's because everything is being run twice, including the sqlite benchmarks. I'll mess with our Github Actions benchmark script to fix that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
These calls unconditionally panicked before anyway; now they just panic in a way that Clippy doesn't object to.
Description of Changes
Rewrite benchmarks to allow direct comparison between:
spacetime_module: Spacetime via a modulespacetime_raw: Spacetime via the RelationalDB structsqliteAnd add some targeted serialization benchmarks.