-
Notifications
You must be signed in to change notification settings - Fork 3
OlmoEarth fine-tuning evaluations #226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! This is really nice, just one small thing - you can remove the forest_loss_driver*.yaml from models
I thought I need it because with forest loss driver, the model architecture is a bit different (adds another level of SimpleTimeSeries), so the specification of what gets frozen has to be a bit different. I think the model configs could be consolidated though, I think there are just two categories (except Satlas which uses it to restore the model weights as well). |
|
btw, I think it’s better to include the Nandi and AWF tasks here, and their Sentinel-2 ts configs are ready: This setup is much better for testing time-series and multi-modal performance. In Helios, I’m mostly sweeping learning rates (with cosine decay and patience), so those experiments don’t really show the ts/mm advantages, which are important for crop type and land cover mapping. To keep things consistent, we can remove Nandi and AWF from the KNN experiments. That way, the fine-tuning covers both (1) research benchmarks (the ones we do KNN/LP) and (2) real-world tasks. |
This is code for OlmoEarth fine-tuning evaluations. Because we have 11 baselines and 12 tasks to compare, we want to avoid having to maintain one config per task per baseline (n^2), instead it should just be one per task and one per baseline.
This adds new
rslp.olmoearth_evalsmodule which provides some extra infrastructure to make all of the models accept a consistent input. For each model, there is some code there to get the model architecture for a given task and output shape, and also to get whatever model-specific normalizations or band re-ordering that might be needed.Then the
data/helios_v3/tasks/configs all load data for each task in the same consistent way. An exception is needed for PASTIS since it only has a subset of bands, and some models can input the subset of bands instead of needing imputation, but otherwise it mostly works since most of the tasks are materialized using rslearn.The launcher is in
data/helios_v3/run.pyanddata/helios_v3/README.mdprovides some documentation about it. Also there are model-specific configs but they basically just configure freezing/unfreezing. The model is passed torslp.olmoearth_evals.eval_adaptervia environment variable by the launcher.Also some new code is added to assign splits specifically for this evaluation.
This depends on allenai/rslearn#319 and allenai/rslearn#320 and allenai/rslearn#324