Skip to content

Conversation

@anthony-yip
Copy link

@anthony-yip anthony-yip commented Oct 27, 2025

Fixes #123, #279, #277, #273.

Sorry for the gigantic PR but all the little fixes built off of each other.

Summary of changes:

  • Preliminary mypy support and corresponding type annotations
    • Implementation of VMMS and Local/Remote as Protocols
  • Code cleanup (removing dead code) in which a job always has a preVM (provided by the jobManager)
  • Functional semantics for local/remote data structures (since Redis is weird)
  • Fixing infinite retries (coherence bug in makeDead)
    • Immutable Attribtues for TangoJob for better maintenance
  • Spot instances added
    • Errors at all stages of the pipeline (e.g. copy in) now trigger a rescheduling (instead of just failing). This is because the spot termination can occur an an arbitrary stage
    • Enforcing the invariant that detachVM is always called upon a a worker's termination

Also associated with autolab/Autolab#2301

coder6583 and others added 25 commits April 7, 2025 15:05
…preallocator (do note that my initial worry about incorrectness was unfounded due to weird Redis sharing behavior)
@anthony-yip anthony-yip requested a review from KesterTan October 27, 2025 02:38
Base automatically changed from copy-in to ec2-new-implementation October 30, 2025 21:57
Copy link

@KesterTan KesterTan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we resolve the merge conflicts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants