Skip to content

Fix critical bugs in MLE-STAR data leakage checker #302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

0xnavarro
Copy link

Summary

Fixed two critical bugs in the data leakage checker that caused crashes when use_data_leakage_checker=True.

Bugs Fixed

Bug 1: Missing functools.partial() wrapper

  • Error: TypeError: replace_leakage_code() missing 1 required positional argument: 'prefix'
  • Cause: after_model_callback was not wrapped with functools.partial() to pass the prefix parameter
  • Fix: Added functools.partial() wrapper consistent with other callbacks in the same file

Bug 2: Undefined variable in exception handling

  • Error: UnboundLocalError: cannot access local variable 'leakage_status' where it is not associated with a value
  • Cause: leakage_status was not defined in the except block of update_extract_status()
  • Fix: Added leakage_status = "Unknown" in the exception handler

Impact

  • Enables data leakage checker to work properly when enabled
  • Makes MLE-STAR more robust for production use
  • Fixes crashes that prevented users from using this important feature

Testing

  • Verified with use_data_leakage_checker=True on a task using this feature.
  • No regressions when feature is disabled (default behavior unchanged)

- Fix TypeError: replace_leakage_code() missing required 'prefix' argument
  by adding functools.partial() wrapper consistent with other callbacks
- Fix UnboundLocalError: leakage_status undefined in exception handler
  by setting leakage_status = 'Unknown' in except block

These bugs caused crashes when use_data_leakage_checker=True, preventing
users from utilizing this important MLE-STAR feature for detecting data
leakage in machine learning competitions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant