⚡️ Speed up function find_last_node
by 11,311%
#108
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 11,311% (113.11x) speedup for
find_last_node
insrc/dsa/nodes.py
⏱️ Runtime :
98.8 milliseconds
→866 microseconds
(best of161
runs)📝 Explanation and details
The optimization transforms the algorithm from O(N*M) to O(N+M) complexity by precomputing source IDs into a set for O(1) lookups.
Key Changes:
source_ids = {e["source"] for e in edges}
once upfrontall(e["source"] != n["id"] for e in edges)
withn["id"] not in source_ids
Why This Is Faster:
The original code performed a nested loop - for each node, it checked against every edge's source (O(N*M) operations). The optimized version builds a hash set of source IDs once (O(M)), then performs constant-time lookups for each node (O(N)), resulting in O(N+M) total complexity.
Performance Benefits by Test Case:
The optimization maintains identical behavior while dramatically improving scalability for larger graphs.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-find_last_node-mfhvkfq0
and push.