Skip to content

Conversation

davidberard98
Copy link

@davidberard98 davidberard98 commented Sep 27, 2025

While running eval.py locally, I observed super long iteration times for any Cauchy distributions. Profiles show that most of the time is spent on the CPU generating Cauchy distributions.

For example, on a H100, I saw a 100-iteration benchmarking run for a BS=2, dim=256, seqlen=128, cauchy distribution take 13 seconds without this change, and ~0.3s after this change. For such a distribution (i.e. generating [2, 128, 128, 256]), torch profiler shows the cauchy data generation is taking ~113ms

My suspicion is that the previous behavior could lead to some submissions timing out if they have high variation, such that they trigger a full 100 iterations on some large cauchy-distribution test cases

While running eval.py locally, I observed super long iteration times for any Cauchy distributions. Profiles show that most of the time is spent on the CPU generating Cauchy distributions.

For example, on a H100, I saw a 100-iteration benchmarking run for a BS=2, dim=256, seqlen=128, cauchy distribution take 13 seconds without this change, and <1s after this change.

My suspicion is that the previous behavior could lead to some submissions timing out if they have more variation than average.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant