Skip to content

Conversation

@jstac
Copy link
Contributor

@jstac jstac commented Aug 3, 2025

Misc improvements -- make it nicer

Finish py file and then convert back to myst with jupytext

Clean up the vectorization stuff and different versions of B.

Get the plot working again and add some brief comments at the end.

@netlify
Copy link

netlify bot commented Aug 3, 2025

Deploy Preview for incomparable-parfait-2417f8 ready!

Name Link
🔨 Latest commit 292b69c
🔍 Latest deploy log https://app.netlify.com/projects/incomparable-parfait-2417f8/deploys/68994d078945d8000807e8b0
😎 Deploy Preview https://deploy-preview-229--incomparable-parfait-2417f8.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link

github-actions bot commented Aug 3, 2025

@github-actions github-actions bot temporarily deployed to pull request August 3, 2025 07:09 Inactive
@github-actions github-actions bot temporarily deployed to pull request August 3, 2025 07:13 Inactive
@jstac
Copy link
Contributor Author

jstac commented Aug 5, 2025

Could someone please review?

On my laptop I get a huge compile time on the first run, not sure why.

Also note the use of block_until_ready. We have to be careful to add this to all timings involving JAX -- otherwise results are completely misleading.

CC @HumphreyYang @mmcky

@github-actions github-actions bot temporarily deployed to pull request August 5, 2025 21:08 Inactive
@HumphreyYang
Copy link
Member

HumphreyYang commented Aug 8, 2025

Many thanks @jstac!

It looks great to me! For compilation and execution time, I got

compile plus execution time = 3930.385113 ms
execution time = 1590.973854 ms

on Tesla T4, which are reasonable to me.

I just pushed some minor cosmetic changes and applied block_until_ready on both v_star and σ_star:

v_star, σ_star = solve_inventory_model(v_init, model)

# Pause until execution finishes
jax.tree_util.tree_map(lambda x: x.block_until_ready(), (v_star, σ_star))

@github-actions github-actions bot temporarily deployed to pull request August 8, 2025 04:08 Inactive
@github-actions github-actions bot temporarily deployed to pull request August 8, 2025 04:11 Inactive
@HumphreyYang
Copy link
Member

Hi @jstac,

I just had another look, and I think I can optmize this slightly more. I will work on it and report back!

@mmcky
Copy link
Contributor

mmcky commented Aug 8, 2025

@HumphreyYang I have noticed the preview is referencing python=3.11 for some reason I can't yet explain. I will try and regenerate the cache.

@jstac
Copy link
Contributor Author

jstac commented Aug 8, 2025

Hi @jstac,

I just had another look, and I think I can optmize this slightly more. I will work on it and report back!

Looking forward to seeing the new version :-)

@HumphreyYang
Copy link
Member

HumphreyYang commented Aug 8, 2025

Hi @jstac and @mmcky,

Please find my updated code in the gist below:
https://gist.github.com/HumphreyYang/88d6881363227d470d656ac7b1ab67e4

It is 4x faster than the old code for compile + run, and 5x faster for reruns on my GPU:

new compile time = 466.747 ms
old compile time = 1822.423 ms
new exc time = 292.510 ms
old exc time = 1566.622 ms

The solutions generated are identical.

My main motivation (and change) was that we applied double vmap to B, and I was thinking about pulling things out of B and precomputing them such that we can use "double loop" over a smaller operation — of course, this introduces a memory/speed trade-off (and it might be a good example to show).

Once we remove (x, z_idx) from B, functions T and get_greedy become slightly simpler.

One downside is that the resulting code also follows the math in a less "operator-like" but more "matrix-like" way, just as you’ve accused me of in meetings : )

I have to admit, the operator approach is always more concise, elegant, and easy to understand.

Due to the trade-off, I didn’t push it to the branch, but please let me know which version you would prefer.

@jstac
Copy link
Contributor Author

jstac commented Aug 10, 2025

Thanks @HumphreyYang, this is very cool.

Daisuke Oyama used the same strategy for the DP code in quantecon.py -- precompting reward and transition matrices to reduce execution time. As you show, the speed gain can be really impressive.

Nonetheless, I think we should resist this approach here, since, as you point out, memory issues can be a problem (some people ran into this limitation when using the QE library) and the code is a bit further away from the maths.

(The vmap code is also messy in some ways, but the right hand side of the bellman equation is still easy to understand.)

(Let's also resist adding methods to our Model class in all QE lectures, just to keep everything as consistent as possible. Once we start adding methods, it's not clear where to stop.)

One small issue that needs to be fixed: we use z as the discount factor without explanation. Somewhere we need to add that we use beta(z) = z.

@HumphreyYang
Copy link
Member

HumphreyYang commented Aug 10, 2025

Many thanks @jstac, I agree! Making it clear should be our priority for lectures!

I will keep your suggestions in mind when reviewing lectures.

I will push a minor edit incorprating your suggestion and merge this PR.

@jstac
Copy link
Contributor Author

jstac commented Aug 10, 2025

Thanks @HumphreyYang, I appreciate it.

@HumphreyYang HumphreyYang merged commit e6f362b into main Aug 11, 2025
6 checks passed
@HumphreyYang HumphreyYang deleted the ud_inv_ssd branch August 11, 2025 01:53
@github-actions github-actions bot temporarily deployed to pull request August 11, 2025 01:59 Inactive
@github-actions github-actions bot temporarily deployed to pull request August 11, 2025 02:02 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants