diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 5bb8bd01..47d1801d 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -7,6 +7,13 @@ jobs:
       - uses: actions/checkout@v4
         with:
           ref: ${{ github.event.pull_request.head.sha }}
+      - name: Upgrade CUDANN
+        shell: bash -l {0}
+        run: |
+          wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
+          sudo dpkg -i cuda-keyring_1.1-1_all.deb
+          sudo apt-get update
+          sudo apt-get -y install cudnn-cuda-12
       - name: Setup Anaconda
         uses: conda-incubator/setup-miniconda@v3
         with:
diff --git a/lectures/kesten_processes.md b/lectures/kesten_processes.md
index 6948c631..03750e4f 100644
--- a/lectures/kesten_processes.md
+++ b/lectures/kesten_processes.md
@@ -40,9 +40,8 @@ In addition to JAX and Anaconda, this lecture will need the following libraries:
 This lecture describes Kesten processes, which are an important class of
 stochastic processes, and an application of firm dynamics.
 
-The lecture draws on [an earlier QuantEcon
-lecture](https://python.quantecon.org/kesten_processes.html), which uses Numba
-to accelerate the computations.
+The lecture draws on [an earlier QuantEcon lecture](https://python.quantecon.org/kesten_processes.html), 
+which uses Numba to accelerate the computations.
 
 In that earlier lecture you can find a more detailed discussion of the concepts involved.
 
@@ -137,10 +136,8 @@ We now study the implications of this specification.
 
 #### Heavy tails
 
-If the conditions of the [Kesten--Goldie
-Theorem](https://python.quantecon.org/kesten_processes.html#the-kestengoldie-theorem)
-are satisfied, then {eq}`firm_dynam` implies that the firm size distribution
-will have Pareto tails.
+If the conditions of the [Kesten--Goldie Theorem](https://python.quantecon.org/kesten_processes.html#the-kestengoldie-theorem)
+are satisfied, then {eq}`firm_dynam` implies that the firm size distribution will have Pareto tails.
 
 This matches empirical findings across many data sets.
 
@@ -190,12 +187,11 @@ class Firm(NamedTuple):
     μ_e:   float = 0.0
     σ_e:   float = 0.5
     s_bar: float = 1.0
-
-#
-# Here's code to update a cross-section of firms according to the dynamics in
-# [](firm_dynam_ee).
 ```
 
+Here's code to update a cross-section of firms according to the dynamics in
+[](firm_dynam_ee).
+
 ```{code-cell} ipython3
 @jax.jit
 def update_cross_section(s, a, b, e, firm):
@@ -250,7 +246,6 @@ data = generate_cross_section(firm).block_until_ready()
 toc()
 ```
 
-
 Let's produce the rank-size plot and check the distribution:
 
 ```{code-cell} ipython3
@@ -271,7 +266,7 @@ The plot produces a straight line, consistent with a Pareto tail.
 
 We did not JIT-compile the `for` loop above because
 acceleration of outer loops makes relatively little difference terms of
-   compute time.
+compute time.
 
 However, to maximize performance, let's try squeezing out a bit more speed
 by replacing the `for` loop with
@@ -311,10 +306,10 @@ def generate_cross_section_lax(
         0, T, update_cross_section, initial_state
     )
     return final_s
-
-# Let's see if we got any speed gain
 ```
 
+Let's see if we get any speed gain
+
 ```{code-cell} ipython3
 tic()
 data = generate_cross_section_lax(firm).block_until_ready()
@@ -339,14 +334,27 @@ ax.set_ylabel("log size")
 
 plt.show()
 
-#
-# If the time horizon is not too large, we can also try generating all shocks at
-# once.
-#
-# Note, however, that this approach consumes more memory, as we need to have to
-# store large matrices of random draws
-#
-# Hence the code below will fail due to out-of-memory errors when `T` and `M` are large.
+```
+
+## Exercises
+
+```{exercise-start}
+:label: kp_ex1
+```
+
+Try writing an alternative version of `generate_cross_section_lax()` where the entire sequence of random draws is generated at once, so that all of `a`, `b`, and `e` are of shape `(T, M)`.
+
+(The `update_cross_section()` function should not generate any random numbers.)
+
+Does it improve the runtime?
+
+What are the pros and cons of this approach.
+
+```{exercise-end}
+```
+
+```{solution-start} kp_ex1
+:class: dropdown
 ```
 
 ```{code-cell} ipython3
@@ -393,6 +401,11 @@ data = generate_cross_section_lax(firm).block_until_ready()
 toc()
 ```
 
-This second method might be slightly faster in some cases but in general the
+This method might be faster in some cases but in general the
 relative speed will depend on the size of the cross-section and the length of
 the simulation paths.
+
+Also, this method is far more memory intensive.
+
+```{solution-end}
+```
\ No newline at end of file