@@ -4,7 +4,7 @@ jupytext:
44 extension : .md
55 format_name : myst
66 format_version : 0.13
7- jupytext_version : 1.16.4
7+ jupytext_version : 1.16.7
88kernelspec :
99 display_name : Python 3 (ipykernel)
1010 language : python
@@ -58,7 +58,7 @@ Next we import some tools from Keras.
5858
5959``` {code-cell} ipython3
6060import keras
61- from keras.models import Sequential
61+ from keras import Sequential
6262from keras.layers import Dense
6363```
6464
@@ -70,22 +70,30 @@ The data has the form
7070
7171$$
7272 y_i = f(x_i) + \epsilon_i,
73- \qquad i=1, \ldots, n
73+ \qquad i=1, \ldots, n,
7474$$
7575
76- The map $f$ is specified inside the function and $\epsilon_i$ is an independent
77- draw from a fixed normal distribution.
76+ where
77+
78+ * the input sequence $(x_i)$ is an evenly-spaced grid,
79+ * $f$ is a nonlinear transformation, and
80+ * each $\epsilon_i$ is independent white noise.
7881
7982Here's the function that creates vectors ` x ` and ` y ` according to the rule
8083above.
8184
8285``` {code-cell} ipython3
83- def generate_data(x_min=0, x_max=5, data_size=400):
86+ def generate_data(x_min=0, # Minimum x value
87+ x_max=5, # Max x value
88+ data_size=400, # Default size for dataset
89+ seed=1234):
90+ np.random.seed(seed)
8491 x = np.linspace(x_min, x_max, num=data_size)
85- x = x.reshape(data_size, 1)
86- ϵ = 0.2 * np.random.randn(*x.shape )
92+
93+ ϵ = 0.2 * np.random.randn(data_size )
8794 y = x**0.5 + np.sin(x) + ϵ
88- x, y = [z.astype('float32') for z in (x, y)]
95+ # Keras expects two dimensions, not flat arrays
96+ x, y = [np.reshape(z, (data_size, 1)) for z in (x, y)]
8997 return x, y
9098```
9199
@@ -115,43 +123,61 @@ x_validate, y_validate = generate_data()
115123
116124We supply functions to build two types of models.
117125
126+ ## Regression model
127+
118128The first implements linear regression.
119129
120130This is achieved by constructing a neural network with just one layer, that maps
121131to a single dimension (since the prediction is real-valued).
122132
123- The input ` model ` will be an instance of ` keras.Sequential ` , which is used to
133+ The object ` model ` will be an instance of ` keras.Sequential ` , which is used to
124134group a stack of layers into a single prediction model.
125135
126136``` {code-cell} ipython3
127- def build_regression_model(model):
128- model.add(Dense(units=1))
137+ def build_regression_model():
138+ # Generate an instance of Sequential, to store layers and training attributes
139+ model = Sequential()
140+ # Add a single layer with scalar output
141+ model.add(Dense(units=1))
142+ # Configure the model for training
129143 model.compile(optimizer=keras.optimizers.SGD(),
130144 loss='mean_squared_error')
131145 return model
132146```
133147
134- In the function above you can see that we use stochastic gradient descent to
135- train the model, and that the loss is mean squared error (MSE).
148+ In the function above you can see that
149+
150+ * we use stochastic gradient descent to train the model, and
151+ * the loss is mean squared error (MSE).
152+
153+ The call ` model.add ` adds a single layer the activation function equal to the identity map.
136154
137155MSE is the standard loss function for ordinary least squares regression.
138156
157+ ### Deep Network
158+
139159The second function creates a dense (i.e., fully connected) neural network with
1401603 hidden layers, where each hidden layer maps to a k-dimensional output space.
141161
142162``` {code-cell} ipython3
143- def build_nn_model(model, k=10, activation_function='tanh'):
144- # Construct network
145- model.add(Dense(units=k, activation=activation_function))
146- model.add(Dense(units=k, activation=activation_function))
147- model.add(Dense(units=k, activation=activation_function))
148- model.add(Dense(1))
163+ def build_nn_model(output_dim=10, num_layers=3, activation_function='tanh'):
164+ # Create a Keras Model instance using Sequential()
165+ model = Sequential()
166+ # Add layers to the network sequentially, from inputs towards outputs
167+ for i in range(num_layers):
168+ model.add(Dense(units=output_dim, activation=activation_function))
169+ # Add a final layer that maps to a scalar value, for regression.
170+ model.add(Dense(units=1))
149171 # Embed training configurations
150172 model.compile(optimizer=keras.optimizers.SGD(),
151173 loss='mean_squared_error')
152174 return model
153175```
154176
177+ ### Tracking errors
178+
179+ +++
180+
155181The following function will be used to plot the MSE of the model during the
156182training process.
157183
@@ -160,12 +186,16 @@ as the parameters are adjusted to better fit the data.
160186
161187``` {code-cell} ipython3
162188def plot_loss_history(training_history, ax):
163- ax.plot(training_history.epoch,
189+ # Plot MSE of training data against epoch
190+ epochs = training_history.epoch
191+ ax.plot(epochs,
164192 np.array(training_history.history['loss']),
165193 label='training loss')
166- ax.plot(training_history.epoch,
194+ # Plot MSE of validation data against epoch
195+ ax.plot(epochs,
167196 np.array(training_history.history['val_loss']),
168197 label='validation loss')
198+ # Add labels
169199 ax.set_xlabel('Epoch')
170200 ax.set_ylabel('Loss (Mean squared error)')
171201 ax.legend()
@@ -180,19 +210,16 @@ Now let's go ahead and train our models.
180210
181211We'll start with linear regression.
182212
183- First we create a ` Model ` instance using ` Sequential() ` .
184-
185213``` {code-cell} ipython3
186- model = Sequential()
187- regression_model = build_regression_model(model)
214+ regression_model = build_regression_model()
188215```
189216
190217Now we train the model using the training data.
191218
192219``` {code-cell} ipython3
193220training_history = regression_model.fit(
194221 x, y, batch_size=x.shape[0], verbose=0,
195- epochs=4000 , validation_data=(x_validate, y_validate))
222+ epochs=2000 , validation_data=(x_validate, y_validate))
196223```
197224
198225Let's have a look at the evolution of MSE as the model is trained.
@@ -241,14 +268,13 @@ Now let's switch to a neural network with multiple layers.
241268We implement the same steps as before.
242269
243270``` {code-cell} ipython3
244- model = Sequential()
245- nn_model = build_nn_model(model)
271+ nn_model = build_nn_model()
246272```
247273
248274``` {code-cell} ipython3
249275training_history = nn_model.fit(
250276 x, y, batch_size=x.shape[0], verbose=0,
251- epochs=4000 , validation_data=(x_validate, y_validate))
277+ epochs=2000 , validation_data=(x_validate, y_validate))
252278```
253279
254280``` {code-cell} ipython3
@@ -286,3 +312,9 @@ fig, ax = plt.subplots()
286312plot_results(x_validate, y_validate, y_predict, ax)
287313plt.show()
288314```
315+
316+ Not surprisingly, the multilayer neural network does a much better job of fitting the data.
317+
318+ ``` {code-cell} ipython3
319+
320+ ```
0 commit comments