pymc-devs · twiecki · Feb 27, 2023 · Feb 26, 2023
diff --git a/docs/source/learn/core_notebooks/pymc_overview.ipynb b/docs/source/learn/core_notebooks/pymc_overview.ipynb
@@ -3342,7 +3342,7 @@
     "\n",
     "This is a more realistic problem than the first regression example, as we are now dealing with a **multivariate regression** model. However, while there are several potential predictors in the LSL-DR dataset, it is difficult *a priori* to determine which ones are relevant for constructing an effective statistical model. There are a number of approaches for conducting variable selection, but a popular automated method is *regularization*, whereby ineffective covariates are shrunk towards zero via regularization (a form of penalization) if they do not contribute to predicting outcomes. \n",
     "\n",
-    "You may have heard of regularization from machine learning or classical statistics applications, where methods like the lasso or ridge regression shrink parameters towards zero by applying a penalty to the size of the regression parameters. In a Bayesian context, we apply an appropriate prior distribution to the regression coefficients. One such prior is the *hierarchical regularized horseshoe*, which uses two regularization strategies, one global and a set of local local parameters, one for each coefficient. The key to making this work is by selecting a long-tailed distribution as the shrinkage priors, which allows some to be nonzero, while pushing the rest towards zero.\n",
+    "You may have heard of regularization from machine learning or classical statistics applications, where methods like the lasso or ridge regression shrink parameters towards zero by applying a penalty to the size of the regression parameters. In a Bayesian context, we apply an appropriate prior distribution to the regression coefficients. One such prior is the *hierarchical regularized horseshoe*, which uses two regularization strategies, one global and a set of local parameters, one for each coefficient. The key to making this work is by selecting a long-tailed distribution as the shrinkage priors, which allows some to be nonzero, while pushing the rest towards zero.\n",
     "\n",
     "The horeshoe prior for each regression coefficient $\\beta_i$ looks like this:\n",
     "\n",
@@ -3396,7 +3396,7 @@
    "source": [
     "### Model Specification\n",
     "\n",
-    "Specifying the model in PyMC mirrors its statistical specification. This model employs a couple of new distributions: the {class}`~pymc.HalfStudentT` distribution for the $\\tau$ and $\\lambda$ priors, and the `InverseGamma` distribution for the $c2$ variable.\n",
+    "Specifying the model in PyMC mirrors its statistical specification. This model employs a couple of new distributions: the {class}`~pymc.HalfStudentT` distribution for the $\\tau$ and $\\lambda$ priors, and the `InverseGamma` distribution for the $c^2$ variable.\n",
     "\n",
     "In PyMC, variables with purely positive priors like {class}`~pymc.InverseGamma` are transformed with a log transform. This makes sampling more robust. Behind the scenes, a variable in the unconstrained space (named `<variable-name>_log`) is added to the model for sampling. Variables with priors that constrain them on two sides, like {class}`~pymc.Beta` or {class}`~pymc.Uniform`, are also transformed to be unconstrained but with a log odds transform. \n",
     "\n",