Skip to content

Commit 8eb5ccc

Browse files
committed
update
1 parent c8aa957 commit 8eb5ccc

20 files changed

+4405
-428
lines changed

doc/LectureNotes/project1.ipynb

Lines changed: 641 additions & 0 deletions
Large diffs are not rendered by default.

doc/Projects/2025/Project1/html/._Project1-bs000.html

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -175,18 +175,20 @@ <h4>September 2</h4>
175175
<h2 id="preamble-note-on-writing-reports-using-reference-material-ai-and-other-tools" class="anchor">Preamble: Note on writing reports, using reference material, AI and other tools </h2>
176176

177177
<p>We want you to answer the three different projects by handing in
178-
reports written like a standard scientific/technical report. The links
179-
at <a href="https://github.com/CompPhysics/MachineLearning/tree/master/doc/Projects" target="_self"><tt>https://github.com/CompPhysics/MachineLearning/tree/master/doc/Projects</tt></a>
180-
Furthermore, at the same link,
181-
you can find examples of previous reports. How to write reports will
182-
also be discussed during the various lab sessions. Please do ask us if you are in doubt.
178+
reports written like a standard scientific/technical report. The
179+
links at
180+
<a href="https://github.com/CompPhysics/MachineLearning/tree/master/doc/Projects" target="_self"><tt>https://github.com/CompPhysics/MachineLearning/tree/master/doc/Projects</tt></a>
181+
contain more information. There you can find examples of previous
182+
reports, the projects themselves, how we rade reports etc. How to
183+
write reports will also be discussed during the various lab
184+
sessions. Please do ask us if you are in doubt.
183185
</p>
184186

185187
<p>When using codes and material from other sources, you should refer to
186188
these in the bibliography of your report, indicating wherefrom you for
187189
example got the code, whether this is from the lecture notes,
188-
softwares like Scikit-Learn, TensorFlow, PyTorch or other sources such
189-
AI software. These should always be cited correctly. How to cite some
190+
softwares like Scikit-Learn, TensorFlow, PyTorch or other sources. These sources
191+
should always be cited correctly. How to cite some
190192
of the libraries is often indicated from their corresponding GitHub
191193
sites or websites, see for example how to cite Scikit-Learn at
192194
<a href="https://scikit-learn.org/dev/about.html" target="_self"><tt>https://scikit-learn.org/dev/about.html</tt></a>.

doc/Projects/2025/Project1/html/._Project1-bs001.html

Lines changed: 54 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -154,7 +154,7 @@
154154
<h2 id="regression-analysis-and-resampling-methods" class="anchor">Regression analysis and resampling methods </h2>
155155

156156
<p>The main aim of this project is to study in more detail various
157-
regression methods, including the Ordinary Least Squares (OLS) method.
157+
regression methods, including Ordinary Least Squares (OLS) reegression, Ridge regression and LASSO regression.
158158
In addition to the scientific part, in this course we want also to
159159
give you an experience in writing scientific reports.
160160
</p>
@@ -170,27 +170,26 @@ <h2 id="regression-analysis-and-resampling-methods" class="anchor">Regression an
170170

171171
<p>Our first step will be to perform an OLS regression analysis of this
172172
function, trying out a polynomial fit with an \( x \) dependence of the
173-
form \( [x,x^2,\dots] \). We can use a uniform distribution to set up the
173+
form \( [x,x^2,\dots] \). You can use a uniform distribution to set up the
174174
arrays of values for \( x \in [-1,1] \), or alternatively use a fixed step size.
175-
Thereafter we will repeat much of the
176-
same procedure using the Ridge and Lasso regression methods,
177-
introducing thus a dependence on the hyperparameter (penalty) \( \lambda \).
175+
Thereafter we will repeat many of the same steps when using the Ridge and Lasso regression methods,
176+
introducing thereby a dependence on the hyperparameter (penalty) \( \lambda \).
178177
</p>
179178

180179
<p>We will also include bootstrap as a resampling technique in order to
181180
study the so-called <b>bias-variance tradeoff</b>. After that we will
182-
include the cross-validation technique.
181+
include the so-called cross-validation technique.
183182
</p>
184183
<h3 id="part-a-ordinary-least-square-ols-for-the-runge-function" class="anchor">Part a : Ordinary Least Square (OLS) for the Runge function </h3>
185184

186-
<p>We will generate our own dataset for a function
185+
<p>We will generate our own dataset for abovementioned function
187186
\( \mathrm{Runge}(x) \) function with \( x\in [-1,1] \). You should explore also the addition
188187
of an added stochastic noise to this function using the normal
189188
distribution \( N(0,1) \).
190189
</p>
191190

192191
<p><em>Write your own code</em> (using for example the pseudoinverse function <b>pinv</b> from <b>Numpy</b> ) and perform a standard <b>ordinary least square regression</b>
193-
analysis using polynomials in \( x \) up to order \( 15 \). Explore the dependence on the number of data points and the polynomial degree.
192+
analysis using polynomials in \( x \) up to order \( 15 \) or higher. Explore the dependence on the number of data points and the polynomial degree.
194193
</p>
195194

196195
<p>Evaluate the mean Squared error (MSE)</p>
@@ -214,13 +213,13 @@ <h3 id="part-a-ordinary-least-square-ols-for-the-runge-function" class="anchor">
214213
\bar{y} = \frac{1}{n} \sum_{i=0}^{n - 1} y_i.
215214
$$
216215

217-
<p>Plot the resulting scores (MSE and R$^2$) as functions of the polynomial degree (here up to polymial degree 20).
216+
<p>Plot the resulting scores (MSE and R$^2$) as functions of the polynomial degree (here up to polymial degree 15).
218217
Plot also the parameters \( \theta \) as you increase the order of the polynomial. Comment your results.
219218
</p>
220219

221220
<p>Your code has to include a scaling/centering of the data (for example by
222221
subtracting the mean value), and
223-
a split of the data in training and test data. For this exercise you can
222+
a split of the data in training and test data. For the scaling you can
224223
either write your own code or use for example the function for
225224
splitting training data provided by the library <b>Scikit-Learn</b> (make
226225
sure you have installed it). This function is called
@@ -243,11 +242,11 @@ <h3 id="part-a-ordinary-least-square-ols-for-the-runge-function" class="anchor">
243242
<h3 id="part-b-adding-ridge-regression-for-the-runge-function" class="anchor">Part b: Adding Ridge regression for the Runge function </h3>
244243

245244
<p>Write your own code for the Ridge method as done in the previous
246-
exercise. The lecture notes from week 35 and 36 contain more information. Furthermore, the exercise from week 36 is something you can reuse here.
245+
exercise. The lecture notes from week 35 and 36 contain more information. Furthermore, the results from the exercise set from week 36 is something you can reuse here.
247246
</p>
248247

249248
<p>Perform the same analysis as you did in the previous exercise but now for different values of \( \lambda \). Compare and
250-
analyze your results with those obtained in part a) with the ordinary least squares method. Study the
249+
analyze your results with those obtained in part a) with the OLS method. Study the
251250
dependence on \( \lambda \).
252251
</p>
253252
<h3 id="part-c-writing-your-own-gradient-descent-code" class="anchor">Part c: Writing your own gradient descent code </h3>
@@ -268,15 +267,15 @@ <h3 id="part-d-including-momentum-and-more-advanced-ways-to-update-the-learning-
268267
the gradient descent method by including <b>momentum</b>, <b>ADAgrad</b>,
269268
<b>RMSprop</b> and <b>ADAM</b> as methods fro iteratively updating your learning
270269
rate. Discuss the results and compare the different methods applied to
271-
the one-dimensional Runge function.
270+
the one-dimensional Runge function. The lecture notes from week 37 contain several examples on how to implement these methods.
272271
</p>
273272
<h3 id="part-e-writing-our-own-code-for-lasso-regression" class="anchor">Part e: Writing our own code for Lasso regression </h3>
274273

275274
<p>LASSO regression (see lecture slides from week 36 and week 37)
276275
represents our first encounter with a machine learning method which
277-
cannot be solved through analytical expressions. Use the gradient
276+
cannot be solved through analytical expressions (as in OLS and Ridge regression). Use the gradient
278277
descent methods you developed in parts c) and d) to solve the LASSO
279-
optimization problem. You can compare your results using
278+
optimization problem. You can compare your results with
280279
the functionalities of <b>Scikit-Learn</b>.
281280
</p>
282281

@@ -286,14 +285,16 @@ <h3 id="part-e-writing-our-own-code-for-lasso-regression" class="anchor">Part e:
286285
</p>
287286
<h3 id="part-f-stochastic-gradient-descent" class="anchor">Part f: Stochastic gradient descent </h3>
288287

289-
<p>Our last gradient step is to include stochastic gradient descent using the
290-
same methods to update the learning rates as in parts c-e).
291-
Compare and discuss your results with and without stochastic gradient and give a critical assessment of the various methods.
288+
<p>Our last gradient step is to include stochastic gradient descent using
289+
the same methods to update the learning rates as in parts c-e).
290+
Compare and discuss your results with and without stochastic gradient
291+
and give a critical assessment of the various methods.
292292
</p>
293293
<h3 id="part-g-bias-variance-trade-off-and-resampling-techniques" class="anchor">Part g: Bias-variance trade-off and resampling techniques </h3>
294294

295-
<p>Our aim here is to study the bias-variance trade-off by implementing the <b>bootstrap</b> resampling technique.
296-
<b>We will only use the simpler ordinary least squares here</b>.
295+
<p>Our aim here is to study the bias-variance trade-off by implementing
296+
the <b>bootstrap</b> resampling technique. <b>We will only use the simpler
297+
ordinary least squares here</b>.
297298
</p>
298299

299300
<p>With a code which does OLS and includes resampling techniques,
@@ -303,11 +304,14 @@ <h3 id="part-g-bias-variance-trade-off-and-resampling-techniques" class="anchor"
303304
tasks and basically all Machine Learning algorithms.
304305
</p>
305306

306-
<p>Before you perform an analysis of the bias-variance trade-off on your test data, make
307-
first a figure similar to Fig. 2.11 of Hastie, Tibshirani, and
308-
Friedman. Figure 2.11 of this reference displays only the test and training MSEs. The test MSE can be used to
309-
indicate possible regions of low/high bias and variance. You will most likely not get an
310-
equally smooth curve!
307+
<p>Before you perform an analysis of the bias-variance trade-off on your
308+
test data, make first a figure similar to Fig. 2.11 of Hastie,
309+
Tibshirani, and Friedman. Figure 2.11 of this reference displays only
310+
the test and training MSEs. The test MSE can be used to indicate
311+
possible regions of low/high bias and variance. You will most likely
312+
not get an equally smooth curve! You may also need to increase the
313+
polynomial order and play around with the number of data points as
314+
well (see also the exercise set from week 35).
311315
</p>
312316

313317
<p>With this result we move on to the bias-variance trade-off analysis.</p>
@@ -317,7 +321,7 @@ <h3 id="part-g-bias-variance-trade-off-and-resampling-techniques" class="anchor"
317321
\( \mathbf{X}_\mathcal{L}=\{(y_j, \boldsymbol{x}_j), j=0\ldots n-1\} \).
318322
</p>
319323

320-
<p>As in part d), we assume that the true data is generated from a noisy model</p>
324+
<p>We assume that the true data is generated from a noisy model</p>
321325

322326
$$
323327
\boldsymbol{y}=f(\boldsymbol{x}) + \boldsymbol{\epsilon}.
@@ -329,29 +333,32 @@ <h3 id="part-g-bias-variance-trade-off-and-resampling-techniques" class="anchor"
329333

330334
<p>In our derivation of the ordinary least squares method we defined then
331335
an approximation to the function \( f \) in terms of the parameters
332-
\( \boldsymbol{\beta} \) and the design matrix \( \boldsymbol{X} \) which embody our model,
333-
that is \( \boldsymbol{\tilde{y}}=\boldsymbol{X}\boldsymbol{\beta} \).
336+
\( \boldsymbol{\theta} \) and the design matrix \( \boldsymbol{X} \) which embody our model,
337+
that is \( \boldsymbol{\tilde{y}}=\boldsymbol{X}\boldsymbol{\theta} \).
334338
</p>
335339

336-
<p>The parameters \( \boldsymbol{\beta} \) are in turn found by optimizing the mean
340+
<p>The parameters \( \boldsymbol{\theta} \) are in turn found by optimizing the mean
337341
squared error via the so-called cost function
338342
</p>
339343

340344
$$
341-
C(\boldsymbol{X},\boldsymbol{\beta}) =\frac{1}{n}\sum_{i=0}^{n-1}(y_i-\tilde{y}_i)^2=\mathbb{E}\left[(\boldsymbol{y}-\boldsymbol{\tilde{y}})^2\right].
345+
C(\boldsymbol{X},\boldsymbol{\theta}) =\frac{1}{n}\sum_{i=0}^{n-1}(y_i-\tilde{y}_i)^2=\mathbb{E}\left[(\boldsymbol{y}-\boldsymbol{\tilde{y}})^2\right].
342346
$$
343347

344348
<p>Here the expected value \( \mathbb{E} \) is the sample value. </p>
345349

346-
<p>Show that you can rewrite this in terms of a term which contains the variance of the model itself (the so-called variance term), a
347-
term which measures the deviation from the true data and the mean value of the model (the bias term) and finally the variance of the noise.
348-
That is, show that
350+
<p>Show that you can rewrite this in terms of a term which contains the
351+
variance of the model itself (the so-called variance term), a term
352+
which measures the deviation from the true data and the mean value of
353+
the model (the bias term) and finally the variance of the noise.
349354
</p>
355+
356+
<p>That is, show that</p>
350357
$$
351358
\mathbb{E}\left[(\boldsymbol{y}-\boldsymbol{\tilde{y}})^2\right]=\mathrm{Bias}[\tilde{y}]+\mathrm{var}[\tilde{y}]+\sigma^2,
352359
$$
353360

354-
<p>with </p>
361+
<p>with (we approximate \( f(\boldsymbol{x})\approx \boldsymbol{y} \)) </p>
355362
$$
356363
\mathrm{Bias}[\tilde{y}]=\mathbb{E}\left[\left(\boldsymbol{y}-\mathbb{E}\left[\boldsymbol{\tilde{y}}\right]\right)^2\right],
357364
$$
@@ -361,8 +368,12 @@ <h3 id="part-g-bias-variance-trade-off-and-resampling-techniques" class="anchor"
361368
\mathrm{var}[\tilde{y}]=\mathbb{E}\left[\left(\tilde{\boldsymbol{y}}-\mathbb{E}\left[\boldsymbol{\tilde{y}}\right]\right)^2\right]=\frac{1}{n}\sum_i(\tilde{y}_i-\mathbb{E}\left[\boldsymbol{\tilde{y}}\right])^2.
362369
$$
363370

364-
<p>The answer to this exercise should be included in the theory part of the report. This exercise is also part of the weekly exercises of week 38.
365-
Explain what the terms mean and discuss their interpretations.
371+
<p><b>Important note</b>: Since the function \( f(x) \) is unknown, in order to be able to evalute the bias, we replace \( f(\boldsymbol{x}) \) in the expression for the bias with \( \boldsymbol{y} \). </p>
372+
373+
<p>The answer to this exercise should be included in the theory part of
374+
the report. This exercise is also part of the weekly exercises of
375+
week 38. Explain what the terms mean and discuss their
376+
interpretations.
366377
</p>
367378

368379
<p>Perform then a bias-variance analysis of the Runge function by
@@ -380,16 +391,18 @@ <h3 id="part-h-cross-validation-as-resampling-techniques-adding-more-complexity"
380391
resampling technique, the so-called cross-validation method.
381392
</p>
382393

383-
<p>Implement the \( k \)-fold cross-validation algorithm (feel free to use the functionality of <b>Scikit-Learn</b> or write your own code) and evaluate again the MSE function resulting
384-
from the test folds.
394+
<p>Implement the \( k \)-fold cross-validation algorithm (feel free to use
395+
the functionality of <b>Scikit-Learn</b> or write your own code) and
396+
evaluate again the MSE function resulting from the test folds.
385397
</p>
386398

387399
<p>Compare the MSE you get from your cross-validation code with the one
388-
you got from your <b>bootstrap</b> code. Comment your results. Try \( 5-10 \)
389-
folds.
400+
you got from your <b>bootstrap</b> code from the previous exercise. Comment and interpret your results.
390401
</p>
391402

392-
<p>In addition to using the ordinary least squares method, you should include both Ridge and Lasso regression in the analysis. </p>
403+
<p>In addition to using the ordinary least squares method, you should
404+
include both Ridge and Lasso regression in the final analysis.
405+
</p>
393406
<h2 id="background-literature" class="anchor">Background literature </h2>
394407

395408
<ol>

doc/Projects/2025/Project1/html/Project1-bs.html

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -175,18 +175,20 @@ <h4>September 2</h4>
175175
<h2 id="preamble-note-on-writing-reports-using-reference-material-ai-and-other-tools" class="anchor">Preamble: Note on writing reports, using reference material, AI and other tools </h2>
176176

177177
<p>We want you to answer the three different projects by handing in
178-
reports written like a standard scientific/technical report. The links
179-
at <a href="https://github.com/CompPhysics/MachineLearning/tree/master/doc/Projects" target="_self"><tt>https://github.com/CompPhysics/MachineLearning/tree/master/doc/Projects</tt></a>
180-
Furthermore, at the same link,
181-
you can find examples of previous reports. How to write reports will
182-
also be discussed during the various lab sessions. Please do ask us if you are in doubt.
178+
reports written like a standard scientific/technical report. The
179+
links at
180+
<a href="https://github.com/CompPhysics/MachineLearning/tree/master/doc/Projects" target="_self"><tt>https://github.com/CompPhysics/MachineLearning/tree/master/doc/Projects</tt></a>
181+
contain more information. There you can find examples of previous
182+
reports, the projects themselves, how we rade reports etc. How to
183+
write reports will also be discussed during the various lab
184+
sessions. Please do ask us if you are in doubt.
183185
</p>
184186

185187
<p>When using codes and material from other sources, you should refer to
186188
these in the bibliography of your report, indicating wherefrom you for
187189
example got the code, whether this is from the lecture notes,
188-
softwares like Scikit-Learn, TensorFlow, PyTorch or other sources such
189-
AI software. These should always be cited correctly. How to cite some
190+
softwares like Scikit-Learn, TensorFlow, PyTorch or other sources. These sources
191+
should always be cited correctly. How to cite some
190192
of the libraries is often indicated from their corresponding GitHub
191193
sites or websites, see for example how to cite Scikit-Learn at
192194
<a href="https://scikit-learn.org/dev/about.html" target="_self"><tt>https://scikit-learn.org/dev/about.html</tt></a>.

0 commit comments

Comments
 (0)