|
103 | 103 | 2, |
104 | 104 | None, |
105 | 105 | 'building-code-using-pytorch'), |
| 106 | + ('What is dropout?', 2, None, 'what-is-dropout'), |
| 107 | + ('Key benefits of Dropout:', 3, None, 'key-benefits-of-dropout'), |
106 | 108 | ('Building our own CNN code', |
107 | 109 | 2, |
108 | 110 | None, |
|
328 | 330 | <!-- navigation toc: --> <li><a href="#final-visualization" style="font-size: 80%;"><b>Final visualization</b></a></li> |
329 | 331 | <!-- navigation toc: --> <li><a href="#finally-evaluate-the-model" style="font-size: 80%;"><b>Finally, evaluate the model</b></a></li> |
330 | 332 | <!-- navigation toc: --> <li><a href="#building-code-using-pytorch" style="font-size: 80%;"><b>Building code using Pytorch</b></a></li> |
| 333 | + <!-- navigation toc: --> <li><a href="#what-is-dropout" style="font-size: 80%;"><b>What is dropout?</b></a></li> |
| 334 | + <!-- navigation toc: --> <li><a href="#key-benefits-of-dropout" style="font-size: 80%;"> Key benefits of Dropout:</a></li> |
331 | 335 | <!-- navigation toc: --> <li><a href="#building-our-own-cnn-code" style="font-size: 80%;"><b>Building our own CNN code</b></a></li> |
332 | 336 | <!-- navigation toc: --> <li><a href="#list-of-contents" style="font-size: 80%;"> List of contents:</a></li> |
333 | 337 | <!-- navigation toc: --> <li><a href="#schedulers" style="font-size: 80%;"> Schedulers</a></li> |
@@ -827,9 +831,7 @@ <h2 id="example-of-how-we-can-up-a-model-without-a-specific-image" class="anchor |
827 | 831 | model<span style="color: #666666">.</span>add(layers<span style="color: #666666">.</span>Conv2D(<span style="color: #666666">64</span>, (<span style="color: #666666">3</span>, <span style="color: #666666">3</span>), activation<span style="color: #666666">=</span><span style="color: #BA2121">'relu'</span>)) |
828 | 832 | model<span style="color: #666666">.</span>add(layers<span style="color: #666666">.</span>MaxPooling2D((<span style="color: #666666">2</span>, <span style="color: #666666">2</span>))) |
829 | 833 | model<span style="color: #666666">.</span>add(layers<span style="color: #666666">.</span>Conv2D(<span style="color: #666666">64</span>, (<span style="color: #666666">3</span>, <span style="color: #666666">3</span>), activation<span style="color: #666666">=</span><span style="color: #BA2121">'relu'</span>)) |
830 | | - |
831 | 834 | <span style="color: #408080; font-style: italic"># Here we display the architecture of our model so far.</span> |
832 | | - |
833 | 835 | model<span style="color: #666666">.</span>summary() |
834 | 836 | </pre> |
835 | 837 | </div> |
@@ -870,7 +872,7 @@ <h2 id="add-dense-layers-on-top" class="anchor">Add Dense layers on top </h2> |
870 | 872 | <pre style="line-height: 125%;">model<span style="color: #666666">.</span>add(layers<span style="color: #666666">.</span>Flatten()) |
871 | 873 | model<span style="color: #666666">.</span>add(layers<span style="color: #666666">.</span>Dense(<span style="color: #666666">64</span>, activation<span style="color: #666666">=</span><span style="color: #BA2121">'relu'</span>)) |
872 | 874 | model<span style="color: #666666">.</span>add(layers<span style="color: #666666">.</span>Dense(<span style="color: #666666">10</span>)) |
873 | | -Here<span style="color: #BA2121">'s the complete architecture of our model.</span> |
| 875 | +<span style="color: #408080; font-style: italic"># Now we list the complete architecture of our model.</span> |
874 | 876 | model<span style="color: #666666">.</span>summary() |
875 | 877 | </pre> |
876 | 878 | </div> |
@@ -1023,7 +1025,7 @@ <h2 id="running-with-keras-and-setting-up-the-model" class="anchor">Running with |
1023 | 1025 | model<span style="color: #666666">.</span>compile(loss<span style="color: #666666">=</span><span style="color: #BA2121">'categorical_crossentropy'</span>, optimizer<span style="color: #666666">=</span>sgd, metrics<span style="color: #666666">=</span>[<span style="color: #BA2121">'accuracy'</span>]) |
1024 | 1026 |
|
1025 | 1027 | <span style="color: #008000; font-weight: bold">return</span> model |
1026 | | - |
| 1028 | +<span style="color: #408080; font-style: italic">#model.summary()</span> |
1027 | 1029 | epochs <span style="color: #666666">=</span> <span style="color: #666666">100</span> |
1028 | 1030 | batch_size <span style="color: #666666">=</span> <span style="color: #666666">100</span> |
1029 | 1031 | input_shape <span style="color: #666666">=</span> X_train<span style="color: #666666">.</span>shape[<span style="color: #666666">1</span>:<span style="color: #666666">4</span>] |
@@ -1300,6 +1302,29 @@ <h2 id="building-code-using-pytorch" class="anchor">Building code using Pytorch |
1300 | 1302 | </div> |
1301 | 1303 |
|
1302 | 1304 |
|
| 1305 | +<!-- !split --> |
| 1306 | +<h2 id="what-is-dropout" class="anchor">What is dropout? </h2> |
| 1307 | + |
| 1308 | +<p>Dropout is a regularization technique used to prevent overfitting in |
| 1309 | +neural networks. During training, a randomly selected fraction |
| 1310 | +(typically 20-50$\%$) of neurons in each layer are temporarily |
| 1311 | +deactivated (dropped out). This compels the network to distribute |
| 1312 | +feature representation across multiple neurons instead of relying on a |
| 1313 | +small subset. |
| 1314 | +</p> |
| 1315 | +<h3 id="key-benefits-of-dropout" class="anchor">Key benefits of Dropout: </h3> |
| 1316 | + |
| 1317 | +<ol> |
| 1318 | +<li> Random Neuron Deactivation: Neurons are randomly ignored during each forward pass.</li> |
| 1319 | +<li> Overfitting Reduction: Encourages the learning of more robust data representations by preventing reliance on specific neurons.</li> |
| 1320 | +<li> Improved Robustness: At test time, all neurons are active, with weights scaled down to account for the training phase dropout rate.</li> |
| 1321 | +</ol> |
| 1322 | +<p>Dropout is commonly applied after fully connected layers in CNNs, but can also be used after convolutional layers.</p> |
| 1323 | + |
| 1324 | +<ol> |
| 1325 | +<li> Application after convolutional layers: While less frequent, dropout can be applied between convolutional and pooling layers. The inherent regularization provided by shared weights and pooling in convolutional layers already mitigates overfitting.</li> |
| 1326 | +<li> Application after fully connected Layers: Dropout is typically applied to fully connected layers due to the increased risk of overfitting caused by the large number of parameters.</li> |
| 1327 | +</ol> |
1303 | 1328 | <!-- !split --> |
1304 | 1329 | <h2 id="building-our-own-cnn-code" class="anchor">Building our own CNN code </h2> |
1305 | 1330 |
|
|
0 commit comments