Skip to content
Merged

2026 #206

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
75a6844
updated Liquid model name in lab 3
shrika-eddula Dec 23, 2025
cc8abbd
should be state instead of hidden_state as per solutions
shrika-eddula Dec 23, 2025
84e329f
ptp was removed from the ndarray class in NumPy 2.0. Use np.ptp(arr, …
shrika-eddula Dec 23, 2025
dc2b4a1
In Keras 3, add_weight changed its function signature. Need to pass s…
shrika-eddula Dec 31, 2025
930e1c3
Was running into an issue with reset_states() not running on 'Sequent…
shrika-eddula Dec 31, 2025
a8ba334
update PTLab1 links
avaamini Jan 3, 2026
eaa9fc5
update 2026
avaamini Jan 3, 2026
9b5e542
update 2026
avaamini Jan 3, 2026
e20e060
update 2026 + link
avaamini Jan 3, 2026
6805092
update 2026
avaamini Jan 3, 2026
f48bf4f
update 2026 + submission link
avaamini Jan 3, 2026
11248eb
update 2026
avaamini Jan 3, 2026
6211f55
update 2026 + submission link
avaamini Jan 3, 2026
dc1434c
update 2026
avaamini Jan 3, 2026
f33f764
Update copyright year and submission link
avaamini Jan 3, 2026
88af21f
update 2026
avaamini Jan 3, 2026
7743df7
update 2026 + submission link
avaamini Jan 3, 2026
389f46c
update 2026
avaamini Jan 3, 2026
13734e6
update 2026 + submission link
avaamini Jan 4, 2026
ac37a5a
update 2026
avaamini Jan 4, 2026
d447d99
update 2026 + submission link
avaamini Jan 4, 2026
b4ddb15
img pointers to master
avaamini Jan 4, 2026
ba56203
img pointers to master
avaamini Jan 4, 2026
11fdf5c
update img pointers to master
avaamini Jan 4, 2026
c3c26b0
update img pointers to master
avaamini Jan 4, 2026
0fc9548
update img pointers to master
avaamini Jan 4, 2026
9065e63
update img pointers to master
avaamini Jan 4, 2026
9c28b17
update img pointers to master
avaamini Jan 4, 2026
dad4310
update img pointers to master
avaamini Jan 4, 2026
87e323d
Update copyright year and submission link
avaamini Jan 4, 2026
9f21659
update copyright year and submission link
avaamini Jan 4, 2026
1fb79a5
lfm2 finetuning, gemini2.5 judge, opik eval
avaamini Jan 4, 2026
d46e580
removing the mention of stop for json
avaamini Jan 4, 2026
1338884
student version w/ LFM, Gemini, Opik updates
avaamini Jan 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions lab1/PT_Part1_Intro.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
},
"outputs": [],
"source": [
"# Copyright 2025 MIT Introduction to Deep Learning. All Rights Reserved.\n",
"# Copyright 2026 MIT Introduction to Deep Learning. All Rights Reserved.\n",
"#\n",
"# Licensed under the MIT License. You may not use this file except in compliance\n",
"# with the License. Use and/or modification of this code outside of MIT Introduction\n",
Expand All @@ -53,7 +53,7 @@
"\n",
"## 0.1 Install PyTorch\n",
"\n",
"[PyTorch](https://pytorch.org/) is a popular deep learning library known for its flexibility and ease of use. Here we'll learn how computations are represented and how to define a simple neural network in PyTorch. For all the labs in Introduction to Deep Learning 2025, there will be a PyTorch version available.\n",
"[PyTorch](https://pytorch.org/) is a popular deep learning library known for its flexibility and ease of use. Here we'll learn how computations are represented and how to define a simple neural network in PyTorch. For all the labs in Introduction to Deep Learning 2026, there will be a PyTorch version available.\n",
"\n",
"Let's install PyTorch and a couple of dependencies."
]
Expand Down Expand Up @@ -203,7 +203,7 @@
"\n",
"A convenient way to think about and visualize computations in a machine learning framework like PyTorch is in terms of graphs. We can define this graph in terms of tensors, which hold data, and the mathematical operations that act on these tensors in some order. Let's look at a simple example, and define this computation using PyTorch:\n",
"\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2025/lab1/img/add-graph.png)"
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/add-graph.png)"
]
},
{
Expand Down Expand Up @@ -235,7 +235,7 @@
"\n",
"Now let's consider a slightly more complicated example:\n",
"\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2025/lab1/img/computation-graph.png)\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/computation-graph.png)\n",
"\n",
"Here, we take two inputs, `a, b`, and compute an output `e`. Each node in the graph represents an operation that takes some input, does some computation, and passes its output to another node.\n",
"\n",
Expand Down Expand Up @@ -306,7 +306,7 @@
"\n",
"Let's consider the example of a simple perceptron defined by just one dense (aka fully-connected or linear) layer: $ y = \\sigma(Wx + b) $, where $W$ represents a matrix of weights, $b$ is a bias, $x$ is the input, $\\sigma$ is the sigmoid activation function, and $y$ is the output.\n",
"\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2025/lab1/img/computation-graph-2.png)\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/computation-graph-2.png)\n",
"\n",
"We will use `torch.nn.Module` to define layers -- the building blocks of neural networks. Layers implement common neural networks operations. In PyTorch, when we implement a layer, we subclass `nn.Module` and define the parameters of the layer as attributes of our new class. We also define and override a function [``forward``](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.forward), which will define the forward pass computation that is performed at every step. All classes subclassing `nn.Module` should override the `forward` function.\n",
"\n",
Expand Down
24 changes: 12 additions & 12 deletions lab1/PT_Part2_Music_Generation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
},
"outputs": [],
"source": [
"# Copyright 2025 MIT Introduction to Deep Learning. All Rights Reserved.\n",
"# Copyright 2026 MIT Introduction to Deep Learning. All Rights Reserved.\n",
"#\n",
"# Licensed under the MIT License. You may not use this file except in compliance\n",
"# with the License. Use and/or modification of this code outside of MIT Introduction\n",
Expand Down Expand Up @@ -399,7 +399,7 @@
"* [`nn.LSTM`](https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html): Our LSTM network, with size `hidden_size`.\n",
"* [`nn.Linear`](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html): The output layer, with `vocab_size` outputs.\n",
"\n",
"<img src=\"https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2019/lab1/img/lstm_unrolled-01-01.png\" alt=\"Drawing\"/>\n",
"<img src=\"https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/lstm_unrolled-01-01.png\" alt=\"Drawing\"/>\n",
"\n",
"\n",
"\n",
Expand All @@ -415,7 +415,7 @@
"* [`tf.keras.layers.Dense`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense): The output layer, with `vocab_size` outputs.\n",
"\n",
"\n",
"<img src=\"https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2019/lab1/img/lstm_unrolled-01-01.png\" alt=\"Drawing\"/> -->"
"<img src=\"https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/lstm_unrolled-01-01.png\" alt=\"Drawing\"/> -->"
]
},
{
Expand Down Expand Up @@ -652,6 +652,11 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "GuGUJB0ZT_Uo"
},
"outputs": [],
"source": [
"### compute the loss on the predictions from the untrained model from earlier. ###\n",
"y.shape # (batch_size, sequence_length)\n",
Expand All @@ -663,12 +668,7 @@
"\n",
"print(f\"Prediction shape: {pred.shape} # (batch_size, sequence_length, vocab_size)\")\n",
"print(f\"scalar_loss: {example_batch_loss.mean().item()}\")"
],
"metadata": {
"id": "GuGUJB0ZT_Uo"
},
"execution_count": null,
"outputs": []
]
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -875,7 +875,7 @@
"\n",
"* At each time step, the updated RNN state is fed back into the model, so that it now has more context in making the next prediction. After predicting the next character, the updated RNN states are again fed back into the model, which is how it learns sequence dependencies in the data, as it gets more information from the previous predictions.\n",
"\n",
"![LSTM inference](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2019/lab1/img/lstm_inference.png)\n",
"![LSTM inference](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/lstm_inference.png)\n",
"\n",
"Complete and experiment with this code block (as well as some of the aspects of network definition and training!), and see how the model performs. How do songs generated after training with a small number of epochs compare to those generated after a longer duration of training?"
]
Expand Down Expand Up @@ -906,7 +906,7 @@
"\n",
" for i in tqdm(range(generation_length)):\n",
" '''TODO: evaluate the inputs and generate the next character predictions'''\n",
" predictions, hidden_state = model('''TODO''', '''TODO''', return_state=True) # TODO\n",
" predictions, state = model('''TODO''', '''TODO''', return_state=True) # TODO\n",
"\n",
" # Remove the batch dimension\n",
" predictions = predictions.squeeze(0)\n",
Expand Down Expand Up @@ -1004,7 +1004,7 @@
"* What if you alter or augment the dataset?\n",
"* Does the choice of start string significantly affect the result?\n",
"\n",
"Try to optimize your model and submit your best song! **Participants will be eligible for prizes during the January 2025 offering. To enter the competition, you must upload the following to [this submission link](https://www.dropbox.com/request/U8nND6enGjirujVZKX1n):**\n",
"Try to optimize your model and submit your best song! **Participants will be eligible for prizes during the January 2026 offering. To enter the competition, you must upload the following to [this submission link](https://www.dropbox.com/request/4hqfsOnLtX4jH1W3ynfp):**\n",
"\n",
"* a recording of your song;\n",
"* iPython notebook with the code you used to generate the song;\n",
Expand Down
20 changes: 13 additions & 7 deletions lab1/TF_Part1_Intro.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
},
"outputs": [],
"source": [
"# Copyright 2025 MIT Introduction to Deep Learning. All Rights Reserved.\n",
"# Copyright 2026 MIT Introduction to Deep Learning. All Rights Reserved.\n",
"#\n",
"# Licensed under the MIT License. You may not use this file except in compliance\n",
"# with the License. Use and/or modification of this code outside of MIT Introduction\n",
Expand All @@ -53,7 +53,7 @@
"\n",
"## 0.1 Install TensorFlow\n",
"\n",
"TensorFlow is a software library extensively used in machine learning. Here we'll learn how computations are represented and how to define a simple neural network in TensorFlow. For all the TensorFlow labs in Introduction to Deep Learning 2025, we'll be using TensorFlow 2, which affords great flexibility and the ability to imperatively execute operations, just like in Python. You'll notice that TensorFlow 2 is quite similar to Python in its syntax and imperative execution. Let's install TensorFlow and a couple of dependencies.\n"
"TensorFlow is a software library extensively used in machine learning. Here we'll learn how computations are represented and how to define a simple neural network in TensorFlow. For all the TensorFlow labs in Introduction to Deep Learning 2026, we'll be using TensorFlow 2, which affords great flexibility and the ability to imperatively execute operations, just like in Python. You'll notice that TensorFlow 2 is quite similar to Python in its syntax and imperative execution. Let's install TensorFlow and a couple of dependencies.\n"
]
},
{
Expand Down Expand Up @@ -208,7 +208,7 @@
"\n",
"A convenient way to think about and visualize computations in TensorFlow is in terms of graphs. We can define this graph in terms of Tensors, which hold data, and the mathematical operations that act on these Tensors in some order. Let's look at a simple example, and define this computation using TensorFlow:\n",
"\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2025/lab1/img/add-graph.png)"
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/add-graph.png)"
]
},
{
Expand Down Expand Up @@ -240,7 +240,7 @@
"\n",
"Now let's consider a slightly more complicated example:\n",
"\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2025/lab1/img/computation-graph.png)\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/computation-graph.png)\n",
"\n",
"Here, we take two inputs, `a, b`, and compute an output `e`. Each node in the graph represents an operation that takes some input, does some computation, and passes its output to another node.\n",
"\n",
Expand Down Expand Up @@ -311,7 +311,7 @@
"\n",
"Let's first consider the example of a simple perceptron defined by just one dense layer: $ y = \\sigma(Wx + b)$, where $W$ represents a matrix of weights, $b$ is a bias, $x$ is the input, $\\sigma$ is the sigmoid activation function, and $y$ is the output. We can also visualize this operation using a graph:\n",
"\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2025/lab1/img/computation-graph-2.png)\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/computation-graph-2.png)\n",
"\n",
"Tensors can flow through abstract types called [```Layers```](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Layer) -- the building blocks of neural networks. ```Layers``` implement common neural networks operations, and are used to update weights, compute losses, and define inter-layer connectivity. We will first define a ```Layer``` to implement the simple perceptron defined above."
]
Expand Down Expand Up @@ -339,8 +339,14 @@
" d = int(input_shape[-1])\n",
" # Define and initialize parameters: a weight matrix W and bias b\n",
" # Note that parameter initialization is random!\n",
" self.W = self.add_weight(\"weight\", shape=[d, self.n_output_nodes]) # note the dimensionality\n",
" self.b = self.add_weight(\"bias\", shape=[1, self.n_output_nodes]) # note the dimensionality\n",
" self.W = self.add_weight(\n",
" shape=(d, self.n_output_nodes),\n",
" name=\"weight\",\n",
" )\n",
" self.b = self.add_weight(\n",
" shape=(1, self.n_output_nodes),\n",
" name=\"bias\",\n",
" )\n",
"\n",
" def call(self, x):\n",
" '''TODO: define the operation for z (hint: use tf.matmul)'''\n",
Expand Down
12 changes: 7 additions & 5 deletions lab1/TF_Part2_Music_Generation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
},
"outputs": [],
"source": [
"# Copyright 2025 MIT Introduction to Deep Learning. All Rights Reserved.\n",
"# Copyright 2026 MIT Introduction to Deep Learning. All Rights Reserved.\n",
"#\n",
"# Licensed under the MIT License. You may not use this file except in compliance\n",
"# with the License. Use and/or modification of this code outside of MIT Introduction\n",
Expand Down Expand Up @@ -399,7 +399,7 @@
"* [`tf.keras.layers.Dense`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense): The output layer, with `vocab_size` outputs.\n",
"\n",
"\n",
"<img src=\"https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2019/lab1/img/lstm_unrolled-01-01.png\" alt=\"Drawing\"/>"
"<img src=\"https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/lstm_unrolled-01-01.png\" alt=\"Drawing\"/>"
]
},
{
Expand Down Expand Up @@ -858,7 +858,7 @@
"\n",
"* At each time step, the updated RNN state is fed back into the model, so that it now has more context in making the next prediction. After predicting the next character, the updated RNN states are again fed back into the model, which is how it learns sequence dependencies in the data, as it gets more information from the previous predictions.\n",
"\n",
"![LSTM inference](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2019/lab1/img/lstm_inference.png)\n",
"![LSTM inference](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/lstm_inference.png)\n",
"\n",
"Complete and experiment with this code block (as well as some of the aspects of network definition and training!), and see how the model performs. How do songs generated after training with a small number of epochs compare to those generated after a longer duration of training?"
]
Expand All @@ -884,7 +884,9 @@
" text_generated = []\n",
"\n",
" # Here batch size == 1\n",
" model.reset_states()\n",
" for layer in model.layers:\n",
" if hasattr(layer, \"reset_states\"):\n",
" layer.reset_states()\n",
" tqdm._instances.clear()\n",
"\n",
" for i in tqdm(range(generation_length)):\n",
Expand Down Expand Up @@ -991,7 +993,7 @@
"* What if you alter or augment the dataset?\n",
"* Does the choice of start string significantly affect the result?\n",
"\n",
"Try to optimize your model and submit your best song! **Participants will be eligible for prizes during the January 2025 offering. To enter the competition, you must upload the following to [this submission link](https://www.dropbox.com/request/U8nND6enGjirujVZKX1n):**\n",
"Try to optimize your model and submit your best song! **Participants will be eligible for prizes during the January 2025 offering. To enter the competition, you must upload the following to [this submission link](https://www.dropbox.com/request/4hqfsOnLtX4jH1W3ynfp):**\n",
"\n",
"* a recording of your song;\n",
"* iPython notebook with the code you used to generate the song;\n",
Expand Down
8 changes: 4 additions & 4 deletions lab1/solutions/PT_Part1_Intro_Solution.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
},
"outputs": [],
"source": [
"# Copyright 2025 MIT Introduction to Deep Learning. All Rights Reserved.\n",
"# Copyright 2026 MIT Introduction to Deep Learning. All Rights Reserved.\n",
"#\n",
"# Licensed under the MIT License. You may not use this file except in compliance\n",
"# with the License. Use and/or modification of this code outside of MIT Introduction\n",
Expand Down Expand Up @@ -241,7 +241,7 @@
"\n",
"A convenient way to think about and visualize computations in a machine learning framework like PyTorch is in terms of graphs. We can define this graph in terms of tensors, which hold data, and the mathematical operations that act on these tensors in some order. Let's look at a simple example, and define this computation using PyTorch:\n",
"\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2025/lab1/img/add-graph.png)"
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/add-graph.png)"
]
},
{
Expand Down Expand Up @@ -282,7 +282,7 @@
"\n",
"Now let's consider a slightly more complicated example:\n",
"\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2025/lab1/img/computation-graph.png)\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/computation-graph.png)\n",
"\n",
"Here, we take two inputs, `a, b`, and compute an output `e`. Each node in the graph represents an operation that takes some input, does some computation, and passes its output to another node.\n",
"\n",
Expand Down Expand Up @@ -364,7 +364,7 @@
"\n",
"Let's consider the example of a simple perceptron defined by just one dense (aka fully-connected or linear) layer: $ y = \\sigma(Wx + b) $, where $W$ represents a matrix of weights, $b$ is a bias, $x$ is the input, $\\sigma$ is the sigmoid activation function, and $y$ is the output.\n",
"\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/2025/lab1/img/computation-graph-2.png)\n",
"![alt text](https://raw.githubusercontent.com/MITDeepLearning/introtodeeplearning/master/lab1/img/computation-graph-2.png)\n",
"\n",
"We will use `torch.nn.Module` to define layers -- the building blocks of neural networks. Layers implement common neural networks operations. In PyTorch, when we implement a layer, we subclass `nn.Module` and define the parameters of the layer as attributes of our new class. We also define and override a function [``forward``](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.forward), which will define the forward pass computation that is performed at every step. All classes subclassing `nn.Module` should override the `forward` function.\n",
"\n",
Expand Down
Loading