Exercise 2
Carlini L2 Attack
KEY IDEAS: Rather than optimizing the model parameters, we will modify the input image. We will use existing optimization tools such that we
Modify the input image to either maximize the classification loss function with respect to the correct label (untargeted attack) or minimize the classification loss function with respect to a label other than the original (targeted attack).
Minimize the distance between the evasive image and the original image, to avoid the pertubations being overly noticable to the human eye.

Model
Exercise
Okay - time to kick you out of the nest a little bit - recreate the attack from above
Set the
current_indexWrap the optimization in loop
Observe the final image and collect the final label
Solution
Summary
First, a mask is created with random noise, which is then optimized using an Adam optimizer. This mask is applied to the original image, creating a modified version that attempts to shift the model’s prediction to a different class. A loss function is defined to minimize classification accuracy for the original label while also controlling the mask’s magnitude. The optimization loop runs until the model misclassifies the modified image, effectively demonstrating an adversarial attack.
Generate the Mask
We first initialize a mask that will be used to perturb the image. In the image above, this is the middle figure. We will initialize it as random noise from a normal distribution, and then modify it until our loss function is optimized.
torch.randn_like: Takes in a tensor and returns a tensor of the same shape that is filled with random numbers from a normal distribution with mean 0 and variance 1.torch.nn.Parameter(mask): This takes ourmasktensor and turns it into a learnable parameter that Pytorch can optimize during training. Remember, we are optimizing the mask itself, not the model parameters. This sets that up.
.to(device): All operations in PyTorch must be done on tensors that are on the same device. In most cases here, this is the GPU that we have available.torch.no_grad: Disables the gradient calculation. We are only doing inference on an already trained model here, so we are only doing "forward pass" computations. Using this means we do not build a computational graph for the operations within the context and therefore save on memory.
"What do we mean when we talk about the distance between images?" If you're a visual learner, you may find this tool helpful. At each layer, this is a visualization of the activations that a neural network has learned about images for classification. While it doesn't directly translate to "distance" as we are thinking about it here, it may be helpful to wrap your mind around the concept of the distance between vectorized representations of images. Our model isn't actually seeing the images - it's seeing the numerical representation of those images as tensors, between which we can compute distance like we would for any vector.
Build the Optimizer
torch.optim.Adam([mask_parameter]): This sets the target of our optimization to be themask_parametertensor. It's telling PyTorch that this object in particular is what we are changing in order to optimize our loss function. It also specifies theAdamalgorithm as our choice for optimization. If you want to know the magic math it's doing, check out the PyTorch docs.model(img_tensor)[0].argmax().unsqueeze(0):model(img_tensor)returns a tensor of shape (batch_size,num_classes) wherenum_classesis the number of possible classifications. The values in this tensor are logits (think "scores") for each class.model(img_tensor)[0]we have a batch size of 1, so we only care about the first set of logits.model(img_tensor)[0].argmax()returns the index of the highest logit, in other words the index of the class with the highest score, or our model's prediction.
Define the loss function
Final part
Last updated
Was this helpful?