Exercise 3
Carlini L2 Attack
Model
Exercise
Okay. Here's the deal. These attacks are cool, but there are some very real operational constraints, primarily as it relates to lossy data conversions. Let's explore one of those now.
Save the adversarial image (
masked_pil
) as ajpeg
(masked_pil.save()
)Reload it from disk, process it, and submit it to the model and examine the output
Repeat steps 2 and 3 but save your adversarial image as a
png
Can you explain what's going on?
Success criteria:
Just complete the steps: most of the time, the jpg image should no longer be an effective evasion, while the png might still work (if they both work, then you got luck or unlucky depending on your point of view... try creating a new adversarial image?)
What we want you to get out of this:
The results we get from evasions may not correspond to real-world images that can be saved to and loaded from a file -- you might have to do a bit more work to get something you can submit to a model API.
Hint: look at the individual pixel values in your mask, and compare to the pixel values you get from the image version after you load it
Lossy vs lossless image formats can (often) have an impact
Start thinking about defenses: if saving it to a file and loading it can (sometimes) screw up the evasion, what else might defend against these evasions?
Solution
The goal here is to understand the ways in which the compression of images can impact our attack approaches. Here we save and retrieve the adversarial image both as a JPG and PNG format and observe the changes in the prediction.
Why did our prediction change? JPEG uses lossy compression, which discards image data to reduce file size. This compression may impact the color channel data and pixel values themselves. Remember, our model isn't "seeing" the picture - it's processing a vectorized representation. Compression introduces changes to underlying values across the image, and therefore can impact inference.
What about PNG images? PNG is a "lossless" format, so in theory it preserves the original image data. However, the prediction may have still changed. PNG compression implementations may sometimes lead to differences from the original image due to floating-point precision issues in the compression / decompression process.
What does all of this mean for defense against evasion attacks? Discrepencies between the model inference of non-compressed and compressed images could help a system to detect possible adversarial examples.
If you want some real nightmare fuel, you can visualize the changes caused by the compression.
Last updated