One weird trick can fool an A.I.’s image recognition (computers hate it). Robots may have found sneaky ways to beat CAPTCHA, but recurrent neural networks still have gaps in their image recognition capabilities, which is why we’re sometimes asked to “choose which photos are a storefront” to prove we’re human (and give free labor to companies training neural networks to identify objects). And now Google researchers have found a surprisingly effective way to trick neural networks into thinking everything is a toaster.
When not teaching computers to beat everyone at Go or make nightmare imagery, Google’s researchers discovered a way to make the prospect of self-driving cars seem even more dangerous. A simple, psychedelic-looking sticker tricked Google’s VGG16 neural network into identifying pictures as a toaster.
Called an “adversarial patch”, the sticker was even more effective at being identified as a toaster than a sticker with an actual picture of a toaster on it was. The reason for this is salience. The sticker is designed to be deemed the most important part of an image, monopolizing the deep learning model, which only identifies one object at a time.
The researchers found the patches to be universal, robust, and targeted, “universal because they can be used to attack any scene, robust because they work under a wide variety of transformations, and targeted because they can cause a classifier to output any target class.” The research paper also notes that humans who see the stickers may not even realize how potentially dangerous they are. This easily-exploited aspect of neural networks will need to be addressed if we’re going to let A.I. drive our cars.
On the brighter side, when the metal ones come for you (and they will), these stickers will be even more indispensable than Old Glory Robot Insurance. Also, the part of the neural network that kept seeing spaghetti seems pretty chill. We can probably be friends.