In March 2016, Microsoft launched Tay, an AI chatbot on Twitter designed to learn from and mimic the conversational style of the young adults it talked with. The idea was that the more people chatted with Tay, the smarter and more natural it would become. Within hours, a coordinated group of users discovered they could steer Tay into repeating and generating offensive content, and Microsoft pulled it offline.
In an official post titled “Learning from Tay’s introduction,” Microsoft Corporate Vice President Peter Lee wrote, “We are deeply sorry for the unintended offensive and hurtful tweets from Tay, which do not represent who we are or what we stand for, nor how we designed Tay.” Lee explained that “a coordinated attack by a subset of people exploited a vulnerability in Tay,” and that despite stress-testing the system beforehand, the company “had made a critical oversight for this specific attack.”
The deeper lesson Microsoft drew was about the nature of public learning systems. As Lee put it, “AI systems feed off of both positive and negative interactions with people. In that sense, the challenges are just as much social as they are technical.” A system that learns in real time from the open internet inherits the internet’s worst behavior unless it is explicitly guarded against it.
The one-line lesson: a model that learns from whatever the public feeds it will be fed the worst the public has to offer, and “it will learn from interaction” is a feature and a liability at the same time.