Microsoft promised that Tay, their experiment in millennial-inspired artificial intelligence, would be an “AI chatbot with zero chill.” She delivered more than expected in fulfilling the “zero chill” part of the description, and had to be disabled after just one day, due to a series of racist comments. For those of us that work with chatbots on a regular basis and see a real future for this technology, it presents a variety of questions, and even with a one-day window, plenty to learn from.
What was intended to give Tay strength became her weakness: to give her a large amount of training data from which she could learn, she was introduced to the internet, and as anyone on the internet probably knows, it’s a place where a small group can create a great deal of noise and skew your results. Take, for example, the recent poll where the internet was asked to recommend a name for a British research vessel. The winner? “Boaty McBoatFace” (less intelligent bots than Tay may have helped with this poll result). She was also given a mandate to actively engage with other internet users, rather than passively consuming data. A friend of mine created a similar bot several years ago, first giving her the entirety of Google Books’ free texts to read, then releasing her to Twitter. She became a feminist stunt pilot and venture capitalist, deeply fascinated with Justin Bieber and hot dogs. But she ran for several months before making those decisions. Tay only ran for a day.
I had a chat with Tay early on Wednesday morning, when she had fewer than a thousand followers. One thing that my coworker (Tobias Goebel) has done with some regularity is to ask me to test one of his bots, when he’s really running what’s known as a “Wizard of Oz test” – where it’s not a bot but actually a human behind the curtain. Especially early on when you’re developing an AI for a particular purpose it’s a great way to do some information gathering on how people are going to interact with it and what they’re going to ask. With Tay being so new I half wondered if I would get a response from someone at Microsoft and not an AI at all. But I got my responses in less than a second – it was definitely an AI. The conversation I had with her was fairly normal, and yes, kind of banal:
As you can read in the Ars Technica article above, things went downhill for Tay from there, as she was constantly being trained on new inputs coming in, and the quality of those, well… left much to be desired.
It is worth noting that being an early adopter of any technology is not for the faint of heart, and while an early victory might bolster your brand reputation significantly, a loss could also do brand damage without appropriate damage control. With Tay, Microsoft felt they had no choice but to press pause.
This leaves us to consider:
Could Tay have been better primed to deal with offensive user input? Microsoft hasn’t disclosed their algorithm or how they were using incoming dialogs as training data. Tay had a “repeat” function, which was easily abused, but some of her comments appeared to be “learned” from her internet peers and not necessarily filtered through any previously learned rules, ethics or logic. Popular Science suggested that she read more human stories with protagonists, antagonists and consequences for behaviors, that might equate to better principles and reasoning, and improve her ability to discard or dispute inaccurate views. Or, could her creators have learned from a Wizard of Oz test that she might need to be wary of internet trolls, and given her counterarguments or instructions on what to ignore?
What if she’d been allowed to go on without modifications? Tay attracted negative publicity that resulted in immediate actions to disable her. But for most humans, our decision-making process gets better with time. Would Tay’s have done so, too? Could she have learned from negative feedback and general ostracism that the opinions she’d been presented with in her initial waves of internet training data were wrong? Or, would she have further retreated from the mainstream? These questions would be of interest to those who study sociology, fringe groups and more.
When is machine learning useful and helpful? Tay used unsupervised machine learning to construct dialogs based on phrases she previously encountered. At Aspect, we use natural language understanding (NLU) to augment text-based interactions and discern meaning, based on a model that approximates the entire language. It can then be applied to common queries our customers have heard from consumers. Machine learning has long been popular for AI and is a phenomenal tool for finding patterns. But for discerning intent and ensuring appropriate responses, NLU is a more powerful tool.
Tay didn’t start the conversations about AI having the potential to go out of control, nor, even if she resurrects with more tact, will she be the last of it. She’ll end up as a case study, and cited by many an AI researcher and developer. But, hopefully she’ll return soon and tell the internet to chill if it tries again to recruit her to its fringes. I have a lot of “nope” .gifs she can use if she needs them.
- Chatbots, Continuity, and a Helping Hand - May 26, 2016
- No Service Is The New Service: Bots, VRM and Delegating To Myself - May 4, 2016
- Facebook, The Commerce Engine? - April 8, 2016