“Don’t look down on someone who uses broken English. That means they know another language.”
– a meme on my Facebook feed this week
Natural Language Processing, I always tell audiences, is a really hard problem for computers. I then joke about how this is a good thing – because it means that I will always have a job trying to solve it. But as challenging as interpreting natural human language can be – with its complex structures, massive ambiguities, and infinite possibilities – that problem becomes all the more intense when the writer is not a native user of the language. Now your structures are no longer predictable, and although the writer may still be using systematic and regular patterns, they don’t conform to what native writers use. Any natural language system designed for native speakers of a language will fall short in its performance when applied to non-native writing.
My early research in Computational Linguistics, and my PhD dissertation, focused entirely on this issue. I wish I could say that I solved it, or that anyone since then has made significant progress; but we haven’t. Now I am thinking about this issue again in the context of building conversational interfaces. Any dialogue agent, even one whose user base is localized, is likely to run across one of the realities of our globalized culture: that not everyone speaks one language. A chatbot that understands only English will run head-on into the non-native writer problem, and probably perform rather poorly if the problem has not been properly anticipated.
One way to get ahead of this issue would be to explicitly model not just native language use but non-native language use: to build into your language model not just the way L1 (native) writers use the language, but also the structures seen in L2 (non-native) writing. However, this is a difficult phenomenon to model. Different L2 writers apply different systems, based on where they are in their language acquisition process and what other language(s) they use. (For a more in-depth discussion of this, see my 205-page dissertation, published 15 years ago.) In the end, the problem of dealing with L2 writing may be intractable.
So what is the solution? I can propose one: let your users pick the language of the interaction. A customer service chatbot that aims to deliver a good customer experience should be able to speak (and understand) the language of the customer. Although crafting a multilingual bot requires additional effort, is it more effort than it would be to staff a contact center around the clock with agents who can speak the languages of your customers?
And crafting a multilingual bot does not necessarily mean that for N languages, you need to expect N times the effort. The Aspect NLU framework embedded into our self-service CX platform is actually perfectly positioned to address this issue. The dialogue logic is language-independent: craft it once, and it works in every language supported by the NLU. Only the response text needs to be localized for each language to cover.
Write to a chatbot in English, and it should respond in English. Write to it in Spanish, and it should respond in Spanish. Crafting bots this way from the beginning may increase customer engagement because it lowers the effort for the customer, and it will definitely yield more successful interactions plus a better user experience.
- CXP 17: Leveraging Natural Language Understanding for Self-Service Chatbots - June 19, 2017
- Je ne parle pas le français: Why We Need Multilingual Bots - April 17, 2017
- Data Doesn’t Lie – But it Doesn’t Tell the Whole Truth - January 24, 2017