Now that we’re about 25 years into the communication revolution and artificial intelligence has become the focus of the technology sector, predictions abound that faultless automated translation and interpretation systems will soon obviate the need to speak different languages and voice recognition technology will replace the tasks of typing, writing, and even reading. No doubt these technological innovations are enhancing our communicative capacities, but our human desire to communicate is too strong to permit us to give up personal avenues of socialization, so they should be seen as additional means of communication rather than replacements for direct human interaction.
One of the most challenging and revolutionary things artificial intelligence (AI) can do is speak, write, listen to, and understand human language. Natural language processing (NLP) is a form of AI that extracts meaning from human language to make decisions based on the information. This technology is still evolving, but there are already many incredible ways NLP is used today. ResearchAndMarkets.com forecasts that the global NLP market size will grow from $10.2 billion in 2019 to $26.4 billion by 2024—a compound annual growth rate of 21% over the next five years. That’s a lot of automated calls, texts, and emails, but at the same time, we’re seeing an upsurge in small family businesses with a personal touch, restaurants that push the farm-to-table movement, and artisans that encourage the maker movement.
Over the last decade, free online translators have improved exponentially. Google Translate uses an engine that translates complete sentences using an artificial neural network, linking digital “neurons” in layers, with each layer feeding its output to the next—a model based loosely on the brain.
Neural translation systems are first trained by huge volumes of human-translated text, then they take each word and use the surrounding context to turn it into an abstract digital representation. Next, they try to find the closest matching representation in the target language, enabling much better translation of longer sentences. The latest upgrade significantly enhances the efficiency and accuracy of machine translation by integrating computer vision capacity and AI self-learning capacity that instantly captures and understands multimodal information presented by the speaker. While apps and automation technologies might work for consumers looking to solve simple problems and carry out mundane tasks, more complex uses require human input. Even in the same language, sales messages have to be customized for each target market. Using machine translation to produce nuanced messages for a new market will create a less effective pitch and result in critical cultural mistakes.
The consequences of such errors in global interactions could be catastrophic.
Almost 30 years ago, I witnessed the launch of IBM’s latest innovation—a voice recognition system they claimed would make typing a thing of the past. Soon afterward, IBM’s Deep Blue shocked the world by beating world chess champion Garry Kasparov, thereby proving that computers were capable of the most complex of human reckoning. However, we’re still typing, we’re still joking, and we’re still singing in multiple languages. The U.S. has become more multilingual than ever before.
Communication is much more than words—context, body language, intonation, and cultural inference that help us understand meaning beyond words when we communicate with each other. A machine’s ability to understand human speech is a spectacular achievement, but humans’ desire to interact with each other is so fundamental that we will always be looking for additional means of communication rather than replacing them.