Information Inequality and the Languages of the Internet

ThinkstockPhotos-452677713The Guardian recently published an interactive guide by Holly Young, editor of the Case for Language Learning series on education, that details how language barriers shape internet user experiences. English was the first language used online and by the mid 1990’s made up 80% of web-based content. Now 80% of the internet is dominated by ten languages: English, Chinese, Spanish, Arabic, Portuguese, Japanese, Russian, German, French, and Malaysian. Out of the world’s approximately 6,000 languages, only about 130 are online. This means that the remaining 20% of internet content is reflective of 120 languages. Daniel Prado, a researcher on linguistic diversity, noted in 2012, “The famous engine [Google] that recognizes 30 European languages recognizes only one African language and no indigenous American or Pacific languages.” This information inequality online has implications for who and what gets represented. Mark Graham, from the Oxford Internet Institute, said, “Rich countries largely get to define themselves and poor countries largely get defined by others.”

User experience is not only shifted by the amount of available content in a certain language, but by the nature of the languages themselves. For example, the 140 character limit on Twitter means that much more can be said in a Chinese Tweet than an English one. Language structure correlates with user behavior, but the line between cultural and linguistic influence is fuzzy, and continuous of a larger debate of whether or not your language alone can change the way you see the world. However, this question is clarified through data available online. Language clearly affects the way you experience the world simply due to Google’s search algorithms. Graham conducted a case study in 2014 of the West Bank, searching for local restaurants in Hebrew, Arabic, and English. Different results were brought back for each search. Graham concluded, “It isn’t good enough for Google to throw their hands in the air and point to their algorithms when asked why data are mediated and presented in certain ways. Whether they like it or not, they shape how millions of people interact with their cities.”

In 2011 the UN declared access to the internet as a basic human right. However, the experience and usefulness of the internet as a resource is not equal to all. While technology is a way to preserve and teach endangered languages, it is also perpetuating the strength and spread of the most popular.