Research on 2,400 languages shows nearly half the world’s language diversity is at risk

We’re losing languages, we’re losing language diversity, and unless we do something, these windows into our collective history will close.

By Hedvig Skirgård and Simon Greenhill

There are more than 7,000 languages in the world, and their grammar can vary a lot. Linguists are interested in these differences because of what they tell us about our history, our cognitive abilities and what it means to be human.

But this great diversity is threatened as more and more languages aren’t taught to children and fall into slumber.

- Advertisement -

In a new paper published in Science Advances, we’ve launched an extensive database of language grammars called Grambank. With this resource, we can answer many research questions about language and see how much grammatical diversity we may lose if the crisis isn’t stopped.

Our findings are alarming: we’re losing languages, we’re losing language diversity, and unless we do something, these windows into our collective history will close.

What is grammar?

The grammar of a language is the set of rules that determines what a sentence is in that language, and what is gibberish. For example, tense is obligatory in English. To combine “Sarah”, “write” and “paper” into a well-formed sentence, I have to indicate a time. If you don’t have tense in an English sentence, then it’s not grammatical.

That’s not the case in all languages though. In the indigenous language of Hokkaido Ainu in Japan, speakers don’t need to specify time at all. They can add words such as “already” or “tomorrow” – but speakers consider the sentence correct without them.

As the great anthropologist Franz Boas once said:

grammar […] determines those aspects of each experience that must be expressed.

Linguists aren’t interested in “correct” grammar. We know grammar changes over time and from place to place – and that variation isn’t a bad thing to us, it’s amazing!

- Advertisement -

By studying these rules across languages, we can get an insight into how our minds work, and how we transfer meaning from ourselves to others. We can also learn about our history, where we come from, and how we got here. It’s rather extraordinary.

A huge linguistic database of grammar

We’re thrilled to release Grambank into the world. Our team of international colleagues built it over several years by reading many books about language rules, and speaking to experts and community members about specific languages.

It was a difficult task. Grammars of different languages can be very different from each other. Moreover, different people have different ways of describing how these rules work. Linguists love jargon, so it was a special challenge to understand them sometimes.

In Grambank, we used 195 questions to compare more than 2,400 languages – including two signed languages. The map below provides an overview of what we have captured.

Each dot represents a language, and the more similar the colour, the more similar the languages. To create this map, we used a technique called “principal component analysis” – it reduced the 195 questions to three dimensions, which we then mapped onto red, green and blue.

The large variation in colours reveals how different all these languages are from each other. Where we get regions with similar colours, such as in the Pacific, this could mean the languages are related, or that they have borrowed a lot from each other.

World map of languages included in the Grambank dataset. The colour represents grammatical similarity – the more similar the colours, the more similar the grammars. Skirgård et al. (2023), CC BY-SA

Language is very special to humans; it’s part of what makes us who we are.

Sadly, the world’s indigenous languages are facing an endangerment crisis due to colonisation and globalisation. We know each language lost heavily impacts the health of Indigenous individuals and communities by severing ties to ancestry and traditional knowledge.

Almost half the world’s linguistic diversity is threatened

In addition to the loss of individual languages, our team wanted to understand what we stand to lose in terms of grammatical diversity.

The Grambank database reveals a dazzling variety of languages around the world – a testament to the human capacity for change, variation and ingenuity.

Using an ecological measure of diversity, we assessed what kind of loss we could expect if languages that are currently under threat were to disappear. We found certain regions will be hit harder than others.

Frighteningly, some regions of the world such as South America and Australia are expected to lose all of their indigenous linguistic diversity, because all of the indigenous languages there are threatened. Even other regions where languages are relatively safe, such as the Pacific, South-East Asia and Europe, still show a dramatic decrease of about 25%.

Barplot of grammatical diversity (functional richness) across regions. Light green shows the current diversity, dark green shows the remaining diversity left after endangered languages are removed. Author provided

What’s next?

Without sustained support for language revitalisation, many people will be harmed and our shared linguistic window into human history, cognition and culture will become seriously fragmented.

The United Nations declared 2022–2032 the Decade of Indigenous Languages. Around the world, grassroots organisations including the Ngukurr Language Centre, Noongar Boodjar Language Centre, and the Canadian Heiltsuk Cultural Education Centre are working towards language maintenance and revitalisation. To get a feel for what this can be like, check out this interactive animation by Angelina Joshua.

Hedvig Skirgård, Postdoctoral researcher, Australian National University and Simon Greenhill, Associate Professor, University of Auckland

This article is republished from The Conversation under a Creative Commons license. Read the original article.