WMFI and the Year of Indigenous Languages

Wikimedia Suomesta
Siirry navigaatioon Siirry hakuun

In 2019, Wikimedia Finland has started to play a more active role in facilitating the use of the three indigenous languages spoken in Finland (Northern Saami, Inari Saami, and Skolt Saami) in Wikimedia projects. The Saami cultures and languages are underrepresented and even misrepresented on these projects. In addition, we are trying to make our activities more visible and transparent to our language communities. The fact that 2019 is UNESCO’s International Year of Indigenous Languages is yet another impetus for what we are doing.


We are taking three main approaches: First, we aim to make the Wikimedia projects more useful to the communities themselves by promoting the Northern Saami Wikipedia, enhancing support for the Saami languages in Wikimedia projects, and advocating their use for tagging and language revitalisation. Secondly, we wish to help the communities control the circulation of the representation of their culture by working together with GLAMs and the Saami communities in identifying, interpreting, and defining their consent to circulate. Finally, the aim is to make the Saami communities, languages, and cultures more visible and factual across Wikimedia projects. It is vital that we are respectful and do not perpetuate colonial discourse in the process.

It is important to build trust and plan the activities together with the Saami communities; this is a slow process and will take time. To work towards the goals above, we are gradually creating a diverse network of individuals and organisations for long-lasting collaboration. For example, we have identified mutual interests with Yle Sápmi and the Saami Archives. We have also discussed ways that we, as outsiders, can support the communities in attaining their self-determined needs and goals and if they would like us to be their ally in the process. One point that was made was that it would be more acceptable for us to help out with language-related content than it would be for us to interpret their cultural heritage items or representations. For this reason, our work capitalizing on the myriad linguistic features of Wikimedia projects is already in full swing while we are still dipping our toes in the water with workshops for Saami culture images.

Northern Saami Wikipedia

One of the Northern Saami articles created as part of a university course
The Northern Saami Wikipedia's Facebook post celebrating the 15th anniversary of the founding of the Northern Saami Wikipedia
The Northern Saami Wikipedia's Facebook post celebrating the 400th anniversary of the first book in Saami
The Northern Saami Wikipedia's Facebook posts in Northern Saami, Inari Saami, and Skolt Saami (from left to right) celebrating being able to finally use Inari and Skolt Saami in Wikidata

The current goal for the Northern Saami Wikipedia is to increase the amount and quality of material by raising awareness of its existence.

One way of both increasing the number of articles and improving their quality that we are exploring is by working with the universities where Saami is taught. For example, one project member in Finland has been creating and translating articles in Northern Saami as part of her university courses. While we expected the feedback she received to directly impact the quality of the articles in the Northern Saami Wikipedia, we did not anticipate the impact it would have on other projects. For example, the feedback she received about unclear passages in the translations was also used to improve the quality of the original articles in the source Wikipedias. In addition, the words, terms, and phrases that she found missing from dictionaries when translating will be forwarded to the appropriate parties for improving Finnish-Northern Saami-Finnish online reference sources, which will eventually trickle down into printed reference sources, too.

So far in 2019, we have raised the community’s awareness of this Wikipedia by bringing this information to where the community is with the resources we have, i.e. mainly to Facebook on the Northern Saami Wikipedia page, in Wikimedia Finland events, and increasingly on Wikimedia Finland’s page too. On the Northern Saami Wikipedia’s Facebook page, we have posted in Northern Saami about edit-a-thons, Women in Red events, the top 10 most read articles of 2018 and various weeks, photos and other material related to the community, Wikimedia Finland’s meet-ups at the new central library Oodi in Helsinki, new articles, the advocacy work that we are doing to be able to use Northern Saami in other Wikimedia projects, etc. For example, this year we celebrated the 15th anniversary of the Northern Saami Wikipedia and the 400th anniversary of the first book written in any Saami language by posting about it on Facebook.

When a post concerns the Inari and Skolt communities, we also try to post in these closely related languages on Facebook and elsewhere, since these languages do not have full-fledged Wikipedias of their own and no Facebook page for their incubator projects. One of the most popular posts on the Northern Saami Wikipedia Facebook page is actually in Inari Saami. In that post, the community recommended notable members of their community that we did not know about and for whom we do not yet have articles. For this, we are extremely grateful and we will be creating articles on them as soon as possible.

Wikidata and Structured Data on Commons

Commons description capabilities for the Saami languages as of June 21, 2019

Inari and Skolt Saami are spoken by approximately 250 people, so it was not deemed feasible to try to move their Wikipedia projects out of the Incubator or to try to translate the user interface of Wikimedia projects into these languages. Instead we decided to focus on doing what could be done with the resources at hand: describing and captioning the content of images in Commons and creating and translating concepts in Wikidata for these three languages.

This, however, appeared to be a pipe dream, as it was not actually possible to use these languages in several Wikimedia projects due to long-standing language policies. Once we realized this, we quickly created a parent task in Phabricator and subtasks to rectify the issues we had found to date. Some of these were resolved at the Wikimedia Hackathon in Prague while others are still being solved. For example, we can now use these three languages in Wikidata as well as most of the other Saami languages too, thanks to the help of a wide group of people involved, from hackathon participants to the Language Team.

In the process we discovered that language definitions are scattered around the projects. While a language gets added to one project, it is not added to the other projects. To make things more complicated, there are several different manifestations of languages within one project. For example, two places in Commons where we need to be able to use the Saami languages are the caption box and the summary/description box and adding a language resulted in it only being useable in one of these places. At the beginning of June, the only available Saami languages were Southern and Northern Saami. But progress has been made, since the caption box and summary/description box can now be used with Inari and Skolt Saami too. The rest of the Saami languages are still not recognized, although we have recently created a joint Phabricator ticket to be able to add linguistic material in two of these languages and the Finnish Romani language too. For the summary/description box, the same four languages have their actual names displayed while the rest are only displayed as their ISO 639-3 code.

Some of the Saami languages do not have a standardized orthography. To address this issue and to make available the abundant linguistic materials related to the Finno-Ugric languages, we also initiated the creation of a Wikidata property for Uralic Phonetic Alphabet transcription.

These open issues have encouraged discussion about what members of better-resourced language communities can truly demand of under-resourced language communities before their languages can be used in Wikimedia projects. In addition, the point has been repeatedly made that there are many language communities around the world that do not have the resources that languages like English, Russian, Spanish, etc. do and that is more than possible to have a wealth of linguistic data for languages without these languages having a written form beyond that used by linguists. More work still needs to be done on finding new ways of incorporating non-written linguistic data into Wikimedia projects.

Saami tagging in action

On June 5, Wikimedia Finland had a workshop to test out the workflow of captioning historical Saami images in Wikimedia Commons and tagging them with the relevant concepts in the Saami languages. The work made us aware of incorrect nuances and actual mistakes in the descriptive texts that had been imported from the GLAM sector. Therefore, this kind of work can only happen in collaboration with representatives of these cultures to be both successful and ethical. Next, we practiced creating translations for Wikidata items. There were completely new contributors, so we went through all the basics and principles of Wikidata during the event. Admittedly, this is quite a lot to consume, and we must see how we can facilitate the different aspects of the workflow without overwhelming participants with excess information in the future.

At the workshop, Pia Virtanen from the Finnish Broadcasting Company Yle presented their tagging process and Jarmo Saarikko from the National Library displayed a collection of language sources to help import Saami terminology into Wikidata. This collection was then further expanded by workshop participants.

Decolonising the Commons

Together with Open Knowledge Finland, we are investigating how the indigenous communities could take control over the distribution of visual representations of their culture. This goes beyond copyright and personality rights. We are interested in researching the use of Traditional Knowledge labels as part of this activity. The work is just getting underway and the communities are still forming. We will continue our workshops with the Saami Archives; these could be part of this collaboration.

Tagging Wikidata concepts enables tagging with Saami languages

Tagging with multilingual Wikidata enables every language available in Wikidata to be used. Our efforts advocating for Skolt and Inari Saami have ensured this is possible for these languages. As the Finnish Broadcasting Company Yle currently tags with Wikidata concepts, we wish to develop a content tagging program with Yle Sápmi and other regional institutions.

Representation of the Saami cultures in Wikimedia projects

Beyond the Northern Saami Wikipedia, we aim to ensure the quality and veracity of material concerning the Saami from articles to images. For example, images of the fake “Saami” world sold to tourists that are often added to Commons are being removed to a separate category instead of being included in categories about the Saami. We then use these images to educate editors on other Wikipedias about the difference between the real Saami culture and the fake version of the tourist industry so that the Saami are not misportrayed in our projects; this fake version is still prevalent in many projects and we will continue our educational outreach.

We have also made a conscious effort to ensure that these cultures and languages can be used in multiple projects in multiple ways. For example, when Wikimedia Finland’s weekly competition about women was held at the beginning of March, various articles were written about Saami women in Finnish and many of these articles were also translated into English. A month later, Wikimedia Finland held its version of the Women in Red competition and Northern Saami, Inari Saami, and Skolt Saami were included for the first time. While only 14 articles were created or modified in Northern Saami, zero in Inari Saami, and two in Skolt Saami, this is still 14 more articles than we had before the competition in these languages. In addition, we hope to be able to continue this tradition next year and for years to come.

Another goal is to ensure that any article about the Saami created in Finnish or one of the Saami languages is promptly translated into English, as a lot of editors translate articles from the English Wikipedia for their own Wikipedia. This allows the up-to-date and accurate information in these articles to reach people who would not otherwise have access to this information or who would only have access to old or unsound information about these communities, their languages, and their culture.  

Cross-border collaboration

WMFI's and WMNO's joint presentation at Celtic Knot 2019

As Northern Saami is also spoken in Norway and Sweden, we have worked and will continue to work closely with Wikimedia Norway (WMNO) on projects for this language community to maximize our results while working with the minimal resources we have. For example, we now have Wikimedia Finland logos in Northern Saami, Inari Saami, and Skolt Saami thanks to Jon Harald Søby of WMNO. In July, we gave a joint presentation together with WMNO about the Saami Languages and the International Year of Indigenous Languages at the Celtic Knot Wikimedia Language Conference. We have also started to sketch out ideas for collaboration with Wikimedia Sweden.

What we have been doing in the Saami project extends far beyond our own borders, as what we have been doing and will be doing is being watched by other under-resourced communities due to being publicly discussed on Phabricator and at various events. In particular, the process of adding Inari and Skolt Saami to Wikimedia projects is being closely monitored by other language communities who do not have the resources to create or maintain a full-fledged Wikipedia in their language or translate the user interface for the other projects, but would have the resources to caption photos of their own community in their own language, to create lexemes in Wikidata, etc.