Studies on FS/OSS per country

Studies on country-of-origin for free software / open source

Published on: 21/10/2022
Last update: 03/11/2022

As governments put more effort into working with free software / open source development communities, some might be interested to know how much involvement each country has. Two interesting studies have recently been published on this. The larger of the two treats Europe as a single geographic location and shows a global context. The second study focusses on Europe and breaks everything down to the level of member states and even regions within member states.

Both studies use a "best guess" approach for assigning a geographic location based on email addresses, names, timezone data, etc. The researches acknowledge the limitations of these pieces of information, noting that Europe and Africa share timezones and that today the name "Eric, derived from Old Norse, is more popular in Ghana than it is in France or in the UK". Previous studies have used questionnaires, which give greater accuracy but greatly limit the number of responses and also introduces its own set of response biases.

Global context

The first was published this week, Geographic Diversity in Public Code Contributions, by Rossi and Zacchiroli. This study made a particular effort to gather a maximum amount of data and is based on a total of 160 million projects with 43 million contributors, including code repositories going all the way back to 1971. In addition to discussing the current situation, it also presents evolutions over that 50-year timespan. The study divides developers into twelve word regions, based on a geoscheme described by the United Nations.

The evolutions show that contributions were first big in North America, which dominated until a dramatic surge in Europe from 1993 to 1996, and in recent years the other regions of the world are all increasing.

Contributions per region per year 1971-2020
Ratio of active authors by world zone over the 1971–2020 period.
Source: DOI: Figure 3 (page 3, excerpt)

European context

The second is The Geography of Open Source Software: Evidence from GitHub, by Wachs et. al., published October 2021. This takes a snapshot of contributors who were active in 2021 in projects on the GitHub website, but also offers some comparisons with snapshots from similar studies performed in the past. Data for countries outside of Europe is included, but there is a focus on Europe and one thing that makes this study stand out is that they even try to locate the region within a country where the author is working.

We can see that capital cities usually have a high concentration of contributors per capita. (See map below) We also see that wealthier countries also have higher levels of contributions to free software / open source. But when comparing the wealthier countries among themselves, the correlation is weaker, suggesting that other factors are also important. (Figure 1, page 9)  Estonia and Bulgaria are highlighted as having "more OSS activity than expected".

The paper also contains interesting discussion of patents. The paper has a global focus and software patents are valid in some parts of the world, so this discussion is interesting, even in Europe. It also notes the findings of another paper which observed that an increase in contributions to free software / open source was correlated with "an increase in IT start-ups (9-18% yearly increase vs. the counterfactual) and employment in IT employment (7-14%), and a decrease in IT patents (5-16%)" (Nagle, F., 2019).

GitHub contributors per capita per region in Europe
Source: DOI:10.1016/j.techfore.2022.121478 Figure 2 (page 12)

Other studies

These studies also mention two slightly older ones which might also be of interest: