1 Motivation
At rOpenSci we work to transform science through open data, software and reproducibility. We provide community support, standards, and infrastructure for scientists and research software engineers working in R to develop, maintain, and publish high-quality open-source scientific software. In addition, we develop and maintain high-quality documentation and resources to support these activities.
We also believe that science and the research software we create to support it, should serve everyone in our communities, which means it needs to be sustainable, open, and built by and for all groups. Currently, however, there is a dismaying lack of diversity in scientific and open source communities in general. This lack of diversity is potentially detrimental to the sustainability, utility and productivity of projects.
rOpenSci is carrying out a series of activities and projects to transform science to ensure that it serves everyone in our communities. One of these projects is our multilingual publishing project.
1.1 Language barrier
English is the lingua franca for science creating a significant barrier for non-English speakers wanting to join the field. It is also the predominant language of open source in code, content, and community interactions making language access one of the environmental barriers to equity in open source.1
The UNESCO’s recommendation on open science highlights the need to overcome language barriers in order to achieve the open science core values and guiding principles of Equity and fairness, Diversity and inclusiveness, Equality of opportunities, and Collaboration, participation and inclusion2.
A recent study3 quantified the consequences of language barriers on the career development of researchers. The authors found that non-native English researchers need:
Up to 91% more time to read English papers.
Up to 51% more time to write papers and their papers will be rejected 2.6 times more often. If accepted, they’ll have to revise it 12.5 more times.
When they present in English, need up to 94% more time to prepare their presentations. They will often avoid English speaking conferences and oral presentations.
Furthermore, the Linux Foundation report states, “English proficiency is a metric by which performance and personality can be judged”4.
These studies also highlight that “the magnitude of the disadvantage seems far beyond the level that can be overcome with individuals’ efforts.”5
By developing the technical and social infrastructure to publishing multilingual resources and publish our own resources in several languages, we can lower these barriers by increasing access to knowledge and democratizing access to quality resources, thereby increasing the potential for individuals to contribute to software and open science projects.
1.2 Community driven localizations and translations
Community-driven translations and localizations are efforts by community members to create resources in a language of their choice. The community organizes and agrees on different aspects of localization projects6.
The Spanish-speaking R community has been very active and growing in recent years and has undertaken various translation activities for technical materials such as books, cheat sheets, guides, and datasets.
In 2017, several Latin American R-Ladies began translating the R-Ladies’ Code of Conduct and their Rules and Guidelines into Spanish.
In 2018, the R community in Latin America collectively translated the R for Data Science book into Spanish. This included the translation of all the data sets used in the book, which were compiled in the datos package, making it an excellent tool for teaching.
The community continued with the translation of Teaching Tech Together and contributed to the translations into Spanish of the Posit Cheatsheets, The Carpentries’ and The Programming Historian lessons.
Driven by this active and growing Spanish-speaking community, rOpenSci successfully piloted our first Spanish-language peer review, where the submission, reviews, and editorial responses were in Spanish.
These works and previous community experiences created the conditions for us to start our multilingual publishing in Spanish. Spanish is the second most-spoken native language in the world and is one of the most geographically widespread languages, being an official language in many countries7.
1.3 Building community
At rOpenSci, we understand review as a way of building community and hope our review process in Spanish and Portuguese, will allow us to continue building the community in regions that speak these languages, to increase the number of contributors, and to get feedback on how our tools and processes can be improved to better serve these community members.
In addition to using our material to learn how to contribute to open source as a developer, maintainer, reviewer, or editor, people can also contribute through translation. This type of contribution is a good way to engage in open source and is recognized as valuable by the community.
We also expect that the multilingual project’s documentation and tooling will be useful for extending this effort to other languages and for other communities and projects undertaking translation efforts.
1.4 References
Hilary Carter and Jessica Groopman, “Diversity, Equity, and Inclusion in Open Source: Exploring the Challenges and Opportunities to Create Equity and Agency Across Open Source Ecosystems”, foreword by Jim Zemlin, The Linux Foundation, December, 2021, https://www.linuxfoundation.org/research/the-2021-linux-foundation-report-on-diversity-equity-and-inclusion-in-open-source↩︎
UNESCO. UNESCO recommendation on open science. Paris, France; 2021.↩︎
Amano T, Ramírez-Castañeda V, Berdejo-Espinola V, Borokini I, Chowdhury S, Golivets M, et al. (2023) The manifold costs of being a non-native English speaker in science. PLoS Biol 21(7): e3002184. https://doi.org/10.1371/journal.pbio.3002184↩︎
Hilary Carter and Jessica Groopman, “Diversity, Equity, and Inclusion in Open Source: Exploring the Challenges and Opportunities to Create Equity and Agency Across Open Source Ecosystems”, foreword by Jim Zemlin, The Linux Foundation, December, 2021, https://www.linuxfoundation.org/research/the-2021-linux-foundation-report-on-diversity-equity-and-inclusion-in-open-source↩︎
Amano T, Ramírez-Castañeda V, Berdejo-Espinola V, Borokini I, Chowdhury S, Golivets M, et al. (2023) The manifold costs of being a non-native English speaker in science. PLoS Biol 21(7): e3002184. https://doi.org/10.1371/journal.pbio.3002184↩︎
Yanina Bellini Saibene and Natalia Soledad Morandeira. Multilingual Data Science: Ten Tips to Translate Science and Tech Content. Chapter at Our Environment. A collection of work by data designers, artists, and scientists. ISBN:979-8-218-20191-3. http://datasciencebydesign.org/blog/multilingual-data-science↩︎
List of languages by number of native speakers. Accessed on December 1, 2022. https://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers↩︎