What about open source software in all this?

14 November 2018 / by Bastien Guerry, EIG Link

Open source software and public administrations

Open source software is software that guarantees four freedoms for everyone: that of running the software as they see fit; that of analysing the program's operation (in particular by accessing its source code) and customizing it to their needs (by changing the source code); that of redistributing copies of the program; that of improving the program and redistributing copies of improved versions.

These "four freedoms" of open source software are the pillars of the movement with the same name, launched in the 1980s by Richard Stallman, then a computer scientist at MIT, who mobilized the hacker community to write an open source operating system called GNU ("GNU is Not Unix"). The movement gained momentum with the arrival of the Linux kernel in the early 1990s, which allowed the GNU open source system to run on ordinary computers.

GNU+Linux

You knew about the penguin, the mascot of the Linux kernel, but did you know about the wildebeest, the symbol of the GNU is Not Unix project?

In the 2000s, this movement spread in two ways: on the one hand with the emergence of "open source" products, which sought to emphasize the practical and economic value of open source, while at the same time avoiding the movement's political aspects; on the other with the spreading of open source ideas to fields other than information technology alone: this involved the launching of the Creative Commons licenses by Lawrence Lessig and the Wikipedia free encyclopedia by Jimmy Wales and Larry Sanger. The next two decades would see "open" initiatives in other areas, in particular open access (for the open access publication of research articles) and open data (for the opening up of public data).

Today, open source software is represented by standards well known to the general public such as the GNU/Linux system, the Firefox web browser, the VLC reader - and many others.

What's this got to do with administrations? They are large consumers of software: efforts are regularly made to encourage the use of open source software, at least since the Ayrault circular of 2012, with varying degrees of success. Administrations are also producers of open source software. The Law for a Digital Republic of 7 October 2016 clearly states that the source codes of software used by a public body are administrative documents coming under the open data regime. As such, any citizen may ask to be provided with the /source code / of software ordered by administrations.

A recent example of a government-led open source project is the Clip OS distribution recently released by the ANSSI (National agency for information systems security).

Overview of the security-oriented Clip OS distribution published under free license by ANSSI - © NextINpact

These outreach efforts are just beginning: many administrations are not familiar with open source software; many projects are still developed without proper knowledge of the issues, constraints and potential benefits of the open source code approach.

The "Public interest entrepreneur" program assigns an important role to open source software. Follow me to see this in detail.

You can learn how to use open source software!

Administrations produce open source software either by ordering it from a service provider or by developing it themselves. Few administrations are able to mobilize resources to develop products in-house. The interest of the EIG program is to make technical skills the focus of services and we have seen the benefits of this approach: public servants are involved in the statement of requirements and in improving products on a daily basis, EIGs have the satisfaction of testing their products at an early stage, and public servants and EIGs form complementary teams dialoguing constantly.

We started from the idea that EIGs, whether designers, developers or data scientists, have heard of open source but don't necessarily know what it actually means: "open source" often means that a software library can be used free of charge.

To go further, we carried out some collective teaching. First, by organizing two workshops open to EIGs and mentors, one in March during a coaching session and the other at DINSIC a few weeks later, with people from other administrations, to go further into licensing issues and to take ownership of the government's policy for contributing to open source software, published last May.

The Government's open source software contribution policy, published in May 2018, which provides the framework within which government agencies can contribute to the open source software ecosystem.

I also wrote and shared an introductory document about open source software, maintained a Frequently Asked Questions section, published a one-page mini-guide in PDF format and occasionally answered questions and help requests.

What are the first observations after ten months of progress in the subject?

  1. Yes, EIGs may not know what open source software is and may believe that they have the right to copy code found on Github without looking into licensing issues…
  2. Yes, open source licenses are complicated, but in fact, with a little clarification at the right time, it is never a sticking point.
  3. The approach seems natural for all EIGs: none of them complained about having to follow it.
  4. There is still a lot to be done to make mentors feel comfortable with the subject, a lot of concepts to be explained and a lot of doubts to be removed about the real interest of all this.

EIGs use mainly open source software

What are the open source tools/software/frameworks used by EIGs?

Loosely speaking: angular.js, antizer, apache airflow, apache hive, atom, babel, bootstrap, bulma, chart.js, cider, clojure, clojurescript, d3.js, elasticsearch, emacs, embulk, flask, git, jupyter, laravel, leaflet, mongodb, neo4j, postgresql, pyspark, python, r, react.js, redash, rstudio, sass, scala, scikit-learn, tensorflow, tornado, vim, visual studio code, vue.js, webpack.

That's a lot! It's the reflection of a de facto situation: it is nowadays impossible to develop a software project without using one or more open source products, either as a development tool or in the product's software "stack".

The most popular products included on Github - © Github

On the proprietary software side, there are only three: the Sublime Text editor, the Adobe suite and the Vertica database. The other proprietary tools EIGs have to use are the ones already available in their administrations (in particular Oracle databases).

EIGs have produced open source software

But EIGs do not only consume open source software programs, they also produce them! Several categories: complete applications, software libraries, generic tools, scripts and other ad hoc tools.

Applications include:

  • Open Scraper: an open source tool for retrieving data from several websites at the same time and structuring the resulting data.
  • Gobelins: a distribution and search tool for Mobilier national's collections.
  • Stalactite: a tool for viewing, classifying and processing a tree structure containing all types of documents (e-mails, images, documents, presentations, etc.).
  • Graph Explorer: a tool for viewing and exploring a large graph of financial transactions.

An overview of the Graph-Explorer interface

Note the efforts made to communicate well regarding these projects: writing a good README is an essential step in producing open source software. Graph-Explorer, for example, guides the user step by step through the application's installation and testing, increasing the potential for reuse.

Generic tools include metadocs which is used to include several Sphinx documentation projects, Open API Schemas to Markdown which allows Markdown documentation to be generated from schemas in accordance with Open API specifications and spacy-lefff, a package for lemmatization and detection of the nature of a word in French.

Animated gif presenting the metadocs tool, used to include several Sphinx documentation projects.

Libraries include H3.Standard to provide binding between C Sharp and the C library developed by Uber for geospatial indexing based on hexagonal breakdown.

And finally, a few tools: a small Twitter bot in Clojure, an Org-mode export module to an HTML rendering in the form of a frieze, a familiarization tool a python application for the backend and vue.js for the frontend, a csv2html mini application to publish csven datatables, a library for locating public holidays in France and another one for school holidays … there is something for everyone! All this just needs to be tested, debugged, used… and to receive your contributions: it's open source!

EIG and the open source ecosystem: learn and/or share

Being "open source" also means participating in communities that discuss, share and learn together.

Those EIGs who wanted to do so publishedseveral technicalblog entries, others asked their questions on Stackoverflow, and others still contributed to existing open source software. And above all, we have sometimesreceived help from people outside EIGs, and that's great!

These exchanges of knowledge take EIGs beyond their field of competence and their comfort zone. Some of them thus participated in the day organized with the Framasoft association around the writing of a Storify clone; while others dabbled with the free database Wikidata during a workshop in Etalab around the publication of data.gouv.fr data; while others finally discovered Wikimedia projects during a day's visit to Mobilier National's workshops and to the Wikimedia commons. All opportunities to meet and cooperate with important "open source" actors.

EIG designers helped design a clone for Storify during the workshop undertaken with Framasoft.

The aim was to move beyond the strictly utilitarian standpoint sometimes observed with reference to "open source" and meet communities that are involved in various aspects of this open source culture.

How can projects be kept open source?

Known open source software are common digital assets. "Common assets" suppose three aspects: a shared resource (in this case the source code), a community to maintain it, and the governance rules that this community establishes.

Open source software programs produced by EIGs are not yet such common assets, as they are maintained by a small team and external contributions are not large enough for the question of governance to have arisen. But that leaves two problems unsolved: how to maintain them over time? why and how to turn them into digital common assets?

The first problem already arises for administrations. The EIG program has seen two ways of approaching this: the first is to increase the skills of the in-house technical teams to allow them to become power users of the application, or even be able to debug and upgrade it; the second is to invest in the creation of a collective body whose task is to fund the future changes in the software. This is the case for the Open Scraper software, which is potentially useful for administrations other than the one that started its development within the EIG program.

It should be noted here that these issues go beyond the framework of EIGs and arise for all open source software. In 2017, Framasoft launched the Contributopia campaign to draw users' attention to the importance of contributing to and putting these common assets on a permanent footing.

Administrations are an interesting environment in this respect, as they are in a position to invest in the development of stable pooled resources. The required change of culture is twofold: move from being a simple consumer of open source software to a contributor, and from a contributor to a maintainer of a digital resource with its own governance, shared by a community extending beyond the boundaries of the contributing administration (the OpenFisca project is a good example of this). These changes have a cost and they will not happen spontaneously. The EIG program shows one way of looking at them: by putting the energy of committed developers at the heart of administration departments.

If the question of how to produce and maintain open source software in administrations interests you, meet in Paris on 6 December at the Paris Open Source Summit where DINSIC is organizing the first meetings of open source software in administrations.

We'll keep you posted!