Berlin Technical University and Nextcloud: showing public services how to interact with open source projects
QED. Public services commonly wonder how to interact with fluid communities of open source developers and users. The seven-year relationship between the Technische Universität Berlin and the developer community of Nextcloud, an open source solution for cloud storage, file sharing and file synchronisation, shows how such a relationship can be both easy and mutually beneficial.
A universe of files
At the Technische Universität Berlin (technical university of Berlin, TU Berlin), cloud storage, file sharing and file synchronisation began in May 2012. Since then, the tubCloud service managed by tubIT, the university’s IT department, has grown to about 22,000 users - students, staff and guests of the university.
Every day, tubIT’s Nextcloud implementation manages changes made to some 80,000 out of a total of 100 million files, using about 70 TB of disk space. Up to now this disk space has been managed using General Parallel File System, a proprietary high-performance cluster file management solution that (also) runs on Linux servers. However, tubIT is nearly ready with preparations for this year’s switch to a CEPH, an open source alternative cluster file management system.
A second instance taking up slightly less disk space is part of the cloud research program by the Deutsche Forschungsnetz (German Research Network or DFN). This cloud file storage system is now used by 16 research and higher education organisations who are DFN members.
Staying in control
Preparations for cloud storage began at tubIT in 2011. First, the IT department staff decided they would evaluate only file sharing and synchronisation services they could host and manage themselves. This would ensure that the system complied with German privacy rules and demands placed on contracts by third parties.
Second, enticing the students and staff of TU Berlin to keep their data on the university’s own file sharing service required a user-friendly solution. It also would have to accommodate big files and large file quotas, and allow for strict rules regarding security, data protection and data backup.
A third reason for tubIT to consider only on-premise ’sync-n-share’ solutions is that it wanted to gain and retain open source expertise. “Knowledge is the most valuable good of a university,” says Dr.-Ing. Thomas Hildmann, head of tubIT’s infrastructure department. The department, with its 23 sysadmins, is responsible for the university’s network and server infrastructure and its hosting facility.
A slim fit
In 2011 Dr.-Ing. Hildmann and his colleagues tested several solutions. After a few months they decided to use ownCloud. This file hosting system, originally written by the German software engineer Frank Karlitschek, had first been released under an open source licence less than a year before, yet already was the most user-friendly product available. It also fitted snugly into the existing LAMP environment – Linux operating systems, Apache web server, MySQL relational database management system and the PHP web programming language – and supported LDAP for accessing directory information services.
Importantly, the original successful test with a few thousand users drew attention to how well tubIT interacted with the ownCloud developer and user community.
In May 2017, the university made a smooth switch from ownCloud to Nextcloud. The latter is a fork created by Frank Karlitschek and others following disagreement over a licence change by the new owners of ownCloud.
Over the years, several features of ownCloud and Nextcloud have been developed in tandem with TU Berlin. Early examples were support for Kerberos, a mechanism to allow secure authentication of users, and SAML, an open standard for the exchange of authentication and authorisation data.
Bigger, better, faster, more
The computer scientists at TU Berlin frequently discuss technical solutions with the Nextcloud developers, providing them with empirical knowledge and keeping them up-to-date on academic research. Several computer science students have written bachelors’ and masters’ theses about ownCloud and Nextcloud.
The implementation by TU Berlin was crucial in helping to scale Nextcloud and make it enterprise-ready. This began in 2012 when the university technicians started alerting the developers whenever they found tables that were not indexed – an issue that slows the system down.
Soon after, the university wanted the implementation to deliver faster response times and higher availability. The tubIT team began working on multi-master replication of their relational database management system (RDBMS), MySQL.
Open access policy
Their solution, which for MySQL (or the GPL fork MariaDB) databases is known as a Galera cluster, is not strictly part of Nextcloud, whose users can choose between three different brands of RDBMS: MySQL/MariaDB, Sqlite or PostgreSQL. The developments at TU Berlin brought unforeseen issues to light, however, and Nextcloud development benefited greatly from the university’s help in tracking these down.
“As academics we are happy to try out new solutions,” Dr.-Ing. Hildmann says. The university is ’not a nuclear power plant’, he notes, and is happy to accommodate testing even this means going offline for a few hours. That does not mean that the university accepts downtime, however. Maintenance tasks that require tubCloud to go offline are scheduled to take place in the late evening hours or weekends.
A running joke, says Dr.-Ing. Hildmann, starts afresh whenever TU Berlin finds a bug, or experiments with new features: “We submit comments and possible solutions to the developers, who then work to get that into working code. And then they will totally rewrite it, but thank us for the idea.”
University staff are allowed to contribute to open source projects. In fact, this is encouraged under the university’s new Open Access Policy, passed unanimously by the university senate in December 2017. The policy recommends that “members of the university publish their work as Open Access and under a free licence (preferably a Creative Commons licence CC BY).” Though source code contributions are not mentioned explicitly, they are covered by the policy, says Dr.-Ing. Hildmann.
The prime example of an open source contribution from the university is Nextcloud’s end-to-end encryption. Encrypted communication is the topic of Dr.-Ing. Hildmann’s 1998 Diplomarbeit (Master’s thesis) at the computer science faculty of TU Berlin (Institut für Kommunikations- und Softwaretechnik), and features in articles he has published: 2001, 2002, 2014 and 2015.
“Frank [Karlitschek, the founder of both ownCloud and Nextcloud] and I have been talking about end-to-end encryption since we first implemented ownCloud at TU Berlin in 2012,” says Dr.-Ing. Hildmann. The topic is not just of academic interest. Some of the university’s contract research, for instance, stipulates researchers to use end-to-end encryption for electronic communication (or fall back to paper documents sent in sealed envelopes).
Encryption is also required by law for documents containing personal data. It is also used for draft research papers, to keep the competition away from closely-guarded research projects, and for research and inventions that could qualify for a patent application.
“I am very happy with the encryption implementation in Nextcloud version 13,” Dr.-Ing. Hildmann says. “The theory looks very solid.”
Nextcloud’s end-to-end encryption allows users to select folders they wish to encrypt, share these with others, and sync them across devices. The contents will never be readable by the server administrators or anyone else who is not authorised.
The university is currently testing the end-to-end encryption functionality. Given its large number of users, testing is a second major way TU Berlin can contribute. Testing lets Nextcloud find performance gaps and determine the system requirements for bigger organisations.
A third great contribution offered by TU Berlin lies in hosting the annual user conference. The university has made room for this meeting six times since 2012, for free. The IT staff happily contribute their time and effort, and help out with conference logistics - including carrying around a fridge to keep fizzy drinks cool in the IT department’s hardware storage room.
The week-long conference includes three days of presentations and practical workshops in six or seven classrooms of the Institute of Mathematics building on the university campus. On the other days, developers get together to talk about free software and politics, work on new functionality, and fix bugs. The conference attracts anywhere between 80 and 150 visitors.
The 2018 conference will take place over 23-30 August. As always, participation is free and registration is not required.
Seven years on, and counting the ways a public service can easily interact with an open source project, TU Berlin ticks nearly all the boxes:
[ ]Join the board
Only one thing is missing: tubIT is not officially involved in the open source project as a member of the board or a steering committee. Nor is it officially part of the process through which the Nextcloud roadmap is laid out. “Searching for open source on the TU Berlin pages returned 40,000 results,” says Dr.-Ing. Hildmann. “I’m sure some of my colleagues at the university are officially involved in an open source project. But I am not one of them. Yet.”
There is an opportunity to tick that final box. Nextcloud is considering an advisory board, to be set up at the conference this summer.