RELAIS (Record Linkage At Istat) is a toolkit for record linkage. RELAIS allows combining techniques for each of the record linkage phases, so that the resulting workflow is actually built on the basis of the requirements of the application at hand. More specifically, the RELAIS toolkit is composed by a collection of techniques for each record linkage phase that can be dynamically combined in order to build the best record linkage workflow. RELAIS has been implemented in Java and R and has a database architecture (MySQL).
-1) Reading of input files in textual format; 2) Creation of the search space of the pairs candidate to link by means of the cross product, blocking method, sorted neighborhood method and nested blocking method; 3) Choice of the matching variables; 4) Data profiling; 5) Set of comparison functions; 6) Probabilistic record linkage (Estimation of the Fellegi and Sunter model parameters via EM (Expectation-Maximization)); 7) Deterministic record linkage; 8) Reduction from N:M to 1:1 matching solution (with several methods)
The very next functionalities that we plan to add are:
1) Preprocessing (character conversions, schema reconciliation, standardization, etc.);
2) Enhancing commercial relational dbms support;
3) Improvement of GUI functionalities for output management and user interactions.
It is a framework consisting of several record linkage techniques and it has been thought so that new techniques can be added to the pool already available.
Public administration reference
RELAIS is a project developed at the Italian national institute for statistics (Istat). The current release of the system is also published on the Istat web site at: http://www.istat.it/strumenti/metodi/software/analisi_dati/relais/.