Concept
Shared Code Base
Develop routines in a language that can be called from statistical tools, avoiding the need to re-implement any algorithm in the tool’s specific syntax:
- C/C++: compiled language for fast loop operations
- Java: compiled language with large choice of domain-specific libraries available, can be sourced from most tools including web applications or running in a Java Virtual Machine (JVM)
- Scala: recent language inspired by Java; designed for software platforms requiring parallel or concurrent computations
- Python: scripting language with large ecosystem for efficient scietific computation; Python functions can be used in LibreOffice Calc, an open source variant of MS Excel or using system commands
- R: developed by statisticians, geared towards statistics; recently adopted by Microsoft with Microsoft R Server and Microsoft R Open MRO; enhanced packaging features making it easy to develop software for use by community members
Approach
- create functions for use in statistical tools based on actual usage information obtained from the technical review
- interaction with other directorates of the Organisation, e.g. the Statistics Directorate
- establish links to exisiting projects, e.g. “program code management” and “developer user collaboration”
- establish gouvernance and policies for submission and review of algorithms
- proposition to open the developed algorithm library to non-contributing users (covering aspects such as license, server, PAC, etc.)
Guiding Principles
Mission
Making the analysis more natural and convenient
Prime Directive
Making the results more trustworthy
Implementation
Classes
creating classes and methods responds to guiding principles: making the analysis more natural and convenient (the Mission) and making the results more trustworthy (the Prime Directive). The key challenge is dealing with complexity and growth: how to expand the computing capabilities in a way that is easy to use and leads to trustworthy software.