RFC 6 Static, Unit and Integration testing

Authors: M. Ubeda Garcia

Last Modified: 15/VI/2012

This proposal comes along with the usage of the Continuous Integration tool Jenkins .

The "raison d'être" of this proposal is the lack of standardized tests at any level in the DIRAC framework. Or with other words, put together all bits of code running tests and launch then on a coherent way. The sooner the errors are spotted in the development / testing phase, the better. Therefore, it requires different approaches and granularities in which concerns tests.

In this proposal there are three testing level described: static , unit and integration . We omit system testing and system integration testing as they will need a requirements specification we are lacking now.

You can find the running prototype used for LHCbDirac here - use your NICE credentials to log. We understand that this prototype is "living" in an LHCb environment. LHCb is willing to volunteer in maintaining such prototypes. Access from members of other VOs would be guaranteed, if requested.

TESTS

Static Analysis

Every developer MUST ensure there are no bugs, typos on the code. That is easily achievable and can be argued that no external tool is needed to do such thing. Certainly true, but as there is a specific DIRAC code convention , all developer must follow it, which it is not always the case.

This step ensures the quality of the code written by developers. More interesting is the check for bad programming practices, such as the usage of mutable objects as defaults in the methods, or the well known 'catch Exception'. Just two examples, but under certain conditions can hide real problems ( or introduce a problem on a perfectly healthy code ). Sometimes small checks like these can spot unimaginable amounts of errors..

The proposed tools are:

pyLint : is a source code bug and quality checker for the Python programming language. It follows the style recommended by PEP 8 , the Python style guide.
clonedigger : aimed to detect similar / duplicated / cloned code in Python programs.
sloccount : set of tools for counting physical Source Lines of Code (SLOC)
cyclomatic complexity : analyzes the linearly independent paths through a program.

The first one ( pyLint ) is already running , and shows there is a LOT of work to do. Putting the others in place is not a time consuming task, but until the first tests are not successful there is no point on moving forward. It shows h ( high ), m ( medium ) and l ( low ) warnings. The first ones are bugs, which must be fixed straight away. The second ones are usually bad coding practices, which can be dangerous. The last ones are warnings about missing documentation mostly.

clonedigger, sloaccount and cyclomatic complexity are not running, neither they have been actively tried out. Possible actions on them will come out as part of this RFC.

Unit tests

Unit tests must be able to run without having a database back-end, a service up and running or an agent running. They test the simplest pieces of code although this is not always easy. Unit tests are a very good indicator of spaghetti code - it there is no way to write a unit test for a function, would be better to rewrite it and save future headaches.

When developing, it must be taken into account that writing tests takes in average as much time as writing the code. ( IMHO ) there is no reason to not write them. Concerning code already experienced, this point is subject to discussion.

Unittests holds hands with mocking and faking code . So far, we lack guidelines to do such thing. Once established, writing the unittests is a piece of cake. If the mocking is done properly, tests will represent a very good description of what the code is doing behind the scenes.

For those unit tests that are already written within DIRAC, the tool used is the python library unittest . Within Jenkins, we have set up the nose tool for automatic run: Python unitests are launched with nose, which is the tool returning a complete report for success / failures. As for mocking, it is done using the mock library , which is already part of the DIRAC externals. While the usage of these tools is subject to formal approval with this RFC, we believe that there should be no concrete reason for changing. The use of unittest, mock and nose is well documented, and no deviations from their standard usage is proposed.

Together with the unittests, within the jenkins prototype we run cobertura , to know which percentage of the code is actually tested. In a perfect world, it should be 100%.

For some LHCbDirac packages the results of nose (and unittest with mock), and cobertura, are Already available and cobertura too.

Integration tests

This is the most complex test by far. It requires a fully functional system ( which includes databases, services and very probably agents running ) and also it must be reproducible, this is what becomes problematic.

In order to ensure it is repeatable, we need the same information on the database ( and CS ). This means, we need a snapshot of the databases and CS at the time integration tests are written. First time will be time expensive to get all data snapshots, in future code modifications it should be the developer the one who updates the test data if needed.

The proposal is to ship the snapshots with the code, populate the test database with them, and run the needed integration tests.

To be decided how / where to be run.

USE CASES

Cron-job mode

As it is now, every hour checks if there are changes in the repository ( svn or git ). If there are, it schedules a new Jenkins job. This job is running static and unit tests agains trunk (SVN) / integration (Git). It is intended to be used by developers to see if their new code looks in good shape or not. If a system does not pass this first test, there is no need to propose a release candidate until it is successful. As of today (14 Jun 2012), the jenkins server is running tests for only this use case.

On demand

Same use case, but can be done on demand through the web portal. It schedules a test automatically for those impatient developers that do not want to wait for the cron-job mode Jenkins job.

Pre-release / release

Once there is a pre-release candidate, a tag is created by whoever is on charge. Jenkins picks it automatically and generates a new Jenkins job for that tagged code. It runs automatically all integration tests ( on top of static and unit test ).

New SW

If needed, commissioning of a new environment can be done with a few clicks. Jenkins can run if needed over a set of nodes, which are configurable to emulate any HW / SW.

IMPLEMENTATION

The proposed implementation follows these guidelines:

test are located on a subdirectory named 'tests' on each system ( e.g. Core/tests )
test names follow the pattern Test_.py ( e.g. Test_DMS_Client_Dataset.py )
no sys.modules redefinition in the tests! Overwrite imported modules with mocked ones if needed.
do not forget about tearDown
feel free to usage fixtures
unittesting & mocking guide by Krzys

to be discussed:

fake subdirectory with proper fake implementation of all modules maintained by the developer. This would simplify the problem of not updating a fake piece of code when the original has been updated. This requires some work and coordination. ( e.g. Test_ModuleA uses ModuleB, and runs with is own fake version of ModuleB. ModuleB is updated, but not the fakeModuleB used by ModuleA on its tests.. if ModuleB provides already its fake version, we avoid false positives ). The problem boils down to the following: if there is a single entry point to the data per system ( clients ) then it is easy to maintain the fake code, it the code is accessed on multiple places, it is a nightmare. Are we in conditions to grant the second ?

NOTIFICATION

The testing framework can notify users if tests are failing via email. This way there is no need to go though the portal and check the status. The proposal is the following:

notify developer when its changes are crashing ( this is used in the cron-job mode ). User A commits some modifications, Jenkins picks this latest code with some regression errors, it notifies User A.
notify developer(s) in charge of the system whenever there is a crash on their code, independently of the source. In this respect, we would need a list of developers for each system. We have such list on an informal way, but why not write it down ? This way, there will be no room for forgotten / unseen bugs.
notify a 'power user', most likely the person in charge of preparing next release. After all, this person needs to know whether the code he / she is about to release is in good shape or not.

RFC 6 Static, Unit and Integration testing

TESTS

Static Analysis

Unit tests

Integration tests

USE CASES

Cron-job mode

On demand

Pre-release / release

New SW

IMPLEMENTATION

NOTIFICATION

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally