XenonStack

6/04/2018 03:53:00 pm

Unit Testing, TDD and BDD in Machine Learning and Deep Learning

Introduction to Test Driven Development (TDD)

A pattern built for development in performance testing is known as Test Driven Development. It is a process that enables the developers to write code and estimate the intended behavior of the application.
The requirements for the Test Driven Development process are mentioned below-

Detect the change in intended behavior.
A rapid iteration cycle that produces working software after each iteration.
To identify the bugs. If a test is not failing, but still a bug is found, then it is not considered as a bug, it will be considered as a new feature.

Tests can be written for functions and methods, whole classes, programs, web services, whole machine learning pipelines, neural networks, random forests, mathematical implementations and many more.

You May also Love to Read Overview of Artificial Neural Networks and its Applications

Test Driven Development Lifecycle

The TDD cycle enables the programmer to write functions in small modules. The small test modules consist of three sections that are described below -

Failed Test (RED) - The First step of TDD is to make a failure test of the application. In terms of Machine Learning, a failure test might be the output of an algorithm that always predicts the same thing. It is a kind of baseline test for Machine Learning algorithms.
Pass the Failed Test (GREEN) - After writing the failed test, next move is to pass the written failed test. The failed test is divided into a number of small failed tests and then tested by passing random values and dummy objects.
Refactoring the Code - After passing the failed test, there is a need to refactor the code. To implement the refactoring process, one must keep in mind that while making changes in the code the behavior should not be affected.

If the developer is adding special handling feature in the code such as an if statement, the code will no longer follow refracting process. If while refactoring the code, the previous test alters then the code has to pass the test process cycle.

Acceptance Test Driven Development (ATDD)

ATDD stands for Acceptance Test Driven Development. This technique is done before starting the development and includes customers, testers, and developers into the loop.
These all together decided acceptance criteria and work accordingly to meet the requirements.
ATDD helps to ensure that all project members understand what needs to be done and implemented. The failing tests provide us quick feedback that the requirement is not being met.

Advantages of Acceptance Test Driven Development

As we have ATDD very first, so it helps to reduce defect and bug fixing effort as the project progresses.
ATDD only focus on ‘What’ and not ‘How’. So, this makes it very easier to meet customer’s requirements.
ATDD makes developers, testers, and customers to work together, this helps to understand what is required from the system

Importance of Test Driven Development in Machine and Deep Learning

Many times, the code doesn’t raise an error. However, the result of the answers won’t be as expected or the other way around the output we get is not exactly what we wanted.
Let us assume that we want to use a package and we start to import the same. There is a chance that the imported package must have already been imported and we are importing it again.
Therefore, to avoid such a situation and we want to test if the package we wanted to import is already imported or not. So, when we submit the whole code to the test case, the test case should be able to find if the package is already imported or not. This is to avoid duplication.
Similarly as above, when we wanted to use the pre-trained models for predictions, the models sometimes will be huge and we want to load the model only once and in the process, if we load multiple times, the processing speed gets slowed down due to occupying more memory which actually is not required. Even in this case duplication has to be avoided.
Other cases that we could look at are the sufficient conditions. If we create a function, the function will take in an input and returns an output. So, when we use the concept of necessary and sufficient conditions, we’re interested in knowing the sufficient condition to say that the function is working properly. To give an example of a necessary condition, each step in the function should be error free.
If we create a function and on giving the input if it raises an error for ex: indentation error, the function is not well defined. So, one of the necessary conditions is error free steps. But, if the function runs successfully and gives an output, does that mean we have the correct answer?
Let’s say, we have two functions in a package, addition, and multiplication but the developer has actually given the code of addition for multiplication and vice-versa(a typo while defining the function).
If we use the function directly we will get the result; we won’t get the expected results. So, we could create a test case where given any two known inputs and the known output, if not one, for a few test examples, we can set the condition saying if all the test cases pass, then the given function is correct.

Simple Testing Module in Python

First of all, simple testing module implemented in Python is described which is further used for TDD in Machine Learning and Deep Learning.
To start writing the test, one has to first write the fail test. The simple failing test is described below -
In the above example, a NumGues object is initiated. Before running the testing script, the script is saved by the name which is ended with _tests.py. Then move to the current directory and run the following command -