AI In Testing – The art of balancing logic and data

“It is a capital mistake to theorize before one has data. Insensibly, one begins to twist facts to suit theories, instead of theories to suit facts.”
― Sir Arthur Conan Doyle, Sherlock Holmes

The foundation of work in the pre-industrial world was the product and validation of the same through test cases. In the industrial world, it was automation, specialized tools, and skills required to execute test cases in a production process. In the quality engineering world, it is automating the whole process right from requirements, design, and testing.

Logically, the foundation in the next phase will go beyond engineering and intelligently using data to meet business outcomes.

Let us consider a business problem with which everyone in the testing community is familiar: “How can I go to market with my product quickly and in a most inexpensive way and how can testing help?” The first answer is: “I should automate all my test cases.” The next thought is: “But, automating all my test cases is expensive.” The next question is: “What should I test or automate?”

To answer that, I need to make decisions based on data. Artificial intelligence can potentially help in analyzing this data and filtering it with appropriate techniques, thus enabling decision making.

The result though will depend on the genuineness of this data. My data in this case comprises requirements, design, code, historic test cases, defect data, and data from operations. Depending on the availability of data, one can use AI techniques to decide what to test or what to automate, achieving speed without impacting quality. Mentioned below are eight use cases to consider in this scenario:

Automated unit testing – If you have good development data, meaning code that is managed with appropriate software configuration management practices, then code analytics algorithms can be leveraged to automate unit tests.
Automated API tests – If you have robust design data, for example a microservices architecture, then algorithms can be written to automatically generate API tests.
Automated behavior-driven development scripts – If you have user stories written in a simple English format, then Natural Language Processing Algorithm parsers can convert them into gherkin formats and create automated tests.
Automated test data generation – If you can monitor data from production then synthetic test data can be automatically generated by applying regression algorithms to historic test data.
Defect analytics-based optimization – If you have clean historic defect data, then correlation algorithms can be used to determine which features in the application are most defect-prone and focus on executing only those test cases.
Regression optimization and automation – If you have well defined test cases that are mapped to defects and code, then natural language processing algorithms in conjunction with random forest algorithms can be used to determine which tests are most important to execute and automate.
Test case optimization based on Production analytics – If you can get data in the form of user behaviors from production, then what scenarios are most important to test can be determined by using unsupervised learning algorithms.
Performance testing – If you can access performance through operational logs, then regression algorithms can be leveraged to predict performance bottlenecks and benchmarks.

In conclusion, intelligence (natural or artificial) needs good data. Artificial intelligence will support the testing community by augmenting data with theories. The ability to leverage these theories and make decisions with judgement and reasoning will define success.

AI In Testing – The art of balancing logic and data

Capgemini

May 4, 2020