My feedback from testing the Ponicode IA powered test generation tool. Recently, the company PoniCode has attracted the attention of some investors and consequently the attention of media and the developers from the tooling ecosystem / Software Craftmanship. As you may know, I am a modest Software Craftsman practicer and I created of some test generation and quality tools in my career.
I decided to give a shot and see whether the Ponicode product deserves praise.
- 1 About the Ponicode Company
- 2 Installation
- 3 The Admin UI
- 4 The tool in itself: The test for dummies
- 5 Conclusion and personal feeling
About the Ponicode Company
To present the company, I extracted the following quote from their website :
The creation of Ponicode is the vision of a great company which is loved by its users and its developers.
Our ambition is plain, to unleash the creativity of millions of developers and allow them to imagine the code of tomorrow.
Ponicode raises €3 million in seed funding to continue developing Artificial Intelligence for improved code quality
More empirically, the company has developed a tool which aims to automate the generation of unit tests (more precisely non-regression and boundary tests), and which is supposed to use artificial intelligence to achieve these goals.
The installation is quite straightforward and I almost did not meet any issue. I used my Github login to obtain Ponicode access and I installed the Visual Studio code plugin.
I stumbled upon their access code since it did not pop up in my VsCode. As a reference, something like this should appear :
The Admin UI
The Admin UI ( or my Account page as you may call it) does not bring much value, it displays some documentation links ( but the Slack channel was missing on my first test, thanks to having added it 🙂
One important thing that I suggest to the Ponicode team, is to become GDPR compliant as soon as possible. As a young company storing data, a data breach could kill their business and there is no way to ask for a Data deletion. I should have a form or a button :
- to request my whole personal data that the plugin has collected
- to request its deletion
- eventually a link with a GDPR notice and a friendly explanation about what is stored.
Like many coders, most of our professional source code is submitted to licenses, intellectual property, and trademark and cannot be indexed even by AI Cloud tools.
The tool in itself: The test for dummies
I won’t explain how the tool is working since I don’t have many places in the article, I would rather show you my experiment, result, and questions. For more information about Ponicode, please go to their (good) documentation pages.
My first attempt to use the tool has been to test the following functions :
You may download my code example there.
The method is basic, it is producing a simple addition of two values.
How would you test such a method? As a coder, probably the most basic test is to pass numerical values ( integer, float) and check the result.
Another more complicated test could be to test boundaries issues ( overflow, negative signs, and so on).
Here are the tests generated by Ponicode :
The initial data prediction for the arguments is pretty bad. The output of the function seems to not be evaluated based on the code and the name of the function ( as we could try to obtain with deep learning networks, see for more information Facebook recent projects).
I supposed my parameter name is poorly named since the offered values are not matching the function purpose. I tried to rename them to see if the result was different :
Bingo! I received better values but the context insensitive approach to guess the argument values is a bit disappointing.
The red signal on the right of my screen indicates that the unit tests cannot be run, I could not figure (quickly) why, since I did not have any log.
I would recommend that Ponicode check the package.json of my project, and eventually suggests some modifications if they require some installed dependencies. I didn’t want to browse the whole documentation to figure why the tests are not launched ( honestly I could try to figure out but I don’t think the basic customer would do it)
My first attempt was disappointing and I could not figure out how to re-launch (retrain) the prediction based on some additional scenarios to enhance the argument data prediction.
Function full name()
My second attempt is a little better with the really obvious fullName() method:
I also fixed the JEST issue with my project by copy/pasting the demo project configuration code.
In this case, I think the method may be sufficiently tested with the offered values. However, the boundaries are not tested.
The first prediction for the arguments is half correct. It provides me a String with some digit, but the goal to obtain a valid phone number is not reached.
I provided a real working value and the values got a bit better but the data generation tool is still not understanding the concept of “Phone Number”.
The generated test is also disappointing as you can see in the following GIST. The test case is the one I wrote. Nothing more.
The next test is using a bit of logic to see if Ponicode will generate several use-cases per branch. A static automatic code generation tool can use control flow analysis and potentially data flow analysis to compute the number of variations required to completely cover my test ( much like PIT tool). The test case is quite easy to guess and generate.
However here, the prediction is a complete failure. First, we do not have any numerical value. In a second time, we do not have any kind of “cleverness” based on the input/output of the test to compute the number of required test cases to achieve a decent code coverage. A simple but dumb algorithm (no IA ugh) could have been :
Yes, it requires a thin interaction between JEST, the code coverage, and the prediction. However the main advantage, it will detect and offer interesting test cases to synthetically increase the code coverage ( for lazy coders).
The test generator does not work at all if I do not provide any feedback.
Conclusion and personal feeling
I have been toying for some hours with Ponicode. As a disclaimer before jumping to any conclusion, please understand that there is a key feature that I did not use because I think it is more a hack to improve the low quality of the generated tests rather than a feature.
This feature is code instrumentation. Ponicode is able to track the inputs of a method at runtime to generate test cases. In my opinion, it should be a way to improve the data predictor but not a direct way to generate the test cases. After all, if I can instrument my whole code and capture the input/output, I do not need Ponicode to generate my test cases. I have been doing that in C/C++, Java, and so on. You can even do that at the integration level using Wiremock and other SOAP/REST proxies. And I also did it with 3270 protocol on Mainframe Z/Series to capture the packets and replay and automate functional use-cases.
I want to avoid a somewhat simplistic and harsh conclusion about the tool for the main reason that it’s a startup project, a French company and I am glad we still have companies investing in #devtech. At the present time, the tool – in my opinion – is unfinished and improper for real use-cases. The installation, the basic functionalities are there but the outcome is not achieved.
To give a better explanation, I will compare the tool, to an Eclipse IDE plugin (paid plugin but then acquired by Google): CodePro Analitix.
The plugin CodePro Analitix was able to generate non-regression tests for Java language using solely static analysis. Some good scientific papers are comparing different approaches of automated test generation and the different challenges there. and there and also there and there and there. The combinatorial explosion between the possible paths of a method, the possible values requires either Oracles, heuristic, or some shortcuts like code instrumentation. A simple way to reach efficient code coverage with a deterministic code generation approach may be to use generic algorithms where the population is the set of arguments and the output. You introduce mutations using your data prediction.
SALT well-known framework proposed the following approach. We see that they are combining Oracles, events from tests, traces, and static analyses.
To sum up my tryout, I am a bit disappointed with the results and the outcome. In 2008, some scientists were generating automated tests for JS with the code coverage on the figure below. Ponicode tool is for the moment not producing viable automated tests, only capturing my inputs and offering me to generate the test cases based on that with some mutation that is mostly irrelevant. I offer if the Ponicode team is reading my article, a right of answer on my website if they want to give more insights about the Ponicode tool mechanism.
Donate to the blog