3 killing Java performance issues and how to detect them
Whether you are a programming professional or a beginner, identifying and fixing a slow code is something highly satisfying. Today, the existence of micro-services, the Cloud and Kubernetes has popularized a somewhat distorted vision of IT, where performance is a secondary concern. Is your service too slow? Create a Kubernetes cluster, add an Amazon instance, or increase your server capacity. At first glance, the solution is quick to implement: no analysis, a solution that works every time.
Why the code performance is ignored and why it is an mistake
On closer inspection, actions based on increasing the server resources provide quick temporary solutions, that are costly in the long term.
The decision to relegate performance problems to a hardware or virtual resource availability seems better suited and more efficient.
Fixing a code performance issue can take several hours or days at a developer’s cost where a server configuration change may take a few minutes in the Cloud.
The first issue with that solution is that the suggested increase in resources will become permanent or forgotten and the annual cost balance will exceed the fix. I had the opportunity to be in a company where servers were rebooted every 20th of the month because three nodes of a cluster failed due to a performance problem. The application had had this defect for several years. The fix took 10 lines of code.
The second issue is when the problematic code exposes a performance issue that is no longer linear but exponential or polynomial. Our previous quick fix is no longer applicable since under certain conditions, the gluttonous performance problem devours all resources and blocks the system.
Our recommendation is to learn the common performance issues you may find in a Java code and how to detect them. With some practice, you will be an even more developer.
Common performance issues in Java
Three types of resources need to be monitored because they are limited: CPU, memory and network.
CPU resources can be exhausted by a code stuck at 100% of CPU or by using too many threads or processes. At the opposite, interruptions in execution lead to poor code performance by causing many interruptions.
Memory is a critical aspect of Java applications because it is expensive to use due to virtualization and complex environment settings. The size of the objects, the frequency of object creation, the non release of objects are the main causes of memory exhaustion.
The network is the most dangerous and underestimated resource because it is not possible to increase it easily (most channels are limited in mega or gigabits). The network resource can be depleted by too large or too talkative exchanges.
Usual CPU performance problems in Java
In my experience, here are the most common problems found in CPU-intensive java applications
- loops that iterated on too many objects or are expensive to run individually
- recursive codes
- calls to library or function methods that have a high complexity (System.arrayCopy)
- since Java 8 : streams (parallel or not)
- codes that call blocking resources (as opposed to asynchronous functions) that penalize execution: file manipulation, SQL queries, REST or SOAP calls, IO manipulation in general.
- concurrency bugs (wait/notify, Lock, synchronized)
Embold : Efficiency issues\
Usual memory performance problems in Java
As we know, it is necessary to track the creation of objects and the size of the objects created, more specifically :
- the manipulations of arrays
- string manipulations (which are also arrays): concatenation and instantiation and the size of the manipulated strings
- loops that instantiate objects (and recursive methods as well)
- the use of buffers (to read files, XML, JSON, database query results)
- the presence of non-stateless beans
- the presence of static object containers as class attributes
- calls to library methods with poor performance
- complex streams with uncertain performance.
Usual network performance problems in Java
Problems related to the use of the network are difficult to detect at the code level and require thinking at a higher level of the code, at the architecture level.
However, there are a number of possible issues to be checked at the code level:
Example of network method call\
- looking for the presence of network calls within loops, for example DNS resolutions
- check for missing caches above REST calls ( @Cacheable for Spring)
- identify the use of cursors in the database
- search for the use of Clob, Blob etc.
- check stream openings (http streams, URL.openConnection())
Which code scanner to find these issues ?
Our interesting check list would be quite ineffective, if we had everything to do by hands.
That’s why we need an exhaustive tool to check these points for us and to include the review in our pull request.
Documentation in Embold\
Code analysis tools contain hundreds of bug and metric detection.To perform an effective code analysis and find genuine performance problems, the tool must include the following features:
- classification of detection by quality impact: you need to be able to quickly identify rules that improve quality.
- the existence of rules that allow an audit to be carried out (they point out places to be checked in the code)
- simple and adapted metrics (depth of loops in particular)
- complete documentation of the rule to understand the possibility of false positives.
I retained two tools for that purpose :
- SonarQube (sonarqube.org) :
- plus : the tags may help you to filter the rules.
- plus : it contains a large range of bug detection with some really useful performance issues detection.
- plus : the documentation is a big plus.
- minus : the rules are not classified by quality impact,
- minus : the specific metrics are not provided.
- minus : no hint rules available for the performance audit
- Embold (https://www.embold.io/)
- plus : the rules are sorted by quality criteria ( Efficiency for the performance issues and Resource Utilization )
- plus : Embold contains a full list of code metrics
- plus : Embold contains both its own documentation and links to the underlying Linter documentation
- minus : The Java code analyzer has less functionalities than Sonarqube’s one.
- minus : no hint rules available too.
I recommend that you give a try to Embold, if you have not done yet. It is a recent and powerful code analysis solution that is evolving in the right direction. Big plus, it is much easier to use than SonarQube.
In this article, we have discussed the most common types of performance errors and how to detect them. The correction of code errors detected by these tools will be discussed later. So don’t panic, subscribe to our newsletter so you don’t miss future articles on this topic.