Fri, 04/22/2011 - 10:19
An article on Dzone's JavaLobby Three Common Application Performance Challenges for Developers written by Bhaskar Sunkara presents a high-level overview of some considerations for Java application engineers. The following is a concise summary.
The 3 main points are:
- Memory Leaks
- Slow SQL
Let's look at these in reverse order. Why? Because threading and synchronization is much more interesting as it relates to performance testing and tuning. Also, the article doesn't really go into much detail about memory leaks.
Threading and Synchronization
Java allows many concurrent threads to be running in order to increase the efficiency of processing. That's a standard concept of all modern computer theory that began in operating systems with "parallelism". Properly shuffling program needs to processors or data resources has been performance engineering 101 for 50 years.
The ability for Java to take this concept out of the operating system and down to the level of threads in an application is intriguing, but it is also where performance challenges arise. There is a keyword in Java that programmers can employ - "synchronized". It forces order on concurrent thread execution to eliminate contention for shared resources in memory (e.g. objects).
The performance problem is due to the numerous conflicts and contentions over the shared resources. Regardless how "concurrent" you think your execution of code may be, there will always be a bottleneck when the processing reduces to sequential execution of requests for those resources in memory. Again, it is the same contention we have been fighting since mainframes started using complex algorithms to squeeze more processing out of CPUs while a program was waiting for slow disk reads to return. The bottleneck has moved to a new place, and the complexity is a magnitude worse.
The website Java Performance Tuning has several tips relative to synchronization. There is a bullet point attributed to Jack Shirazi, the author of the book and presumably the website Java Performance Tuning, stating, "Avoid synchronization where possible."
On the same website, Karthik Rangaraju is attributed to say:
- Use Djikstra semaphores (synchronized acquire()/release()) to control access to a finite pool of resources.
- Conditional events provide a more sophisticated version of the wait()/notify() mechanism which avoids some potential problems of that mechanism.
Brain Goetz gives us 4 tips about avoiding synchronization deadlocks:
- Deadlocks are difficult to identify from code analysis, and can occur unexpectedly.
- Always acquire locks in the same order to avoid one common cause of deadlocking. If you can guarantee that all locks will always be acquired in a consistent order, then your program will not deadlock.
- Try to avoid acquiring more than one lock at a time (though this is usually impractical).
- Keep synchronized blocks of code as short as possible.
Anyone that has worked with web applications can attest to a proven axiom that most performance issues center around the database. Bhaskar wisely jumps on that bandwagon too.
He focuses on the Object Relational Mapper (ORM). The premise is a common trade-off between ease of use and performance. Most developers that are concerned with getting more speed will avoid added layers of architecture and extra tools that do some of the work for you. Arguably, Java itself is designed to make certain parts of the development cycle easier for coders; however, the ORM can be particularly expensive.
The idea is to allow the ORM to handle the inherent mismatch between objects and a relational database table structure. The "ObjectRelationalImpedanceMismatch" is label for a set of problems encountered when using a relational database to store (the state of) objects from software written in an object-oriented programming language.
While abstracting the translation is a good thing for programming productivity, Bhaksar tells us that it puts, "...a significant weight on an application's performance". Makes sense to me. There is no further explanation of quantification of the ORM impact.
C2.com has a page in their wiki called Object Relational Mapping Costs Time And Money that states
This runtime response time cost has been anecdotally observed by some, with caveats, to be anywhere from twice to an order of magnitude more...
Josh Marotti commented on Bhaskar's post by saying: "I find that unless you have everything setup perfectly, it is more work to get it working the way you want than just writing the damn queries/rowmappers by hand".
Bhaskar talks about the advantages of Java handling the memory model for you, but that it isn't perfect. His main point is that heap management is problematic when a developer does not "relieve all references to a object". The issue stems from the creation and destroying of unused objects because Java's memory management puts these unused objects in a heap and that heap continues to grow as your application runs.
He states, "...your heap builds up and your app comes to a grinding halt". Intuitively, it is understandable that the memory available is consumed by this heap; thus, the leakage causes other important application processes to be starved for memory. The performance challenge sounds rather easy to address - REMOVE ALL UNUSED OBJECTS FROM YOUR CODE.
Bhaskar touches on the use of heap dumps and profilers to find memory leaks. He does not explain how to use these tools, and he says both have drawbacks. He says the heap dump helps you identify objects, but it does not allow finding the root cause.
Using a profile in conjunction with a heap dump doesn't improve the diagnosis process significantly. Neither provide the functionality to determine what is causing the leak. This article leaves us wanting more information, as exhibited by a few comments left by other developers saying things like, "Was hoping for a bit more insight into tackling performance issues and/or designing to prevent them, gleaned from and explained through real-life experiences. This article is very superficial."
Web performance challenges exist no matter what language you code - Java, PHP, .NET, Python, etc. There are simply some problems that always will exist, especially around the database.
This article shares 3 of the most common issues for Java developers. If you have comments or suggestions, this topic can easily grow into a whole series of posts that we can put together to help web coders. Thoughts?