J2EE application performance optimization
How to extract maximum performance from your J2EE Web applications
In today's world of larger-then-ever software applications, users still expect real-time data and software that can process data at blazing speeds. With the advent of broadband access, users have grown accustomed to receiving responses in seconds, irrespective of how much data the server handles. It is becoming increasingly important for Web applications to provide as short response times as possible. The most obvious and simple way to improve a Website's performance is by scaling up hardware. But scaling the hardware does not work in all cases and is definitely not the most cost-effective approach. Other solutions can improve performance, without extra costs for the hardware. This article provides some suggestions that prove helpful when trying to maximize a J2EE Web application's performance.
If you decide to try this article's recommendations, keep in mind this article provides suggestions only. Performance tuning is as much an art as it is a science. Changes that often result in improvement might not make any difference in some cases, or, in rare scenarios, they can result in degradation. For best results with performance tuning, take a holistic approach.
Overview
Figure 1 illustrates at a broad level how a J2EE application appears when deployed in a production environment. To get the best performance from a J2EE application, all the underlying layers must be tuned, including the application itself.Figure 1. J2EE application architecture. Click on thumbnail to view full-sized image.
- Set a goal: Before you begin tuning your J2EE application's performance, set a goal. Often this goal addresses the maximum concurrent users the application will support for a given limit on response times. But the goal can also focus on other variables—for example, the response times should not increase more than 10 percent during the peak hour of user load.
- Identify problem areas: It is important to identify the bottlenecks when you start making changes to improve performance. A little investigation into problems might reveal the specific component that causes poor performance. For example, if the CPU usage on an application server is high, you will want to focus on tuning the application server first.
- Follow a methodical and focused path: Once the goal is set, try to make changes that are expected to have the biggest impact on performance. Your time is better spent tuning a method that takes 10 seconds but gets called 100 times than tuning a method that takes one minute but gets called only once. In an ideal world, you test one change at a time before using it in a production environment. You make one change and stress-test it. If the change results in positive impact, only then will you make it permanent.
Identify bottlenecks
The goal of performance tuning is to identify bottlenecks and remove them. It is an iterative process. Once one area of the application improves, another area will become a bottleneck. You must repeat the cyclic process of first identifying the bottleneck, then resolving the bottleneck, then identifying the next bottleneck until the desired goal has been reached. You will need two kinds of tools that prove helpful in this process. First, you need stress tools that generate load for your application. Second, you need monitoring tools that collect data for various performance indicators.Stress tools
Many different stress tools are available in the market today. Some of the popular ones are:- Mercury Interactive's LoadRunner
- Segue's SilkPerformer
- RadView Software's WebLoad
- Support for a large number of concurrent Web users, each running through a predefined set of URL requests
- Ability to record test scripts automatically from browser
- Support for cookies, HTTP request parameters, HTTP authentication mechanisms, and HTTP over SSL (Secure Socket Layer)
- Option to simulate requests from multiple client IP addresses
- Flexible reporting module that lets you specify the log level and analyze the results long after the tests were run
- Option to specify the test duration, schedule test for a later time, and specify the total load
Performance monitors
Using a monitoring tool, you collect data for various system performance indicators for all the appropriate nodes in your network topology. Many stress tools also provide monitoring tools. Windows OS also has a built-in performance monitor sufficient for many purposes. This Windows performance monitor can be started from the Administrative Tools menu, accessed from the Control Panel menu, or by typing "perfmon" in the Run window (accessed from the Start menu). You can display the performance counters data in real time, but usually, you'll want to log this data into a file so it can be viewed later. To log the data into a file, go to the Counter Logs selection in the left-hand side of the Performance window, right click with your mouse, and select New Log Settings as shown in Figure 2.Figure 2. Windows performance monitor. Click on thumbnail to view full-sized image.
Many performance counters are available in Windows OS. The following table lists some of the important counters that you should always monitor:
|
You should add the above counters (and any others as appropriate) in your counter log and collect this data while you stress-test your application using the stress tool. A file generated by the counter log can be opened later by clicking on the View Log File Data button on the right-hand side toolbar. Looking at these counters should give you some hint as to where the problem exists—application server, Web server, or database server.
After identifying the bottleneck this way, you should try to resolve it. You can use two different strategies—either tune the hosting environment in which your application runs or tune the application itself.
Environment tuning
In this section, we look at the possible tuning options in a typical J2EE Web application hosting environment. As already discussed above, a J2EE application environment usually consists of an application server, Web server, and a backend database.Web server/application server tuning
Most application servers and Web servers provide similar kinds of configuration options though they have different mechanisms to set them. In this article, I cover Tomcat 4.1.x and Apache 1.3.27, which talk to each other using the JK connector. The configuration options presented here exist in most of the other application servers and Web servers, but you will need to locate the correct place to set them.Apache
Probably the most important setting for your Windows Apache HTTP Server is the option for number of threads. This value should be high enough to handle the maximum number of concurrent users, but not so high that it starts adding its own overhead of too many context switches. The optimum value can be determined by monitoring the number of threads in use during peak hours. To monitor the threads in use, make sure you have the following configuration directives present in the Apache configuration file (
httpd.conf
):
LoadModule status_module modules/mod_status.so <Location /server-status> SetHandler server-status Allow from all </Location>
Now, from your browser, make an HTTP request to your Apache server with this URL: http://<apache_machine>/server-status. It displays how many requests are being processed and their status (reading request, writing response, etc.). Monitor this page during peak load on the server to ensure the server is not running out of idle threads. After you come up with the optimum number of threads for your application, change the
ThreadsPerChild
directive in the configuration file to an appropriate value.
A few other items that improve performance in the Apache HTTP Server are:
- DNS reverse lookups are inefficient. Switch off DNS lookups by setting
HostnameLookups
in the configuration file to off. - Do not load unnecessary modules. Apache allows dynamic modules that extend basic functionality of the Apache HTTP Server.
Comment out all the
LoadModule
directives you don't need. - Try to minimize logging as much as possible. Look for directives
LogLevel
,CustomLog
, andLogFormat
in the configuration file for changing logging level. - Minimize the JK connector's logging also by setting the
JkLogLevel
directive toemerg
.
Tomcat
-Xms<size> -Xmx<size>
as the JVM parameter in the command line that starts Tomcat. <size>
is the JVM heap size usually specified in megabytes by appending a suffix m
, for example, 512m
. Initial heap size is -Xms
, and -Xmx
is the maximum heap size. For server applications, both should be set to the same value.The number of threads in Tomcat can be modified by changing the values of
minProcessors
and maxProcessors
attributes for the appropriate connector in <Tomcat>/conf/server.xml
.
If you are using the JK connector, change the values of its attributes.
Again, there is no simple way to decide the optimum
value for these attributes. The value
should be set such that enough threads are available to handle your Web
application's
peak load. You can monitor a process's
current thread count in the Windows Task Manager, which can assist in
determining the
correct value of these attributes.
A few other options you should be aware of:
- If you are not using the latest JRE (Java Runtime Environment), consider upgrading to the latest one. I have seen up to a 30 percent performance improvement after upgrading from JRE 1.3.1 to JRE 1.4.1.
- Add the
-server
option to the JVM options for Tomcat, which should result in better performance for server applications. Note that this option, in some cases, causes the JVM to crash for no apparent reason. If you face this problem, remove the option. - Change the default Jasper (JavaServer Pages, or JSP, compiler) settings in
<Tomcat>/conf/web.xml
by settingdevelopment="false"
,reloading="false"
andlogVerbosityLevel="FATAL"
. - Minimize logging in Tomcat by setting
debug="0"
everywhere in<Tomcat>/conf/server.xml
. - Remove any unnecessary resources from the Tomcat configuration file. Some examples include the Tomcat
examples
Web application and extra<Connector>
,<Listener>
elements. - Set the
autodeploy
attribute of the<Host>
tag tofalse
(unless you need any of the default Tomcat applications like Tomcat Manager). - Make sure you have set
reloadable="false"
for all your Web application contexts in<Tomcat>/conf/server.xml
.
Database tuning
In the case of Microsoft SQL Server, more often than not, you do not need to modify any configuration options, since it automatically tunes your database to a great degree. You should change these settings only if your stress tests identify the database as a bottleneck. Some of the configuration options that you can try are:- Run your SQL Server on a dedicated server instead of a shared machine.
- Keep your application database and your temporary database on different hard disks.
- Consider taking local backups and moving them to a different machine. The backups should complete much faster.
- Normalize your database to the third normal form. This is usually the best compromise, as the fourth and fifth forms of normalization can result in performance degradation.
- If you have more than 4 GB of physical RAM available, set the
awe enabled
configuration option to 1, which will allow SQL Server to use more than 4 GB of memory up to a maximum of 64 GB (depending on the SQL Server edition). - In case you have many concurrent queries executing and enough memory is available, you can increase the value of the
min memory per query
option (default is 1,024 KB). - Change the value of the
max worker threads
option, which indicates the maximum number of user connections allowed. Once this limit is reached, any new user requests will wait until one of the existing worker threads finishes its current task. The default value for this option is 255. - Set the
priority boost
option to 1. This will allow SQL Server to run with a higher priority as compared to the other applications running on the same server. If you are running on a dedicated server, it is usually safe to set this option.
Application tuning
After you have tuned your hosting environment, now it is time to get down and dirty inside your application source code and database schema. In this section, we look at one of the many possible ways to tune your Java code and your SQL queries.Java code optimization
The most popular way to optimize Java code is by using a profiler. Sun's JVM has built-in support for profiling (Java Virtual Machine Profiler Interface, or JVMPI) that can be switched at execution time by passing the right JVM parameters. Many commercial profilers are available; some rely on JVMPI, others provide their own custom hooks into Java applications (using bytecode instrumentation or some other method). But be aware that all these profilers add significant overhead. Thus, your application cannot be profiled at a realistic load level. Use these profilers with a single user or a limited number of users. It is still a good idea to run your application through the profiler and analyze the results for any obvious bottlenecks.To identify your application's slowest areas in a full-fledged deployed environment, you can add your own timing logs to the application, which can be switched off easily in the production environment. A logging API, such as log4j or J2SE 1.4's Java Logging API, is handy for this purpose. The code below shows a sample utility class that can help you add timing logs to your application:
import java.util.HashMap; //Import org.apache.log4j.Logger; public class LogTimeStamp { private static HashMap ht = new HashMap(); // Preferably we should use log4j instead of System.out // private static Logger logger = Logger.getLogger("LogTimeStamp"); private static class ThreadExecInfo { long timestamp; int stepno; } public static void LogMsg(String Msg) { LogMsg(Msg, false); } /* * Passing true in the second parameter of this function resets the counter for * the current thread. Otherwise it keeps track of the last invocation and prints * the current counter value and the time difference between the two invocations. */ public static void LogMsg(String Msg, boolean flag) { LogTimeStamp.ThreadExecInfo thr; long timestamp = System.currentTimeMillis(); synchronized (ht) { thr = (LogTimeStamp.ThreadExecInfo) ht.get(Thread.currentThread().getName()); if (thr == null) { thr = new LogTimeStamp.ThreadExecInfo(); ht.put(Thread.currentThread().getName(), thr); } } if (flag == true) { thr.stepno = 0; } if (thr.stepno != 0) { // logger.debug(Thread.currentThread().getName() + ":" + thr.stepno + ":" + // Msg + ":" + (timestamp - thr.timestamp)); System.out.println(Thread.currentThread().getName() + ":" + thr.stepno + ":" + Msg + ":" + (timestamp - thr.timestamp)); } thr.stepno = thr.stepno + 1; thr.timestamp = timestamp; } }
After adding the above class in your application, you must invoke method
LogTimeStamp.LogMsg()
at various checkpoints in your code. This method prints the time (in milliseconds) it took for one thread to get from one
checkpoint to the next one. First, call LogTimeStamp.LogMsg("Your Msg", true)
at one place in the code that is the start of a user request. Now you can insert the following invocations in your code:public void startingMethod() { ... LogTimeStamp.LogMsg("This is a test message", true); //This is starting point ... LogTimeStamp.LogMsg("One more test message"); //This will become check point 1 method1(); ... } public void method1() { ... LogTimeStamp.LogMsg("Yet another test message"); //This will become check point 2 method2(); ... LogTimeStamp.LogMsg("Oh no another test message"); //This will become check point 4 } public void method2() { ... LogTimeStamp.LogMsg("Wow! another test message"); //This will become check point 3 ... }
The Perl script
analyze.pl
, which can be downloaded from Resources,
can take the output of the above log messages as input and print the
results in the format below. From these results, you
now know which part of the code requires
the most time and can concentrate on optimizing that part:
Transactions Avg. Time Max Time Min Time -------------------------------------------------------------------- [This is a ...] to [One more t...] 14410 20937 7500 [One more t...] to [Yet anothe...] 16 62 0 [Yet anothe...] to [Wow! anoth...] 39860 50844 27703 [Wow! anoth...] to [Oh no anot...] 711 1844 94 [Oh no anot...] to [OK thats e...] 68089 228452 19718
The above approach represents just one of the ways to tune your Java code. You can use whatever methodology works for you. Some excellent resources are available for Java performance tuning. Check Resources for some links. Some general suggestions you should be aware of while developing a J2EE application are:
- Avoid using synchronized blocks in your code as much as possible. That does not mean you should abdicate handling synchronization for your code's multithreaded parts, but you should try to limit its usage. Synchronized blocks can severely impair your application's scalability.
- Proper logging proves necessary in serious software development. You should try to use a logging mechanism (like log4j) that lets you switch off logging in the production environment to reduce logging overhead.
- Instead of creating and destroying resources every time you need them, use a resource pool for every resource that is costly
to create. One obvious choice for this is your JDBC (Java Database Connectivity)
Connection
objects. Threads are also usually good candidates for pooling. Many free APIs are available for pooling various resources. - Try to minimize the objects you store in
HttpSession
. Extra objects inHttpSession
not only lead to more memory usage, they also add additional overhead for serialization/deserialization in case of persistent sessions. - Where ever possible, use
RequestDispatcher.forward()
instead ofHttpServletResponse.sendRedirect()
, as the latter involves a trip to the browser. - Minimize the use of
SingleThreadModel
in servlets so that the servlet container does not have to create many instances of your servlet. - Java stream objects perform better than reader/writer objects because they do not have to deal with string conversion to bytes.
Use
OutputStream
in place ofPrintWriter
. - Reduce the default session timeout either by changing your servlet container configuration or by calling
HttpSession.setMaxInactiveInterval()
in your code. - Just as we switched off the DNS lookup in the Web server configuration, try not to use
ServletRequest.getRemoteHost()
, which involves a reverse DNS lookup. - Always add directive
<%@ page session="false"%>
to JSP pages where you do not need a session. - Excessive use of custom tags also may result in poor performance. Keep performance in mind while designing your custom tag library.
SQL query optimization
The optimization of SQL queries is a vast subject in itself, and many books cover only this topic. SQL query running times can vary by many orders of magnitude even if they return the same results in all cases. Here I just show how to identify the slow queries and offer a few suggestions as to how to fix some of the most common mistakes.First of all, to identify slow queries, you can use SQL Profiler, a tool from Microsoft that comes standard with SQL Server 2000. This tool should be run on a machine other than your SQL Server database server and the results should be stored in a different database as well. Storing results in a database allows all kinds of reports to be generated using standard SQL queries. Profiling any application inherently adds a lot of overhead, so try to use appropriate filters that can reduce the total amount of data collected.
To start the profiling, from SQL Profiler's File menu, select New, then Trace, and give the connection information and the appropriate credentials to connect to the database you want to profile. A Trace Properties windows will open, where you should enter a meaningful name you will recognize later. Select Save To Table option and now give the connection information and credentials for the database server (this should differ from the server you are profiling) where you want to store the data collected by the profiler. Next, you should be asked to provide the database and the table name where the results will be stored. Usually, you would also want to add a filter, so go to the Filters tab and add the appropriate filters (for example "duration greater than or equal to 500 milliseconds" or "CPU greater than or equal to 20" as shown in Figure 3). Now click on the Run button and the profiling will start.
Figure 3. SQL Profiler. Click on thumbnail to view full-sized image.
SELECT [TABLE1].[T1COL1], [TABLE1].[T1COL2], [TABLE1].[T1COL3], [TABLE1].[T1COL4] FROM ((([TABLE1] LEFT JOIN [TABLE4] ON [TABLE1].[T1COL4] = [TABLE4].[T4COL4]) LEFT JOIN [TABLE3] ON [TABLE1].[T1COL3] = [TABLE3].[T3COL3]) LEFT JOIN [TABLE2] ON [TABLE1].[T1COL2] = [TABLE2].[T2COL2]) WHERE [TABLE1].[T1COL5] = 'VALUE1'
Now your task is to optimize this query to improve performance. You look at your database and find that TABLE1 has 700,000 records, TABLE2 has 16, TABLE3 has 100, and TABLE4 happens to have more than 4 million records. You should first understand this query's cost, and Query Analyzer comes in handy for this task. Select Show Execution Plan in the Query menu and execute this query in Query Analyzer. Figure 4 shows the resulting execution plan.
Figure 4. Execution plan before indexes. Click on thumbnail to view full-sized image.
Figure 5. Index Tuning Wizard. Click on thumbnail to view full-sized image.
Figure 6. Execution plan after indexes. Click on thumbnail to view full-sized image.
- Keep the transactions as short as possible. The longer a transaction is open, the longer it holds the locks on the resources acquired, and every other transaction must wait.
- Do not use DISTINCT clauses unnecessarily. If you know the rows will be unique, do not add DISTINCT in the SELECT clause.
- When possible, avoid using SELECT *. Select only the columns from which you need the data.
- Consider adding indexes to those columns causing full-table scans for your queries. Indexes can result in a big performance gain, as shown above, even though they consume extra disk space.
- Avoid using too many string functions or operators in your queries. Functions like SUBSTRING, LOWER, UPPER, and LIKE result in poor performance.
No comments:
Post a Comment