May 20, 2009

"mysterious" core dumps of apache tomcat in a Zone of Solaris 10

We have a Sun Fire V240 with 8 zones. In each zone are some applications running
on apache tomcat or jboss. There are also some other java applications running.

Today a colleague experienced a "mysterious" core dump of tomcat.

After looking into the hserr* log of java it showed a coredump with Signal 10 ( SIGBUS), also a "mdb core" showed this problem.

Looking with "swap -s" showed a nearly exhausted swap memory only 180MB were
available to the whole system.

First step was to allocate some more swap memory from the second hard drive
and everything worked ok.

The other thing was in the parametrization of the applications running in tomcat.
The application can utilize a Threadpool. The starting size of the pool was
parametrized with 1000 threads.
So this caused a problem, because each thread is using memory, although the
thread is just hanging around in the pool.