|
|
|
Toni, thank you for this report. Could you also provide a a thread dump showing the deadlock? (In reply to comment #2)
> Toni, thank you for this report. > > Could you also provide a a thread dump showing the deadlock? > 2007-12-12 12:41:51 Full thread dump Java HotSpot(TM) Client VM (1.6.0_02-b06 mixed mode, sharing): "DestroyJavaVM" prio=6 tid=0x002e6000 nid=0x1cc waiting on condition [0x00000000..0x0091fd4c] java.lang.Thread.State: RUNNABLE "Thread-0" prio=6 tid=0x02fad400 nid=0x5b0 waiting for monitor entry [0x032cf000..0x032cfa94] java.lang.Thread.State: BLOCKED (on object monitor) at ch.qos.logback.classic.Logger.callAppenders(Logger.java:265) - waiting to lock <0x22eade50> (a ch.qos.logback.classic.Logger) at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:491) at ch.qos.logback.classic.Logger.filterAndLog(Logger.java:445) at ch.qos.logback.classic.Logger.debug(Logger.java:404) at fi.imagesoft.test.Main.run(Main.java:67) "pool-5-thread-1" prio=6 tid=0x02fac400 nid=0x530 waiting for monitor entry [0x0327f000..0x0327fc14] java.lang.Thread.State: BLOCKED (on object monitor) at ch.qos.logback.classic.Logger.callAppenders(Logger.java:265) - waiting to lock <0x22eade50> (a ch.qos.logback.classic.Logger) at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:491) at ch.qos.logback.classic.Logger.filterAndLog(Logger.java:464) at ch.qos.logback.classic.Logger.debug(Logger.java:408) at fi.imagesoft.test.Task.run(Task.java:39) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(Unknown Source) at java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) "pool-4-thread-1" prio=6 tid=0x02fab800 nid=0x660 waiting for monitor entry [0x0322f000..0x0322fb14] java.lang.Thread.State: BLOCKED (on object monitor) at ch.qos.logback.classic.Logger.callAppenders(Logger.java:265) - waiting to lock <0x22eade50> (a ch.qos.logback.classic.Logger) at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:491) at ch.qos.logback.classic.Logger.filterAndLog(Logger.java:464) at ch.qos.logback.classic.Logger.debug(Logger.java:408) at fi.imagesoft.test.Task.run(Task.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(Unknown Source) at java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) "pool-3-thread-1" prio=6 tid=0x02faa800 nid=0x67c waiting for monitor entry [0x031df000..0x031dfc94] java.lang.Thread.State: BLOCKED (on object monitor) at ch.qos.logback.classic.Logger.callAppenders(Logger.java:265) - waiting to lock <0x22eade50> (a ch.qos.logback.classic.Logger) at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:491) at ch.qos.logback.classic.Logger.filterAndLog(Logger.java:464) at ch.qos.logback.classic.Logger.debug(Logger.java:408) at fi.imagesoft.test.Task.run(Task.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(Unknown Source) at java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) "pool-2-thread-1" prio=6 tid=0x02fe6c00 nid=0x130 waiting for monitor entry [0x0318f000..0x0318fb94] java.lang.Thread.State: BLOCKED (on object monitor) at ch.qos.logback.classic.Logger.callAppenders(Logger.java:265) - waiting to lock <0x22eade50> (a ch.qos.logback.classic.Logger) at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:491) at ch.qos.logback.classic.Logger.filterAndLog(Logger.java:464) at ch.qos.logback.classic.Logger.debug(Logger.java:408) at fi.imagesoft.test.Task.run(Task.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(Unknown Source) at java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) "pool-1-thread-1" prio=6 tid=0x02fed000 nid=0x330 runnable [0x0313f000..0x0313fd14] java.lang.Thread.State: RUNNABLE at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(Unknown Source) at java.net.SocketOutputStream.write(Unknown Source) at java.io.ObjectOutputStream$BlockDataOutputStream.write(Unknown Source) at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source) at java.io.ObjectOutputStream.writeSerialData(Unknown Source) at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source) at java.io.ObjectOutputStream.writeObject0(Unknown Source) at java.io.ObjectOutputStream.writeArray(Unknown Source) at java.io.ObjectOutputStream.writeObject0(Unknown Source) at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source) at java.io.ObjectOutputStream.defaultWriteObject(Unknown Source) at ch.qos.logback.classic.spi.LoggingEvent.writeObject(LoggingEvent.java:287) at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at java.io.ObjectStreamClass.invokeWriteObject(Unknown Source) at java.io.ObjectOutputStream.writeSerialData(Unknown Source) at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source) at java.io.ObjectOutputStream.writeObject0(Unknown Source) at java.io.ObjectOutputStream.writeObject(Unknown Source) at ch.qos.logback.core.net.SocketAppenderBase.append(SocketAppenderBase.java:160) at ch.qos.logback.core.AppenderBase.doAppend(AppenderBase.java:81) - locked <0x22ee7838> (a ch.qos.logback.classic.net.SocketAppender) at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:51) at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:282) at ch.qos.logback.classic.Logger.callAppenders(Logger.java:266) - locked <0x22eade50> (a ch.qos.logback.classic.Logger) at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:491) at ch.qos.logback.classic.Logger.filterAndLog(Logger.java:464) at ch.qos.logback.classic.Logger.debug(Logger.java:408) at fi.imagesoft.test.Task.run(Task.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(Unknown Source) at java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) "Low Memory Detector" daemon prio=6 tid=0x02aa9800 nid=0x1f4 runnable [0x00000000..0x00000000] java.lang.Thread.State: RUNNABLE "CompilerThread0" daemon prio=10 tid=0x02aa8000 nid=0x258 waiting on condition [0x00000000..0x02d5f81c] java.lang.Thread.State: RUNNABLE "Attach Listener" daemon prio=10 tid=0x02aa7000 nid=0x734 runnable [0x00000000..0x00000000] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" daemon prio=10 tid=0x02aa6000 nid=0x180 waiting on condition [0x00000000..0x00000000] java.lang.Thread.State: RUNNABLE "Finalizer" daemon prio=8 tid=0x02aa1800 nid=0x68c in Object.wait() [0x02c6f000..0x02c6fc94] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x22eaabd0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(Unknown Source) - locked <0x22eaabd0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(Unknown Source) at java.lang.ref.Finalizer$FinalizerThread.run(Unknown Source) "Reference Handler" daemon prio=10 tid=0x02a9d400 nid=0x214 in Object.wait() [0x02c1f000..0x02c1fd14] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x22eaac60> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:485) at java.lang.ref.Reference$ReferenceHandler.run(Unknown Source) - locked <0x22eaac60> (a java.lang.ref.Reference$Lock) "VM Thread" prio=10 tid=0x02a94000 nid=0x390 runnable "VM Periodic Task Thread" prio=10 tid=0x02ac4400 nid=0x208 waiting on condition JNI global references: 1071 Heap def new generation total 960K, used 441K [0x229b0000, 0x22ab0000, 0x22e90000) eden space 896K, 42% used [0x229b0000, 0x22a0e6d0, 0x22a90000) from space 64K, 99% used [0x22aa0000, 0x22aafff8, 0x22ab0000) to space 64K, 0% used [0x22a90000, 0x22a90000, 0x22aa0000) tenured generation total 4096K, used 3804K [0x22e90000, 0x23290000, 0x269b0000) the space 4096K, 92% used [0x22e90000, 0x23247328, 0x23247400, 0x23290000) compacting perm gen total 12288K, used 1528K [0x269b0000, 0x275b0000, 0x2a9b0000) the space 12288K, 12% used [0x269b0000, 0x26b2e348, 0x26b2e400, 0x275b0000) ro space 8192K, 62% used [0x2a9b0000, 0x2aeaf728, 0x2aeaf800, 0x2b1b0000) rw space 12288K, 52% used [0x2b1b0000, 0x2b7f1eb8, 0x2b7f2000, 0x2bdb0000) That's not a "dead" lock. It's just a simple lock aquired by thread "pool-1-thread-1", which then waits for socket operation to complete. All other threads consequently wait for the same operation.
Anyway, introducing a sync around callAppenders loop (as a simple way to keep logging system runtime-reconfigurable) was one of the most stupid design decisions made in Log4J, which then carried on to logback, as its successor. It utterly defeats possibility of using log4j/logback in applications that utilize multiple threads and require lots of logging (as opposed to non-logging, which log4j/logback does pretty well). As Earwin observes, there is no deadlock. However, it is true that there is contention on the same resource, that is the logger with appenders attached, in this case, the root logger. Thus, during logging all threads a synchronized on the root logger, see Logger.callAppenders(Logger.java:265). We could loosen this particular synchronization by using read-locks allowing concurrent access to the root logger. However, this would displace the contention point to appenders which can process only one event at a time. (The doAppend method in method AppenderBase is synchronized.)
I think the slowness observed is caused by the Console plug-in which grinds to a near-halt after 5 minutes of intensive use by the "Main" application above. If you restart Main, then the laggard behavior is observed almost immediately, which leads me to think that the problem is with the Console plug-in. My suspicion can be confirmed by commenting out <consolePlugin /> in the configuration file because thereafter the problem can no longer observed. I am leaving this bug open so that we can come back to it to fix the Console plug-in. If you think that there is a dead-lock or a contention issues in logback, then I invite you to open a new issue. This bug needs to be fixed as does the bug in AppenderBase. I have a patch that fixes this issue but I don't seem to have permission to attach it to this issue. I have sent the patch to the dev list. I would appreciate it if someone with permission could attach it to this issue for me.
FWIW, in some cases slow appenders wouldn't be as much of a problem if it weren't for these synchronization points. I have a custom appender that is slow, but is thread safe so many of them can be run in parallel. With these two synchronization points however, Logback ends up showing up as a system wide bottleneck. Ralph,
If it will make the code simpler, I have no qualms with making appender names non-mutable. Moreover, we can further impose that loggers only accept named appenders rejecting anonymous one. I see that you have displaced the synchronization blocks from Logger to AppenderAttachableImpl, which seems quite reasonable to me. However, and if I am reading the patch correctly, you have not removed the synchronization from callAppender (line 265). The code still reads public void callAppenders(LoggingEvent event) { int writes = 0; // System.out.println("callAppenders"); for (Logger l = this; l != null; l = l.parent) { // System.out.println("l="+l.getName()); // Protected against simultaneous call to addAppender, removeAppender,... synchronized (l) { <--- this is what I am takling about writes += l.appendLoopOnAppenders(event); if (!l.additive) { break; } } } You have also kept a synchronization point in addAppender - public synchronized void addAppender(Appender<LoggingEvent> newAppender) { - if (aai == null) { - aai = new AppenderAttachableImpl<LoggingEvent>(); + public void addAppender(Appender<LoggingEvent> newAppender) { + synchronized(this) { + if (aai == null) { + aai = new AppenderAttachableImpl<LoggingEvent>(); + } } So I don't see any gain with respect to the existing code. What am I missing? Having said that, I do not wish to offhandedly discard your contribution. I was thinking that given that the 'aai' variable is assigned only once (and never set to null again), it would be possible to remove both remaining synchronization points. This would completely get rid of synchronization on "this" (the Logger instance) with synchronization point moved into ApppenderAttachableImpl, which frankly does not seem to be much of a gain to be worth the trouble. Would you concur with my analysis? Well, that was silly. Of course I should have removed the synchronization on the logger in callAppenders. I can't believe I didn't do that since that was the whole point.
Yes, requiring Appenders to have immutable names that are set before the appender is attached will make AppenderAttachableImpl much simpler since the name can then reliably be used in the ConcurrentMap. The extra code to account for the name being missing can just go away. As for the synchronization in addAppender, it is marginally better since it doesn't block aai.addAppender. The synchronization it is still doing only protects against a race condition on creating the AppenderAttachableImpl. However, this might not not even occur. My concern is for what happens when a reconfigure occurs. Is it possible that a Logger didn't have an AppenderAttachableImpl (i.e. it has no Appenders) and then does after a reconfigure? If so, then aai may need to be an AtomicReference. I haven't looked at how the Interpreter works and how loggers get updated, etc.so I'm not sure. I appreciate you looking at the code. Multiple pairs of eyeballs are always better than 1 set. Updated patch. This one makes the Appender name required and immutable, removes all the synchronization from Logger while calling Appenders and makes sure that an AppenderAttachableImpl is always attached to the Logger eliminating any race conditions that might occur during reconfiguration.
I think we should talk over your patch over the phone or via IRC.
Uploaded patch3.txt. This version uses CopyOnWriteArrayList instead of ConcurrentHashMap in AppenderAttachableImpl. It does not change the constructor for AppenderBase so is a lot less intrusive than patch2. I included all the updates from the patch in LBCORE-58 in this patch. I also added a test into LoggerPerfTest that demonstrates the difference. If LoggerPerfTest is run on trunk without the patch the elapsed time is significantly longer.
Hi Ralph,
I have applied your patch on my work copy. However, I suspect using ReadLocks on read operations on AppenderAttachableImpl (appenderOnLoop, getAppender, isAttached, iteratorForAppenders) and WriteLocks on write operations (addAppender, detachAndStopAllAppenders, detachAppender) would be safer although appenderOnLoop would be a bit slower and still free of interference among threads. Nevertheless, I will commit a variant of your patch in the next few minutes. What is the current status of this issue? I think I got hit by this one, at least that's what I'm seeing right now in Eclipse when I suspend all threads in my application - one thread has callAppenders in its stack trace and it hanged (yesterday) on socketWrite. Some of the other threads seem to be waiting on a lock in callAppenders. As a result, I don't see any more logging appended to my log file and the console.
I'm using logback-core/classic 0.9.9 with slf4j 1.5.3. Did the Ralph's patch make it to any of the newer logback releases? I was not able to find any change log or release announcement for releases newer than 0.9.9... I would really appreciate some feedback, thank you. Below is the stack trace of the "offending" thread (only relevant calls included): Thread [Thread-12] (Suspended) SocketOutputStream.socketWrite0(FileDescriptor, byte[], int, int) line: not available [native method] SocketOutputStream.socketWrite(byte[], int, int) line: 92 SocketOutputStream.write(byte[], int, int) line: 136 ObjectOutputStream$BlockDataOutputStream.drain() line: 1685 ObjectOutputStream$BlockDataOutputStream.write(byte[], int, int, boolean) line: 1649 ObjectOutputStream.defaultWriteFields(Object, ObjectStreamClass) line: 1368 ObjectOutputStream.writeSerialData(Object, ObjectStreamClass) line: 1347 ObjectOutputStream.writeOrdinaryObject(Object, ObjectStreamClass, boolean) line: 1290 ObjectOutputStream.writeObject0(Object, boolean) line: 1079 ObjectOutputStream.writeArray(Object, ObjectStreamClass, boolean) line: 1251 ObjectOutputStream.writeObject0(Object, boolean) line: 1075 ObjectOutputStream.defaultWriteFields(Object, ObjectStreamClass) line: 1375 ObjectOutputStream.defaultWriteObject() line: 391 LoggingEvent.writeObject(ObjectOutputStream) line: 288 GeneratedMethodAccessor1118.invoke(Object, Object[]) line: not available DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 25 Method.invoke(Object, Object...) line: 585 ObjectStreamClass.invokeWriteObject(Object, ObjectOutputStream) line: 917 ObjectOutputStream.writeSerialData(Object, ObjectStreamClass) line: 1339 ObjectOutputStream.writeOrdinaryObject(Object, ObjectStreamClass, boolean) line: 1290 ObjectOutputStream.writeObject0(Object, boolean) line: 1079 ObjectOutputStream.writeObject(Object) line: 302 SocketAppender(SocketAppenderBase<E>).append(E) line: 163 SocketAppender(AppenderBase<E>).doAppend(E) line: 81 AppenderAttachableImpl<E>.appendLoopOnAppenders(E) line: 51 Logger.appendLoopOnAppenders(LoggingEvent) line: 282 Logger.callAppenders(LoggingEvent) line: 266 Logger.buildLoggingEventAndAppend(String, Marker, Level, String, Object[], Throwable) line: 487 Logger.filterAndLog(String, Marker, Level, String, Object, Object, Throwable) line: 460 Logger.debug(String, Object, Object) line: 406 And those are the ones being blocked: Thread [Thread-4] (Suspended) Logger.callAppenders(LoggingEvent) line: 265 Logger.buildLoggingEventAndAppend(String, Marker, Level, String, Object[], Throwable) line: 487 Logger.filterAndLog(String, Marker, Level, String, Object[], Throwable) line: 479 Logger.debug(String, Throwable) line: 414 Daemon Thread [ActiveMQ Connection Worker: tcp://localhost/127.0.0.1:61616] (Suspended) Logger.callAppenders(LoggingEvent) line: 265 Logger.buildLoggingEventAndAppend(String, Marker, Level, String, Object[], Throwable) line: 487 Logger.filterAndLog(String, Marker, Level, String, Object[], Throwable) line: 479 Logger.log(Marker, String, int, String, Throwable) line: 818 SLF4JLocationAwareLog.info(Object, Throwable) line: 152 AdvisoryConsumer.dispose() line: 58 ActiveMQConnection.close() line: 563 CachingConnectionFactory(SingleConnectionFactory).closeConnection(Connection) line: 364 CachingConnectionFactory(SingleConnectionFactory).resetConnection() line: 302 CachingConnectionFactory.resetConnection() line: 120 CachingConnectionFactory(SingleConnectionFactory).onException(JMSException) line: 283 SingleConnectionFactory$InternalChainedExceptionListener(ChainedExceptionListener).onException(JMSException) line: 60 ActiveMQConnection$3.run() line: 1690 ThreadPoolExecutor$Worker.runTask(Runnable) line: 650 ThreadPoolExecutor$Worker.run() line: 675 Thread.run() line: 595 Pawel,
Ralph's made has made it into the trunk and was part of the 0.9.11 release. The bad news is that the problem that you are suffering from is in the eclipse plug-in. Ralph's patch merely displaces the deadlock elsewhere. I should point out, this isn't a deadlock but a bottleneck. By removing the locking in Logback we allow Appenders to better handle situations like this. But to do that the Appender will have to extend UnsynchronizedAppenderBase, which the SocketAppender doesn't do. Even if it did, because the SocketAppender has a scarce resource, the socket, it would eventually have to block on that.
The SocketAppender could be modified to use a pool of sockets, but then you would just end up with a bunch of threads sitting in SocketWrite. The advantage is that if the sockets have a 5 second timeout and you have 10 threads then the average log event takes 50 seconds. With a socket pool the same size as the number of threads the average log event would be 5 seconds. Still awful but 10 times better. What is needed is a better way to handle Appenders that are "Out of Service". I will create a Jira issue for that and create a patch at some point. Thanks, I will update to 0.9.11 ASAP then.
Good luck with fixing it "the right way" :-) Any progress on this issue? I'm getting:
java.lang.Thread.State: BLOCKED on ch.qos.logback.core.rolling.RollingFileAppender@2db3f4 owned by: http-127.0.0.1-8080-2 at ch.qos.logback.core.AppenderBase.doAppend(AppenderBase.java:66) at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:60) at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:270) at ch.qos.logback.classic.Logger.callAppenders(Logger.java:257) at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:439) at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:393) at ch.qos.logback.classic.Logger.debug(Logger.java:500) Thanks. Hello Robert,
Any progress on what exactly? This issue is related to logback console plug-in for eclipse as such as logback's threading model. Are you using the the logback console plug-in for eclipse? I'm getting that deadlock in logback core (0.9.14 and 0.9.15). No console plugin or socket appenders, just plain RollingFileAppender.
In that case, please open a new jira issue. It would also be very helpful if you could provide the full thread dump. Please do not continue to comment on the present issue. (I intend to mark it as closed.)
Hi guys,
Our webapp running on a 4-core solaris is suffering* the same problem as described in http://jira.qos.ch/browse/LBCORE-63 and http://jira.qos.ch/browse/LBCORE-96 and I think I have an idea about the reason behind this. Threading policy has changed from 1.5 to 1.6. I can't give you a reference for this, it's purely based on observation. In 1.5, a thread waiting for a lock was somewhat preferred by the VM over threads that would arrive at the wait() at a later time. In 1.6, however, a thread that releases synchronization is much more likely to immediately obtain the object lock again if it's in a consumer/producer loop. for(;;) { synchronized(obj) { // thread is likely to reobtain lock } } The same is the case for wait/notifyAll combinations. One can not assume that a longer waiting thread will likely obtain the lock after notifyAll. This is documented in http://java.sun.com/javase/6/docs/api/java/lang/Object.html#notifyAll() so the changed behavior between 1.5 and 1.6 is perfectly legal. My assumption is that the perceived deadlock, which is actually a congestion/starvation problem, occurs if one thread is logging "like crazy" while the other thread is sluggish as hell because he can't obtain the lock. This should be solvable by using a fair ReentrantLock (See http://java.sun.com/javase/6/docs/api/java/util/concurrent/locks/ReentrantLock.html#ReentrantLock(boolean) ) From the javadoc: "The constructor for this class accepts an optional fairness parameter. When set true, under contention, locks favor granting access to the longest-waiting thread. Otherwise this lock does not guarantee any particular access order. Programs using fair locks accessed by many threads may display lower overall throughput (i.e., are slower; often much slower) than those using the default setting, but have smaller variances in times to obtain locks and guarantee lack of starvation." The guaranteed lack of starvation is what we are looking for, I think, This is something that pure synchronization simply can't provide. I'd suggest to give the attached patch a try, I can't do so myself because I can't risk our webapp and I have no other chance to test this right now. I'll crosspost this to both tickets cause I'm not sure which one is more "active" right now. * The webapp shows that behavior in case of either console, file and/or lilith/socket appenders. It's just much more likely to happen with the lilith/socket appender in place since it takes significantly longer to send events over the wire than just writing them to disk or console. This issue is tainted by the fact that the eclipse console plug-in is being used. If you are really concerned about a deadlock issue, please open a new and separate issue. Suppling a test case would be most appropriate. In the mean time, as forewarned on March the 23rd, I am closing this issue.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
---------------------------------
logger.xml
---------------------------------
<?xml version="1.0" encoding="UTF-8" ?>
<configuration>
<consolePlugin />
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<layout class="ch.qos.logback.classic.PatternLayout">
<pattern>%d{HH:mm:ss.SSS} - %msg%n</pattern>
</layout>
</appender>
<appender name="FILE-MAIN" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>logs/main.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
<FileNamePattern>logs/main-%i.log</FileNamePattern>
<MinIndex>1</MinIndex>
<MaxIndex>3</MaxIndex>
</rollingPolicy>
<triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
<MaxFileSize>500MB</MaxFileSize>
</triggeringPolicy>
<layout class="ch.qos.logback.classic.PatternLayout">
<Pattern>%date %level [%thread] %msg%n</Pattern>
</layout>
</appender>
<root>
<appender-ref ref="FILE-MAIN" />
<level value="DEBUG" />
</root>
</configuration>