Affects Version/s: None
Fix Version/s: None
Windows Server 2003 Enterprise edition
We have a clustered windows environment (2 nodes) and each node is logging to the same file which is stored on a clustered storage device (prudent is set to true)
Logging works great until we switch the owner of the storage device where the logs are being written to at which point one of two things happens:
- one or both servers go into a sort of deadlock state, meaning the code that logs just hangs indefinatly. it seems the safeWrite() in FileAppender may be to blame because of the "lock()" statement (note that a jvm reboot will fix this problem). I have been able to reproduce this locked state and verify that it is the lock() by trying to get a lock on the log file in the same jvm but without using logback code. This code will hang as well.
- if there is no deadlock, any log written after the switch will just vanish into thin air (I have no idea where the outputstream points to when you start switching the owner)
I have no idea how this can be fixed without resorting to a reload/reboot of our servers (which is not an option in production).
We used to log synchronously, but then each thread that called logback would hang resulting in a massive amount of idling threads, so now we log asynchronously (this was already an option in the system we built around it) which gives us one advantage: we never write concurrently to the same file, although different nodes may write serially.
As a temporary work around I am thinking to implement a simple appender that opens/closes the outputstream for each write (performance is less of an issue) and that does not implement locking.