Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: 1.5.7
Affects Version/s: 1.2.11
Component/s: logback-core
Labels:
None
Environment:

Windows 10; Java 17; locale en-US

Description

Let's say I have this typical Logback configuration to write output to stderr, with pretty colors and such:

<configuration>
  <property scope="context" name="COLORIZER_COLORS" value="boldred@,boldyellow@,boldcyan@,@,@" />
  <conversionRule conversionWord="colorize" converterClass="org.tuxdude.logback.extensions.LogColorizer" />
  <statusListener class="ch.qos.logback.core.status.NopStatusListener" />
  <appender name="STDERR" class="ch.qos.logback.core.ConsoleAppender">
    <target>System.err</target>
    <withJansi>true</withJansi>
    <encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
      <pattern>[%colorize(%level)] %msg%n</pattern>
    </encoder>
  </appender>
  <root level="INFO">
    <appender-ref ref="STDERR" />
  </root>
</configuration>

The key part is that I have a PatternLayoutEncoder (a descendant of LayoutWrappingEncoder) logging via a ConsoleAppender to System.err.

The default charset for a LayoutWrappingEncoder (discussed in depth on Stack Overflow) is Charset.defaultCharset(). (How it gets that is complicated, but ultimately it relies on String.getBytes().) There's just one big problem: the default charset of System.out and System.err is System.console().charset(), not Charset.defaultCharset(), as per the API documentation for e.g. System.out:

The "standard" output stream. This stream is already open and ready to accept output data. Typically this stream corresponds to display output or another output destination specified by the host environment or user. The encoding used in the conversion from characters to bytes is equivalent to Console.charset() if the Console exists, Charset.defaultCharset() otherwise.

On my system for example, Charset.defaultCharset() is set to windows-1252, while System.console().charset() returns IBM437. This results in mojibake: if I try to log the string "é" via Logback, it appears in System.out or System.err as Θ instead! (See discussion on Stack Overflow.)

Thus LayoutWrappingEncoder somehow needs to default to System.console().charset() (instead of Charset.defaultCharset() as it does now) if it is appending to System.out or System.err. (I can't manually specify a charset because I certainly don't know what the console default charset will be on each user's machine, as there will be many different values for different users.)

Unfortunately LayoutWrappingEncoder probably has no idea where it's writing to and probably shouldn't care. So instead, LayoutWrappingEncoder should be able to ask the enclosing OutputStreamAppender for the current charset. OutputStreamAppender could then default to Charset.defaultCharset() if not specified, and ConsoleAppender could override the default to return System.console().charset() instead of Charset.defaultCharset(). Problem solved, with the added benefit that the default charset now comes explicitly from the OutputStreamAppender implementation rather than indirectly form String.getBytes() hidden in the bowels of LayoutWrappingEncoder.

Attachments

Issue Links

links to

commit/7c2947

Activity

People

Assignee:: Ceki Gülcü

Reporter:: Garret Wilson

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 29/May/22 19:59

Updated:: 12/Aug/24 17:16

Resolved:: 12/Aug/24 17:13