Revision history [back]

click to hide/show revision 1
initial version

answered 2018-06-29 11:44:06 +0800

cor3000 gravatar image cor3000

The specific problem with the "И" character happens when deliberately using cp1251 when converting the bytes or setting the default it via -Dfile.encoding=cp1251:

String s = "";
for (char ch = 0x0410; ch <= 0x044F; ch++)
    s += ch;

String s2 = s;

System.out.println(Charset.defaultCharset());

System.out.println(s);
s = new String(s.getBytes("utf-8"));
System.out.println(s);
s = new String(s.getBytes(), "utf-8");
System.out.println(s);

System.out.println(s2);
s2 = new String(s2.getBytes("utf-8"), "cp1251");
System.out.println(s2);
s2 = new String(s2.getBytes("cp1251"), "utf-8");
System.out.println(s2);

This kind of invalid byte/string conversion between cp1251 and utf-8 might explain this ... So at least this happens in this little test case.

How/where this can occur inside your or potentially any other ZK application I hope to find out with your stack trace.

I quite like those kind of mysteries so hang in.

The specific problem with the "И" character happens when deliberately using cp1251 when converting the bytes or setting the default it via -Dfile.encoding=cp1251:

String s = "";
for (char ch = 0x0410; ch <= 0x044F; ch++)
    s += ch;

String s2 = s;

System.out.println(Charset.defaultCharset());

System.out.println(s);
s = new String(s.getBytes("utf-8"));
System.out.println(s);
s = new String(s.getBytes(), "utf-8");
System.out.println(s);

System.out.println(s2);
s2 = new String(s2.getBytes("utf-8"), "cp1251");
System.out.println(s2);
s2 = new String(s2.getBytes("cp1251"), "utf-8");
System.out.println(s2);


windows-1251
АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя
АБВГДЕЖЗР?ЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя
АБВГДЕЖЗ??ЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя
АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя
АБВГДЕЖЗР?ЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя
АБВГДЕЖЗ??ЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя

This kind of invalid byte/string conversion between cp1251 and utf-8 might explain this ... So at least this happens in this little test case.

How/where this can occur inside your or potentially any other ZK application I hope to find out with your stack trace.

I quite like those kind of mysteries so hang in.

Support Options
  • Email Support
  • Training
  • Consulting
  • Outsourcing
Learn More