0

Content type of media objects containing text won't ever display charset

asked 2015-09-01 08:14:58 +0800

cvarona gravatar image cvarona
554 1 6

updated 2015-09-01 08:15:43 +0800

Hi,

I've discovered that zk systematically removes or hides the charset information from Media objects containing plain text. I can't imagine why this is done, but I think this information has value, specially when it comes to download that very media object (the browser cannot tell which one to apply and is likely to do it wrong) or if, by whichever reason, you need to reproduce the actual bytes that were uploaded.

  1. If you write a custom charset finder, the charset it yields will be used to turn bytes into characters and discarded afterwards (but for files big enough to be translated into ReaderMedia, which anyway does not publicly display the charset)

  2. If you write a custom AuUploader which takes care of appending the charset detected by a custom charset finder to the content type, AMedia will systematically strip it off the content type in the inner setup method.

Being so, no choice remains but to write special media wrappers that preserve the charset information or display it by means of a getter, as explained here.

Wouldn't it be possible for text media objects to preserve charset info in the content type? Or at least display a getter method does grant access to this information?

Kind regards

delete flag offensive retag edit

1 Answer

Sort by ยป oldest newest most voted
0

answered 2015-09-03 12:18:48 +0800

cor3000 gravatar image cor3000
6280 2 7

updated 2015-09-03 12:20:05 +0800

Yes it's surely possible. But I don't think it makes sense,

When you have text documents this information automatically used to convert the binary information into a character String or Reader.

If it's binary, it's well ... binary and the charset information is ignored on purpose (or embedded into the document such as xml). Is there a case where a text/* with content type with an encoding is not transformed into a String/reader properly, that needs to be fixed?

According to the MIME standard the charset is only applicable to text content type

From https://www.ietf.org/rfc/rfc2045.txt

For example, the "charset" parameter is applicable to any subtype of "text", while the "boundary" parameter is required for any subtype of the "multipart" media type.

So adding a charset to a binary content type doesn't make sense, to me.

In the other post you vaguely mentioned such a case "If by whichever reason you cannot read the media object's content as characters and are forced to read them as raw bytes ...". Do you have a more concrete example. Then I'd be happy to help.

link publish delete flag offensive edit
Your answer
Please start posting your answer anonymously - your answer will be saved within the current session and published after you log in or create a new account. Please try to give a substantial answer, for discussions, please use comments and please do remember to vote (after you log in)!

[hide preview]

Question tools

Follow
2 followers

RSS

Stats

Asked: 2015-09-01 08:14:58 +0800

Seen: 34 times

Last updated: Sep 03 '15

Support Options
  • Email Support
  • Training
  • Consulting
  • Outsourcing
Learn More