Opened 16 years ago
Closed 16 years ago
#293 closed defect (fixed)
sample field is not converted to html in non-utf8 html documents
Reported by: | ruslan shevchenko | Owned by: | Olly Betts |
---|---|---|---|
Priority: | normal | Milestone: | 1.0.8 |
Component: | Omega | Version: | 1.0.7 |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description (last modified by )
Problem, that sample field is not converted to utf8, because omega set sample from value of description attribute, but html parsers does not convert attribute values to utf8.
Patch to fix is attached. (against 1.0.7, but I can;t see 1.0.7 in trac version options lins)
Attachments (1)
Change History (8)
by , 16 years ago
Attachment: | omega-rssh-293.patch added |
---|
comment:1 by , 16 years ago
Description: | modified (diff) |
---|---|
Summary: | sample field is not converted to html8 in non-utf8 htmpl documents → sample field is not converted to html in non-utf8 htmpl documents |
Version: | other → 1.0.7 |
comment:2 by , 16 years ago
Summary: | sample field is not converted to html in non-utf8 htmpl documents → sample field is not converted to html in non-utf8 html documents |
---|
comment:3 by , 16 years ago
Well, I fixed the versions at least - it turns out that trac sorts them by date so by setting the dates for all the old releases, they now sort sensibly.
comment:4 by , 16 years ago
Example of document is:
<html>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1251"> <description content="Моя сторінка (My page)" </description>
</html>
As I understand, process_text is called only for text fragement in htmplarse.cc
And htmlpase.cc contains one and only one call for convert_to_utf8: before process_text (line 216) and attributes are passed to open_tag as is, without converting.
So, I still guess, that path is correct.
comment:5 by , 16 years ago
comment:6 by , 16 years ago
Milestone: | → 1.0.8 |
---|
Ah, I'd misread the code around the call to process_text().
Fixed in trunk [11162].
comment:7 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Backported to 1.0 branch [11167].
1.0.7 is in the list, but not next to 1.0.6. We imported data from bugzilla. The import script orders the versions with the newest first, but trac adds new entries to the end. I don't know how to correct this stupidity.
This patch doesn't look right to me either. It would lead to a double conversion to UTF-8 in some situations.
I'm suspecting that this and #292 are actually the same issue and we need to parse the document to find any meta http-equiv which specifies the character set, then convert the document to that and reparse.
Could you supply a sample document which shows this problem too? Or a single document which shows both.