Opened 10 years ago

Closed 10 years ago

Last modified 10 years ago

#9398 closed defect (fixed)

IE: Dijit.Editor International characters problem

Reported by: mpoiu Owned by: bill
Priority: high Milestone: 1.3.2
Component: Editor Version: 1.3.1
Keywords: IE Dijit Editor international characters charset Cc: mmpoiu@…, Jared Jurkiewicz
Blocked By: Blocking:

Description

Dijit.Editor under Internet Explorer (IE6, IE7, IE8) cannot read international characters (Polish "ółńżźćąśę") correctly. File is using UTF-8.

Under FF and Opera everything is fine.

Tested on dojo 1.3.1 release and the latest SVN version.

Using testcase: dijit\tests\test_Editor.html (added meta charset=utf-8, and polish text inside editor)

Attachments (4)

test_Editor_charset.html (918 bytes) - added by mpoiu 10 years ago.
test_Editor_charset.2.html (1.1 KB) - added by bill 10 years ago.
augmented testcase to show how text should look vs. how it actually appears
Codepage.patch (1.8 KB) - added by Jared Jurkiewicz 10 years ago.
Patch to default the editor to the curren tpage codepage if none is specified, and falling back to UTF-8 if nothing else can be determined.
Codepage.2.patch (1.8 KB) - added by Jared Jurkiewicz 10 years ago.
Updated patch, simplifying constructor setting.

Download all attachments as: .zip

Change History (28)

comment:1 Changed 10 years ago by bill

Please attach the actual test case using the attach file button.

Changed 10 years ago by mpoiu

Attachment: test_Editor_charset.html added

comment:2 in reply to:  1 Changed 10 years ago by mpoiu

Replying to bill:

Please attach the actual test case using the attach file button.

Done, I made a simple test. Does that work for you?

comment:3 Changed 10 years ago by bill

Is that file actually in UTF-8? It doesn't look like it to me. (When you look at attachment:test_Editor_charset.html which browser character set setting makes it show up correctly?

comment:4 in reply to:  3 Changed 10 years ago by mpoiu

Replying to bill:

Is that file actually in UTF-8? It doesn't look like it to me. (When you look at attachment:test_Editor_charset.html which browser character set setting makes it show up correctly?

Yes it is in UTF-8. I think that Trac attachment view displays it incorrectly. Try to download it in the original format.

comment:5 Changed 10 years ago by bill

When you look at attachment:test_Editor_charset.html which browser character set setting makes it show up correctly?

comment:6 Changed 10 years ago by mpoiu

None, it always shows up incorrect characters when using the Trac attachment view. Please try to download it using "Original format" link at the bottom.

comment:7 Changed 10 years ago by bill

I see... OK I can reproduce it (on IE only). Maybe it's a problem with the iframe.

Changed 10 years ago by bill

Attachment: test_Editor_charset.2.html added

augmented testcase to show how text should look vs. how it actually appears

comment:8 Changed 10 years ago by bill

Milestone: tbd1.4
Owner: changed from liucougar to bill
Status: newassigned

comment:9 Changed 10 years ago by bill

Resolution: fixed
Status: assignedclosed

(In [17963]) Make non-ascii characters show up in the Editor correctly on IE. Fixes #9398 !strict.

comment:10 Changed 10 years ago by Jared Jurkiewicz

Bill,

Wouldn't it be better to look at the current page and use that charset if no user specified charset was provided (and only if not determinable, to default to UTF-8)?

Attaching a patch that does this...

Changed 10 years ago by Jared Jurkiewicz

Attachment: Codepage.patch added

Patch to default the editor to the curren tpage codepage if none is specified, and falling back to UTF-8 if nothing else can be determined.

Changed 10 years ago by Jared Jurkiewicz

Attachment: Codepage.2.patch added

Updated patch, simplifying constructor setting.

comment:11 Changed 10 years ago by bill

I'm not sure if that would be better or not; you'd need to try a test page in a different character set and see if what I checked in breaks in that case, and conversely should check what happens when the editor document contains characters that aren't representable in the main page's character set.

comment:12 Changed 10 years ago by Jared Jurkiewicz

(In [18383]) Improved fix of codepage setting for Editor. \!strict refs #9398

comment:13 Changed 10 years ago by Jared Jurkiewicz

Bill,

Liu Cougar liked the solution, plus that it defines a way a user can over-ride/set a codepage that's different from the main page (which this now allows, just define charset attribute on the editor when you create it to set the encoding mode you want.

comment:14 Changed 10 years ago by Jared Jurkiewicz

In fact, I tried it. I took your test test_Editor_charset.html and converted it to Shift-JIS (using native2ascii and changing the meta tag for encoding). When I use editor with forced UTF-8, the content is mangled in the iframe (when it wraps/consumes the shift-jis content from the page. with the fix above, the editor renders it right.

So, this way does work better. And it ives the user customization capability of setting the charset to UTF-8 if they want it anyway. They can change the editor to any codepage they want, regardless of parent, if so desired.

comment:15 Changed 10 years ago by bill

(In [18405]) Codepage tests for editor, refs #9398.

comment:16 Changed 10 years ago by bill

(In [18406]) Fix svn:eol-style attribute, refs #9398.

comment:17 Changed 10 years ago by bill

Letting the user override the codepage of the iframe is not useful, as shown by the tests I checked in (particularly editor/nls_8859-2.html). That parameter can be removed.

But, the editor code can be further simplified by making the editor set it's initial content via setValue() (the same way that attr('value', ...) works) rather than supplying it as part of the _iframeSrc (the string referenced from the iframe's src attribute, javascript:parent.dijit.byId("'+this.id+'")._iframeSrc).

I'm going to check in that change.

comment:18 Changed 10 years ago by bill

(In [18432]) Refactor editor to set it's initial content (taken from the srcNodeRef) the same way that attr('value', ...) works: by setting the innerHTML of this.editNode. Initial content is set in the onLoad() method.

This refactor eliminates the need to set the iframe's codepage on IE.

Refs #9398 !strict.

comment:19 Changed 10 years ago by Douglas Hays

Cc: Jared Jurkiewicz added

The public property charset is defined and set but never used now.

comment:20 Changed 10 years ago by bill

(In [18452]) Make FF2 show cursor on blank editors, regressed in [18432], refs #9398 !strict.

comment:21 Changed 10 years ago by bill

Milestone: 1.41.3.2
Resolution: fixed
Status: closedreopened

Traced this back to [16560], so it's a new problem in 1.3. Although I think the best fix is the refactor in [18432], I don't want to make that change in the 1.3 branch until it has more testing... so I'll put a minimal fix into the 1.3 branch, based on [18383].

comment:22 Changed 10 years ago by bill

Resolution: fixed
Status: reopenedclosed

(In [18478]) Minimal fix to 1.3 branch so that editor can be initialized with non-ascii characters, fixes #9398 !strict.

comment:23 Changed 10 years ago by bill

(In [18822]) Fix another regression from [18432]; on FF, can't select then delete initial text in editor. Refs #9398, fixes #9552 !strict.

comment:24 Changed 10 years ago by bill

(In [19958]) Fix another regression from [18432]; on FF2 reloading test_Editor.html w/out clearing the cache causes "parent is not defined" errors. Apparently parent hasn't been defined yet when the iframe evaluates it's src; maybe it's a race condition. I thought I saw this error on FF3.5/mac too but now I can't reproduce it, so just adding workaround code for FF2 for now. Refs #9398 !strict.

Note: See TracTickets for help on using tickets.