Opened 9 years ago

Closed 8 years ago

#11288 closed defect (wontfix)

[patch][ccla]dojo.toJson does not strip control characters as per JSON Spec

Reported by: mdknapp Owned by: Kris Zyp
Priority: high Milestone: tbd
Component: General Version: 1.3.2
Keywords: json control characters Cc:
Blocked By: Blocking:

Description

The JSON Spec specifies control characters are not allowed in JSON. Currently (as of the nightly) Dojo replaces all control characters with with applicable CEC (Character Escape Codes) but it does *not* strip control characters that do not have an equivalent CEC. This causes compatibility conflicts with other JSON parsers.

I have attached a diff (1.3.2) for review that will fix the issue and should be good up through the current nightly.

CLA is on file.

Test Case:

var t1 = { a:  '\x01\x02\x7F\n'};
console.log(t1.a.length);
var t1_json = dojo.toJson(t1);
var t2 = dojo.fromJson(t1_json);
console.log( (1==t2.a.length)?'success':'failure' );

Attachments (1)

json_control_code_fix.diff (1.4 KB) - added by mdknapp 9 years ago.
Diff with fix for ticket #11288

Download all attachments as: .zip

Change History (9)

Changed 9 years ago by mdknapp

Attachment: json_control_code_fix.diff added

Diff with fix for ticket #11288

comment:1 Changed 9 years ago by mdknapp

This becomes a larger issue when using JSON as the RPC protocol and passing in json encoded objects to the server. The JS layer can have control characters in the string they encode to send to the server via JSON over AJAX. The PHP (or other JSON decoder) layer then will fail to decode it if it adheres to the JSON spec (which the json_decode in PHP does).

comment:2 Changed 9 years ago by Adam Peller

Owner: changed from anonymous to Kris Zyp
Summary: dojo.toJson does not strip control characters as per JSON Spec[patch][ccla]dojo.toJson does not strip control characters as per JSON Spec

comment:3 Changed 9 years ago by Kris Zyp

Adam, do you think this is an acceptable size increase for dojo base?

comment:4 Changed 9 years ago by Adam Peller

/me shrugs. Probably worth posing to the list. It's not large, but every little bit counts. Perhaps it would be worth parameterizing the replace with a function and a switch instead of doing multiple passes with replace? Could be better in speed and size? Also, do we need the flag, or can we do this unconditionally?

comment:5 Changed 9 years ago by James Burke

We can pick up this change for base, the size when minified is likely not to be significant. More important is making sure it is correct. I am OK with the patch as-is (as far as structure), but Kris, if you want to do it slightly differently for correctness/speed concerns and is around the same size or smaller, go for it.

comment:6 Changed 9 years ago by Kris Zyp

I don't get it looks like your control characters are in the strings, not in the JSON. This doesn't seem to align with native browser behavior:

t1 = { a: '\x01\x02\x7F\n'} JSON.stringify(t1)

"{"a":"\u0001\u0002\n"}"

JSON.parse(JSON.stringify(t1)).a.length

4

comment:7 Changed 9 years ago by Kris Zyp

I am guessing that we should be escaping these characters (and we currently aren't)?

comment:8 Changed 8 years ago by Kris Zyp

Resolution: wontfix
Status: newclosed

Regardless of what the spec says, all browser vendors agree on not stripping: JSON.parse(JSON.stringify({ a: '\x01\x02\x7F\n'})).a.length -> 4

Note: See TracTickets for help on using tickets.