Opened 9 years ago

Closed 9 years ago

#10595 closed defect (fixed)

dojo.hash decodes hash unconditionally

Reported by: cvogt Owned by: Rob Retchless
Priority: high Milestone: 1.4.1
Component: General Version: 1.3.2
Keywords: Cc: miksago, Adam Peller
Blocked By: Blocking:

Description

Ticket #9969 delivered a patch to hash.js which no longer decodes the hash. There was also some points made as to why decoding the hash is unwanted. But then #7925 re-instated the decoding of the hash to fix a failing test case.

The following is a test resembling a scenario I have encountered where I set the hash similar to a query string where a param value is encoded name value pairs. Using dojo#queryToObject to parse the hash fails to give me the expected result.

I would assume that whatever value I set to the hash is the exact same value that I can retrieve from the hash.

{
    name: "Setting the hash to 'data=foo%3Dbar%26bar%3Dfoo&test=1'",
    setUp: function(){
        dojo.hash('data=foo%3Dbar%26bar%3Dfoo&test=1');
    },
    runTest: function(t){
        t.is('data=foo%3Dbar%26bar%3Dfoo&test=1', dojo.hash());
    },
    tearDown: function(){
        setHash();
    }
}
     _AssertFailure: doh._AssertFailure: assertEqual() failed:
 	expected
		data=foo%3Dbar%26bar%3Dfoo&test=1
	but got
		data=foo=bar&bar=foo&test=1

: assertEqual() failed:
	expected
		data=foo%3Dbar%26bar%3Dfoo&test=1
	but got
		data=foo=bar&bar=foo&test=1

 doh._AssertFailure: assertEqual() failed:
     expected
		data=foo%3Dbar%26bar%3Dfoo&test=1
	but got
		data=foo=bar&bar=foo&test=1

     ERROR IN:
 		 function (t){
				t.is('data=foo%3Dbar%26bar%3Dfoo&test=1', dojo.hash());
			}
 FAILED test: Setting the hash to 'data=foo%3Dbar%26bar%3Dfoo&test=1' 1 ms

I would also argue that the failing test case ("test with spaces"), where the hash is set to 'test%20with%20spaces' and the expected value is "test with spaces", is incorrect as the input and output values are different due to decoding.

Change History (8)

comment:1 Changed 9 years ago by Adam Peller

Cc: miksago added; Rob Retchless removed
Owner: changed from anonymous to Rob Retchless

comment:2 Changed 9 years ago by Rob Retchless

Darn. The decodeURIComponent call in the dojo.hash() getter was added at the last minute to normalize some cross-browser idiosyncrasies regarding automatic encoding of spaces, but has introduced the need to doubly encode values for cases such as this.

Also, a common use case is: dojo.queryToObject(dojo.hash()), both of which decode. So in this case, triple encoding is required to fully preserve encoded segments.

These idiosyncrasies are entirely due to spaces and are avoidable by ensuring that clients encode segments with spaces before setting (the way that dojo.objectToQuery does it). Here are a few examples:

  • dojo.hash(" test") followed by dojo.hash(): FF gets "%20test", IE/Webkit/Opera gets " test".
  • dojo.hash("test with spaces") followed by dojo.hash(): FF gets "test%20with%20spaces", IE/Webkit/Opera gets "test with spaces".

So I propose that we remove the explicit decodeURIComponent call from the dojo.hash() getter, remove the tests that set values with unencoded spaces in them, and then document the fact that users are responsible for encoding/decoding. This will require an adoption for some though.

Thoughts?

comment:3 Changed 9 years ago by James Burke

Milestone: tbd1.4.1

Want to see about getting this in for Dojo 1.4.1 so we can set the API correctly. Need a bit more time to study the issue, hopefully later tonight.

comment:4 Changed 9 years ago by James Burke

retchless: What about the following algorithm, just tried in consoles in IE, Safari and FF 3, so you should double-check it:

  • When setting, always set location.hash = '#' + encodeURIComponent(value);
  • When reading the value, always use location.href, and strip off the fragment ID/hash.
  • Get the stripped off hash from location.href and do a decodeURIComponent() on it.

At least for IE and Firefox and Safari I seemed to get consistent results. Just tried Opera 10 on its command line too.

It seems that setting via location.hash with the value encoded and starting with a # but then reading the location.href may preserve the right levels of encoding?

If that seems to work for you too, if you whip up a patch for the module, I am happy to then apply it so we can get it in Dojo 1.4.1.

comment:5 Changed 9 years ago by Mike Wilson

What's the reason for searching location.href instead of using location.hash directly?

(I agree totally that encode/decode must match, having just one of them is wrong. I don't have a strong opinion if this means having both or neither, as long as it's symmetric.)

comment:6 Changed 9 years ago by Rob Retchless

@mikewse, completely agree that symmetry is required.

@jburke, there are unfortunately some bad downsides to always encoding the input:

dojo.hash(dojo.objectToQuery({test:"test test"})) ends up getting doubly encoded: "#test%3Dtest%2520test" instead of "#test=test%20test".

Let's just remove the decode and document the fact that it's the responsibility of consumers to encode/decode their input/output. Agreed? If so, I've got a patch in hand.

comment:7 Changed 9 years ago by Rob Retchless

Oh, @mikewse the reason we search location.href instead of location.hash is due to an issue regarding question marks in the hash. http://trac.dojotoolkit.org/ticket/9969

comment:8 Changed 9 years ago by Adam Peller

Resolution: fixed
Status: newclosed

(In [21190]) Leave encoding/decoding up to user. Patch from retchless. Fixes #10595, Refs #7925 !strict

Note: See TracTickets for help on using tickets.