Opened 8 years ago
Closed 7 years ago
#17271 closed defect (fixed)
Build is adding wrapper declare if file uses UTF-8 encoding
Reported by: | Michael Schall | Owned by: | Michael Schall |
---|---|---|---|
Priority: | undecided | Milestone: | 1.10 |
Component: | BuildSystem | Version: | 1.9.0 |
Keywords: | Cc: | Rawld Gill | |
Blocked By: | Blocking: |
Description
If a file in my own package uses UTF-8 encoding, the build system puts wraps the output with an extra define.
I'm defining a grid widget with the extentions I want to have across my application. Here is the contents of the file:
define("foo/widget/OnDemandGrid", [ "dojo/_base/declare", "dgrid/OnDemandGrid", "dgrid/Keyboard", "dgrid/Selection", "dgrid/extensions/ColumnResizer", "dgrid/extensions/DijitRegistry"], function (declare, OnDemandGrid, Keyboard, Selection, ColumnResizer, DijitRegistry) { return declare("atg.widget.OnDemandGrid", [OnDemandGrid, Keyboard, Selection, ColumnResizer, DijitRegistry], { selectionMode: "single", cellNavigation: false }); });
When performing a build (on Windows if that matters), the built file looks like the following:
// wrapped by build app define("atg/widget/OnDemandGrid", ["dijit","dojo","dojox"], function(dijit,dojo,dojox){ define("atg/widget/OnDemandGrid", [ "dojo/_base/declare", "dgrid/OnDemandGrid", "dgrid/Keyboard", "dgrid/Selection", "dgrid/extensions/ColumnResizer", "dgrid/extensions/DijitRegistry"], function (declare, OnDemandGrid, Keyboard, Selection, ColumnResizer, DijitRegistry) { return declare("atg.widget.OnDemandGrid", [OnDemandGrid, Keyboard, Selection, ColumnResizer, DijitRegistry], { selectionMode: "single", cellNavigation: false }); }); });
If I edit the source file in Notepad++ and change the encoding to UTF-8 without BOM, the built file does not have the extra wrapper define.
Change History (10)
comment:1 Changed 8 years ago by
Owner: | set to Michael Schall |
---|---|
Status: | new → pending |
comment:2 Changed 8 years ago by
Status: | pending → new |
---|
I was not tagging the files as AMD as I did not have a package.js file in my package folder. It seems strange that file encoding would cause a file to be treated as "Not AMD". I believe most tools on Windows at least, create files with BOM so it would be nice to not have that factor into the AMD detection.
For others with this same issue, I added 2 files (package.js, package.json) to the root of my package. I basically took the files from dgrid as my template. I'll include my package.js here for future reference.
var miniExcludes = { "atg/package": 1 }, isTestRe = /\/test\//; var profile = { resourceTags: { test: function(filename, mid){ return isTestRe.test(filename); }, miniExclude: function(filename, mid){ return /\/(?:test|demos)\//.test(filename) || mid in miniExcludes; }, amd: function(filename, mid){ return /\.js$/.test(filename); } } };
comment:3 Changed 8 years ago by
Resolution: | → invalid |
---|---|
Status: | new → closed |
the problem is not that the encoding is treating your file as "Not AMD", it's that by you not telling the builder explicitly that a file is amd it makes the builder guess. unfortunately, guessing is going to be wrong sometimes (as was found in #17120) and the BOM is what makes it guess wrong in this case.
the guessing is based on a regular expression and the BOM causes the regular expression to fail to match. in #17120 it was a leading ;
that caused the regular expression to fail to match. as you can see, regular expressions are not the best way to guess - there's often going to be some valid case you miss.
this is why the build provides a mechanism for you to explicitly tell the build something is amd and then it doesn't need to make a guess.
also, in #17120 i mentioned that you can also inline resourceTags
in your build profile if you don't want to or can't add package.json and package.js to your codebase but for code you have control over, it's definitely recommended to do as you've done and add those files.
comment:4 Changed 8 years ago by
Cc: | Rawld Gill added |
---|
It's disappointing to see this marked as invalid. Windows is a very common platform, and as schallm said, the default way UTF-8 files are stored on Windows, for example from Notepad, is with a BOM.
It's unfortunate that NodeJS on Windows doesn't remove the BOM (in readFileSync()). However, we already have code in the builder to workaround this behavior, in optimizeCss.js, from #15236:
text = text.replace(/^\uFEFF/, ''); // remove BOM
It seems logical to [instead] put that code in util/build/fs.js.
comment:5 Changed 8 years ago by
If it is that easy to fix, I would give a huge +1 to add the line of code. Why not make it easy for people to build packages?
comment:6 Changed 7 years ago by
@schallm, I made a patch for this (so you wouldn't need your workaround), see https://github.com/wkeese/util/commit/45ba5346f08be2c36d12d8d141b5efe57f32309f
The problem is though that I tried to reproduce this problem (without applying my patch) on Windows 7 as follows, and it didn't reproduce for me:
- edit dojo\number.js in notepad, saving as UTF-8 and thus adding the BOM
- cd util\buildscripts
- build.bat action=clean,release profile=standard optimize=shrinksafe
I then looked at release\dojo\dojo\number.js, but it doesn't have a BOM or an extra define() wrapper. Is there something I missed? Do I really need to create my own package to see this problem? Or perhaps the problem happens with node but I'm using rhino (because that's what build.bat calls)?
comment:7 Changed 7 years ago by
If I do a build w/out any optimization then (without my patch) I can see the BOM appear at the beginning of number.js. Either
node ../../dojo.js load=build action=release profile=standard
or
build.bat action=release profile=standard
The patch fixes that problem, so I guess it's an improvement at least (and probably fixes the problem listed in this ticket).
comment:8 Changed 7 years ago by
Milestone: | tbd → 1.10 |
---|---|
Resolution: | invalid |
Status: | closed → reopened |
I filed https://github.com/dojo/util/pull/19. Probably worth checking in for 1.10 if no one sees any problems.
comment:10 Changed 7 years ago by
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
this sounds like you are not tagging this file as amd - see #17120 for a similar issue. if you tag the file as amd does it fix everything?
also, you don't need BOM with UTF-8 http://www.unicode.org/faq/utf_bom.html#bom5