Opened 10 years ago

Last modified 10 years ago

#12428 closed enhancement

improve performance of docscripts — at Initial Version

Reported by: liucougar Owned by: liucougar
Priority: high Milestone: 1.7
Component: Doc parser Version: 1.6.0
Keywords: Cc:
Blocked By: Blocking:

Description

the generate.php script currently has two O2 complexity loops, by introducing a hash, they can be changed to O complexity

in addition, the default file store is very inefficient: when it writes new information to a file, it creates a new tmp file for each new row of information, copying over all contents in the current file, then write new content, rename the tmp file to the current file

by again using hash, instead of writing to files, the performance can be dramatically improved (this only involves one read, one write file IO, instead of file IO proportional to how many symbols there are in the parsed code)

before the change, the:

time php generate.php --serialize=xml --store=file dojo dijit
real    3m20.817s
user    3m11.700s
sys     0m8.309s

after applying attached patch

time php generate.php --serialize=xml --store=hash dojo dijit
real    0m57.803s
user    0m56.968s
sys     0m0.396s

so the performance improvement is about 3x faster when parsing dojo and dijit with the patch than the current version. (the performance improvement is much greater if parsing dojo, dijit and dojox, because the current algorithm has two loops with O2 complexity, and file IO proportional to symbols in the parsed content)

Change History (1)

Changed 10 years ago by liucougar

Attachment: 12428.patch added
Note: See TracTickets for help on using tickets.