Bad Code Smells in WACT, a refactoring review
October 18th, 2004This post shows some bad design elements in WACT and discusses how to improve them. It lists some related bad code smells in the WACT template compiler. It is a non-trivial example of refactoring in an established application and an illustration of OO design issues.
related files:
treebuilder.inc.php
parserstate.inc.php
tagjudge.inc.php
tagdictionary.inc.php
input.tag.php
The TreeBuilder and ComponentParsingState classes may need to have code re-distributed between the two.
The TreeBuilder class performs two distinct tasks: building the tree (Managing the Component and ParentComponent fields) and constructing the component. This may be an example of a Large Class code smell. It is certainly an example of poor class cohesion.
The openBranch method of TreeBuilder has the job of both constructing the component and adding it to the tree. Breaking this method into two distinct functions would clarify the code and remove a Long Parameter List smell from openBranch.
The openBranch and closeBranch methods of TreeBuilder have boolean parameters. This makes these calls to TreeBuilder from the ComponentParsingState more confusing. (more on refactoring parameters and boolean parameters).
The closeBranch method performs two tasks. setting the hasClosingTag field is in this method for convenience. It is really a part of the construction of the component and probably belongs closer to openBranch. (conveniently eliminating the need for having a boolean parameter in closeBranch).
The Component construction methods on TreeBuilder pass along a Locator object through a deep calling tree. The Locator object is a field on the ComponentParsingState class. The many methods on TreeBuilder for constructing a component that are called from ComponentParsingState may be an example of the Feature Envy code smell. There seems to be some excessive coupling between ComponentParsingState and TreeBuilder. Perhaps the construction methods really belong to ComponentParsingState, not TreeBuilder?
The startElement, endElement, and emptyElement methods of ComponentParsingState show an example of the Innappropriate Intimacy smell by manipulating the component field of the TreeBuilder class ($this->TreeBuilder->Component->). They do this as part of constructing the component. This is probably further evidence that the component should be constructed inside the ComponentParsingState class and passed as a whole unit into the openBranch method of the TreeBuilder class for addition into the tree. Another example of excessive coupling.
The isServerTagComponentTag method of the TagJudge class instantiates a dummy component to examine a local variable containing information about the component and then throws the component away. The TagInfo records in the TagDictionary are supposed contain metadata about each component. The information returned by isServerTagComponentTag on the Component class should be moved into the TagInfo meta-data. Everytime isServerTagComponentTag is called, a TagInfo object should instead be available to call, thus elminating the isServerTagComponentTag method from the TagJudge Class. Eliminating the unnecessary object instantiation will probably result in a tiny performance improvement.
The calling of the isComponent method on TagJudge from ComponentParsingState is always a two stage process. First isComponent is called on TagJudge, then getTagInfo is called on TagDictionary. That combined with the call to getTagInfo inside the implementation of isComponent and the TagDictionary field of TagJudge, used only in isComponent, suggests that perhaps the isComponent method is really a sophisticated find method for tag information and should be a combined findComponent method on the TagDictionary class that returns NULL if no appropriate TagInfo class can be found.
Right now, the two stage component finding process in ComponentParsingState means that there is a one to one relationship between HTML tags and compile time component tags. However, the Switch Statement code smell in the InputTag suggests that a one to one relationship is not correct, and that we really need a one to many relationship. The code cannot support a one to many relationship as long as the finding of a TagInfo object is broken up into two stages and not encapsulated into the same place.
The fact that both methods on TagJudge can be moved to other classes, suggests that TagJudge is an example of the Lazy Class code smell. The TagJudge class can probably be eliminated.
The InputTagInfo class shows the Data Class code smell. instead of defining a unique class for each tag to be registered, a common TagInfo class should be available. An instance of this class could then be initialized and registered. Then the common TagInfo class would be available to move behavior to in further refactoring efforts. Additionally, this would allow the information in the TagDictionary to be loaded from configuration files, instead of requiring tag files to be included, a slow and unncessary process.
These refactoring issues are related to three WACT feature requests. Each of these three requests should largely be a change to the TagDictionary class and the (nonexistent) TagInfo class. The fact that theses features require broad changes across the template compiler is an example of the Shotgun Surgery code smell and an example of poor encapsulation around the TagInfo and TagDictionary classes.
Allow multiple components per tag
Convert TagInfo to XML based format
Lazy loading for Compiler Components
Note that implementing the lazy loading will have a significant impact on compilation times, illustrating that fully refactored normalized code is easier to performance tune.
Anyone else see any additional code smells here? Anyone up for a non-trivial refactoring learning exercise? I hope to be able to eventually post the refactored versions of these files.
October 19th, 2004 at 4:04 am
Maybe it’s a bit offtopic, yet i find the following feature very important and nice to have in WACT - custom tags loader.
Something like WACT :: registerTagsLoader($loader);
WACT could use some default tags loader if none is registered.
This way it’ll simplify extending WACT by 3d party software tools…
October 25th, 2004 at 11:31 am
The bit that I’ve never understood is, what is a DataSpace? Do you have a release date planned for the next version?
October 26th, 2004 at 5:01 am
Great write up. Should confess I’m guilty in most cases - probably Shotgun Surgery.
Another one to watch is the relationship between ExpressionFilterFindingParser and ExpressionFilterParser (expression.inc.php) and in the latter, the ways Args is used is dubious.
Off the specific topic of bad smells, some particular issues;
The way compileall.inc.php works also needs review I think. For a start seems to bypass various code in a way that makes it’s behaviour unpredictable plus that constant WACT_ERROR_CONTINUE is a hack.
Find Jason’s list / script bug unsettling. The new HTMLParser is generally a big improvement but it’s now got embedded knowledge of HTML (the script tag). Somehow think we’re going to need to have a second stage of parsing. Also what about other document formats (where there happens to be some kind of script tag)?
Wonder if we need to consider some kind of document detection at last. There’s still the issue with form tags and XHTML. And if a document does claim to be well formed XML, perhaps we can switch to a real XML parser?
July 27th, 2005 at 7:36 am
I am doing a project where i am anaylsing how bad code documenatation becomes a nightmare for new/future developer.
i would be obliged if u can help me in any form.
thanks……….