Destyling requires that I apply a homomorphism to the DOM tree. That’s just an elaborate way of saying, “ignore bold tags and the like.” Ignoring the bold tags is a bit tricky, because I need to calculate pre-images: based on the result of having removed stylistic elements, where did this part of the tree come from.
To accomplish this feat, I clone the tree. Then walk the clone and remove style nodes while simultaneously associating each node in the original tree with a node in the clone and vice versa (except of course the reverse map is 1 to n). Complex pages usually have two to three thousand nodes which isn’t bad except I’m using JavaScript in the browser.
Discovering the transformation was taking 20 seconds: completely unacceptable. So I pulled out Venkman, Mozilla’s good as gold debugger and profiler. I discovered that the problem was with maintaining the mapping. Not much of a surprise since the class I use for maintaining it is a bandage for the fact that Mozilla doesn’t hash DOM nodes effectively.
A few tweaks later, the cost was down to two seconds: acceptable for our purposes. Without the profiler, I wouldn’t have known that the hashing was the problem.
Perhaps you’re more clever than I and can know the weakest link intuitively. I have no good idea, hence the mantra: no optimization without profilation.
Commentary