(picture)

June 15, 2003

Anagram kata

Here's my JavaScript approach to PragDave's Kata 6: Anagrams. Quite an eyeopener.

(Updated version with a threefold performance boost. But still slow!)
(Updated again with some more measurement numbers)

Performance is just awful. Seventy seconds (on my P3/700 laptop). Am I doing something really wrong? Is there an order-of-magnitude improvement to be had with a clever algorithm?

After mulling on it, I don't think so.
Each word from the wordlist needs to be added into a hashtable; the "base" word form, here being the word sorted by its characters, is really just a convenient hash key. I tried a few other types of keys (one literally counting the occurrences of A through Z, and a couple others trying to make a "weaker" but faster hash, thinking to resolve collisions later), and honestly it made no real difference. Slower.

In the end, I have to just conclude that JScript's hashtable implementation (which is fundamental to the language, since it's the basis for method dispatch) totally sucks. Some associated timing (spurred in part by Tim Bray's recent article): JScript "for( key in table )" turns out to be really really slow. Stunningly, incredibly slow:







RunsArray Sizesum1() mssum2() ms
100001007330387507
100010007992381829
100100007831377653
101000007972370773
110000008232370703
(code)

If you can build an array (keyed by number) instead of a hashtable (sparse), it's many orders of magnitude faster to count "for( x=0; x < array.length; x++ )".

In desperation, I turned to Google, and found that Alan Green has a close-equivalent Python implementation running near one second. It's a relief to see that I'm not obviously a dumb programmer. But is JScript performance really this awful? Please, someone, say I'm missing something...!