(picture)

September 27, 2002

Data path

Right now my work includes some hairy string handling. We're pulling text and XML from a Web service, parsing it, serializing (and chopping up the resulting strings) and deserializing, storing on disk, pushing back out to web services...

That includes data conversions (dates as milliseconds-of-Unix-epoch, ISO8601 text, locale text, and a few others; numbers, currencies, single- and multi-line text), XML entity encoding and decoding, URL encoding... you get the idea. I'm obsessive about doing this right, because if it's wrong it will break unpredictably. Not just with weird international stuff, but with regular users typing regular text.

I'm a strong believer in the notion of a "clean data path". This may be influenced by having spent a lot of time in localization work, where it's really important to (a) know what character set you're dealing with, and (b) avoid unnecessary string operations: text strings are way too subtle and complex, and the cause of much pain. Any time you take apart a string, rearrange its contents, or reassemble a string, there's plenty of room for errors.

The other influence here is electronic engineering, and the "signal path". What we're building, if it's right, will be a very close analog of another obsession of mine: the Quad 405. It rocks.

Comments

Send three and fourpence, we're going to a dance. :-)

Post a comment
Name:


Email Address:
(optional)

URL:
(optional)

Comments:


Remember my name