~skeeto/public-inbox

1

Re: State machines are wonderful tools

Stefan Seering
Details
Message ID
<trinity-3c82a9ef-1f78-4330-85a7-b7015619a0bd-1621335266692@3c-app-gmx-bs69>
DKIM signature
missing
Download raw message
Hi, is there a particular reason you wrote

  switch (c) {
  case 0x00: ...
  case 0x2e: ...
  case 0x2d: ...
  default: ...
  }

instead of?

  switch (c) {
  case 0x00: ...
  case '.': ...
  case '-': ...
  default: ...
  }

Re: State machines are wonderful tools

Details
Message ID
<20210518123827.biinvqa6ppttolgp@nullprogram.com>
In-Reply-To
<trinity-3c82a9ef-1f78-4330-85a7-b7015619a0bd-1621335266692@3c-app-gmx-bs69> (view parent)
DKIM signature
missing
Download raw message
Good question, Stefan. It's not a strong preference, but what tipped the 
balance:

1. While writing this program I was already thinking in terms of concrete 
values, considering if there were any subtle ways I (or the compiler) 
might exploit the specific ASCII values for dash and dot. Since conversion 
between hex and binary is trivial, just looking at 0x2e and 0x2d I can 
tell that they differ in their last two bits. Perhaps this could be used 
for a particularly efficient jump table? Could I transform the input to 
unlock some new optimization?

2. This program works exactly the same (i.e. it always processes UTF-8) 
regardless of locale. In C, the specific values of '-' and '.' depend on 
the implementation, so, for instance, a host using EBCDIC will have 
different values for these constants. Even if one wanted the function to 
automatically adapt to locale, my table is already hardcoded to ASCII and 
so always returns ASCII results. However, I'd like the program to behave 
identically everywhere, ignoring locale.

3. I like the way it lines up. :-)

Perhaps this is too obscure? I originally wrote this 6 months ago, and 
revisiting I have no trouble understanding it, so it passes that test. I 
don't have the ASCII table memorized so I couldn't tell you off-hand the 
codes for dash or dot, but my editor (Vim) can readily convert between 
representations, or I could reference a table ("man 7 ascii"). Though it 
ultimately doesn't matter which is which since the important concept is 
trie navigation, and whether left/right is dash/dot isn't important.
Reply to thread Export thread (mbox)