[hashcash] Re: new format tweak coming...
- From: Hubert Chan <hubert@xxxxxxxxx>
- To: hashcash@xxxxxxxxxxxxx
- Date: Sun, 14 Mar 2004 16:16:40 -0500
>>>>> "Ben" == Ben Laurie <ben@xxxxxxxxxxxxx> writes:
Justin> Oh good, you'd be happy to provide a regex that'll do the above
Justin> left-to-right scan to cope with
Justin> foo:"bar \\\\\\\\\\\\\\\":baz":blargh
Hubert> "([^\"]|\\.)*"
Ben> Doesn't work, either. You also haven't dealt with : inside the "s.
How are : inside the "s supposed to be parsed? I don't see anything
wrong with my regexp. As far as I can tell, a : inside "s should be
treated just normal characters.
Ben> Before you continue to pursue this line of thought, you might like
Ben> to consider why there are at least two CPAN modules for parsing
Ben> this format (well, CSV) if its so easy to do?
I wouldn't know. I don't do a whole loot of Perl programming, so I
don't deal with CPAN much. (But I am a "Theory of Computing" TA, so I
know my regexps.) But multiple implementations generally tend to
indicate ease of implementation, rather than difficulty. (e.g. compare
the number of music playing programs vs. the number of database
programs.)
Ben> Alternatively, produce a _tested_ regex that parses the whole
Ben> format, not a subset.
Well, looking at Justin's example again, it isn't even in valid format,
according to the spec Adam gave, so testing is futile, since the
expected output is undefined.
Speaking of being regexp friendly, though, I would recommend that for
the resource field, we either disallow colons, or force them to be
escaped (or URL encoded). While allowing colons can be handled fine
using regexps, the underlying implementation would be inefficient.
(e.g. Perl would have to do a bit of backtracking, and Perl doesn't seem
to be very good a parsing from right-to-left.)
And for escaping vs. URL encoding, I will admit that URL encoding is
easier to parse right-to-left than escaping. Escaping is still
possible to parse, but it takes a bit more thought.
--
Hubert Chan <hubert@xxxxxxxxx> - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7 5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net. Encrypted e-mail preferred.
- Follow-Ups:
- [hashcash] Re: new format tweak coming...
- From: Adam Back
- References:
- [hashcash] Re: new format tweak coming...
- From: Hubert Chan
- [hashcash] Re: new format tweak coming...
- From: Justin Mason
- [hashcash] Re: new format tweak coming...
- From: Hubert Chan
- [hashcash] Re: new format tweak coming...
- From: Ben Laurie
Other related posts:
- » [hashcash] new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- [hashcash] Re: new format tweak coming...
- From: Adam Back
- [hashcash] Re: new format tweak coming...
- From: Hubert Chan
- [hashcash] Re: new format tweak coming...
- From: Justin Mason
- [hashcash] Re: new format tweak coming...
- From: Hubert Chan
- [hashcash] Re: new format tweak coming...
- From: Ben Laurie