[hashcash] Re: new format tweak coming...
- From: Hubert Chan <hubert@xxxxxxxxx>
- To: hashcash@xxxxxxxxxxxxx
- Date: Sun, 14 Mar 2004 19:38:06 -0500
>>>>> "Adam" == Adam Back <adam@xxxxxxxxxxxxxxx> writes:
Adam> How about we assume the simplification that '" are not special,
Adam> and are considered part of the string as any other alpha-num.
That sounds fine to me.
Adam> :;,= are special and must be quoted with \.
\ needs to be escaped too, of course.
Adam> I think the following should do it, don't have time to test write
Adam> now.
Adam> It's not one regexp, perhaps more advanced use of regexps can
Adam> split this in one go by splitting on [^\\]: etc
Splitting on [^\\]: wouldn't be enough, since that wouldn't split the
string \\:\\:\\:\\:\\ properly. So you would have to count the number
of \ before : and see if it's odd. Even so, if you just do the split,
it would grab some characters before the : into the split token, and
they would disappear. I don't see any good way of just using the split
function (which I will admit is one advantage that url encoding has).
I would do something like:
$field_re = '(?:[^\\:]|\\.)*';
($ver,$date,$stamp,$ext,$rand) =
/($field_re):($field_re):($field_re):($field_re):($field_re)/;
(semi-tested) and then do some similar stuff to split $ext.
But AFAICT, your Perl code should work more-or-less. At least the
concept seems fine to me. I don't think it matters that it's not a
single regexp -- I don't think it's possible to do full
ver/date/stamp/ext/rand splitting, and split the extensions all at the
same time, all in a single regexp, no matter what encoding you use.
[...]
Adam> so no quotes needed (either " or '), reserved are :;=, and so we
Adam> use restricted url-encoding where only those are urlencoded.
If I understand correctly, you would only need url encoding within the
extension field, and only for non-printable characters, right? In that
case, I think that you could just leave it up to whoever creates an
extension to define what type of encoding he/she wants, but encourage
the use of url encoding.
I'm not against using url encoding. I just think it's best to dictate
as little as possible (or is reasonable). I would imagine that someone
might create an extension that is pure binary data, and would want to
use base64 encoding instead of url encoding, since it's much more
compact.
[...]
Hubert> How are : inside the "s supposed to be parsed? I don't see
Hubert> anything wrong with my regexp. As far as I can tell, a : inside
Hubert> "s should be treated just normal characters.
Adam> yes but the outer level parsing needs to split
Adam> ver:date:resource:ext:rand using the :s and if there are unquoted
Adam> :s it can't do that.
Jason's example wasn't in ver:date:resource:ext:rand format, so it is
unclear exactly what he wanted. In particular it is unclear whether the
second colon was supposed to delimit two fields, or just be part of the
string. Ben's comment seemed (to me) to imply that he understood it as
just being part of the string (which was my initial understanding,
counting the number of backslashes), in which case (if " are treated as
special characters) I would think that the : should just be treated as a
normal character. All in all, it seems everything related to that
example seems to be unclear as to what the expected result is supposed
to be.
Hubert> Speaking of being regexp friendly, though, I would recommend
Hubert> that for the resource field, we either disallow colons, or force
Hubert> them to be escaped (or URL encoded).
Adam> We should use the same mechanism. I think \ quoted : would be
Adam> enough.
Sounds good to me.
--
Hubert Chan <hubert@xxxxxxxxx> - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7 5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net. Encrypted e-mail preferred.
- Follow-Ups:
- [hashcash] Re: new format tweak coming...
- From: Adam Back
- References:
- [hashcash] Re: new format tweak coming...
- From: Hubert Chan
- [hashcash] Re: new format tweak coming...
- From: Justin Mason
- [hashcash] Re: new format tweak coming...
- From: Hubert Chan
- [hashcash] Re: new format tweak coming...
- From: Ben Laurie
- [hashcash] Re: new format tweak coming...
- From: Hubert Chan
- [hashcash] Re: new format tweak coming...
- From: Adam Back
Other related posts:
- » [hashcash] new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- » [hashcash] Re: new format tweak coming...
- [hashcash] Re: new format tweak coming...
- From: Adam Back
- [hashcash] Re: new format tweak coming...
- From: Hubert Chan
- [hashcash] Re: new format tweak coming...
- From: Justin Mason
- [hashcash] Re: new format tweak coming...
- From: Hubert Chan
- [hashcash] Re: new format tweak coming...
- From: Ben Laurie
- [hashcash] Re: new format tweak coming...
- From: Hubert Chan
- [hashcash] Re: new format tweak coming...
- From: Adam Back