accept "unwise" characters as valid. rfc 2396 ยง2.4.3 says they aren't valid but they are used all the time in ads. it probablly would be good to accept %n grudgingly as this appears often. (http://www.computerworld.com/s/article/9138007/Microsoft_No_TCP_IP_patches_for_you_XP) Notes: Tue Jan 19 17:06:52 EST 2010 geoff should | be escaped too? if unwise chars appear primarily in ads, isn't that a good reason to reject them? Reference: /n/sources/patch/maybe/webfsunwise Date: Tue Sep 15 21:11:34 CES 2009 Signed-off-by: quanstro@quanstro.net Reviewed-by: geoff --- /sys/src/cmd/webfs/url.c Tue Sep 15 21:08:26 2009 +++ /sys/src/cmd/webfs/url.c Tue Sep 15 21:08:23 2009 @@ -111,11 +111,12 @@ */ /* RE character-class components -- these go in brackets */ +#define UNWISE "\\[\\]|\\\\^{}`" #define PUNCT "\\-_.!~*'()" #define RES ";/?:@&=+$," #define ALNUM "a-zA-Z0-9" #define HEX "0-9a-fA-F" -#define UNRES ALNUM PUNCT +#define UNRES ALNUM PUNCT UNWISE /* RE components; _N => has N parenthesized subexpressions when expanded */ #define ESCAPED_1 "(%[" HEX "][" HEX "])"