KBD

Keith Devens .com

Friday, July 4, 2008 Flag waving
YAGNI: You ain't gonna need it. – XP slogan
← David Ellis on Bush's 2nd term so farAgentMine.com » PyWeka: ID3 →

Daily link icon Tuesday, November 22, 2005

uriescape

I've been meaning to write a replacement for PHP's built-in urlencode so that characters that don't have to be escaped in the path part of a URI won't be. Here 'tis:

<?php
function uriescape($uri){
    static 
$invalid_chars '/([^-A-Za-z0-9_.!~*\'():@&=$, ])/e';
    return 
str_replace(' ''+'preg_replace($invalid_chars'\'%\'.dechex(ord(\'$1\'))'$uri));
}

$uri ' -_.!~*\'():@&=+$,';
var_dump(urlencode($uri));
var_dump(uriescape($uri));
?>

Gives:

string(43) "+-_.%21%7E%2A%27%28%29%3A%40%26%3D%2B%24%2C"
string(17) "+-_.!~*'():@&=%2b$,"

Note that this is only valid for the path component of the URI. In particular, it's not valid for a query string since the extra characters ":", "@", "&", "+", ",", and "$" are reserved there. Though "+" is interpreted as a space anyway, so I don't know what it means to say that it's reserved in the query and not a path segment. And I don't know why characters like "/" and "?" are reserved in the query as well.

Update: Fixed so that the '+' would still be escaped, as it needs to be, despite what the spec says, to not be interpreted as a space. Where did that convention of using a '+' to represent a space come from anyway? I didn't notice that specified anywhere in the URI spec.

← David Ellis on Bush's 2nd term so farAgentMine.com » PyWeka: ID3 →

Comments XML gif

Adam V. (http://adamv.com) wrote:

The HTML spec mentions turning spaces to +, and then escaping per the URI RFC. As far as I can tell it's a more or less arbitary decision made back in the HTML 2 days, but I haven't found anything more specific yet.

∴ Adam V. | 22-Nov-2005 8:31pm est | http://adamv.com | #8724

Jim wrote:

Thank you very much, Keith, for sharing your code examples! I certainly appreciate them.

∴ Jim | 22-Nov-2005 10:44pm est | #8725

Keith (http://keithdevens.com/) wrote:

...arbitary decision made back in the HTML 2 days...

A lot of this seems like I can do whatever I want and nothing along the way will care, as long as I simply comply with what browsers and PHP expect.

Keith | 23-Nov-2005 12:17am est | http://keithdevens.com/ | #8727

Feel free to post a comment below. Please see my comment policy.

Formatting Rules (No HTML):

  • **bold**, *italic*, _underlined_, --strikeout--
  • "text"="url" creates a link, and URLs are auto-highlighted
  • Blockquote: Like e-mail, begin paragraph with > (greater-than sign)
  • Lists: begin paragraph with *,-, or + (unordered), or # (ordered)
  • Code block: ?!code:language=perl|php|sql|javascript|etc.{\n}...{\n}?!/code

:
(will be your IP address if blank)
: (optional)
(Will not be shown on site)

: (optional)
:

July 2008
SunMonTueWedThuFriSat
 12345
6789101112
13141516171819
20212223242526
2728293031 



RSS feed RSS feed for Keith's Weblog
Atom feed Atom feed for Keith's Weblog
Weblog archive
Recent comments
  on 5 posts

Recent comments XML

Girls, please don't get breast implants

> And no, you will not be receiving​a picture.

:-(...

Keith: Jul 2, 6:05am

Javascript clone function

This is a clever way to clone an​object if you are using YAHOO UI.​Same tec...

Antonio: Jul 1, 12:47pm

I hate Norton Antivirus

Oh just one other thing norton is​great at keeping people out of your​compu...

kevin.sands: Jul 1, 12:50am

Terminator 3 was awful

I think the biggest reason why T3​totally blew was because Edward​Furlong g...

76.167.172.64: Jun 29, 3:06am

Generated in about 0.292s.

(Used 8 db queries)

mobile phone