Wednesday, November 11, 2009

Rewrite URL to HTML with PHP

When presenting an article or other information it's convenient for readers to be able to click on links you are referring to. With regular expressions the composers are able to write clickable links without know any HTML.

In the example below we are getting the plain text in the variable $str.

Step one, which protocols are allowed? By adding more protocols to the array you're accepting them to be rewritten.
// Protocols
$protocols = array("http://", "https://", "ftp://");
// Imploding the array to fit the regex
$protocols = str_replace("/", "\/", implode("|", $protocols));

Step two, adding http:// to URLs without a specified protocol, like www.example.com
$str = preg_replace("/(?<=^|\(|\s)(www\.[A-Za-z0-9-_]+\.[A-Za-z\.]{2,6}(?:[\/\?].*)?)(?=[\.,!\?]*(?:\s|\(|\)|$))/U", "http://$1", $str);
Step three, replace all accepted URLs with HTML

$str = preg_replace("/(?<=^|\(|\s)({$protocols})((?:[A-Za-z0-9-_]+\.)?[A-Za-z0-9_-]+\.[A-Za-z\.]{2,6}(?:[\/\?].*)?)(?=[\.,!\?]*(?:\s|\(|\)|$))/U", "<a href=\"$1$2\">$2</a>", $str);
This code will work with the following ways to write URLs:
www.example.com
http://www.example.com
http://sub.example.com
http://example.com

www.example.com/example.html
www.example.com/exempe/
www.example.com?id=1
www.example.com/?id=1
www.example.com?id=1#23
www.example.com/example/#23

(http://example.com)
(example: www.example.com)
www.example.com/example)
www.example.com,
www.example.com?
www.example.com!
http://example.com. - The dot won't be a part of the link
http://example.com/. - The dot won't be a part of the link
www.example.com?id=id=123.43. - The last dot won't be a part of the link

But not with:
example.com
sub.example.com

No comments:

Post a Comment