In the example below we are getting the plain text in the variable $str.
Step one, which protocols are allowed? By adding more protocols to the array you're accepting them to be rewritten.
// Protocols
$protocols = array("http://", "https://", "ftp://");
// Imploding the array to fit the regex
$protocols = str_replace("/", "\/", implode("|", $protocols));
Step two, adding http:// to URLs without a specified protocol, like www.example.com
$str = preg_replace("/(?<=^|\(|\s)(www\.[A-Za-z0-9-_]+\.[A-Za-z\.]{2,6}(?:[\/\?].*)?)(?=[\.,!\?]*(?:\s|\(|\)|$))/U", "http://$1", $str);
Step three, replace all accepted URLs with HTML
$str = preg_replace("/(?<=^|\(|\s)({$protocols})((?:[A-Za-z0-9-_]+\.)?[A-Za-z0-9_-]+\.[A-Za-z\.]{2,6}(?:[\/\?].*)?)(?=[\.,!\?]*(?:\s|\(|\)|$))/U", "<a href=\"$1$2\">$2</a>", $str);
This code will work with the following ways to write URLs:
www.example.com
http://www.example.com
http://sub.example.com
http://example.com
www.example.com/example.html
www.example.com/exempe/
www.example.com?id=1
www.example.com/?id=1
www.example.com?id=1#23
www.example.com/example/#23
(http://example.com)
(example: www.example.com)
www.example.com/example)
www.example.com,
www.example.com?
www.example.com!
http://example.com. - The dot won't be a part of the link
http://example.com/. - The dot won't be a part of the link
www.example.com?id=id=123.43. - The last dot won't be a part of the link
But not with:
example.com
sub.example.com
No comments:
Post a Comment