Valid characters in URLs

Hello everyone,

Now that URLs now allow non-latin characters, what are the characters that are NOT valid for websites?

Even better, anyone have a regex that validates website URLs with the new changes?

Thanks.

RFC 3986 http://tools.ietf.org/html/rfc3986
Uniform Resource Identifier (URI): Generic Syntax

explore urlencode();