MatchesXsAnyUri🔗
Check that
text
conforms to the pattern of an xs:anyURI
.
See: https://www.w3.org/TR/xmlschema-2/#anyURI and https://datatracker.ietf.org/doc/html/rfc2396 and https://datatracker.ietf.org/doc/html/rfc2732
Note, that version 1.0 of the XSD specification defines xs:anyURI
as
"defined by RFC 2396, as amended by RFC 2732". Therefore, we use a
pattern here that implements the amendments of RFC 2732. This should not
be confused with matches_RFC_2396
, which does not include those
amendments and is used in different parts of the specification.
Code
alphanum = '[a-zA-Z0-9]'
mark = (
"[\\-_.!~*'()]"
)
unreserved = (
f'({alphanum}|{mark})'
)
hex = (
'([0-9]|[aA]|[bB]|[cC]|[dD]|[eE]|[fF]|[aA]|[bB]|[cC]|[dD]|[eE]|[fF])'
)
escaped = (
f'%{hex}{hex}'
)
pchar = (
f'({unreserved}|{escaped}|[:@&=+$,])'
)
param = (
f'({pchar})*'
)
segment = (
f'({pchar})*(;{param})*'
)
pathSegments = (
f'{segment}(/{segment})*'
)
absPath = (
f'/{pathSegments}'
)
scheme = (
'[a-zA-Z][a-zA-Z0-9+\\-.]*'
)
userinfo = (
f'({unreserved}|{escaped}|[;:&=+$,])*'
)
domainlabel = (
f'({alphanum}|{alphanum}({alphanum}|-)*{alphanum})'
)
toplabel = (
f'([a-zA-Z]|[a-zA-Z]({alphanum}|-)*{alphanum})'
)
hostname = (
f'({domainlabel}\\.)*{toplabel}(\\.)?'
)
ipv4address = (
'[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}'
)
hex4 = (
'[0-9A-Fa-f]{1,4}'
)
hexseq = (
f'{hex4}(:{hex4})*'
)
hexpart = (
f'({hexseq}|{hexseq}::({hexseq})?|::({hexseq})?)'
)
ipv6address = (
f'{hexpart}(:{ipv4address})?'
)
ipv6reference = (
f'\\[{ipv6address}\\]'
)
host = (
f'({hostname}|{ipv4address}|{ipv6reference})'
)
port = '[0-9]*'
hostport = (
f'{host}(:{port})?'
)
server = (
f'(({userinfo}@)?{hostport})?'
)
regName = (
f'({unreserved}|{escaped}|[$,;:@&=+])+'
)
authority = (
f'({server}|{regName})'
)
netPath = (
f'//{authority}({absPath})?'
)
reserved = (
'[;/?:@&=+$,\\[\\]]'
)
uric = (
f'({reserved}|{unreserved}|{escaped})'
)
query = (
f'({uric})*'
)
hierPart = (
f'({netPath}|{absPath})(\\?{query})?'
)
uricNoSlash = (
f'({unreserved}|{escaped}|[;?:@&=+$,])'
)
opaquePart = (
f'{uricNoSlash}({uric})*'
)
absoluteuri = (
f'{scheme}:({hierPart}|{opaquePart})'
)
fragment = (
f'({uric})*'
)
relSegment = (
f'({unreserved}|{escaped}|[;@&=+$,])+'
)
relPath = (
f'{relSegment}({absPath})?'
)
relativeuri = (
f'({netPath}|{absPath}|{relPath})(\\?{query})?'
)
uriReference = (
f'^({absoluteuri}|{relativeuri})?(\\#{fragment})?$'
)
return match(
uriReference,
text
) is not None