2024-03-15 20:55:33 -06:00
|
|
|
.\" generated by cd2nroff 0.1 from curl_url_get.md
|
2024-07-29 18:00:04 -06:00
|
|
|
.TH curl_url_get 3 "2024-07-29" libcurl
|
2024-03-15 20:55:33 -06:00
|
|
|
.SH NAME
|
|
|
|
curl_url_get \- extract a part from a URL
|
|
|
|
.SH SYNOPSIS
|
|
|
|
.nf
|
|
|
|
#include <curl/curl.h>
|
|
|
|
|
|
|
|
CURLUcode curl_url_get(const CURLU *url,
|
|
|
|
CURLUPart part,
|
|
|
|
char **content,
|
|
|
|
unsigned int flags);
|
|
|
|
.fi
|
|
|
|
.SH DESCRIPTION
|
|
|
|
Given a \fIurl\fP handle of a URL object, this function extracts an individual
|
|
|
|
piece or the full URL from it.
|
|
|
|
|
|
|
|
The \fIpart\fP argument specifies which part to extract (see list below) and
|
|
|
|
\fIcontent\fP points to a \(aqchar *\(aq to get updated to point to a newly
|
|
|
|
allocated string with the contents.
|
|
|
|
|
|
|
|
The \fIflags\fP argument is a bitmask with individual features.
|
|
|
|
|
|
|
|
The returned content pointer must be freed with \fIcurl_free(3)\fP after use.
|
|
|
|
.SH FLAGS
|
|
|
|
The flags argument is zero, one or more bits set in a bitmask.
|
|
|
|
.IP CURLU_DEFAULT_PORT
|
|
|
|
If the handle has no port stored, this option makes \fIcurl_url_get(3)\fP
|
|
|
|
return the default port for the used scheme.
|
|
|
|
.IP CURLU_DEFAULT_SCHEME
|
|
|
|
If the handle has no scheme stored, this option makes \fIcurl_url_get(3)\fP
|
|
|
|
return the default scheme instead of error.
|
|
|
|
.IP CURLU_NO_DEFAULT_PORT
|
|
|
|
Instructs \fIcurl_url_get(3)\fP to not return a port number if it matches the
|
|
|
|
default port for the scheme.
|
|
|
|
.IP CURLU_URLDECODE
|
|
|
|
Asks \fIcurl_url_get(3)\fP to URL decode the contents before returning it. It
|
|
|
|
does not decode the scheme, the port number or the full URL.
|
|
|
|
|
|
|
|
The query component also gets plus\-to\-space conversion as a bonus when this
|
|
|
|
bit is set.
|
|
|
|
|
|
|
|
Note that this URL decoding is charset unaware and you get a zero terminated
|
|
|
|
string back with data that could be intended for a particular encoding.
|
|
|
|
|
|
|
|
If there are byte values lower than 32 in the decoded string, the get
|
|
|
|
operation returns an error instead.
|
|
|
|
.IP CURLU_URLENCODE
|
2024-03-30 12:28:04 -06:00
|
|
|
If set, \fIcurl_url_get(3)\fP URL encodes the hostname part when a full URL is
|
|
|
|
retrieved. If not set (default), libcurl returns the URL with the hostname raw
|
|
|
|
to support IDN names to appear as\-is. IDN hostnames are typically using
|
2024-03-15 20:55:33 -06:00
|
|
|
non\-ASCII bytes that otherwise gets percent\-encoded.
|
|
|
|
|
|
|
|
Note that even when not asking for URL encoding, the \(aq%\(aq (byte 37) is URL
|
|
|
|
encoded to make sure the hostname remains valid.
|
|
|
|
.IP CURLU_PUNYCODE
|
|
|
|
If set and \fICURLU_URLENCODE\fP is not set, and asked to retrieve the
|
|
|
|
\fBCURLUPART_HOST\fP or \fBCURLUPART_URL\fP parts, libcurl returns the host
|
|
|
|
name in its punycode version if it contains any non\-ASCII octets (and is an
|
|
|
|
IDN name).
|
|
|
|
|
|
|
|
If libcurl is built without IDN capabilities, using this bit makes
|
|
|
|
\fIcurl_url_get(3)\fP return \fICURLUE_LACKS_IDN\fP if the hostname contains
|
|
|
|
anything outside the ASCII range.
|
|
|
|
|
|
|
|
(Added in curl 7.88.0)
|
|
|
|
.IP CURLU_PUNY2IDN
|
|
|
|
If set and asked to retrieve the \fBCURLUPART_HOST\fP or \fBCURLUPART_URL\fP
|
|
|
|
parts, libcurl returns the hostname in its IDN (International Domain Name)
|
|
|
|
UTF\-8 version if it otherwise is a punycode version. If the punycode name
|
|
|
|
cannot be converted to IDN correctly, libcurl returns
|
|
|
|
\fICURLUE_BAD_HOSTNAME\fP.
|
|
|
|
|
|
|
|
If libcurl is built without IDN capabilities, using this bit makes
|
|
|
|
\fIcurl_url_get(3)\fP return \fICURLUE_LACKS_IDN\fP if the hostname is using
|
|
|
|
punycode.
|
|
|
|
|
|
|
|
(Added in curl 8.3.0)
|
2024-06-01 14:49:19 -06:00
|
|
|
.IP CURLU_GET_EMPTY
|
|
|
|
When this flag is used in curl_url_get(), it makes the function return empty
|
|
|
|
query and fragments parts or when used in the full URL. By default, libcurl
|
|
|
|
otherwise considers empty parts non\-existing.
|
|
|
|
|
|
|
|
An empty query part is one where this is nothing following the question mark
|
|
|
|
(before the possible fragment). An empty fragments part is one where there is
|
|
|
|
nothing following the hash sign.
|
|
|
|
|
|
|
|
(Added in curl 8.8.0)
|
2024-07-29 18:00:04 -06:00
|
|
|
.IP CURLU_NO_GUESS_SCHEME
|
|
|
|
When this flag is used in curl_url_get(), it treats the scheme as non\-existing
|
|
|
|
if it was set as a result of a previous guess; when CURLU_GUESS_SCHEME was
|
|
|
|
used parsing a URL.
|
|
|
|
|
|
|
|
Using this flag when getting CURLUPART_SCHEME if the scheme was set as the
|
|
|
|
result of a guess makes curl_url_get() return CURLUE_NO_SCHEME.
|
|
|
|
|
|
|
|
Using this flag when getting CURLUPART_URL if the scheme was set as the result
|
|
|
|
of a guess makes curl_url_get() return the full URL without the scheme
|
|
|
|
component. Such a URL can then only be parsed with curl_url_set() if
|
|
|
|
CURLU_GUESS_SCHEME is used.
|
|
|
|
|
|
|
|
(Added in curl 8.9.0)
|
2024-03-15 20:55:33 -06:00
|
|
|
.SH PARTS
|
|
|
|
.IP CURLUPART_URL
|
2024-06-01 14:49:19 -06:00
|
|
|
When asked to return the full URL, \fIcurl_url_get(3)\fP returns a normalized and
|
|
|
|
possibly cleaned up version using all available URL parts.
|
|
|
|
|
|
|
|
We advise using the \fICURLU_PUNYCODE\fP option to get the URL as "normalized" as
|
|
|
|
possible since IDN allows hostnames to be written in many different ways that
|
|
|
|
still end up the same punycode version.
|
2024-03-15 20:55:33 -06:00
|
|
|
|
2024-06-01 14:49:19 -06:00
|
|
|
Zero\-length queries and fragments are excluded from the URL unless
|
|
|
|
CURLU_GET_EMPTY is set.
|
2024-03-15 20:55:33 -06:00
|
|
|
.IP CURLUPART_SCHEME
|
|
|
|
Scheme cannot be URL decoded on get.
|
|
|
|
.IP CURLUPART_USER
|
|
|
|
.IP CURLUPART_PASSWORD
|
|
|
|
.IP CURLUPART_OPTIONS
|
|
|
|
The options field is an optional field that might follow the password in the
|
|
|
|
userinfo part. It is only recognized/used when parsing URLs for the following
|
|
|
|
schemes: pop3, smtp and imap. The URL API still allows users to set and get
|
|
|
|
this field independently of scheme when not parsing full URLs.
|
|
|
|
.IP CURLUPART_HOST
|
|
|
|
The hostname. If it is an IPv6 numeric address, the zone id is not part of it
|
|
|
|
but is provided separately in \fICURLUPART_ZONEID\fP. IPv6 numerical addresses
|
|
|
|
are returned within brackets ([]).
|
|
|
|
|
|
|
|
IPv6 names are normalized when set, which should make them as short as
|
|
|
|
possible while maintaining correct syntax.
|
|
|
|
.IP CURLUPART_ZONEID
|
|
|
|
If the hostname is a numeric IPv6 address, this field might also be set.
|
|
|
|
.IP CURLUPART_PORT
|
|
|
|
A port cannot be URL decoded on get. This number is returned in a string just
|
|
|
|
like all other parts. That string is guaranteed to hold a valid port number in
|
|
|
|
ASCII using base 10.
|
|
|
|
.IP CURLUPART_PATH
|
|
|
|
The \fIpart\fP is always at least a slash (\(aq/\(aq) even if no path was supplied
|
|
|
|
in the URL. A URL path always starts with a slash.
|
|
|
|
.IP CURLUPART_QUERY
|
|
|
|
The initial question mark that denotes the beginning of the query part is a
|
|
|
|
delimiter only. It is not part of the query contents.
|
|
|
|
|
|
|
|
A not\-present query returns \fIpart\fP set to NULL.
|
2024-06-01 14:49:19 -06:00
|
|
|
|
|
|
|
A zero\-length query returns \fIpart\fP as NULL unless CURLU_GET_EMPTY is set.
|
2024-03-15 20:55:33 -06:00
|
|
|
|
|
|
|
The query part gets pluses converted to space when asked to URL decode on get
|
|
|
|
with the CURLU_URLDECODE bit.
|
|
|
|
.IP CURLUPART_FRAGMENT
|
|
|
|
The initial hash sign that denotes the beginning of the fragment is a
|
|
|
|
delimiter only. It is not part of the fragment contents.
|
2024-06-01 14:49:19 -06:00
|
|
|
|
|
|
|
A not\-present fragment returns \fIpart\fP set to NULL.
|
|
|
|
|
|
|
|
A zero\-length fragment returns \fIpart\fP as NULL unless CURLU_GET_EMPTY is set.
|
2024-03-30 12:28:04 -06:00
|
|
|
.SH PROTOCOLS
|
2024-07-29 18:00:04 -06:00
|
|
|
This functionality affects all supported protocols
|
2024-03-15 20:55:33 -06:00
|
|
|
.SH EXAMPLE
|
|
|
|
.nf
|
|
|
|
int main(void)
|
|
|
|
{
|
|
|
|
CURLUcode rc;
|
|
|
|
CURLU *url = curl_url();
|
|
|
|
rc = curl_url_set(url, CURLUPART_URL, "https://example.com", 0);
|
|
|
|
if(!rc) {
|
|
|
|
char *scheme;
|
|
|
|
rc = curl_url_get(url, CURLUPART_SCHEME, &scheme, 0);
|
|
|
|
if(!rc) {
|
|
|
|
printf("the scheme is %s\\n", scheme);
|
|
|
|
curl_free(scheme);
|
|
|
|
}
|
|
|
|
curl_url_cleanup(url);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
.fi
|
|
|
|
.SH AVAILABILITY
|
2024-07-29 18:00:04 -06:00
|
|
|
Added in curl 7.62.0
|
2024-03-15 20:55:33 -06:00
|
|
|
.SH RETURN VALUE
|
|
|
|
Returns a CURLUcode error value, which is CURLUE_OK (0) if everything went
|
|
|
|
fine. See the \fIlibcurl\-errors(3)\fP man page for the full list with
|
|
|
|
descriptions.
|
|
|
|
|
|
|
|
If this function returns an error, no URL part is returned.
|
|
|
|
.SH SEE ALSO
|
|
|
|
.BR CURLOPT_CURLU (3),
|
|
|
|
.BR curl_url (3),
|
|
|
|
.BR curl_url_cleanup (3),
|
|
|
|
.BR curl_url_dup (3),
|
|
|
|
.BR curl_url_set (3),
|
|
|
|
.BR curl_url_strerror (3)
|