Percent Encoding
Encoding
The encode
can be used to percent-encode strings with the specified CharSet.
std::string s = encode("hello world!", unreserved_chars);
assert(s == "hello%20world%21");
A few parameters, such as encoding spaces as plus (+
), can be adjusted with encode_opts
:
encoding_opts opt;
opt.space_as_plus = true;
std::string s = encode("msg=hello world", pchars, opt);
assert(s == "msg=hello+world");
The result type of the function can also be specified via a StringToken so that strings can be reused or appended.
std::string s;
encode("hello ", pchars, {}, string_token::assign_to(s));
encode("world", pchars, {}, string_token::append_to(s));
assert(s == "hello%20world");
We can also use encoded_size
to determine the required size before attempting to encode:
boost::core::string_view e = "hello world";
std::string s;
s.reserve(encoded_size(e, pchars));
encode(e, pchars, {}, string_token::assign_to(s));
assert(s == "hello%20world");
In other scenarios, strings can also be directly encoded into buffers:
boost::core::string_view e = "hello world";
std::string s;
s.resize(encoded_size(e, pchars));
encode(&s[0], s.size(), e, pchars);
assert(s == "hello%20world");
Validating
The class pct_string_view
represents a reference percent-encoded strings:
pct_string_view sv = "hello%20world";
assert(sv == "hello%20world");
pct_string_view
is analogous to string_view
, with the main difference that the percent-encoding of the underlying buffer is always validated.
Attempting to directly construct a pct_string_view
from an invalid string throws an exception.
To simply validate a string without recurring to exceptions, a result
can be returned with the
make_pct_string_view
:
boost::system::result<pct_string_view> rs =
make_pct_string_view("hello%20world");
assert(rs.has_value());
pct_string_view sv = rs.value();
assert(sv == "hello%20world");
This means make_pct_string_view
can also be used to validate strings and keep that information for future use.
The modifying functions in classes such as url
expect instances of
pct_string_view
that have already been validated.
This completely removes the responsibility of revalidating this information or throwing exceptions from these functions:
pct_string_view s = "path/to/file";
url u;
u.set_encoded_path(s);
assert(u.buffer() == "path/to/file");
When exceptions are acceptable, a common pattern is to let a literal string or other type convertible to string_view
be implicitly converted to
pct_string_view
.
url u;
u.set_encoded_path("path/to/file");
assert(u.buffer() == "path/to/file");
If the input is invalid, note that an exception is thrown while the
pct_string_view
is implicitly constructed and not from the modifying function.
Reusing the validation guarantee is particularly useful when the
pct_string_view
comes from another source where the data is also ensured to be validated:
url_view uv("path/to/file");
url u;
u.set_encoded_path(uv.encoded_path());
assert(u.buffer() == "path/to/file");
In the example above,
set_encoded_path
does not to revalidate any information from
encoded_path
because these references are passed as pct_string_view
.
Decode
The class pct_string_view
represents a reference percent-encoded strings.
decode_view
is analogous to pct_string_view
, with the main difference that the underlying buffer always dereferences to decoded characters.
pct_string_view es("hello%20world");
assert(es == "hello%20world");
decode_view dv("hello%20world");
assert(dv == "hello world");
A decode_view
can also be created from a pct_string_view
with the
operator*
.
The also gives us an opportunity to validate external strings:
boost::system::result<pct_string_view> rs =
make_pct_string_view("hello%20world");
assert(rs.has_value());
pct_string_view s = rs.value();
decode_view dv = *s;
assert(dv == "hello world");
This is particularly useful when the decoded string need to be accessed for comparisons with no necessity to explicitly decoding the string into a buffer:
url_view u =
parse_relative_ref("user/john%20doe/profile%20photo.jpg").value();
std::vector<std::string> route =
{"user", "john doe", "profile photo.jpg"};
auto segs = u.encoded_segments();
auto it0 = segs.begin();
auto end0 = segs.end();
auto it1 = route.begin();
auto end1 = route.end();
while (
it0 != end0 &&
it1 != end1)
{
pct_string_view seg0 = *it0;
decode_view dseg0 = *seg0;
boost::core::string_view seg1 = *it1;
if (dseg0 == seg1)
{
++it0;
++it1;
}
else
{
break;
}
}
bool route_match = it0 == end0 && it1 == end1;
assert(route_match);
The member function
pct_string_view::decode
can be used to decode the data into a buffer.
Like the free-function
encode
, decoding options and the string token can be customized.
pct_string_view s = "user/john%20doe/profile%20photo.jpg";
std::string buf;
buf.resize(s.decoded_size());
s.decode({}, string_token::assign_to(buf));
assert(buf == "user/john doe/profile photo.jpg");