Percent Encoding

Encoding

The encode can be used to percent-encode strings with the specified CharSet.

std::string s = encode("hello world!", unreserved_chars);
assert(s == "hello%20world%21");

A few parameters, such as encoding spaces as plus (+), can be adjusted with encode_opts:

encoding_opts opt;
opt.space_as_plus = true;
std::string s = encode("msg=hello world", pchars, opt);
assert(s == "msg=hello+world");

The result type of the function can also be specified via a StringToken so that strings can be reused or appended.

std::string s;
encode("hello ", pchars, {}, string_token::assign_to(s));
encode("world", pchars, {}, string_token::append_to(s));
assert(s == "hello%20world");

We can also use encoded_size to determine the required size before attempting to encode:

boost::core::string_view e = "hello world";
std::string s;
s.reserve(encoded_size(e, pchars));
encode(e, pchars, {}, string_token::assign_to(s));
assert(s == "hello%20world");

In other scenarios, strings can also be directly encoded into buffers:

boost::core::string_view e = "hello world";
std::string s;
s.resize(encoded_size(e, pchars));
encode(&s[0], s.size(), e, pchars);
assert(s == "hello%20world");

Validating

The class pct_string_view represents a reference percent-encoded strings:

pct_string_view sv = "hello%20world";
assert(sv == "hello%20world");

pct_string_view is analogous to string_view, with the main difference that the percent-encoding of the underlying buffer is always validated. Attempting to directly construct a pct_string_view from an invalid string throws an exception.

To simply validate a string without recurring to exceptions, a result can be returned with the make_pct_string_view:

boost::system::result<pct_string_view> rs =
    make_pct_string_view("hello%20world");
assert(rs.has_value());
pct_string_view sv = rs.value();
assert(sv == "hello%20world");

This means make_pct_string_view can also be used to validate strings and keep that information for future use. The modifying functions in classes such as url expect instances of pct_string_view that have already been validated. This completely removes the responsibility of revalidating this information or throwing exceptions from these functions:

pct_string_view s = "path/to/file";
url u;
u.set_encoded_path(s);
assert(u.buffer() == "path/to/file");

When exceptions are acceptable, a common pattern is to let a literal string or other type convertible to string_view be implicitly converted to pct_string_view.

url u;
u.set_encoded_path("path/to/file");
assert(u.buffer() == "path/to/file");

If the input is invalid, note that an exception is thrown while the pct_string_view is implicitly constructed and not from the modifying function.

Reusing the validation guarantee is particularly useful when the pct_string_view comes from another source where the data is also ensured to be validated:

url_view uv("path/to/file");
url u;
u.set_encoded_path(uv.encoded_path());
assert(u.buffer() == "path/to/file");

In the example above, set_encoded_path does not to revalidate any information from encoded_path because these references are passed as pct_string_view.

Decode

The class pct_string_view represents a reference percent-encoded strings. decode_view is analogous to pct_string_view, with the main difference that the underlying buffer always dereferences to decoded characters.

pct_string_view es("hello%20world");
assert(es == "hello%20world");

decode_view dv("hello%20world");
assert(dv == "hello world");

A decode_view can also be created from a pct_string_view with the operator*. The also gives us an opportunity to validate external strings:

boost::system::result<pct_string_view> rs =
    make_pct_string_view("hello%20world");
assert(rs.has_value());
pct_string_view s = rs.value();
decode_view dv = *s;
assert(dv == "hello world");

This is particularly useful when the decoded string need to be accessed for comparisons with no necessity to explicitly decoding the string into a buffer:

url_view u =
    parse_relative_ref("user/john%20doe/profile%20photo.jpg").value();
std::vector<std::string> route =
    {"user", "john doe", "profile photo.jpg"};
auto segs = u.encoded_segments();
auto it0 = segs.begin();
auto end0 = segs.end();
auto it1 = route.begin();
auto end1 = route.end();
while (
    it0 != end0 &&
    it1 != end1)
{
    pct_string_view seg0 = *it0;
    decode_view dseg0 = *seg0;
    boost::core::string_view seg1 = *it1;
    if (dseg0 == seg1)
    {
        ++it0;
        ++it1;
    }
    else
    {
        break;
    }
}
bool route_match = it0 == end0 && it1 == end1;
assert(route_match);

The member function pct_string_view::decode can be used to decode the data into a buffer. Like the free-function encode, decoding options and the string token can be customized.

pct_string_view s = "user/john%20doe/profile%20photo.jpg";
std::string buf;
buf.resize(s.decoded_size());
s.decode({}, string_token::assign_to(buf));
assert(buf == "user/john doe/profile photo.jpg");