RICCILAB
> blog/cpp26/Validating-Containers-with-C++26-Reflection

Validating Containers with C++26 Reflection

_DEV_C++26

The earlier post closed with a validator whose recursion rule was three words long: recurse iff aggregate. std::is_aggregate_v<MT> told the walker whether the current member was a plain struct of fields, and if it was, we descended into those fields. Scalars, std::string, anything else — leaves, handled by whatever annotations were attached or skipped entirely.

That rule has a blind spot I didn’t test at the time.

struct Address {
    [[=av::MinLength{2}]]    std::string street;
    [[=av::Range{1, 99999}]] int         zip_code;
};
 
struct User {
    std::optional<Address> address;      // Address is aggregate + annotated
    std::vector<Address>   past;         // same
};
 
User u{Address{"", 0}, {Address{"", 0}, Address{"", 0}}};
auto errors = av::collect(u);
//  → empty.

Neither std::optional nor std::vector is an aggregate. is_aggregate_v<MT> returns false for both, so the walker classifies them as leaves and never looks inside. An invalid Address sitting inside an optional or a vector produces no errors at all. The earlier post never hit this because the demo didn’t have containers — the lesson is that silence is the worst failure mode a validator can have, and the aggregate rule produced exactly that.

Three stages fix it:

  • 13std::optional<T>. Type-driven recursion when the member is optional, has_value() check before descending, NotNullopt{} annotation for “must be present.”
  • 14std::vector<T>. The path type has to change. std::vector<std::string> was enough until indices showed up; now paths need to look like past[0].street, which flat string segments can’t express without lossy baking.
  • 15 — Container-level annotations (MinSize, MaxSize) on the container itself, layered on top of the Stage 14 recursion. The starting point is the header as released at the end of the previous post, unmodified. Everything new attaches to the type-driven branch of validate_impl.

Stage 13 — std::optional<T>

Commit: 48bc245

Two pieces to add:

  1. A way to recognise std::optional<T> at the type level inside validate_impl, so the walker knows to unwrap instead of treating the whole std::optional as a scalar.
  2. An annotation for “this optional field is required.”

Recognising std::optional

Plain trait, nothing reflective:

template <typename T>
struct is_optional : std::false_type {};
 
template <typename T>
struct is_optional<std::optional<T>> : std::true_type {};
 
template <typename T>
constexpr bool is_optional_v = is_optional<T>::value;

I briefly considered doing this through reflection — std::meta::template_of(type_of(member)) == ^^std::optional — mostly out of consistency with the Regex<N> dispatch. It works, but it buys nothing here. The primary-template match is a type-level question and the trait answers it directly; reflection is the right tool when the annotation type is a template family keyed on reflection data, which isn’t the case for std::optional.

The recursion branch

The earlier walker had one type-driven branch at the end of each member iteration:

using MT = std::remove_cvref_t<decltype(obj.[:member:])>;
if constexpr (std::is_aggregate_v<MT>) {
    validate_impl(obj.[:member:], ctx);
}

It grows to two:

if constexpr (is_optional_v<MT>) {
    using Inner = typename MT::value_type;
    if constexpr (std::is_aggregate_v<Inner>) {
        if (obj.[:member:].has_value()) {
            validate_impl(*obj.[:member:], ctx);
        }
    }
} else if constexpr (std::is_aggregate_v<MT>) {
    validate_impl(obj.[:member:], ctx);
}

The optional branch comes first because of the shape of the dispatch: if MT is std::optional<Address>, neither is_aggregate_v<MT> nor a plain recurse would do the right thing — the former is false, and the latter would try to iterate the members of std::optional itself. Once we’ve matched is_optional_v<MT>, the aggregate check shifts one level down, to Inner.

The runtime has_value() gate is the part that wasn’t there before. validate_impl used to operate on plain references; now it has a lifetime precondition — don’t dereference before checking. Violating that would be a null-deref bug silently hidden inside recursion.

NotNullopt{}

A marker annotation, shaped like the earlier NotEmpty{}:

struct NotNullopt {};

And a branch in the annotation dispatch:

} else if constexpr (std::meta::type_of(ann) == ^^NotNullopt) {
    if constexpr (requires { obj.[:member:].has_value(); }) {
        if (!obj.[:member:].has_value()) {
            ctx.errors.push_back({
                ctx.current_path(),
                "must not be nullopt",
                "NotNullopt"
            });
        }
    }
}

The requires guard is cheap insurance. NotNullopt is only meaningful on optional-shaped members; if someone attaches it to a scalar the guard discards the branch and the compiler stays quiet. Same pattern as MinLength using .size() as its guard in the earlier post.

On the naming — NotNullopt over Required. Required is the word JSON Schema and TypeScript use, but in C++ “required” is ambiguous (the member is there by virtue of being declared; what’s required is that its value be non-empty). NotNullopt matches the naming already in the library (NotEmpty) and names the exact state it forbids. std::nullopt_t is the standard vocabulary for this concept; Required would be the translation.

The scalar-inside-optional trap

While writing Stage 13 I wanted to support [[=Range{0, 150}]] std::optional<int> — the natural reading is “if there’s a value, it has to be in range.” The annotation dispatch already handles Range for int fields, so conceptually this is one line of additional plumbing.

It took a subtle wrong turn first. The existing Range guard looks like this:

if constexpr (std::meta::type_of(ann) == ^^Range) {
    if constexpr (requires { obj.[:member:] < 0LL; }) {
        // ...
        if (v < r.min || v > r.max) { /* error */ }
    }
}

The requires clause is there to screen out types that don’t support comparison to a long long. int passes. std::string fails, so the branch is discarded for string fields. The guard gave me the impression I could leave the body alone and std::optional<int> would either fall through or work correctly.

It does not fall through. std::optional has a heterogeneous operator< against T, so std::optional<int>{nullopt} < 0LL compiles — and evaluates true, because nullopt compares less than every T. With Range{0, 150}, a nullopt would report “must be in [0, 150], got …”: wrong condition, wrong value, and a std::format of an empty optional is itself not well-formed in the first place, so the fix isn’t even to change the message — the whole branch has no business firing.

The right behaviour is:

  • Range, MinLength, and the other scalar annotations shouldn’t fire on an std::optional member at all from the outer dispatch.
  • If the optional has a value and the annotation applies, dispatch on the unwrapped value. The second half needs the annotation dispatch pulled out into a helper that takes a value reference, because right now the dispatch only exists as an inline block inside validate_impl’s member loop. That refactor rewires enough code that it doesn’t belong in the same stage as “add is_optional_v and one branch.” So Stage 13 stops at the first half: an explicit !is_optional_v<MT> conjunct on the Range and MinLength guards keeps them from misfiring.
if constexpr (std::meta::type_of(ann) == ^^Range) {
    if constexpr (requires { obj.[:member:] < 0LL; }
               && !is_optional_v<MT>) {
        // ...
    }
}

NotNullopt is untouched — its guard (has_value()) already implies optional.

I left this in the stage comment and will pick it up when the dispatch refactor is the point of the stage, rather than a side-effect of it.

Four cases

struct Address {
    [[=MinLength{2}]]    std::string street;
    [[=Range{1, 99999}]] int         zip_code;
};
 
struct User {
    [[=Range{0, 150}]]  int                     age;
    [[=NotNullopt{}]]   std::optional<Address>  address;
    std::optional<Address>                      prev_address;
    [[=NotNullopt{}]]   std::optional<int>      session_id;
};

Case 1 — required optional present but invalid:

age: must be in [0, 150], got 200 (Range)
address.street: length must be >= 2, got 1 (MinLength)
address.zip_code: must be in [1, 99999], got 0 (Range)

The address field’s Address is reached through *opt, the aggregate recursion fires on its two annotated members, and the paths are built from the same path_stack push/pop that existed before. No new path logic — the optional wrapper doesn’t need a segment of its own.

Case 2 — required fields empty:

address: must not be nullopt (NotNullopt)
session_id: must not be nullopt (NotNullopt)

NotNullopt fires, recursion doesn’t (there’s no value to recurse into), and no follow-up errors mask the real problem. prev_address (no NotNullopt) is silent, which is the intended default.

Case 3 — everything valid. No output.

Case 4 — optional present without NotNullopt, inner invalid:

prev_address.street: length must be >= 2, got 0 (MinLength)
prev_address.zip_code: must be in [1, 99999], got 0 (Range)

This is the case that would have been silently empty before Stage 13. The whole point.

What this stage didn’t change

The path type is still std::vector<std::string>. Optional is a lifetime wrapper, not a path wrapper — the field name address is the right segment whether the value is present or not. That changes in the next stage, because std::vector<T> introduces indices, and indices aren’t field names.


Stage 14 — std::vector<T> and path indexing

Commit: 4cf6353

The shape of the new branch looks almost identical to the optional one:

} else if constexpr (is_vector_v<MT>) {
    using Elem = typename MT::value_type;
    if constexpr (std::is_aggregate_v<Elem>) {
        const auto& vec = obj.[:member:];
        for (std::size_t i = 0; i < vec.size(); ++i) {
            ctx.path_stack.push_back(i);
            validate_impl(vec[i], ctx);
            ctx.path_stack.pop_back();
        }
    }
}

One trait, one branch, a loop around the same validate_impl call. The interesting change is in the one line that looks trivial — ctx.path_stack.push_back(i).

The path type has to change

Up to Stage 13 path_stack has been std::vector<std::string>, and current_path() has been for (i...) result += (i ? "." : "") + path_stack[i]. Dotted field names, address.street, address.zip_code. That’s fine when every segment is a C++ identifier — one character joins them all and nothing is lost.

Indices break the invariant. emails[0].value is not three identifiers joined with dots; it’s a field name, an index, a field name, and the separators differ by kind ([ before the index, ] after, . before the next field). If I stuff "[0]" into path_stack at push time, the string works for printing but hides the segment’s kind from everything downstream:

  • Filter ValidationErrors by path prefix. "users" should match "users[0].name" but not "users_count" — trivial with a segment-aware comparison, and a string-matching hack until someone names a field "users_archived".
  • Render to JSON Pointer (/users/0/name). Once the separator is baked as "[0]", walking it back out is a parse.
  • Localise to something human ("첫 번째 유저의 이름"). Same problem: the renderer needs to know that 0 is an index, not part of the field name. Baking segment kinds into a string is a one-way compression. Keep the structure:
using PathSegment = std::variant<std::string, std::size_t>;
 
struct ValidationContext {
    std::vector<ValidationError>   errors;
    std::vector<PathSegment>       path_stack;
    // ...
};

current_path() becomes a small renderer that dispatches on the alternative:

std::string current_path() const {
    std::string r;
    for (const auto& seg : path_stack) {
        if (std::holds_alternative<std::size_t>(seg)) {
            r += std::format("[{}]", std::get<std::size_t>(seg));
        } else {
            const auto& name = std::get<std::string>(seg);
            if (!r.empty()) r += '.';
            r += name;
        }
    }
    return r;
}

[N] has no separator of its own — it clamps onto whatever preceded it. Field names get a leading . iff the output is non-empty, which also handles the first-segment-no-dot case without a branch.

Push sites stay close to what they were:

ctx.path_stack.push_back(std::string{std::meta::identifier_of(member)});  // field
ctx.path_stack.push_back(i);                                              // index

The push for a field is already building a std::string from identifier_of, and the variant silently accepts either alternative, so neither call site got more complicated.

MinLength’s guard, again

Stage 13 added !is_optional_v<MT> to the Range and MinLength guards to keep their requires checks from silently accepting std::optional<int> and std::optional<std::string>. std::vector<T> has the same problem: the .size() check that qualifies strings for MinLength qualifies vectors too. A bare guard would accept std::vector<EmailEntry> and report “length must be >= N” when the vector is too short — meaningful, but the wrong spelling. MinLength means string characters; MinSize (next stage) means container elements. Same word, different semantics. I want the dispatch to pick one deliberately, not accidentally.

if constexpr (requires { obj.[:member:].size(); }
           && !is_optional_v<MT>
           && !is_vector_v<MT>) {
    // MinLength body
}

Same pattern as Stage 13, one more conjunct.

Cases

struct EmailEntry {
    [[=MinLength{3}]] std::string value;
};
 
struct Address {
    [[=MinLength{2}]]    std::string street;
    [[=Range{1, 99999}]] int         zip_code;
};
 
struct User {
    [[=Range{0, 150}]]  int                      age;
    std::vector<EmailEntry>                      emails;
    std::vector<Address>                         past_addresses;
    [[=NotNullopt{}]]   std::optional<Address>   current_address;
};

Case 1 — mixed failures across two vectors:

emails[1].value: length must be >= 3, got 1 (MinLength)
emails[3].value: length must be >= 3, got 0 (MinLength)
past_addresses[1].street: length must be >= 2, got 0 (MinLength)
past_addresses[1].zip_code: must be in [1, 99999], got 0 (Range)

Four errors, four correct paths. Element [0] and [2] of emails were valid and produce nothing; the walker reports only what failed.

Case 2 — required optional missing while vectors are fine:

current_address: must not be nullopt (NotNullopt)

The vector branch and the optional branch coexist with no coordination — one fires per member iteration, decided by if constexpr on MT.

Case 3 — everything valid. No output.

Case 4 — empty vectors:

(no errors)

And this is the shape of what Stage 15 has to solve. Element-level annotations apply to whatever elements exist; when the vector is empty there are no elements, so there’s nothing for the walker to look at. That’s correct as far as it goes, but it means “emails must be non-empty” is not expressible yet. Container-level annotations fix that.

What the new path type enables even before Stage 15

The printed form emails[0].value is the visible change, but the payload shift is less obvious. ValidationError::path in this stage is still a string, same as before — current_path() renders on every error push. The path_stack underneath it is now structurally richer than what the ValidationError exposes. A future revision that wants to surface structured paths to the caller (e.g. std::vector<PathSegment> alongside the printed form in ValidationError) has the data. That isn’t this stage’s job, but it was one of the reasons for the variant choice over the flat-string-with-brackets shortcut.


Stage 15 — Container-level annotations

Commit: 3a64156

Stage 14 can validate things inside a vector but not statements about the vector — the “must have at least one element” shape of MinSize{1}, or “no more than five tags” of MaxSize{5}. These are pure annotation extensions, no new recursion path, no path-stack change. Two types:

struct MinSize {
    std::size_t value;
    constexpr MinSize(std::size_t v) : value(v) {}
};
 
struct MaxSize {
    std::size_t value;
    constexpr MaxSize(std::size_t v) : value(v) {}
};

Two branches in the annotation dispatch:

} else if constexpr (std::meta::type_of(ann) == ^^MinSize) {
    if constexpr (is_vector_v<MT>) {
        constexpr auto r = std::meta::extract<MinSize>(ann);
        const auto& v = obj.[:member:];
        if (v.size() < r.value) {
            ctx.errors.push_back({
                ctx.current_path(),
                std::format("size must be >= {}, got {}", r.value, v.size()),
                "MinSize"
            });
        }
    }
}

The guard is is_vector_v<MT> rather than the looser requires { .size(); }. The looser form would silently accept strings (which have .size()) and let MinSize work as a synonym for MinLength. That’s the same accident I spent Stages 13 and 14 closing the other direction of — MinLength picking up vectors. I want each annotation to mean one thing.

MinSize{1} on an optional is not the right spelling either. The vocabulary for “must be present” is NotNullopt, and MinSize is gated out of that path by the trait.

Output shape

struct User {
    [[=Range{0, 150}]]            int                       age;
    [[=MinSize{1}, =MaxSize{5}]]  std::vector<EmailEntry>   emails;
    [[=MaxSize{3}]]               std::vector<std::string>  tags;
};

The tags field is an interesting case on its own — its element type is std::string, which isn’t aggregate, so the Stage 14 element-recursion branch doesn’t fire for it. Container-level annotations apply anyway, because they’re decided at annotation dispatch and don’t depend on whether the element branch runs.

---- case 1: emails empty, tags too long ----
  emails: size must be >= 1, got 0 (MinSize)
  tags: size must be <= 3, got 4 (MaxSize)

Case 2 demonstrates the interaction with Stage 14’s recursion:

---- case 2: emails oversized + invalid elements ----
  emails: size must be <= 5, got 6 (MaxSize)
  emails[1].value: length must be >= 3, got 1 (MinLength)
  emails[5].value: length must be >= 3, got 0 (MinLength)

Container-level annotations fire first (same iteration, annotation dispatch precedes type-driven recursion), then the walker descends through every element regardless — the collector does not short-circuit just because the container itself is invalid. That’s the “core always collects” principle from the earlier post; MaxSize failing is not a reason to stop looking at emails[5].

Case 4 confirms that the bounds are inclusive:

---- case 4: size bounds inclusive ----
  (no errors — [>= 1] and [<= 3] are inclusive)

emails is size 1 against MinSize{1} and MaxSize{5}; tags is size 3 against MaxSize{3}. Both pass.

Scope — container taxonomy

MinSize and MaxSize are gated on is_vector_v, not on a “has .size()” trait. That means std::array<T, N> (statically sized, probably redundant), std::map, std::unordered_map, std::set — all silently unsupported by the dispatch right now. Extending to each is a one-line addition to the trait family (is_array, is_map, …) plus an adjustment to the guard. The decision I’m making here is not “these containers aren’t worth validating,” but “one container per stage, expand the taxonomy when there’s a concrete reason.”


Where this leaves the walker

The type-driven recursion in the walker now branches three ways:

if      constexpr (is_optional_v<MT>)        { /* has_value() then recurse if aggregate */ }
else if constexpr (is_vector_v<MT>)          { /* push index, recurse per element if aggregate */ }
else if constexpr (std::is_aggregate_v<MT>)  { /* recurse directly */ }

Each branch is two or three lines. The cascade order matters — is_aggregate_v comes last because std::optional and std::vector aren’t aggregates anyway, so putting them first just makes the “generic aggregate” case a default when nothing more specific applied.

The annotation dispatch gained two kinds of guards along the way:

  • !is_optional_v<MT> / !is_vector_v<MT> on Range and MinLength, preventing their requires probes from silently accepting wrapper types.
  • is_vector_v<MT> on MinSize / MaxSize, keeping them out of string fields and optionals. Either direction of guard is a deliberate narrowing. The walker could be more permissive — letting MinLength apply to vectors under a “length means size” reading, letting MinSize{1} stand in for NotNullopt — and the compile would still go through. The choice to keep annotations distinct is about error messages and intent clarity, not a C++-level constraint.

There are two deferred items from this series worth naming explicitly:

  1. Annotation dispatch as a value-taking helper. std::optional<int> with [[=Range{0, 150}]] should validate *opt when has_value(), and std::vector<int> with [[=Range{0, 150}]] should validate each element. Both require pulling the current inline dispatch out into a function that takes a value reference, at which point the wrapper branches can call back into it. That refactor is the point of a future stage.
  2. Nested composition. std::vector<std::optional<Address>> and std::optional<std::vector<Address>> don’t recurse correctly today because validate_impl<T> is written around nonstatic_data_members_of(^^T), which assumes T has struct-like members. Fixing this probably means splitting validate_impl into “walk the members of an aggregate” and “dispatch a value through the type-driven branches.” Same refactor as (1), more or less. Both are scoped out of this post. The shape is clear — the work isn’t done.

Update: resolved in One Refactor, Three Payoffs. The value-taking helper from (1) and the walk/dispatch split from (2) turned out to be the same refactor, and it lands both cases plus a third — signature-selected predicate scope on the same field.

EOF — 2026-04-18
> comments