RICCILAB
> blog/cpp26/annotation-is-the-validator

Annotation IS the Validator

_DEV_C++26

The previous post closed by naming two things Stage 17 left on the table:

Predicate diagnostics. Still "custom predicate failed", hardcoded. […]User-defined validator protocols. Predicate<F> is the cheapest form of user extension: the annotation is a wrapper, the callable is the payload, the contract is fixed. A richer form would let the user write their own annotation class with a validate(v, ctx) member, and the walker would call into it […]

I had expected these to be two separate stages. They turned out to be the same work — once the walker learns “ask the annotation to validate itself” as a dispatch shape, Predicate<F> becomes a specialization of it (and gains its message field as a drop-in), and every other closed-set annotation follows. The whole type_of(ann) == ^^Range | ^^MinLength | ^^MinSize | … ladder collapses into one branch.


The symptom, restated

Stage 17’s ladder looks like this (abbreviated):

template for (constexpr auto ann :
              std::define_static_array(
                  std::meta::annotations_of(Member)))
{
    if constexpr (std::meta::type_of(ann) == ^^Range) {
        if constexpr (requires { v < 0LL; }
                   && !is_optional_v<V>
                   && !is_vector_v<V>) {
            constexpr auto r = std::meta::extract<Range>(ann);
            if (v < r.min || v > r.max) { /* push error */ }
        }
    }
    else if constexpr (std::meta::type_of(ann) == ^^MinLength) { /* ... */ }
    else if constexpr (std::meta::type_of(ann) == ^^MinSize)   { /* ... */ }
    else if constexpr (std::meta::type_of(ann) == ^^MaxSize)   { /* ... */ }
    else if constexpr (std::meta::type_of(ann) == ^^NotNullopt){ /* ... */ }
    else if constexpr (std::meta::template_of(
                           std::meta::type_of(ann)) == ^^Predicate) { /* ... */ }
}

Six if constexpr branches, each hardcoded against a specific annotation type. The library author has to add a branch here every time they introduce an annotation. A user who wants “starts with uppercase” or “IPv4-looking” can’t add a branch here without forking the library — Stage 16’s Predicate<F> was the escape hatch for that, and Stage 17 made it compose through wrappers.

But the escape hatch still goes through Predicate<F>. If I want my own annotation that takes a two-argument message + predicate, or carries an enum severity, or does anything that doesn’t fit the F f; shape, I’m back to forking. And the library’s own branches are still hardcoded — the user’s escape hatch doesn’t help the library author shrink the ladder.

The fix, stated as a question: can the dispatch be ann.validate(v, ctx) — period, no type switch? That is, can every annotation carry its own validation logic and the walker just ask?


This is a design where the runtime shape looks easy (dispatching through a member function) but the reflection shape has several places it could break. Writing the walker first and finding out later which rule rejects it would produce a stage where the failure is buried three layers deep. Three short probes first.

Probe — template member function on an annotation struct as NTTP

Commit: d952c2b

The structural-type rule (C++26) admits literal classes whose bases and non-static data members are public and non-mutable; member functions don’t participate in the equality/equivalence definition. So in principle a template member function should be transparent to the NTTP rule. The question is whether clang-p2996 agrees.

struct StartsWithUppercase {
    template <typename V, typename Ctx>
    constexpr void validate(const V& v, Ctx& ctx) const {
        if (v.empty() || v[0] < 'A' || v[0] > 'Z') {
            ctx.errors.push_back("must start with uppercase");
        }
    }
};
 
struct Probe {
    [[=StartsWithUppercase{}]] std::string name;
};

Compile, run, invoke the member through reflection. It works. StartsWithUppercase{} rides [[=...]] just like Range{0,150} does, and the walker can call its template validate against p.[:member:] without complaint. Q2 passes.

Probe — requires { a.validate(v, ctx); } as a dispatch guard

Commit: d952c2b

Two smaller worries about using a requires expression as the dispatch condition:

  • Q1. ctx is mutable — validate pushes errors into its vector, so the reference is non-const. The requires clause has to typecheck the call against the actual call site, which means instantiating validate with ValidationContext&. If that triggers some rule about constant-evaluation contexts, the guard fails for the wrong reason.
  • Q3. Stage 5 (long-ago) observed that template for with reflection-dependent if constexpr branches can fail to discard on non-matching members — the body of a wrong-typed branch gets typechecked against a member that doesn’t satisfy its guard and blows up. If that recurs here, I need a two-level if constexpr (requires { … }) shell. A minimal probe mixes protocol-style annotations with a legacy value carrier:
struct MustBePositive {
    template <typename V, typename Ctx>
    constexpr void validate(const V& v, Ctx& ctx) const {
        if (v <= 0) { ctx.errors.push_back(std::format(
            "must be positive, got{}", v)); }
    }
};
 
struct RangeLegacy { long long min, max; };  // no validate member
 
struct Probe {
    [[=StartsWithUppercase{}]] std::string name;
    [[=MustBePositive{}]]      int         count;
    [[=RangeLegacy{0,150}]]   int         age;
};

And the dispatch body:

if constexpr (requires { a.validate(p.[:member:], ctx); }) {
    a.validate(p.[:member:], ctx);
}
else if constexpr (std::meta::type_of(ann) == ^^RangeLegacy) {
    // explicit fallback
}

The requires clause correctly accepts StartsWithUppercase against the string member and MustBePositive against the int member, and correctly rejects both of them for the other fields. The RangeLegacy annotation falls through to the explicit branch. No typecheck pollution, no spurious errors, no Stage-5-style discard failure. Q1 and Q3 both pass.

(The protocol is robust enough that the fallback branch isn’t even needed in the real file — every annotation gets a validate member. The fallback was in the probe to verify that a mixed scenario compiles, in case of future hybrid use.)

Probe — identifier_of on a template specialization

Commit: d952c2b

The one probe that didn’t work the way I’d hoped. I wanted the walker to auto-extract the annotation’s name so the "Range" / "Predicate" / "MinLength" strings in error messages would come from reflection instead of being hardcoded.

using A = [:std::meta::type_of(ann):];
constexpr auto name = std::meta::identifier_of(^^A);   // ??

For Range (a plain class), identifier_of(^^Range) gives "Range". For Predicate<SomeClosure> (a class-template specialization), it fails:

error: call to consteval function 'std::meta::identifier_of'
       is not a constant expression
note: names of template specializations are not identifiers

std::meta::template_of lets you peel a specialization to its primary, so in principle you could guard on “is this a specialization” and take the primary’s identifier. The problem: the probe for “is this a specialization” itself isn’t cheap to express — template_of is consteval and rejects at evaluation time when the reflection isn’t a specialization, which a requires clause doesn’t catch (it only checks the expression’s syntactic well-formedness, not its evaluation).

The right fix is probably a std::meta::is_class_template_specialization_of metafunction that returns a plain bool. clang-p2996 has some queries along those lines, but I’d rather not chase the specific spelling for what’s ultimately a polish feature. Stage 17’s pattern — each annotation embeds its own name in the error — stays, and the validate body just hardcodes the string:

ctx.errors.push_back({
    ctx.current_path(),
    std::format("must be in [{},{}], got{}", min, max, v),
    "Range"
});

Users writing custom annotations do the same. It’s boilerplate, not a boundary.


The refactor

Commit: d952c2b

Every annotation gets a validate(v, ctx) template member. The walker’s annotation ladder becomes one branch.

struct Range {
    long long min, max;
    constexpr Range(long long lo, long long hi) : min(lo), max(hi) {}
 
    template <typename V, typename Ctx>
    constexpr void validate(const V& v, Ctx& ctx) const {
        if constexpr (requires { v < 0LL; }
                   && !is_optional_v<V>
                   && !is_vector_v<V>) {
            if (v < min || v > max) {
                ctx.errors.push_back({
                    ctx.current_path(),
                    std::format("must be in [{},{}], got{}", min, max, v),
                    "Range"
                });
            }
        }
    }
};

The requires { v < 0LL; } && !is_optional_v<V> && !is_vector_v<V> is the same guard that was in Stage 17’s Range branch — just now living inside Range itself, rather than at the top-level dispatch. The annotation owns its applicability. MinLength, MinSize, MaxSize, NotNullopt all migrate the same way: take their guard from the old ladder, put it inside their own validate, done.

Predicate gets the message field it was promised, using Stage 12’s Regex<N> NTTP pattern:

template <typename F, std::size_t N = 24>
struct Predicate {
    F f;
    char message[N] = "custom predicate failed";
 
    template <typename V, typename Ctx>
    constexpr void validate(const V& v, Ctx& ctx) const {
        if constexpr (requires { { f(v) } -> std::same_as<bool>; }) {
            if (!f(v)) {
                ctx.errors.push_back({
                    ctx.current_path(),
                    std::string{message},
                    "Predicate"
                });
            }
        }
    }
};
 
template <typename F>
Predicate(F) -> Predicate<F, 24>;
 
template <typename F, std::size_t N>
Predicate(F, const char (&)[N]) -> Predicate<F, N>;

The two deduction guides let Predicate{f} keep its default-message behavior and Predicate{f, "custom message"} pick up the literal’s length as N. 24 is enough for "custom predicate failed" plus its null terminator.

And the walker:

template <std::meta::info Member, typename V>
void dispatch_value(const V& v, ValidationContext& ctx) {
    template for (constexpr auto ann :
                  std::define_static_array(
                      std::meta::annotations_of(Member)))
    {
        using A = [:std::meta::type_of(ann):];
        constexpr auto a = std::meta::extract<A>(ann);
        if constexpr (requires { a.validate(v, ctx); }) {
            a.validate(v, ctx);
        }
    }
 
    if constexpr (is_optional_v<V>) {
        if (v.has_value()) dispatch_value<Member>(*v, ctx);
    } else if constexpr (is_vector_v<V>) {
        for (std::size_t i = 0; i < v.size(); ++i) {
            ctx.path_stack.push_back(i);
            dispatch_value<Member>(v[i], ctx);
            ctx.path_stack.pop_back();
        }
    } else if constexpr (std::is_aggregate_v<V>) {
        walk_members(v, ctx);
    }
}

Six branches in the ladder, gone. One branch — the protocol call — in their place. The wrapper-piercing recursion from Stage 17 is unchanged; that machinery was right, and it’s decoupled from which annotations exist. walk_members is unchanged.

The diff from Stage 17 is: every annotation gained a validate member, the ladder went from 6 branches to 1, and a pair of deduction guides got added to Predicate. That’s the whole structural change.


Payoff 1 — User-defined annotations

The thing Stage 16 was trying to open, now fully open. A user who wants “starts with uppercase” writes:

struct StartsWithUppercase {
    template <typename V, typename Ctx>
    constexpr void validate(const V& v, Ctx& ctx) const {
        if constexpr (requires {
                          { v.empty() } -> std::convertible_to<bool>;
                          v[0];
                      }) {
            if (v.empty() || v[0] < 'A' || v[0] > 'Z') {
                ctx.errors.push_back({
                    ctx.current_path(),
                    "must start with an uppercase letter",
                    "StartsWithUppercase"
                });
            }
        }
    }
};

and attaches it:

struct P1User {
    [[=StartsWithUppercase{},=MinLength{3}]] std::string name;
    [[=MustBePositive{},=Range{0,150}]]     int         age;
};

StartsWithUppercase and MustBePositive are pure user types. Neither appears anywhere in the walker, in the dispatch, or in any trait. The library doesn’t know they exist. But they share the same annotation site with the built-in MinLength and Range, and they all fire through the same one-branch dispatch:

---- P1: user-defined + built-in on same field ----
  name: must start with an uppercase letter (StartsWithUppercase)
  name: length must be >= 3, got 2 (MinLength)
  age: must be in [0, 150], got 200 (Range)

Walk through name = "al" specifically. dispatch_value<member_name>("al", ctx):

  • Ladder iteration 1: ann is StartsWithUppercase{}. using A = StartsWithUppercase;, a = extract<A>(ann). requires { a.validate("al", ctx); } passes — A::validate is a template that instantiates against std::string just fine. Body runs: v.empty() is false, v[0] = 'a' fails the uppercase check, error pushed.
  • Ladder iteration 2: ann is MinLength{3}. Same pattern — requires { a.validate(...); } passes (MinLength::validate is a template), body runs, v.size() == 2 < 3, error pushed. For age = 200: MustBePositive{} passes (200 > 0, body runs but no error), Range{0, 150} fails (200 > 150, error pushed). Two annotations, only the failing one produces output.

This is what Stage 16 set out to make possible, Stage 17 made compositional, and Stage 18 makes symmetric. Every annotation — user-written or library-provided — looks exactly the same to the dispatch.


Payoff 2 — Predicate gains its message

Stage 17 ended "custom predicate failed" was the only message Predicate could emit. The protocol migration turns this from a library limitation into a field:

struct P2User {
    [[=Predicate{[](int x){ return x%2==0;}}]]
    int default_msg;
 
    [[=Predicate{[](int x){ return x>0;},"count must be positive"}]]
    int custom_msg;
};
 
P2User u{.default_msg = 3, .custom_msg = -5};

Output:

default_msg: custom predicate failed (Predicate)
custom_msg: count must be positive (Predicate)

Predicate{f} hits the Predicate(F) -> Predicate<F, 24> deduction guide, gets message = "custom predicate failed" from the default member initializer (23 chars + null, exactly fills the array). Predicate{f, "count must be positive"} hits the Predicate(F, const char (&)[N]) -> Predicate<F, N> guide with N = 23 (the literal’s size including null), and message is initialized from the literal. Both forms are valid NTTP values (structural — char arrays in non-mutable fields), so both ride [[=…]] without special handling.

There was nothing to refactor in the dispatch for this. Once every annotation is protocol-based, Predicate is just one more annotation whose validate pulls from its own fields — f and message — and the walker is unaware.

The capacity bound (N = 24 default, or the literal’s exact length for custom messages) is the honest cost. If a user wants a 100-character message they write the literal and the deduction guide picks N = 101. There’s no runtime string in the annotation value, by design: std::string isn’t a structural type (it has dynamic storage), so it can’t appear in an NTTP-valued annotation. Char arrays can, and do.


Payoff 3 — Protocol carries through wrappers

Stage 17’s whole point was that the annotation ladder runs at every level of the walk. The ladder is now one branch long, but the “runs at every level” property is unchanged — the wrapper-piercing recursion is the same code. So user-defined annotations inherit wrapper-traversal automatically:

struct P3User {
    [[=StartsWithUppercase{}]] std::optional<std::string> title;
    [[=MustBePositive{}]]      std::vector<int>           scores;
};
 
P3User u{
    .title  = std::string{"lowercase title"},
    .scores = {3, -1, 0, 7},
};
title: must start with an uppercase letter (StartsWithUppercase)
scores[1]: must be positive, got -1 (MustBePositive)
scores[2]: must be positive, got 0 (MustBePositive)

title’s walk:

  • dispatch_value<member_title>(optional<string>{...}, ctx).
  • Ladder: a = StartsWithUppercase{}. requires { a.validate(optional<string>{...}, ctx); } — template instantiates, body runs, but the inner guard requires { v.empty(); v[0]; } fails on optional<string>. Body is empty. No error at this level.
  • Recursion: is_optional_v<V> true, has_value() true, recurse with V = std::string.
  • Ladder: a.validate(string, ctx) — inner guard now passes, body fires, "lowercase title"[0] = 'l', error pushed with path title. Same story on scores: the outer vector<int> level skips the MustBePositive body (inner guard fails), the vector recursion iterates elements, each element enters the ladder with V = int, the guard passes, and the bodies fire on the negative and zero cases.

This is Stage 17’s “annotation traverses wrappers” property operating on user-defined annotations without any new machinery. The protocol migration didn’t break it, and it didn’t make user-written validate() bodies treat wrappers specially — the inner guard and the outer recursion take care of it.


Payoff 4 — Nested composition

Stage 17’s Q2 was vector<optional<Address>> with aggregate recursion inside the optional, composed from the same dispatch. Stage 18 inherits it unchanged:

struct Address {
    [[=MinLength{2}]]    std::string street;
    [[=Range{1,99999}]] int         zip_code;
};
 
struct P4User {
    std::vector<std::optional<Address>> past_addresses;
};
 
P4User u{.past_addresses = {
    Address{.street = "X",  .zip_code = 0},       // both fail
    std::nullopt,                                  // skip
    Address{.street = "OK", .zip_code = 12345},    // pass
    Address{.street = "Y",  .zip_code = 100000},   // both fail
}};
past_addresses[0].street: length must be >= 2, got 1 (MinLength)
past_addresses[0].zip_code: must be in [1, 99999], got 0 (Range)
past_addresses[3].street: length must be >= 2, got 1 (MinLength)
past_addresses[3].zip_code: must be in [1, 99999], got 100000 (Range)

The MinLength and Range here are the migrated ones (same name, same shape, but now protocol-based with validate members). They fire on street and zip_code respectively, through vector indexing → optional unwrap → aggregate walk → member iteration → dispatch_value on the scalar. Nothing in that chain is aware that Range used to be in a dedicated branch of the ladder; the protocol call is interchangeable with the type-switch for the annotations that existed before.


Payoff 5 — Signature-selected scope (still)

Stage 17’s Q3 was the money shot — two Predicate annotations on the same vector<int> field, one taking const vector<int>&, one taking int, and the dispatch firing them at container-level and element-level respectively because the requires guard inside Predicate selected the scope. With the protocol-based Predicate (which now also carries a message), the same property holds, and the messages come out distinct:

struct P5User {
    [[=Predicate{[](const std::vector<int>& v){ return!v.empty();},
"list must be non-empty"},
=Predicate{[](int x){ return x>0;},
"element must be positive"}]]
    std::vector<int> entries;
};

Three cases:

---- P5a: empty — container predicate fires ----
  entries: list must be non-empty (Predicate)

---- P5b: mixed — element predicate fires per-index ----
  entries[1]: element must be positive (Predicate)
  entries[3]: element must be positive (Predicate)

---- P5c: all positive, non-empty — no errors ----
  (no errors)

The dispatch for P5b, concretely. At the container level (V = std::vector<int>):

  • Predicate #1 with F1 = [](const vector<int>&) { … }, message = "list must be non-empty". Ladder iteration enters — requires { a.validate(vec, ctx); } passes. Inside validate, the inner guard requires { { f(v) } -> std::same_as<bool>; } passes (F1(vector<int>) returns bool). f(vec) returns true (vector is non-empty). No error.

  • Predicate #2 with F2 = [](int) { … }, message = "element must be positive". Ladder iteration enters — requires { a.validate(vec, ctx); } passes (Predicate::validate is a template). Inside validate, the inner guard requires { { f(v) } -> bool; } failsF2(vector<int>) isn’t valid. Body empty. Then the vector recursion runs the ladder per-element. At element level (V = int):

  • Predicate #1: outer guard passes. Inside validate, inner guard requires { { f(v) } -> bool; }F1(int) invalid. Body empty.

  • Predicate #2: inner guard now passes. f(element) runs. For element = -1, returns false, error pushed with path entries[1]. For element = 0, same. For 3 and 7, passes silently. The outer guard (requires { a.validate(v, ctx); }) uniformly passes for both predicates at both levels, because validate itself is a template and accepts any V. It’s the inner guard that does the scope selection. That inner guard is local to Predicate::validate’s body — it’s not part of the library dispatch.

So the scope-selection mechanism is portable. A user who writes their own wrapper annotation can do the same pattern: outer signature accepts any V, inner guard decides which Vs are meaningful.


Where this leaves the validator

The ladder has one branch. Adding a new annotation is writing a struct with validate(const V&, Ctx&) and putting its requires guards inside. The user and the library author do the same work, write the same shape, get the same treatment from the walker.

What the protocol migration doesn’t change:

  • std::string in annotation values. Still not a structural type, still can’t be used as an NTTP. The char[N] pattern used by Regex<N> and now Predicate<F, N> is the workaround, and it’ll keep being the workaround for any annotation that wants a runtime-flexible string. Fixed-size char arrays are a real capacity constraint but a small one in practice.
  • The path type. Still vector<variant<string, size_t>> from Stage 14. Walker still pushes string segments for fields and size_t segments for vector indices. Optional unwrapping doesn’t push. Same as Stage 17.
  • The three entry points. validate / check / collect from Stage 7 are still thin wrappers over the context — the protocol migration is entirely below them. Header-only library’s public API didn’t move. Things that are now scope cuts or plausibly-next stages:
  1. A proper is_class_template_specialization_of<T, Template> query, to make the identifier_of auto-extraction work. Would let "Range" / "Predicate" etc. come from reflection instead of being hardcoded inside each validate body. Polish; not blocking.
  2. Wrapper-type extension. std::expected<T, E> is the obvious next wrapper — success-or-error semantics, annotation should validate the success value the same way it validates an optional’s present value. One new trait (is_expected_v), one new branch in the recursion section. The protocol side needs nothing.
  3. Constexpr validation. Validating a struct User u{…}; at compile time against its annotations, producing a static_assert failure if any annotation rejects. Most of the pieces are already constexpr — validate members are annotated constexpr, ValidationContext isn’t but could be replaced by a constexpr-friendly collector. Worth exploring; would let “wrong data” be a compile error for literal structs.
  4. Message interpolation. Range{0, 150} currently pushes "must be in [0, 150], got 200" from std::format inside validate. A hypothetical i18n layer would want to push a template instead ("out_of_range" + args), and render later. That’s a context change, not a dispatch change — the protocol already makes it easy for different annotations to push different shapes.
  5. include/validator.hpp migration. The header-only release is still at Stage 8’s closed-set shape. Moving the Stage 18 protocol into it is mostly mechanical — every annotation gains a validate member, the dispatch collapses, the entry points don’t change. Worth doing before the next stage cares about it. The arc that’s been running since Stage 16 — “can the annotation set be opened?” — is now complete. Stage 16 opened it with one wrapper branch. Stage 17 made that branch compose through wrapper types. Stage 18 made every branch into that one shape, and the library stopped having an annotation list at all. The walker asks the annotation to validate. The annotation decides what to do. The rest is the same recursion Stage 17 wrote.
EOF — 2026-04-20
> comments