Opening a Closed Annotation Set with Structural Lambdas

The validator across the earlier posts ships six annotation types: Range, MinLength, MaxLength, MinSize, MaxSize, NotNullopt. The dispatch switch at the top of validate_impl enumerates exactly those. A user who wants “name starts with an uppercase letter” or “age is not thirteen” has no way in.

struct User {
    [[=av::Range{0,150}]]  int         age;
    [[=av::MinLength{3}]]   std::string name;
    // I want: "name starts with an uppercase letter."
    // There is no annotation for that.
};

Extending the library by adding another annotation type — StartsWithUppercase, say — gets you one validation rule at the cost of one branch in the dispatch switch, and it only works for the one rule. The interesting question isn’t “which specific rules should the library ship” but “can the dispatch be opened at all, so that users bring their own predicate at the annotation site.” In other words: can a callable ride through [[=...]] syntax?

This isn’t as immediate as “just store a function pointer.” It sits at the intersection of three C++ rules that pull against each other: annotation-value constant-evaluation, NTTP structural-type, and template-parameter equivalence. Running the callable, once the annotation has been successfully parsed, is trivial. Getting it past the attribute is the work.

Three sub-questions stake out the terrain. Each has a clean pass/fail under clang-p2996:

Q1 (baseline). Can a captureless lambda’s closure type be used as an NTTP? The C++26 draft says yes — captureless closure types are structural. The test is whether clang-p2996 has caught up to that rule.
Q2 (the expected fail). What happens with a capturing lambda? Capturing closure types are not structural, so the annotation declaration should be rejected. Goal is to confirm the rejection is specifically the structural rule, not some earlier check (variable storage duration, parsing, whatever).
Q3 (identity). Two fields, two textually identical captureless lambdas. Do they share a closure type the way two identical Regex<N> annotations share an instantiation under the value-NTTP fold? The language answer is no — every lambda-expression gets its own closure type regardless of spelling — but it’s worth probing directly because the previous post’s fold result conditioned the intuition.

The wrapper annotation

Commit: 02cc12c

There are two shapes a callable annotation could take:

// (A) Naked callable as annotation value.
[[=[](const std::string& s){ return!s.empty()&& s[0]>= 'A';}]]
std::string name;
 
// (B) Wrapper annotation carrying the callable.
[[=Predicate{[](const std::string& s){ return!s.empty()&& s[0]>= 'A';}}]]
std::string name;

is the more arresting syntax. (B) is the one I’m using, for four reasons.

Dispatch has a handle. The existing switch is type_of(ann) == ^^Range, type_of(ann) == ^^MinLength, and so on. A wrapper lets custom predicates sit on the same switch with template_of(type_of(ann)) == ^^Predicate, keyed on the wrapper, not on a raw callable’s closure type. With (A) the dispatch would have to be “this annotation is not one of the known types, so it must be a callable” — a negative identification, easy to misfire.
Room for metadata. A first-class wrapper can grow fields. A later version of Predicate can carry name, message, code — all the bits needed for good diagnostics. The naked callable has nowhere to put them.
Separation of concerns. (B) keeps “is this a predicate annotation” and “what’s the callable it carries” as two distinct questions. (A) fuses them. When something fails — NTTP rejection, extraction failure, whatever — (B) tells you which layer, (A) doesn’t.
Consistency with the existing annotations. Range is a struct, MinLength is a struct, MinSize is a struct. Predicate<F> fits the same mold. No stylistic exception for “when the payload is callable.” The wrapper is one line:

template <typename F>
struct Predicate {
    F f;
};

F is expected to be some structural type invocable as bool f(const T&) for a T matching the annotated field. The canonical F is a captureless lambda’s closure type, but we’ll see in a moment that a named functor lands on the same path at zero extra cost.

Call contract is fixed for this stage: return type is exactly bool, failure message is the literal string "custom predicate failed". No message customization, no richer return type. Shape experiments and diagnostics design are different problems, and bundling them produces a post where every failure has two possible causes. The contract gets richer in a later stage once the wrapper is stable.

Dispatch

The existing switch in validate_impl picks annotations by exact type comparison:

if constexpr (std::meta::type_of(ann) == ^^Range) { ... }
else if constexpr (std::meta::type_of(ann) == ^^MinLength) { ... }
else if constexpr (std::meta::type_of(ann) == ^^MinSize) { ... }

For Predicate the comparison is one level up — the template, not the instantiated type, because Predicate<ClosureTypeA> and Predicate<ClosureTypeB> are different instantiations but both want the same branch:

else if constexpr (std::meta::template_of(
                       std::meta::type_of(ann)) == ^^Predicate) {
    constexpr auto targs = std::define_static_array(
        std::meta::template_arguments_of(
            std::meta::type_of(ann)));
    using F = [:targs[0]:];
    if constexpr (requires(F g) {
                      { g(obj.[:member:]) } -> std::same_as<bool>;
                  }) {
        constexpr auto p = std::meta::extract<Predicate<F>>(ann);
        if (!p.f(obj.[:member:])) {
            ctx.errors.push_back({
                ctx.current_path(),
                "custom predicate failed",
                "Predicate"
            });
        }
    }
}

The recovery recipe mirrors the Regex<N> path from the previous post — template_arguments_of to get the single parameter, then extract to pull the annotation value. One small difference: Regex<N>’s parameter was a size_t, recovered as a value via extract<std::size_t>(args[0]). Predicate<F>’s parameter is a type, so it gets recovered as a using-alias via [:targs[0]:].

That splice is worth flagging because the first post in this regex arc hit a splice wall: extract<[:type_of(ann):]>(ann) failed because a reflection held in a template for loop variable couldn’t be spliced into a template-argument position. The fix there was indirection — pass ann as an NTTP into a helper function template where, inside the function body, the reflection is no longer a loop variable.

The using F = [:targs[0]:]; above goes through the loop variable too (targs is derived from type_of(ann)), but into a using-alias position, not a template-argument position. And it compiles. That’s a narrower read of the splice wall than I had before: it isn’t “splice from a loop variable is bad,” it’s “splice from a loop variable as a template argument is bad.” Using-alias declarations get to splice a type out of a loop-variable-derived reflection just fine. Good to know.

The inner requires(F g) { { g(obj.[:member:]) } -> std::same_as<bool>; } check silently skips the predicate if F isn’t callable with the field’s type at all — Predicate<bool(int)> on a std::string field does nothing, same as Range{0, 150} on a std::string field does nothing. Symmetric with the rest of the dispatch.

Q1 — Captureless lambdas

The baseline test has four annotation sites on one struct:

struct IsPositive {
    constexpr bool operator()(int v) const { return v > 0; }
};
 
struct User {
    [[=Predicate{[](const std::string& s){
        return!s.empty()&& s[0]>= 'A'&& s[0]<= 'Z';
}}]]
    std::string name;
 
    [[=Predicate{[](int v){ return v%2==0;}}]]
    int even_number;
 
    [[=Predicate{IsPositive{}}]]
    int positive;
 
    [[=Range{0,150},=Predicate{[](int v){ return v!=13;}}]]
    int age;
};

Three captureless lambdas and one named functor. The age field carries both Range and Predicate to make sure the closed-set branch and the open branch don’t step on each other.

It compiles. Running against a deliberately failing record:

---- case 1: predicates failing ----
  name: custom predicate failed (Predicate)
  even_number: custom predicate failed (Predicate)
  positive: custom predicate failed (Predicate)
  age: custom predicate failed (Predicate)

All four sites fired. Q1 passes. The C++26 structural-closure rule is implemented in clang-p2996, captureless lambdas ride through [[=Predicate{...}]] without incident, and the dispatch branch extracts and invokes them.

The Range + Predicate coexistence case is the other thing worth showing:

---- case 3: Range and Predicate coexist ----
  age: must be in [0, 150], got 200 (Range)

Here age = 200 fails Range{0, 150} and passes Predicate{[](int v) { return v != 13; }}. Both branches ran, each against the same field value, and only the failing one produced an error. The core-always-collects invariant from earlier posts extends straight through — an open predicate branch doesn’t change the rule that every annotation gets evaluated independently.

Named functors fall out for free

One observation on the positive field above. IsPositive is an empty aggregate with a constexpr operator():

struct IsPositive {
    constexpr bool operator()(int v) const { return v > 0; }
};

The Predicate<IsPositive> instantiation works for exactly the same reason the captureless-lambda version does: IsPositive is a structural literal class (all bases and members public, no base, no mutable), so the wrapper is structural, so the annotation value is accepted. The dispatch treats it identically to a closure type — template_of == ^^Predicate, targs[0] names the functor type, extract<Predicate<IsPositive>> produces the value, p.f(v) invokes it.

I’d originally sketched named functors as a fallback path (“if lambdas don’t work, try this”) before Q1 results came in. They end up being not a fallback but a parallel option: any structural callable works, and IsPositive{} is just the case where the author wanted a named, reusable validator instead of an inline closure. No extra branch, no extra machinery, no overloads. The same Predicate<F> path carries both.

Q2 — Capturing lambdas

Commit: 02cc12c

The hypothesis for Q2 is that capturing closure types are not structural, so Predicate<CapturingClosure> would fail the annotation-value structural check, so the annotation declaration would be rejected.

The first reproducer I tried was:

constexpr int threshold = 10;
 
struct BadField {
    [[=Predicate{[threshold](int x){ return x> threshold;}}]]
    int value;
};

This got rejected, but not for the reason I wanted:

error: 'threshold' cannot be captured because it does not have
       automatic storage duration

Capturing by name requires the source variable to have automatic storage, and constexpr int threshold at namespace scope has static storage. The rejection is earlier in the process than the structural-type check, and doesn’t tell us anything about the NTTP story. The isolation I need is “capturing lambda, but not a capturing-lambda that’s already broken for other reasons.”

An init-capture bypasses the automatic-storage requirement — it creates a fresh closure member by copy from the right-hand expression, regardless of what that expression was:

constexpr int threshold = 10;
 
struct BadField {
    [[=Predicate{[t= threshold](int x){ return x> t;}}]]
    int value;
};

The closure now has a captured member, the capture source isn’t a problem, and the remaining rejection (if any) has to be about the closure type itself. clang-p2996’s response:

error: C++26 annotation attribute requires a value of structural type

This is the result the hypothesis wanted. The rejection is at the annotation-value layer, and it cites structural type as the reason. What makes the message clean is that it isn’t Predicate-specific — the attribute rule applies to every annotation value, and the capturing closure’s non-structural-ness propagates through Predicate<F> to reach the check. If I rewrote Predicate tomorrow to be shaped differently, this would still fail the same way for the same reason.

Practically, this means the validator’s open predicate branch accepts captureless lambdas, function pointers, and structural functors, and rejects everything else at compile time with a message that points at the actual rule. Runtime fallbacks (type-erased std::function-style) are a different design, and they stop being annotation-native — they require somewhere to store the erased callable, which [[=...]] isn’t. The structural-type wall is a feature here.

Q3 — Site identity

The last experiment is the one most directly informed by the previous post. Regex<N> annotations with textually identical patterns folded to a single function template instantiation, because Regex<N> is a value NTTP and two identical char[N] values compare equal under template-parameter equivalence.

Lambdas are different. The rule at the language level is that every lambda-expression produces its own distinct closure type, regardless of the source text. Two lambdas that look identical are not the same type, and it follows that two Predicate<F> annotations built from textually identical lambdas should not fold. But it’s worth probing directly, because the regex fold set up an expectation, and C++ is the kind of language that makes you verify.

The probe:

struct TwoSites {
    [[=Predicate{[](int x){ return x>0;}}]] int a;
    [[=Predicate{[](int x){ return x>0;}}]] int b;
};
 
template <typename T>
void probe_predicate_types(const T&) {
    template for (constexpr auto member : ...) {
        template for (constexpr auto ann : ...) {
            if constexpr (std::meta::template_of(
                              std::meta::type_of(ann)) == ^^Predicate) {
                constexpr auto targs = std::define_static_array(
                    std::meta::template_arguments_of(
                        std::meta::type_of(ann)));
                using F = [:targs[0]:];
                std::cout << "  field "
                          << std::meta::identifier_of(member)
                          << "   sizeof(F)=" << sizeof(F)
                          << "   typeid(F)=" << typeid(F).name()
                          << '\n';
            }
        }
    }
}

Output:

---- Q3: site-identity probe ----
(two fields, textually identical captureless lambdas)
  field a   sizeof(F)=1   typeid(F)=N8TwoSites3$_3E
  field b   sizeof(F)=1   typeid(F)=N8TwoSites3$_4E

Two different closure types — $_3 vs $_4 are the mangled identifiers clang assigns to the unnamed local entities, and they differ per lambda-expression. Size is 1 (empty closure), typeid differs, the Predicate<F> instantiations are distinct. No fold.

The contrast with Stage 12 is worth naming directly:

	Stage 12 (`Regex<N>`)	Stage 16 (`Predicate<F>`)
Parameter kind	non-type, `char[N]` value	type, closure-or-functor
Identity rule	template-parameter equivalence (value-based)	type identity (per lambda-expression)
Two identical sites	Fold to one instantiation	Two distinct instantiations
Cache granularity	per unique value	per unique expression

Neither is a bug. Both are what C++ says to do. But it does mean “I have a cache keyed on annotation identity” behaves differently for the two cases: Regex<N> caches collapse identical patterns naturally, Predicate<F> caches do not. If you really want two sites with identical predicate text to share compiled code — say, to keep binary size down — you need a named functor (Design C from Q1), not a lambda. IsPositive declared once and used at three sites is a single type, hence a single Predicate<IsPositive>, hence one compiled instance.

Once I started thinking of the two options as ends of a spectrum, the whole thing clicked into place. Captureless lambdas get maximum site-level readability (the rule lives at the annotation) at the cost of per-site codegen. Named functors get codegen deduplication at the cost of one top-level type declaration per reusable rule. The validator lets users pick, because the structural-callable path in Predicate<F> is indifferent to which one is in there.

Where this leaves the validator

The dispatch switch grows exactly one branch:

else if constexpr (std::meta::template_of(
                       std::meta::type_of(ann)) == ^^Predicate) {
    // recover F, extract value, invoke.
}

One branch at the end of the closed-set chain opens the annotation set. A user who wants “starts with uppercase” writes a lambda at the field, the walker picks it up through the generic Predicate branch, and neither the library’s header nor the user’s struct declaration has to change anywhere else. The closed set keeps its own dispatch — Range, MinLength, and the rest still have their individually typed branches — and the Predicate branch is what happens when that switch runs out of specific matches.

Two things this stage doesn’t deliver:

Message customization. The failure string is hardcoded "custom predicate failed". A user who wanted a field-specific message — "name must start with an uppercase letter" — has no way to pass it in. That’s a deliberate scope cut, and the fix is straightforward: give Predicate a second field for a message string (or a richer result type). It wasn’t included here because diagnostics design is its own problem, and it would have mixed with the structural-type / identity experiments in a way that made every issue two-variable.
Predicates on scalar fields inside containers. [[=Predicate{...}]] std::vector<int> doesn’t validate element-wise — the current dispatch treats the annotation as applying to the container, and requires { predicate(container) } is probably false, so it silently skips. Fixing this means splitting validate_impl into “walk the members of an aggregate” and “dispatch a value through the type-driven branches,” so that the vector branch can send each element through the dispatch pipeline too. This is the same refactor the earlier post deferred for optional<Scalar> + Range and vector<Scalar> + Range. One refactor, three payoffs. Both are named and out of scope. Update: the refactor is in One Refactor, Three Payoffs — it carries scalar annotations through optional and vector wrappers, and a signature-selected predicate scope on the same field falls out as a third property. Message customization is still deferred.