RICCILAB
> blog/cpp26/Caching-Regex-with-C++26-Reflection

Caching Regex with C++26 Reflection

_DEV_C++26

A Declarative Validator in C++26 closed with a promise: std::regex isn’t a literal class, so it can’t sit inside an annotation directly. The workaround is to store the pattern as a std::string_view on the annotation and construct the std::regex at call time. Straightforward in principle — the interesting part is what “at call time” should mean, because constructing a regex is three orders of magnitude slower than matching one.

Four stages, one moving target (the cache):

  • 09 — Attach a Regex{pattern} annotation and read it back via reflection. No validation yet.
  • 10 — Construct std::regex on every call. Measure how bad that is.
  • 11 — Cache through a function-local static map. The safe, boring answer.
  • 12 — Cache through template <std::meta::info Pattern> instantiation. The interesting answer, if it works. The validator from that post — av::collect / av::check / av::validate over Range / MinLength / MaxLength / NotEmpty — is the starting point, unchanged. Everything new lives in the Regex branch.

Current head: v1.0 (tag [v1.0](https://github.com/Ricci-curvature/reflecting-cpp26/releases/tag/v1.0)) — header-only include/validator.hpp.


Stage 9 — Reading a Regex annotation

Commit: 643d6ee

The plan from that closing paragraph looked like one line of work. Store the pattern as a std::string_view on the annotation; let the validator construct a std::regex later. One new annotation type, one read path through reflection, done.

struct Regex {
    std::string_view pattern;
    constexpr Regex(std::string_view p) : pattern(p) {}
};
 
struct Sample {
    [[=Regex{"^[a-z]+$"}]] std::string name;
};

Compile:

error: C++26 annotation attribute requires an expression usable as
       a template argument
note: pointer to subobject of string literal is not allowed in a
      template argument

Why string_view is rejected

P3394 annotations have to be valid as non-type template parameters. NTTP rules forbid pointers that aren’t to a whole object with static storage — and that’s exactly what std::string_view{"^[a-z]+$"} produces. The view stores a const char* pointing at the first char of the string literal’s char[N] array, which is a subobject of the literal. Subobject pointer → not NTTP-legal → annotation rejected before reflection ever sees it.

This isn’t a clang-p2996 quirk. It’s the interaction between two standard rules, and it kills the obvious approach.

Inlining the pattern into the type

Standard fix for NTTP strings: put the characters in the type via a fixed-size array. CTAD then deduces the size from the literal.

template <std::size_t N>
struct Regex {
    char pattern[N];
    constexpr Regex(const char (&src)[N]) {
        for (std::size_t i = 0; i < N; ++i) pattern[i] = src[i];
    }
};
 
struct Sample {
    [[=Regex{"^[a-z]+$"}]]    std::string name;
    [[=Regex{"^\\d+$"}]]      std::string digits;
    [[=Regex{"^\\S+@\\S+$"}]] std::string email;
    std::string               no_annotation;
};

Each pattern string becomes part of the annotation’s type (Regex<9>, Regex<8>, Regex<13>). No external storage, no subobject pointer, no NTTP issue. The downside — one instantiation per distinct pattern length — is exactly what we’ll want in Stage 12 anyway.

Dispatch across a template family

The earlier walker did if constexpr (std::meta::type_of(ann) == ^^Range). That doesn’t work anymore because Regex isn’t a type; it’s a template that produces types. ^^Regex reflects the primary template, and std::meta::type_of(ann) reflects a concrete instantiation like Regex<9>. They never compare equal.

Fix: std::meta::template_of(type_of(ann)) == ^^Regex. template_of takes an instantiation reflection and hands back a reflection of its template, which does match.

The splice-from-template-for problem, again

First attempt at reading the pattern:

if constexpr (std::meta::template_of(
                  std::meta::type_of(ann)) == ^^Regex) {
    constexpr auto t = std::meta::type_of(ann);
    constexpr auto r = std::meta::extract<[:t:]>(ann);
    std::cout << r.pattern << '\n';
}

Compile:

error: unparenthesized splice expression cannot be used as a template argument

Parenthesize:

constexpr auto r = std::meta::extract<([:t:])>(ann);

Compile:

error: reflection not usable in a splice expression

This is the Stage 2 [:ann:] failure from the earlier post reappearing, now for a type splice. The rule underneath: a reflection held in a template for loop variable — even when the loop variable is constexpr — isn’t considered a constant expression for the purposes of splicing. Values failed there with [:ann:]; types fail here with [:t:]. Same root cause, different surface.

The workaround: extract N first, spell Regex<N> directly

std::meta::template_arguments_of(type_of(ann)) returns the template argument list as a vector of reflections. For Regex<9> that’s a single entry reflecting the std::size_t value 9. Pull it out as a plain std::size_t, then write Regex<N> as a concrete type — no splice required.

if constexpr (std::meta::template_of(
                  std::meta::type_of(ann)) == ^^Regex) {
    constexpr auto args = std::define_static_array(
        std::meta::template_arguments_of(
            std::meta::type_of(ann)));
    constexpr std::size_t N =
        std::meta::extract<std::size_t>(args[0]);
    constexpr auto r = std::meta::extract<Regex<N>>(ann);
    std::cout << "  annotation: Regex { pattern=\""
              << r.pattern << "\" }\n";
}

N is a constexpr std::size_t — an ordinary value, not a reflection. Regex<N> is a concrete type name. extract<Regex<N>> has no splice anywhere. Compiles.

Output

field: name
  annotation: Regex { pattern="^[a-z]+$" }
field: digits
  annotation: Regex { pattern="^\d+$" }
field: email
  annotation: Regex { pattern="^\S+@\S+$" }
field: no_annotation
  (no annotations)

Three Regex annotations read back with their patterns intact. The unannotated field is silently skipped — same behaviour as before.

What Stage 9 actually bought

Two things, both useful for the rest of this post:

  1. Patterns are type-level. Regex<9> for "^[a-z]+$" and Regex<8> for "^\\d+$" are distinct types. This is a constraint right now (we need template_arguments_of gymnastics to read the pattern), but it’s the same property Stage 12 will exploit: if the pattern is part of the type, a template <std::meta::info Ann> cache can key on it for free.
  2. The splice-from-template-for rule applies to types, not just values. The earlier post noted it for [:ann:]. Stage 9 confirms it for [:type_of_ann:]. The universal answer is the same: use extract<ConcreteType> with the concrete type spelled out, or pull the ingredients out as plain values first. The actual regex matching hasn’t happened yet. Stage 10 builds a std::regex at validation time and measures how much that costs.

Stage 10 — The naive path

Commit: 1c35c6e

Stage 9 read the pattern back but did nothing with it. Stage 10 wires it in: for every Regex<N> annotation, build a std::regex from the inlined char[N] and run std::regex_match against the field. No caching anywhere — every validation call rebuilds every regex from scratch. The goal is a baseline number, not a fast number.

template for (constexpr auto ann :
              std::define_static_array(
                  std::meta::annotations_of(member)))
{
    if constexpr (std::meta::template_of(
                      std::meta::type_of(ann)) == ^^Regex) {
        constexpr auto args = std::define_static_array(
            std::meta::template_arguments_of(
                std::meta::type_of(ann)));
        constexpr std::size_t N =
            std::meta::extract<std::size_t>(args[0]);
        constexpr auto r = std::meta::extract<Regex<N>>(ann);
 
        std::regex re{r.pattern};                       // rebuild every call
        const auto& v = obj.[:member:];
        if (!std::regex_match(v, re)) {
            errors.push_back({
                std::string{std::meta::identifier_of(member)},
                std::format("does not match /{}/", r.pattern),
                "Regex"
            });
        }
    }
}

The splice obj.[:member:] works here because member is a constexpr auto loop variable over non-static data members — a value splice for a non-type-parameter context, which is fine. Only the splice-into-template-argument path (Stage 9’s [:t:] under extract<...>) hits the “reflection not usable in a splice expression” wall.

Correctness first. A passing sample reports no errors; a failing one gives back one error per mismatched field:

---- good sample ----
  (no errors)
---- bad sample ----
  name: does not match /^[a-z]+$/ (Regex)
  digits: does not match /^\d+$/ (Regex)
  email: does not match /^\S+@\S+$/ (Regex)

The measurement

50,000 iterations of validate_all(good, errs) — three regex operations per call, each a fresh construct + match. clang-p2996 + libc++ at -O2, 5-run median (every per-op number in this post is reported the same way):

  • per validate(): 1752 ns (3 regex constructions + 3 matches)
  • per regex op (construct + match): 584 ns
  • per match (pre-built regex, reference ceiling): 310 ns
  • construction alone: ~0.88× match cost

What this actually tells us

I went in expecting the folk number — “std::regex construction is 100× slower than matching.” For these patterns, on libc++, that’s not what shows up. ^[a-z]+$, ^\d+$, ^\S+@\S+$ are all short and simple; their compiled NFAs fit in the same ballpark as a single match pass. Construction costs about as much as a match, not orders of magnitude more.

Which reframes the cache question. The naive path spends ~584 ns per regex op; the theoretical floor (pre-built regex, match only) is ~310 ns. A cache can close roughly half of that — a ~1.9× speedup on validate() for this pattern shape. Not the dramatic headline I was expecting, but still clearly worth having, and the ratio grows with pattern complexity: longer alternations, backrefs, and Unicode classes push construction cost up while match cost stays roughly constant.

The more interesting thing Stage 10 surfaces isn’t the headline number — it’s that every call throws away a std::regex the previous call just built. Three std::regex objects, constructed, matched, destructed, per validation. For a validator that’s meant to live on a request path, that’s the waste we want gone. Stages 11 and 12 are about where the cache lives and what keys it — the same regex object, one construction, reused forever.

Stage 11 does the safe, boring thing: function-local static map keyed by pattern string. Stage 12 does the interesting thing: template <std::meta::info Pattern> so the cache key is the annotation itself, at the type level, with no map and no lock.


Stage 11 — Function-local static cache

Commit: 2df52e4

Stage 10 throws away a std::regex on every call. Stage 11 keeps them. The shape is the obvious one — a function-local static std::unordered_map<std::string, std::regex>, keyed by the pattern string, constructed lazily the first time a given pattern is seen:

inline const std::regex& get_cached_regex_locked(std::string_view pattern) {
    static std::unordered_map<std::string, std::regex> cache;
    static std::mutex mtx;
    std::lock_guard lock(mtx);
    auto it = cache.find(std::string{pattern});
    if (it == cache.end()) {
        it = cache.emplace(
            std::string{pattern},
            std::regex{std::string{pattern}}
        ).first;
    }
    return it->second;
}

Both the map and the mutex are function-local statics — initialization happens once, by C++11 magic-static rules, on the first call. The mutex is there because the map is shared mutable state: a first-seen pattern writes while a later thread may already be reading. The call site changes from std::regex{r.pattern} to one function call:

const std::regex& re = fetch(r.pattern);
const auto& v = obj.[:member:];
if (!std::regex_match(v, re)) { /* error */ }

where fetch is a function pointer into the cache. Same shape as Stage 10, one allocation per pattern instead of one per call.

The measurement

Same 50k-iteration loop as Stage 10, after a 16-call warm-up to make sure every measured call hits the cache. clang-p2996 + libc++ at -O2, 5-run median:

  • locked (mutex + map lookup): 1185 ns/validate, 395 ns/regex op
  • unlocked (map lookup only, reference ceiling): 1137 ns/validate, 379 ns/regex op
  • lock cost: 48 ns/validate, 16 ns/regex op Against Stage 10:
pathper validateper regex op
Stage 10 naive (rebuild each call)1752 ns584 ns
Stage 11 locked1185 ns395 ns
Stage 11 unlocked (reference)1137 ns379 ns
pre-built regex, match only (floor)930 ns310 ns

What the numbers actually say

Three things worth unpacking.

The cache got us about two-thirds of the way to the floor. Naive was 822 ns above the pre-built floor (1752 − 930); locked is 255 ns above (1185 − 930). So the cache closes ~69% of that gap. Not 100% — which is the interesting part.

Mutex overhead is real and measurable. The unlocked variant runs 48 ns/call faster (~16 ns per regex op) — the cost of one lock_guard construction and destruction on an uncontended std::mutex. That’s small in absolute terms, but once you’re measuring against a 255 ns/call above-floor budget, 48 ns is ~19% of what’s left to shave. On Linux libc++, std::mutex is a pthread mutex; a more aggressive design could use std::shared_mutex with std::shared_lock for the read path, or a read-copy-update scheme, or just accept a thread-local cache. For a validator that’s meant to live on a request path, the mutex is the first thing that looks disposable.

Even unlocked, we’re still ~207 ns above the floor per validate. That’s ~69 ns per regex op, entirely spent in std::string{pattern} (allocating a temporary key) plus the unordered_map hash + comparison. The std::regex itself is sitting ready to run; we just have to find it through a runtime key, every call, forever. In a system where the set of patterns is known at compile time — and with reflection, it is — paying a hash lookup at runtime to find a regex whose identity the compiler already knew is the definition of leaving performance on the floor.

That’s the pitch for Stage 12. The pattern is part of the annotation type (Regex<9>, Regex<8>, Regex<13> — different types for different patterns), and the annotation reflection is a std::meta::info constant. If we can hand that reflection to a function template as a non-type template parameter, each distinct pattern gets its own function instantiation with its own function-local static std::regex. No map. No lock. No runtime key at all — the compiler resolves the cache slot at compile time. The question is whether P2996 and clang-p2996 actually let us do that.


Stage 12 — Template-parameter caching

Commit: df3f034

Stage 11’s static std::unordered_map + std::mutex burned ~85 ns per regex op on runtime bookkeeping — the lock was ~16 ns of that, the map lookup the rest. Stage 12 tries to get rid of both, at once, by moving the cache key from a runtime string into a template parameter. One function template instantiation per distinct key; the std::regex lives as a function-local static inside that instantiation. No map, no lock, no runtime key at all.

Three questions, not two

The first draft of this post had a “two questions” framing — “can std::meta::info be an NTTP, and can we get the pattern out of the body without splicing?” That framing was off. The NTTP question is largely settled by P2996: std::meta::info is a scalar reflection type, and non-type template arguments of that type are allowed; the paper even shows specialization on reflections as a worked example. So the first question isn’t “does the declaration compile,” it’s “is the reflection value a stable cache key?” — which is about template-parameter equivalence, instantiation identity, and (when the reflected entity is TU-local) linkage. Different question.

Reframed:

  1. Is the reflection value a stable cache key? Two annotations with the same pattern at different sites — do they fold to one template instantiation under template-parameter equivalence, or split into two? Same-address check inside the instantiation tells us what this implementation + setup does; it doesn’t speak to the full cross-TU story, which the experiment can’t reach.
  2. Can the body recover the pattern without the Stage 9 splice wall? Stage 9 showed that a reflection held in a template for loop variable can’t be spliced into a template argument — extract<[:t:]>(ann) fails. A reflection held in a template parameter is a different kind of constant source; if the Stage 9 recipe (template_arguments_of(type_of(...))[0] → N, then extract<Regex<N>>(...)) works there without any [:…:] syntax, the restriction was about the loop-variable path specifically.
  3. What are we trading away? Stage 11 paid at runtime: one mutex + one hash lookup per op. Stage 12 pays at codegen: one function instantiation and one function-local static std::regex per distinct template argument value. If Q1 folds, the instantiation count equals the unique-pattern count. If it doesn’t, it equals the annotation-site count. To keep the narrative honest I wrote the abort criteria down before running the compiler:
  • A1template <std::meta::info Pattern> itself rejected → give up on Design A, fall back to template <auto Ann> over the Regex<N> value.
  • A2 — declaration accepted but body can’t recover the pattern → give up on A, include the exact error in the post.
  • A3 — everything compiles, but the Q1 probe shows per-site granularity → keep A, write it up as “works but not the cache key we wanted.”

The code

template <std::meta::info Pattern>
const std::regex& get_regex_for() {
    constexpr auto args = std::define_static_array(
        std::meta::template_arguments_of(
            std::meta::type_of(Pattern)));
    constexpr std::size_t N =
        std::meta::extract<std::size_t>(args[0]);
    constexpr auto r = std::meta::extract<Regex<N>>(Pattern);
    static const std::regex re{r.pattern};
    return re;
}

No splices anywhere. type_of, template_arguments_of, and extract are plain metafunction calls taking reflection values and returning them; N comes out as a constexpr std::size_t, and Regex<N> is a concrete type name. The call site in the validator:

template for (constexpr auto ann :
              std::define_static_array(
                  std::meta::annotations_of(member)))
{
    if constexpr (std::meta::template_of(
                      std::meta::type_of(ann)) == ^^Regex) {
        const std::regex& re = get_regex_for<ann>();
        /* regex_match, record error, same as before */
    }
}

ann is a constexpr auto over a template for range — a reflection value of type std::meta::info. Passing it as a template argument of NTTP type std::meta::info isn’t a splice; it’s the ordinary NTTP pass. Whether that works when the same expression is illegal under splice-into-template-argument is exactly Q2.

What compiled

A1 cleared — the declaration was accepted without complaint. A2 cleared — the body compiled with the Stage 9 recipe reused verbatim, no error at any extract site. So the splice wall in Stage 9 really was about the loop-variable path specifically, not about reflection values as template arguments in general. Under clang-p2996 + libc++, template <std::meta::info Pattern> is a first-class way to plumb annotation reflections into function templates.

Q1 — the probe

The sample now has four Regex annotations, two of which share a pattern:

struct Sample {
    [[=Regex{"^[a-z]+$"}]]    std::string name;
    [[=Regex{"^\\d+$"}]]      std::string digits;
    [[=Regex{"^\\S+@\\S+$"}]] std::string email;
    [[=Regex{"^[a-z]+$"}]]    std::string nickname;   // same pattern as name
    std::string               no_annotation;
};

Walk reflection, call get_regex_for<ann>() for each, print the address of the returned static const std::regex:

name      pattern="^[a-z]+$"    &regex = 0x5eeb2dabb1d8
digits    pattern="^\d+$"       &regex = 0x5eeb2dabb220
email     pattern="^\S+@\S+$"   &regex = 0x5eeb2dabb268
nickname  pattern="^[a-z]+$"    &regex = 0x5eeb2dabb1d8

name and nickname land on the same address. Under this implementation and setup, two std::meta::info values reflecting the same Regex<N> annotation are template-parameter-equivalent — one instantiation, one cache slot, regardless of how many sites use the same pattern. This is the “best case” answer for Q1, better than I had predicted going in.

The usual caveat: this is what clang-p2996 does for these annotation values in this translation unit. It doesn’t tell you about cross-TU mangling, about reflections of TU-local entities, or about equivalence rules that may still be settling in the standard. But for the purpose of “is the reflection a cache key for a single-TU validator,” the answer here is yes.

The linker agrees. nm --demangle on the binary finds three get_regex_for instantiations, not four:

_ZZ13get_regex_forIMatl5RegexILm9EELA9_KcE...E2re   # Regex<9>  — name + nickname
_ZZ13get_regex_forIMatl5RegexILm6EELA6_KcE...E2re   # Regex<6>  — digits
_ZZ13get_regex_forIMatl5RegexILm10EELA10_KcE...E2re # Regex<10> — email

Four annotation sites → three static slots. The fold survived codegen.

Q2 — the splice wall didn’t reappear

The recovery path inside get_regex_for<Pattern> is exactly Stage 9’s recipe:

constexpr auto args = std::define_static_array(
    std::meta::template_arguments_of(std::meta::type_of(Pattern)));
constexpr std::size_t N = std::meta::extract<std::size_t>(args[0]);
constexpr auto r = std::meta::extract<Regex<N>>(Pattern);

No [:Pattern:] anywhere. And no ^^Pattern either — Pattern is already a reflection; applying ^^ to a reflection variable isn’t meaningful (and the recent direction of the P2996 wording narrows rather than broadens those operators), so there’s no reason to try it. The fact that this body compiles, where the equivalent extract<[:t:]>(ann) in Stage 9 did not, is the direct confirmation: template-parameter reflection and template for loop-variable reflection look the same at the source level but sit in different constant-expression categories as far as splice legality goes.

Q3 — what we traded

Benchmark (5-run median, -O2, clang-p2996 + libc++, 50k iterations; Stage 12’s Sample has 4 Regex annotations, so per-op is the apples-to-apples metric):

pathper regex op (median)vs floor
Stage 10 naive (rebuild each call)584 ns+274
Stage 11 locked cache395 ns+85
Stage 11 unlocked cache (ref)379 ns+69
Stage 12 template cache313 ns+3
pre-built regex, match only (floor)310 ns0

Stage 12 lands inside measurement noise of the theoretical floor. Single runs vary over 300–320 ns; the floor (Stage 10’s pre-built match measurement) ranged over 292–329 ns across its own runs. There is no daylight between “validator with Stage 12 cache” and “regex built once, matched many times, no validator around it” — the runtime overhead of the caching layer, in this setup, has effectively disappeared into measurement noise.

Binary size, which is the thing we were supposedly trading runtime cost for:

stage 10 naive        : 224,880 bytes
stage 11 static cache : 236,888 bytes   (+12,008 over stage 10)
stage 12 template cache: 225,944 bytes   (+1,064 over stage 10, −10,944 vs stage 11)

Stage 12 is not only faster than Stage 11; it’s smaller. Three function instantiations plus three std::regex statics cost less than the std::unordered_map + std::mutex + std::string key infrastructure that Stage 11 pulled in. The codegen trade-off we were worried about — “fine for three patterns, might explode for three hundred” — is a real axis, but in the range where the regex count is anything like what a real validator would see, the static-map approach is the one carrying the extra binary, not the template one.

What Stage 12 actually showed

Three findings worth remembering:

  1. template <std::meta::info Pattern>** folds by value**, not by annotation-site identity, under clang-p2996. Same-pattern annotations at different sites collapse to one instantiation — one static regex, one symbol, one cache slot. The cache key we wanted is the cache key we got.
  2. The Stage 9 splice wall was path-specific, not category-specific. Reflections held in template-for loop variables can’t be spliced into template arguments — extract<[:t:]>(ann) fails. Reflections held in template parameters can be handed straight to type_of, template_arguments_of, and extract as ordinary metafunction inputs, no splice required. The restriction was on the splice path, not on passing a reflection as an NTTP and then using ordinary metafunctions inside the body. Those two positions look similar at the source level but the compiler treats them very differently — worth internalising for anyone doing reflection-driven templates in anger.
  3. The “codegen cost” framing for template caches is less scary than it looks, at least at typical validator scale. In every axis we measured — per-op time, per-validate time, binary size — Stage 12 came in at or below Stage 11. The runtime machinery needed to make a std::unordered_map<std::string, std::regex> thread-safe is genuinely expensive, and a cache that doesn’t need any of it wins cleanly. The earlier post ended with Regex listed as “explicitly out of scope — a separate post’s worth of design.” Four stages in, it turned into: annotation patterns live in the type (Stage 9), the runtime cost of reconstructing them is real but not astronomical (Stage 10), the obvious caching strategy works and costs a mutex (Stage 11), and the P2996-native caching strategy works better, costs less, and drops into the same call site unchanged (Stage 12). The next post is about containers — std::vector<T> and std::optional<T> as validated fields, and what the aggregate-based dispatch has to look like when “the field” is no longer a scalar.
EOF — 2026-04-18
> comments