ADR-0009: Support PostgreSQL native partitioning in MVP¶
Status¶
Accepted — 2026-05-29
Context¶
PostgreSQL has shipped declarative (native) partitioning since 10
(PARTITION BY RANGE / LIST / HASH). In 2026, it is the default
choice for high-volume event, audit, and time-series tables in
real customer schemas. Customers expecting PrivaCI to "just clone
my Postgres database" will almost always have at least one
partitioned table — typically the events table they need most.
Deferring partitioning support to v1.x means:
- The flagship MedicalHelpDesk Corp test fixture (which includes
public.raw_eventspartitioned monthly andclinical.patient_visitspartitioned by region) would not represent a real customer database. - Early Marketplace customers with partitioned event tables would hit a hard fail on day 1, generating support tickets and refund requests.
- The retrofitting cost later (when the engine has shipped) is higher than building it in now: streaming, checkpointing, and config-validation all have partition-aware code paths.
The decision space is: do we model partitioned tables as a single virtual table at the streaming layer, or as N independent tables that share a parent's masking config?
Alternatives considered:
- Stream from parent only. Simplest. Postgres routes inserts to the right partition on the target. But checkpointing granularity is the whole table, so a 24-partition crash at partition 23 restarts all 24. Unacceptable for the resumability promise.
- Treat partitioned parents as one virtual table everywhere. Hides reality. The catalog already shows the structure; pretending otherwise leaks complexity into many places (e.g., the audit log would lie about which "table" was being processed).
- Per-child streaming, parent supplies masking config. Each partition child is a streaming unit. The masking pipeline resolves config from the parent. Best resumability, honest about catalog reality, modest extra code. Accepted.
Sub-partitioning (a partition that is itself partitioned) is a PostgreSQL feature but is rare in practice. Supporting it correctly requires recursive resolution of bound expressions and config inheritance. Deferred to v1.x.
Decision¶
Native partitioning is a first-class MVP capability.
Catalog layer¶
- Each partitioned-parent table records
is_partitioned = true, the strategy (RANGE,LIST,HASH), the partition-key column list, and the ordered children. - Each partition child records its parent ID and its bound expression as a verbatim string.
- Multi-level partitioning is detected at pre-flight and refused
with exit
2and a clear error message.
Schema replication¶
- The partitioned parent is created on the target first, with its
original
PARTITION BY <strategy> (<keys>)clause. - Each partition child is created with its
PARTITION OF <parent> <bound>clause. - Per-partition
strategyconfig is rejected at config validation with exit3. The user configures the parent; all children inherit.
Streaming¶
- Partition children are independent streaming units. The parent itself has no rows and is not streamed.
- Target-side writes go directly into the child via
COPY <child> FROM STDIN ..., bypassing the parent's tuple routing. This is faster (no per-row routing overhead) and guarantees we write to the partition we expect. - The masking pipeline resolves per-column actions from the parent's config.
State¶
- Each partition child has its own row in
_privaci.table_checkpoints. - Resumability is per-partition. A 24-partition crash at partition 23 resumes 23 and starts 24, leaving 1-22 untouched.
- A new partition added to the source between runs is treated as
a new table on the next run and emits a
new_tableaudit event.
Consequences¶
What this enables¶
- Realistic test coverage on the MedicalHelpDesk Corp fixture (24
monthly partitions on
raw_events, 4 list partitions onpatient_visits). - "Just point me at your Postgres" actually works for real customer schemas in 2026.
- Future per-partition parallelism: because partitions are independent streaming units, post-MVP parallelism work just schedules N children concurrently in the topo-sort layer.
What this rules out¶
- Per-partition config overrides in MVP. Customers who need partition-specific masking (rare) must configure the parent.
- Sub-partitioning in MVP. Detected, refused, documented as a future change.
- Cross-partition resume semantics. A partition that has partially completed but whose checkpoint is corrupted forces a restart of only that partition, not the whole table.
Operational implications¶
- The audit log is slightly chattier on partitioned tables (one
table.start/table.endevent per partition child). Documented; customers who don't want it can grep the audit log by partition-parent name. - A "new partition appeared mid-run" race is impossible because introspection happens at run start; new partitions appearing between runs are picked up cleanly on the next run.
Risks¶
- Bound-expression replication fidelity.
FOR VALUES FROM (...) TO (...)andFOR VALUES IN (...)cover the common cases; expression bounds (e.g., a function call) are rare but possible and may require special handling. Integration-tested. - Per-child streaming = N COPY commands. For tables with hundreds of small partitions, the per-COPY overhead may dominate. Mitigation: introspection can group small adjacent partitions into bulk COPYs in a v1.x optimization if it ever matters in production.
Decision review triggers¶
- If sub-partitioning becomes common in customer support tickets, revisit deferral.
- If per-COPY overhead on hundred-partition tables becomes a real bottleneck, implement child-grouping.
- If customers ask for per-partition config (e.g., "the 2024 partition is older data, mask it differently"), reconsider.