Test Fixtures — MedicalHelpDesk Corp¶

This document defines the reference source database used to test PrivaCI end-to-end. It is the test-side counterpart to docs/local-development.md.

The fixture is a fictional SaaS company called MedicalHelpDesk Corp — a multi-tenant support tool sold to medical practices, hospitals, and clinics. The choice is deliberate:

PHI density — patients, providers, prescriptions, visits all carry HIPAA-relevant identifiers (MRN, NPI, DEA, diagnosis codes, insurance member IDs). This exercises the future HIPAA vertical config pack on a realistic schema.
Free-text NER bait — support ticket messages and visit notes contain narrative prose mentioning patient names and dates, perfect for Level 2 SpaCy NER testing.
Multi-tenancy — every meaningful table has an org_id, which exercises FK consistency across tables.
Naturally cyclic FKs — every SaaS schema with "primary contact" and "owner" pointers has cycles. We use this organically rather than contriving one.
Two tenants worth of data in one schema — the SaaS-product side (public) and the patient-data side (clinical) live in separate schemas, exercising cross-schema FK behavior.

The full schema and generator scripts live at tests/fixtures/demo_corp/.

0. Implementation status¶

What exists today: the mini tier — all major tables from the schema below with CI-friendly row counts — generated by tests/fixtures/demo_corp/ and committed at tests/fixtures/sql/demo-corp/. Regenerate with make fixtures-generate (or python -m tests.fixtures.demo_corp.generate --tier mini). The full demo tier (millions of rows, ~300–500 MB loaded) uses the same generator with --tier demo and is not committed.

Loading the mini dataset¶

With compose.dev.yml up, load it into the source database and inspect it. Use the project venv (activate it first, or prefix commands with .venv/bin/):

source .venv/bin/activate   # once per shell

SOURCE_DB_URL=postgresql://postgres:dev@127.0.0.1:55432/privaci_source \
    python -m scripts.load_sample_data

SOURCE_DB_URL=postgresql://postgres:dev@127.0.0.1:55432/privaci_source \
    privaci catalog inspect

catalog inspect should report 20+ tables, partitioned raw_events and patient_visits, deferred FK edges breaking the org↔user cycle, a polymorphic_fk_warning on comments, and implied-FK warnings on patient_documents.

Heads-up: the loader runs DROP SCHEMA ... CASCADE for public, clinical, auth, and audit_internal before recreating them, so it replaces the Week-1 spike tables in the source database's public schema. Re-create those by recreating the compose volume if you need them back.

1. Three-Tier Strategy¶

   ┌────────────────────────────────────────────────────────────┐
   │  TIER 1 — Mini fixtures (programmatic, per-test)            │
   │  Built in-memory by pytest fixtures.                        │
   │  1-100 rows. Each test picks the minimum schema subset it   │
   │  needs.                                                      │
   │  Run-time per test: < 1 second.                              │
   └────────────────────────────────────────────────────────────┘
                                │
   ┌────────────────────────────────────────────────────────────┐
   │  TIER 2 — Demo Corp (committed)                             │
   │  Faker-seeded generator. Cached SQL committed to repo.       │
   │  ~500 orgs / 10k users / 100k patients / 1M ticket msgs.    │
   │  ~300-500 MB loaded into Postgres.                          │
   │  ~50-80 MB compressed on-disk (gzipped binary COPY).        │
   │  Loads into a fresh Postgres via docker compose in ~60s.    │
   │  Used by: `compose.yml` smoke, E2E integration tests,       │
   │           release CI smoke gate, demos.                      │
   └────────────────────────────────────────────────────────────┘
                                │
   ┌────────────────────────────────────────────────────────────┐
   │  TIER 3 — Stress (generated on demand, NOT committed)       │
   │  Same schema, 10x to 100x scale.                            │
   │  Up to ~50 GB loaded.                                        │
   │  Used by: performance benchmarks, week-1 throughput spike.   │
   │  Constant-memory streaming is validated separately by the   │
   │  server-side 1 GB load test (see section 8).                 │
   └────────────────────────────────────────────────────────────┘

Determinism rule: every tier uses Faker.seed_instance(42) and random.seed(42). Same seed, same rows, byte-for-byte. This is what makes "the masked email for john.doe@… is always X" testable.

2. The Schema¶

══════════════════════════════════════════════════════════════════════
public schema   (the SaaS-product side)
══════════════════════════════════════════════════════════════════════

  ┌─────────────────────┐         ┌──────────────────────┐
  │   organizations     │◀────────│   subscriptions      │
  │  id IDENTITY        │         │  id IDENTITY         │
  │  name               │         │  org_id FK           │
  │  ein                │         │  plan                │
  │  billing_email UQ   │         │  started_at          │
  │  owner_user_id  ┐   │         │  billing_address_id  │
  │  primary_user_id│ ──┼──┐      └──────────┬───────────┘
  └─────────┬───────┘   │  │                 │
            │           │  │ ⤴ CYCLE         ▼
            ▼           │  │         ┌──────────────────────┐
  ┌─────────────────────┐ │ │         │     invoices         │
  │       users         │◀┘ │         │  id IDENTITY          │
  │  id IDENTITY        │   │         │  subscription_id FK  │
  │  email UQ           │◀──┘         │  period_start         │
  │  phone              │             │  total_cents          │
  │  first_name         │             └──────────┬───────────┘
  │  last_name          │                        │
  │  dob                │                        ▼
  │  ssn                │             ┌──────────────────────┐
  │  password_hash      │             │ invoice_line_items   │
  │  role               │             │  invoice_id FK       │
  │  org_id  FK ────────┘             │  line_no              │
  │  manager_id (self) ──┐            │  PRIMARY KEY         │
  │  last_login_ip       │            │   (invoice_id,       │
  │                  ←───┘            │    line_no)          │
  └────┬─────────────────┘            └──────────────────────┘
       │
       ├──────────────────────┐
       ▼                      ▼
  ┌───────────────────┐  ┌──────────────────────────┐
  │ user_addresses    │  │ user_payment_methods     │
  │ id IDENTITY       │  │ id IDENTITY              │
  │ user_id FK        │  │ user_id FK               │
  │ street/city/...   │  │ card_number / cvv /      │
  │                   │  │ expiry / cardholder_name │
  └───────────────────┘  └──────────────────────────┘

  ┌─────────────────────┐
  │     employees       │   practice-side staff (separate from `users`)
  │  id IDENTITY        │
  │  org_id FK          │
  │  manager_id (self)  │   self-referential
  │  full_name          │
  │  ssn                │
  │  dob                │
  │  hire_date          │
  │  salary             │
  │  npi (some only)    │   some employees are also `providers`
  └─────────────────────┘

  ┌─────────────────────┐         ┌──────────────────────┐
  │      tickets        │         │   ticket_messages    │
  │  id IDENTITY        │◀────────│  id IDENTITY         │
  │  org_id FK          │         │  ticket_id FK        │
  │  reporter_user_id   │         │  author_user_id FK   │
  │  subject            │         │  body (LARGE TEXT)   │ ← L2 NER bait
  │  status             │         │  posted_at           │
  │  priority           │         └──────────────────────┘
  │  assigned_to_user   │
  └─────────────────────┘

  ┌──────────────────────────────────────────────────────┐
  │                    comments                          │
  │  id IDENTITY                                         │
  │  commentable_type TEXT  ← 'Ticket' | 'Patient' |    │
  │  commentable_id  BIGINT                              │
  │  author_user_id  FK→users                            │
  │  body                                                │
  │                                                      │
  │  POLYMORPHIC FK — no catalog edge.                   │
  │  Engine emits a `polymorphic_fk_warning`.            │
  └──────────────────────────────────────────────────────┘

  ┌──────────────────────────────────────────────────────┐
  │                  raw_events                          │
  │  PARTITIONED BY RANGE (event_at) — 24 monthly parts │
  │                                                      │
  │  raw_events  (parent)                                │
  │   ├─ raw_events_2024_01 ... raw_events_2025_12       │
  │                                                      │
  │  60+ columns including jsonb payload.                │
  │  Wide row → exercises batch auto-tune.               │
  └──────────────────────────────────────────────────────┘

  ┌──────────────────────────────────────────────────────┐
  │              geo_locations                           │
  │  id, name, region_path ltree                         │
  │  Contains UNSUPPORTED type (ltree) → exercises       │
  │  per-table text-mode COPY fallback.                  │
  └──────────────────────────────────────────────────────┘

  ┌──────────────────────────────────────────────────────┐
  │              active_clinics_v   (VIEW)                │
  │  Plain view. Engine SHALL skip + audit log.          │
  │                                                      │
  │              monthly_revenue_v  (VIEW)                │
  │  Aggregating view. Engine SHALL skip + audit log.    │
  │                                                      │
  │              tickets_open_mv    (MATERIALIZED VIEW)   │
  │  Engine SHALL skip + audit log.                      │
  └──────────────────────────────────────────────────────┘

══════════════════════════════════════════════════════════════════════
clinical schema   (the patient-data side — HIPAA territory)
══════════════════════════════════════════════════════════════════════

  ┌─────────────────────┐         ┌──────────────────────┐
  │     providers       │         │      patients        │
  │  id IDENTITY        │◀────┐   │  id IDENTITY         │
  │  npi UQ             │     │   │  mrn UQ              │
  │  dea_number         │     │   │  first_name          │
  │  first_name         │     │   │  last_name           │
  │  last_name          │     │   │  full_name           │  ← composite
  │  email UQ           │     │   │  dob                 │
  │  specialty          │     │   │  ssn                 │
  │  org_id FK ────────────┐  │   │  phone               │
  │     (CROSS-SCHEMA →    │  └───│  primary_provider_id │
  │      public.orgs)     │      │  org_id FK ──────┐   │
  └───────────────────────│      │     (CROSS-SCHEMA)│   │
                          │      │  insurance_member_id│  │
                          │      │  insurance_group   │  │
                          │      └─────────┬──────────┘  │
                          │                │             │
                          └────────────────┼─────────────┘
                                           │
                                           ▼
  ┌──────────────────────────────────────────────────────┐
  │                 patient_visits                       │
  │  PARTITIONED BY LIST (region_code) —                 │
  │    visits_us_east / us_west / us_central / intl      │
  │                                                      │
  │  id IDENTITY (composite with region_code)            │
  │  patient_id FK                                       │
  │  provider_id FK                                      │
  │  visit_date                                          │
  │  visit_type                                          │
  │  region_code                                         │
  │  diagnosis_code                                      │
  │  chief_complaint TEXT                                │
  │  visit_notes TEXT (LARGE — L2 NER bait, narrative   │
  │   prose mentioning patient names and dates)          │
  └──────────────────────────────────────────────────────┘

  ┌─────────────────────┐
  │   prescriptions     │
  │  id IDENTITY        │
  │  patient_id FK      │
  │  provider_id FK     │
  │  drug_name          │
  │  dosage             │
  │  refills            │
  │  prescribed_at      │
  └─────────────────────┘

  ┌──────────────────────────────────────────────────────┐
  │              patient_documents                       │
  │  id IDENTITY                                         │
  │  patient_id FK→patients                              │
  │  referring_provider_email TEXT  ← IMPLIED FK,        │
  │   informally references clinical.providers.email     │
  │   (no FK constraint declared)                        │
  │  document_blob bytea                                 │
  │  uploaded_by_user_email TEXT  ← IMPLIED FK to       │
  │   public.users.email                                 │
  │                                                      │
  │  Engine emits an `implied_fk_warning` for both.      │
  │  User config can mitigate via `seed_alias` to make   │
  │  faked values consistent.                            │
  └──────────────────────────────────────────────────────┘

══════════════════════════════════════════════════════════════════════
auth schema   (cross-schema FKs)
══════════════════════════════════════════════════════════════════════

  ┌─────────────────────┐         ┌──────────────────────┐
  │      sessions       │         │      api_keys        │
  │  id IDENTITY        │         │  id IDENTITY         │
  │  user_id FK ───────────┐      │  user_id FK ───────┐ │
  │   (→ public.users)    │      │   (→ public.users) │ │
  │  token UUID UQ         │      │  key_hash           │ │
  │  ip_address           │      │  last_used_at       │ │
  │  user_agent           │      │  last_used_ip       │ │
  │  created_at           │      │  scopes TEXT[]      │ │
  │  expires_at           │      └─────────────────────┘ │
  └───────────────────────┘                              │
                                                         │

══════════════════════════════════════════════════════════════════════
audit_internal schema   (exclude-strategy test)
══════════════════════════════════════════════════════════════════════

  ┌──────────────────────────────────────────────────────┐
  │             audit_log_events                         │
  │  occurred_at, actor_user_id, action, target,         │
  │  payload jsonb                                       │
  │                                                      │
  │  NO PRIMARY KEY — exercises:                         │
  │   • table-level checkpoint fallback                  │
  │   • the `strategy: exclude` config path (default     │
  │     test config excludes this table)                 │
  └──────────────────────────────────────────────────────┘

What every quirky table teaches¶

Table / object	Lesson exercised
`organizations` ↔ `users` (`owner_user_id`, `primary_user_id`)	2-table FK cycle, `DEFERRABLE` constraint, cycle-break heuristic
`users.manager_id`	Self-referential FK
`employees.manager_id`	Self-referential FK in a non-user table
`invoice_line_items`	Composite PK — keyset-pagination + UNIQUE composite faker
`audit_internal.audit_log_events`	No PK → table-level checkpoint fallback; `exclude` strategy in default config
`comments` (`commentable_type`, `commentable_id`)	Polymorphic FK → warning, not failure
`auth.sessions` / `auth.api_keys`	Cross-schema FK
`clinical.providers.org_id`	Cross-schema FK between non-`public` schemas
`raw_events` (24 partitions, monthly)	Range-partitioned table — clone parent + children, per-partition checkpoint
`clinical.patient_visits` (4 list partitions)	List-partitioned table — same as above, list variant
`raw_events` (60+ columns)	Wide row → batch auto-tune
`ticket_messages.body`, `patient_visits.visit_notes`	Free-text → SpaCy NER (L2)
`users`	Every PII pattern: email (UNIQUE), phone, ssn, dob, password, name, address
`user_payment_methods`	PCI-DSS pattern coverage (card_number / cvv / expiry)
`clinical.patients`	HIPAA pattern coverage (mrn, dob, ssn, insurance_member_id)
`clinical.providers.npi`, `dea_number`	HIPAA provider identifiers
`clinical.patients.full_name` + UNIQUE on `(first_name, last_name, dob)`	Composite UNIQUE → faker uniqueness suffix
`clinical.patient_documents.*_email`	Implied (soft) FK → warning + `seed_alias` mitigation
`geo_locations.region_path ltree`	Unsupported type → text-mode COPY fallback
`active_clinics_v`, `monthly_revenue_v`	Plain views — engine skips + audit-log entry
`tickets_open_mv`	Materialized view — engine skips + audit-log entry
`raw_events.id` (SERIAL) + most others (IDENTITY)	Both legacy `SERIAL` and modern `IDENTITY` sequences

Edge-case fixtures (separate small SQL files)¶

These don't belong in the main Demo Corp because they're failure-mode tests. Each lives at tests/fixtures/sql/edge-cases/<name>.sql and is loaded only by the tests that need it.

File	Failure mode tested
`non-deferrable-cycle.sql`	Cycle with `NOT DEFERRABLE` constraint → pre-flight exit `2` with SQL hint
`no-primary-key.sql`	Table with no PK → table-level checkpoint fallback
`unsupported-types.sql`	PostGIS-like custom domain → text-mode fallback
`permission-denied.sql`	Role lacks `CREATE SCHEMA` → pre-flight exit `2`
`dangling-fk-exclude.sql`	`exclude`'d table is FK target → config validation exit `3`
`composite-pk-only.sql`	Only composite PKs, no monotonic single-column key
`adversarial-types.sql`	Deep FK chain + 4-byte UTF-8, large jsonb, max numeric, arrays, bytea, inet → byte-for-byte round-trip

3. Generator Layout¶

tests/fixtures/demo_corp/
├── README.md
├── __init__.py
├── schema.py                  ← DDL as Python — single source of truth
├── generate.py                ← CLI: produce SQL files at given tier
├── seed.py                    ← Faker seeding + deterministic helpers
├── providers/                 ← per-domain row generators
│   ├── __init__.py
│   ├── orgs.py
│   ├── users.py
│   ├── patients.py
│   ├── visits.py
│   ├── tickets.py
│   ├── messages.py            ← long-form narrative w/ PHI mentions
│   └── events.py
└── tiers.py                   ← scale parameters per tier

tests/fixtures/sql/
├── demo-corp/                 ← Tier 2 output (committed, ~50-80 MB)
│   ├── 00-schemas.sql
│   ├── 01-tables-public.sql
│   ├── 02-tables-clinical.sql
│   ├── 03-tables-auth.sql
│   ├── 04-tables-audit-internal.sql
│   ├── 05-partitions.sql
│   ├── 06-constraints.sql
│   ├── 07-indexes.sql
│   ├── 08-views.sql
│   └── 09-data/               ← gzipped binary COPY files per table
│       ├── public.organizations.copy.gz
│       ├── public.users.copy.gz
│       └── ...
└── edge-cases/
    ├── non-deferrable-cycle.sql
    └── ...

Generation invocation¶

# Regenerate Tier 2 (committed Demo Corp dataset)
python -m tests.fixtures.demo_corp.generate --tier demo \
    --out tests/fixtures/sql/demo-corp

# Generate Tier 3 stress dataset (not committed)
python -m tests.fixtures.demo_corp.generate --tier stress \
    --scale 10 \
    --out /tmp/stress

# Tier 1 (mini) is built in-memory by pytest fixtures; no script
# invocation needed.

The generator is reproducible: make fixtures-verify regenerates Tier 2 into a temp dir and diffs against the committed files. CI fails if the committed SQL drifts from what the code would produce.

Tier scale parameters (Tier 2 / Demo)¶

Entity	Count
organizations	500
users	10,000
employees	5,000
providers	1,000
patients	100,000
patient_visits	500,000 (across 4 list partitions)
prescriptions	250,000
patient_documents	25,000
tickets	50,000
ticket_messages	1,000,000 (free-text; the bulk of disk)
invoices	25,000
invoice_line_items	100,000
raw_events	5,000,000 (across 24 monthly partitions)
comments (polymorphic)	50,000

Loaded into Postgres: ~300-500 MB. Compressed COPY files on disk: ~50-80 MB. docker compose up → masked target ready in under 60 seconds on a modern laptop.

4. Reference Configs¶

The fixtures ship with three reference configs at tests/fixtures/configs/:

Config	Purpose
`demo-corp.yaml`	The "happy path" config — every table addressed explicitly, sane defaults
`minimal.yaml`	Only required fields. Relies entirely on auto-detect. Tests the zero-config story
`strict.yaml`	`strict_autodetect: true` + explicit passthrough for known-safe columns. Tests the compliance-mode path

Snippet from demo-corp.yaml (excerpt — full file in repo):

version: "1.0"
global_salt: env://PRIVACI_TEST_SALT
on_existing_data: fail
strict_autodetect: false

tables:
  public.organizations:
    strategy: transform
    columns:
      name:           { action: fake, provider: company }
      ein:            { action: regex_mask, pattern: '^\d{2}-\d{7}$', replace: "00-0000000" }
      billing_email:  { action: fake, provider: email }

  public.users:
    strategy: transform
    columns:
      email:          { action: fake, provider: email }
      phone:          { action: fake, provider: phone }
      first_name:     { action: fake, provider: first_name }
      last_name:      { action: fake, provider: last_name }
      dob:            { action: fake, provider: dob }
      ssn:            { action: fake, provider: ssn }
      password_hash:  { action: static, value: "privaci-test-pw" }
      last_login_ip:  { action: fake, provider: ip_address }

  clinical.patients:
    strategy: transform
    columns:
      mrn:                  { action: hash }
      first_name:           { action: fake, provider: first_name }
      last_name:            { action: fake, provider: last_name }
      full_name:            { action: fake, provider: full_name }
      dob:                  { action: fake, provider: dob }
      ssn:                  { action: fake, provider: ssn }
      phone:                { action: fake, provider: phone }
      email:                { action: fake, provider: email }
      insurance_member_id:  { action: hash }

  clinical.patient_documents:
    strategy: transform
    columns:
      # Aliases make implied FKs deterministic across tables
      referring_provider_email:
        action: fake
        provider: email
        seed_alias: clinical.providers.email
      uploaded_by_user_email:
        action: fake
        provider: email
        seed_alias: public.users.email
      document_blob: { action: passthrough }   # synthetic blob, no PII

  public.ticket_messages:
    strategy: transform
    columns:
      body: { action: ner_mask }              # L2 SpaCy

  clinical.patient_visits:
    strategy: transform
    columns:
      visit_notes: { action: ner_mask }

  audit_internal.audit_log_events:
    strategy: exclude

5. Assertions Every Integration Test Must Make¶

When a test runs the engine against Demo Corp (full or subset), the test SHOULD verify, at minimum:

# Original PII values do not appear anywhere in target
assert "john.doe@acme-clinic.example" not in serialized_target_rows

# FK integrity preserved
assert all_fks_valid(target_conn)

# Deterministic faker consistency
masked_email_in_users = pick_one(target_conn, "public.users", "email")
masked_email_in_messages_authored_by_same_user = ...
assert they_match_via_seed()

# UNIQUE constraints satisfied after masking
assert no_unique_violations(target_conn)

# Audit log written
assert (
    audit_count(target_conn, event_type="column.masked") > 0
)

# Skipped views recorded
assert "active_clinics_v" in skipped_objects(target_conn)
assert "tickets_open_mv" in skipped_objects(target_conn)

# Polymorphic + implied FKs warned
assert (
    audit_count(target_conn, event_type="polymorphic_fk_warning") == 1
)
assert (
    audit_count(target_conn, event_type="implied_fk_warning") >= 2
)

# Partitions all loaded
assert partition_count(target_conn, "public.raw_events") == 24
assert partition_count(target_conn, "clinical.patient_visits") == 4

# No raw PII in audit log
assert no_pii_in_audit_log(target_conn)

The helper module tests/integration/assertions.py provides these as composable functions so every integration test reads identically.

6. What This Fixture Does NOT Cover¶

Honesty matters. Things this fixture intentionally does not cover yet — captured as future work:

Multi-region partitioning at scale (hundreds of partitions). Tier 2 covers 24 (range) and 4 (list); v1.x test additions can expand.
Triggers / rules — current spec already says "skip + audit log"; Demo Corp doesn't ship any. Edge-case fixture with-triggers.sql can be added when needed.
Custom domains and user-defined types beyond ltree.
Mongo / non-relational sources — out of MVP scope.
Sharded databases / Citus — out of MVP scope.
Logical replication slots / publications — engine doesn't use them; Demo Corp doesn't need them.

7. Operational Notes¶

PHI must never be real. Faker generates all patient and provider data with a fixed seed. The seed.py module includes a hard assertion that prevents the generator from running if it detects any obvious real-PII patterns in the input (e.g., a name dictionary imported from a real source).
Faker locale. en_US for predictability. Vertical packs can later add en_GB, de_DE, etc.
Long-text quality. ticket_messages.body and patient_visits.visit_notes use a templated narrative generator that produces realistic prose mentioning Faker names, dates, and Faker-generated MRN-shaped strings. This is what makes L2 NER tests meaningful.
Regeneration determinism. Any change to generate.py, schema.py, seed.py, or any provider module that affects output requires a regenerated committed SQL set in the same PR. The pre-commit hook verify-fixtures enforces this.
Storage cost. ~80 MB of compressed SQL in the repo is acceptable. If it grows past ~200 MB we'll move to Git LFS or fetch-on-demand from a release artifact.

8. Large-scale load test (§18.3)¶

The beta release gate requires proof that the engine streams a multi-gigabyte table in constant memory. This is validated by a dedicated, opt-in test rather than the Tier 3 Python generator.

Why not --tier stress? The Python generators materialise rows in process memory before writing SQL, so asking them for a 1 GB dataset balloons RAM. The load test instead builds the dataset inside PostgreSQL with INSERT ... SELECT FROM generate_series(...) (see tests/integration/loadtest_data.py): the database streams generated rows straight to its own heap/TOAST storage while Python holds nothing larger than the SQL statement.

What it asserts¶

tests/integration/test_loadtest_1gb.py runs the full masking pipeline against the generated load_test.customers table and verifies:

Completeness — every source row reaches the target (row parity).
Integrity — no foreign key is left orphaned.
No leakage — the probe email customer1@example.test never survives.
Bounded memory — peak process RSS growth during the run stays under a ceiling and below half the dataset size, proving the engine never buffers the whole table.

Running it¶

The test is marked slow + integration, so it is excluded from the default suite. With compose.dev.yml up:

source .venv/bin/activate

# Default 64 MiB — a fast confidence check.
pytest tests/integration/test_loadtest_1gb.py -m "integration and slow"

# Full 1 GiB target (validated locally at 1.15 GB / 900k rows in ~40 s).
PRIVACI_LOADTEST_BYTES=1073741824 \
    pytest tests/integration/test_loadtest_1gb.py -m "integration and slow"

Environment variable	Default	Purpose
`PRIVACI_LOADTEST_BYTES`	`67108864` (64 MiB)	Minimum `pg_total_relation_size` of the customers table to build.
`PRIVACI_LOADTEST_MEM_CEILING_MB`	`512`	Maximum allowed peak RSS growth during the streamed run.