Frequently Asked Questions
Common questions about how ScryData works, how it compares to other tools, and what it costs to run in production.
What is shadow database testing?
Shadow database testing means applying your proposed schema change to a copy of your database that's receiving real production traffic — before you deploy. ScryData proxies queries between your application and PostgreSQL, captures every query, and replays it against the shadow. If a query that ran in 3ms now takes 200ms after your migration, you know before your users do.
How does ScryData compare to pt-upgrade?
pt-upgrade is a Percona Toolkit utility that replays MySQL slow-log files against two instances to validate version upgrades. ScryData is built primarily for schema migrations — validating that your index additions, column changes, or table restructures don't regress query performance before you deploy. It also works for upgrade testing (new database version, same schema). The key technical differences: ScryData captures live traffic continuously via a wire-protocol proxy rather than replaying static log files, and uses CDC replication to keep the shadow database current with production data, so you're always testing against real data distribution and volume.
What is the production overhead?
Less than 1ms per query at p50, benchmarked on commodity hardware against PgBouncer as the baseline. At 100 concurrent connections, scry-proxy adds roughly 1% latency. Event publishing to the shadow replay pipeline is fully async and best-effort — if the buffer fills, oldest events are dropped rather than slowing down queries. Observability never sits in the query path.
Does ScryData replace pgBouncer?
Yes — scry-proxy is a pgBouncer replacement, not an additional layer. It provides connection pooling (deadpool-based, comparable benchmark numbers) plus query capture, circuit breaking, and health monitoring that pgBouncer doesn't have. Topology: App → scry-proxy → PostgreSQL. At 100 concurrent connections scry-proxy beats pgBouncer at p99 (13.3ms vs 14.1ms).
How are non-deterministic queries handled?
The replay engine uses AST-based SQL transformation before executing against the shadow. NOW() and CURRENT_TIMESTAMP are replaced with a fixed timestamp; SET, SHOW, and EXPLAIN are filtered out. The primary regression signal is latency, not result-set comparison — which sidesteps most non-determinism concerns. A query that returns different random values but runs in the same time is not a regression.
Is ScryData open source?
scry-proxy (the PostgreSQL proxy) is open source under Apache 2.0. The full platform — shadow management, replay engine, regression analysis — is not open source yet. The scry demo command runs the entire stack locally so you can evaluate it end-to-end without a cloud account.