scry-proxy: An Open-Source PgBouncer Replacement with Per-Query Observability

If you’re running PostgreSQL in production, there’s a good chance PgBouncer is sitting between your application and your database. It’s been the go-to connection pooler for over a decade, and for good reason—it’s lightweight, stable, and does exactly what it says on the tin.

But connection pooling alone doesn’t help you understand what’s happening with your queries. When latency spikes or a background job starts hammering the database, PgBouncer has no way to surface that information.

We built scry-proxy to fill that gap, and we’re releasing it as open source under the Apache 2.0 license.

The Problem with “Good Enough”

PgBouncer excels at connection pooling. It’ll happily sit in front of your database, manage a pool of connections, and keep your application from overwhelming PostgreSQL with too many connections. Mission accomplished.

But when something goes wrong—and in production, something always goes wrong—PgBouncer leaves you flying blind.

After executing 70+ production PostgreSQL migrations, we’ve seen this scenario play out more than once:

A team notices their API latency creeping up. They check their application metrics—nothing obvious. They check PgBouncer—connections look fine, pool isn’t saturated. They check PostgreSQL—CPU is spiking, but why?

After an hour of digging through pg_stat_statements, correlating timestamps, and guessing at query patterns, they finally discover that a single query (run by a background job that got misconfigured) is scanning a 50-million-row table every 30 seconds.

PgBouncer saw every one of those queries. It just had no way to tell anyone.

With scry-proxy, that query shows up immediately in your metrics—fingerprinted, timed, and ranked by impact. The investigation that took an hour becomes a 30-second dashboard check.

But it’s not just about debugging. PgBouncer’s limitations show up in other ways too:

No per-query metrics. You know connections are being pooled, but which queries are slow? Which ones are hammering hot tables? PgBouncer can’t tell you.
No circuit breaking. When your database starts struggling, PgBouncer keeps sending traffic. It’ll faithfully forward requests right into a cascading failure.
No query intelligence. Every query is just bytes on a wire. There’s no understanding of what’s actually being asked of your database.
Limited metrics. You get connection counts and pool statistics via SHOW STATS, but correlating that with application-level behavior requires additional tooling.

For years, teams have worked around these limitations by bolting on additional tools—query logging, APM agents, custom middleware. It works, but each layer adds operational complexity and none of them have the full picture that the proxy does.

scry-proxy: Drop-In Compatible, Production-Grade

We built scry-proxy to be a direct replacement for PgBouncer. If you’re already running PgBouncer, migration is straightforward:

# scry-proxy can read your existing PgBouncer config
# or use its own format with familiar concepts
listen_addr: "0.0.0.0:6432"
database_url: "postgres://user:pass@localhost:5432/mydb"

pool:
  mode: transaction  # Same modes as PgBouncer: session, transaction
  size: 100

Your application doesn’t need to change. Same port, same PostgreSQL wire protocol, same connection strings. scry-proxy can also import your existing PgBouncer configuration directly via --pgbouncer-config.

What’s supported: session and transaction pooling modes, SCRAM-SHA-256/MD5/trust/certificate auth, TLS on both sides, COPY protocol, simple and extended query protocols, cancel request forwarding.

What’s not supported (yet): LISTEN/NOTIFY passthrough, PgBouncer’s SHOW admin commands (scry-proxy has its own admin interface), auth_query for dynamic auth lookups, and HBA-style auth file format. See the migration guide for the full compatibility matrix.

What You Get That PgBouncer Can’t Offer

Hybrid Connection Pooling

If you’ve used PgBouncer, you know the tradeoff: transaction mode gives you efficient connection sharing, but breaks if your app uses prepared statements, session variables, or temp tables. Session mode works with everything, but you lose most of the pooling benefits.

scry-proxy’s hybrid mode gives you both.

pool:
  mode: hybrid  # The default - transaction efficiency with session compatibility

Hybrid mode tracks session state automatically. Stateless connections go back to the pool after each transaction—just like transaction mode. But when your application uses session-level features (SET, PREPARE, temp tables, cursors, advisory locks), scry-proxy pins that connection until the state is cleared or can be replayed. No application changes required. See Under the Hood for the implementation details.

Per-Query Observability

Every query that flows through scry-proxy is instrumented. Not with expensive tracing that adds milliseconds of latency, but with lightweight, asynchronous capture designed for production workloads. Event publishing never blocks the query path—queries complete and events are batched and flushed in the background.

You get: - Query fingerprinting using Blake3 hashing—group similar queries together even when parameters differ - Timeline breakdowns showing queue time, pool acquisition, backend execution, and publishing overhead - Hot data detection using probabilistic data structures (Count-Min Sketch + Top-K heap) to identify your most accessed tables and rows

This isn’t sampling. It’s every query, in production. See the benchmarks below for measured overhead.

Lock-Free Circuit Breaker

When your database starts showing signs of distress, scry-proxy can protect it:

circuit_breaker:
  failure_threshold: 5
  success_threshold: 3
  timeout_seconds: 30

The circuit breaker is lock-free—state is managed with atomic operations (AtomicU8 for state, AtomicU32 for failure counters, AtomicU64 for timestamps), so there’s zero contention on the critical path. When failures pile up past your configured threshold, it trips open and gives your database breathing room. When health returns, it transitions to half-open and cautiously lets traffic back through.

Combined with our health monitor, the circuit breaker can even open predictively based on latency spikes and error rate anomalies, before a full failure cascade develops.

Privacy-Preserving Query Anonymization

Need to analyze production queries but can’t expose PII? scry-proxy can anonymize sensitive values while preserving query structure:

-- Original query
SELECT * FROM users WHERE email = 'alice@example.com' AND ssn = '123-45-6789'

-- Anonymized (deterministic fingerprints)
SELECT * FROM users WHERE email = 'a]3x9...' AND ssn = 'b]7k2...'

The fingerprints are deterministic—the same value always produces the same hash—so you can still do analysis and grouping without exposing the underlying data. This is designed to support GDPR, HIPAA, PCI DSS, and SOC 2 requirements around query logging.

Comprehensive Prometheus Metrics

scry-proxy exports Prometheus metrics at the query level, not just the pool level:

Aggregate query latency histograms (p50, p95, p99)
Connection pool saturation and wait times
Circuit breaker state transitions
Health monitor anomaly detections
Event publishing throughput and backpressure

Plug it into your existing Grafana dashboards for real-time visibility into query behavior.

For per-query analysis with fingerprint breakdowns, scry-proxy publishes detailed events to your analytics backend—keeping Prometheus cardinality under control while still giving you the full picture when you need to drill down.

Under the Hood: How Hybrid Mode Works

The hybrid pooling mode is the most architecturally interesting part of scry-proxy, so it’s worth explaining how it actually works.

scry-proxy inspects the PostgreSQL wire protocol at the message level—it understands the framing of Query (‘Q’), Parse (‘P’), CommandComplete (‘C’), and ErrorResponse (‘E’) messages without fully parsing SQL. This is the same layer PgBouncer operates at, so the overhead characteristics are comparable.

The key difference is what scry-proxy does with that information. It maintains a per-client state machine that tracks whether a connection is stateless or pinned:

Stateless (default): The connection behaves like transaction mode. After each transaction completes, the backend connection is returned to the pool via DISCARD ALL and is available for other clients.
Pinned: When scry-proxy detects a state-changing operation, it pins the backend connection to that client. What triggers pinning:
- SET / SET LOCAL — session variables
- PREPARE — prepared statements
- CREATE TEMP TABLE — temporary objects
- DECLARE CURSOR — open cursors
- pg_advisory_lock() — advisory locks

Not all pinned state is equal. Prepared statements and session variables can be replayed—if a pinned connection is reclaimed after sitting idle (default: 5 minutes), scry-proxy records the PREPARE and SET commands and replays them against a fresh connection from the pool when the client sends its next query. The client never sees a difference.

Temp tables, cursors, and advisory locks can’t be replayed, so those connections stay pinned until the client explicitly drops them or disconnects.

This gives you a practical middle ground: most of your workload gets transaction-mode connection sharing, and the subset that needs session state gets it—automatically, without application changes or mode-switching.

How Hot Data Detection Works

scry-proxy uses a Count-Min Sketch paired with a Top-K heap to identify your most frequently accessed values. The Count-Min Sketch is a probabilistic frequency estimator—it uses a fixed amount of memory (regardless of how many distinct values it sees) and can overcount but never undercount. The Top-K heap maintains the k most frequent items seen so far.

As queries flow through the proxy, value fingerprints (Blake3 hashes—so no raw data is stored) are fed into the sketch. The result is a continuously updated view of your hottest access patterns: which rows are read most often, which lookup values appear in the most queries, which tables are under the most pressure.

This data is available via a debug endpoint (/debug/hot_data) and published as part of the event stream. Practical uses include identifying candidates for caching, spotting anomalous access patterns (e.g., credential stuffing), and understanding read amplification before a migration.

Event Pipeline

Query events are published asynchronously and never block the query path. The pipeline works as follows:

After a query completes, a lightweight event struct (~100 bytes) is pushed onto a bounded ring buffer (default capacity: 10,000 events).
A background task flushes the buffer in batches (default: 100 events or every 1 second, whichever comes first).
Events are serialized using FlexBuffers (a schema-less binary format from the FlatBuffers family) and optionally gzip-compressed before publishing—typically an 84% size reduction vs. JSON.
If the consumer is slow, the ring buffer overwrites the oldest events rather than applying backpressure to queries. Observability is best-effort; query latency is not.

How It Compares

Feature	scry-proxy	PgBouncer	pgcat	Odyssey
Connection pooling	✓	✓	✓	✓
Transaction mode	✓	✓	✓	✓
Session mode	✓	✓	✓	✓
Hybrid mode (auto state tracking)	✓	✗	✗	✗
Per-query metrics	✓ (fingerprinted, per-query events)	✗	Limited (`SHOW QUERIES`)	Limited (per-route stats)
Query fingerprinting (Blake3)	✓	✗	✗	✗
Circuit breaker	✓ (lock-free, predictive)	✗	✗	✗
Hot data detection	✓	✗	✗	✗
Query anonymization	✓	✗	✗	✗
Health-based routing	✓	✗	✓	✓
Prometheus metrics	✓ (query-level histograms)	✓ (pool-level stats)	✓	✓
Multi-database sharding	✗	✗	✓	✗
Read replica load balancing	✗	✗	✓	✓
Open source	✓ (Apache 2.0)	✓ (ISC)	✓ (MIT)	✓ (BSD)

A few notes on this table: PgBouncer’s SHOW STATS provides aggregate timing and byte counts per pool, and pgcat exposes per-query statistics through its admin interface. Where scry-proxy differs is in the granularity—every query is fingerprinted, timed, and published as a structured event with full timeline breakdowns, rather than aggregated at the pool level.

scry-proxy isn’t trying to be everything to everyone. If you need multi-database sharding or read replica load balancing, pgcat is purpose-built for that. But if you want deep, per-query observability—without bolting on a separate monitoring stack—that’s where scry-proxy is focused.

Who Is scry-proxy For?

Teams running PostgreSQL in production who’ve been surprised by slow queries they didn’t know existed
Platform engineers managing shared databases across multiple services who need centralized visibility
Organizations planning database migrations who need to understand their actual query patterns before making changes
Companies with compliance requirements (GDPR, HIPAA, SOC 2) who need query logging without exposing sensitive data

Why Rust and Tokio

PgBouncer is written in C, optimized for minimal overhead, and intentionally limited in scope. That design has served the PostgreSQL community well for years.

scry-proxy is written in async Rust using Tokio. The choice was driven by what we needed from a proxy that does more than pooling:

Memory efficiency: Async tasks share a thread pool rather than allocating an OS thread per connection, so memory usage stays flat as connection counts grow
Safe concurrency: Rust’s ownership model eliminates data races at compile time—no segfaults, no use-after-free under load
Async I/O: Tokio’s work-stealing scheduler means throughput scales linearly with CPU cores

Benchmarks

We benchmarked scry-proxy against PgBouncer, pgcat, and direct PostgreSQL connections using a realistic OLTP workload. The benchmark source is in the scry-proxy repo so you can reproduce these results yourself.

Methodology: - Custom Rust benchmark runner using tokio-postgres with simple query protocol - Simulated e-commerce workload: product browsing (42%), product detail (26%), text search (16%), order history (11%), order details with JOINs (5%) - ~1,000 users, ~1,000 products, ~10,000 orders in PostgreSQL 16 - 100,000 queries per run, all proxies in transaction pooling mode with pool size 250 - All services in Docker containers on the same network - scry-proxy benchmarked with event publishing and anonymization disabled for a fair pooling-only comparison

Results:

Connections	Proxy	p50 (μs)	p95 (μs)	p99 (μs)	Throughput (qps)
1	direct	242	633	1,019	3,416
1	pgbouncer	271	679	1,063	3,072
1	scry-proxy	276	667	1,045	3,074
1	pgcat	401	1,089	1,234	2,155

10	direct	442	1,155	1,496	18,834
10	pgbouncer	397	1,085	1,307	21,175
10	scry-proxy	492	1,176	1,466	17,498
10	pgcat	541	1,256	1,638	15,858

50	direct	1,444	3,559	5,139	29,951
50	pgbouncer	1,794	3,739	5,183	25,357
50	scry-proxy	1,809	3,913	5,483	24,699
50	pgcat	1,901	4,139	5,975	23,161

100	direct	2,347	6,939	12,375	33,019
100	pgbouncer	2,741	7,499	14,127	29,205
100	scry-proxy	2,981	7,827	13,319	26,889
100	pgcat	3,177	8,263	13,887	25,850

Throughput vs Connection Count — scry-proxy tracks PgBouncer closely, both ahead of pgcat

At 1 connection, scry-proxy and PgBouncer are within measurement noise of each other (~3,074 vs ~3,072 qps). At higher concurrency, scry-proxy runs behind PgBouncer in throughput—about 3% at 50 connections, 8% at 100 connections, and 17% at 10 connections. The gap varies because PgBouncer’s connection reuse is particularly efficient at moderate concurrency (it actually beats direct connections at 10 clients: 21,175 vs 18,834 qps). This is the cost of the additional wire protocol inspection that enables hybrid mode and query fingerprinting. Both proxies are consistently faster than pgcat across all connection counts.

p99 Latency vs Connection Count — scry-proxy and PgBouncer track closely, with scry-proxy lower at 100 connections

The p99 latencies tell a similar story: scry-proxy tracks PgBouncer closely. At 100 connections, scry-proxy’s p99 (13,319μs) is lower than PgBouncer’s (14,127μs).

These results are with scry-proxy’s observability features disabled. Enabling event publishing and query anonymization adds overhead; we’ll publish those numbers separately.

If you’re happy with PgBouncer and don’t need query-level visibility, that’s a reasonable choice—it’s battle-tested software with a slight throughput edge. But if you want your proxy to tell you why your database is struggling, the performance cost of scry-proxy’s additional capabilities is modest.

Getting Started

If you’re running PgBouncer today, trying scry-proxy is low-risk. Check out our installation guide and migration guide for the full walkthrough, but the gist is:

Install scry-proxy alongside your existing PgBouncer
Point a test application at scry-proxy instead
Compare behavior and explore the additional metrics
Gradually migrate as confidence builds

We’ve designed the migration path to be reversible at every step. No big bang cutover required.

Frequently Asked Questions

What PostgreSQL versions are supported? scry-proxy supports PostgreSQL 12 and later, including cloud variants like Amazon RDS, Aurora, Supabase, and Neon.

Does it support TLS? Yes. scry-proxy supports TLS for both client connections and backend database connections, including certificate-based authentication.

What authentication methods are supported? Trust, MD5, SCRAM-SHA-256, and certificate authentication are all supported.

What happens if scry-proxy crashes? Your applications will see connection failures and need to reconnect—same as if PgBouncer crashed. For high availability, run multiple scry-proxy instances behind a load balancer. Each instance is stateless (except for pinned connections in hybrid mode), so failover is straightforward.

Can I run multiple instances? Yes. scry-proxy instances are independent—no coordination required. Put them behind a TCP load balancer for horizontal scaling and failover.

How does hybrid mode handle prepared statements? When a client creates a prepared statement, scry-proxy pins that connection and tracks the statement. If the connection needs to be released (idle timeout), scry-proxy can replay the PREPARE commands to a fresh connection from the pool, so your application doesn’t notice.

For more details, see the full documentation.

Try It Now

scry-proxy is fully open source under the Apache 2.0 license. No signup, no license keys, no strings attached.

# Build from source
cargo install scry-proxy

# Or with Docker
docker run -p 6432:6432 ghcr.io/scrydata/scry-proxy

Full installation guide →

Check out the quick start guide to have scry-proxy running in front of your PostgreSQL instance in under 5 minutes. You’ll see query metrics flowing immediately.

Star us on GitHub: github.com/scrydata/scry-proxy — we’d love your feedback, issues, and contributions.

Beyond the proxy: scry-proxy is fully open source and standalone—use it as-is, fork it, build on it. If you also need to replay production traffic against a shadow database before a migration, compare query results across database versions, or visualize query patterns over time, those capabilities are available through the ScryData platform. The proxy doesn’t require it.