Is it safe to set statement_timeout to 0 for an index build?

Yes, but only in an isolated maintenance session dedicated to the build. Disabling the wall-clock cancel lets a long build finish, but application sessions should keep a bounded statement_timeout. The durable production answer is to pair a bounded ceiling with a non-blocking CREATE INDEX CONCURRENTLY build so no live client ever waits on the build.

My build times out but PostgreSQL logs show no cancel — why?

A connection pooler or client socket timeout is cancelling it, not PostgreSQL. If a backend disappears at a fixed interval and pg_stat_activity shows no statement_timeout cancel, raise the pooler's query_timeout and server_idle_timeout, or run the build on a direct connection that bypasses PgBouncer or RDS Proxy.

How high should maintenance_work_mem go to stop timeouts?

High enough to hold the graph or centroid working set without spilling to disk, but low enough that maintenance_work_mem times (1 + max_parallel_maintenance_workers) stays under free RAM. Setting it blindly to 50% of RAM with several parallel workers can trigger an out-of-memory kill that looks like a different failure.

Does adding parallel workers speed up an IVFFlat build?

Not the centroid phase. IVFFlat's k-means centroid initialization is single-threaded, so max_parallel_maintenance_workers does not accelerate it. Parallel workers help HNSW layer construction, which is why large IVFFlat rebuilds often move to a shadow-table pattern where a blocking build runs uncontended instead.

My timed-out build left an index behind — is it usable?

Probably not. A failed CREATE INDEX CONCURRENTLY leaves an index with indisvalid = false that silently forces sequential scans. Check pg_index for indisvalid and indisready; if invalid, DROP INDEX CONCURRENTLY and rerun the build, because a concurrent build never cleans up after itself.

Resolving pgvector Index Build Timeout Errors

A CREATE INDEX on a large embedding column frequently dies with ERROR: canceling statement due to statement_timeout or a dropped connection long before the graph finishes materializing. This page gives a step-by-step recovery procedure: triage the true cause from pg_stat_activity and pg_stat_progress_create_index, recalibrate the memory and timeout settings that actually govern build duration, and move the build off the synchronous path so it can no longer time the client out.

Up: Asynchronous Index Build Strategies

A build timeout is rarely a single failure — it is a symptom that can trace back to a hard statement_timeout, a proxy idle cutoff, memory-starved external sorting, or lock contention with concurrent DML. Fixing it blindly by disabling every timeout hides the real bottleneck and invites an out-of-memory kill instead. The procedure below isolates which of those is firing, then applies the matching fix, and finally shifts the build to the non-blocking pattern documented in the parent asynchronous index build strategies guide so production traffic never depends on the build completing inside a session window.

Prerequisites

pgvector 0.5+ (0.7+ if you build on halfvec to halve the working set); SELECT extversion FROM pg_extension WHERE extname = 'vector';.
PostgreSQL 15+ for parallel index builds and pg_stat_progress_create_index phase reporting.
A superuser or table-owner role — SET maintenance_work_mem and ALTER SYSTEM require it.
Headroom on RAM: at least maintenance_work_mem × (1 + max_parallel_maintenance_workers) free, plus the OS page cache, before you raise the ceiling.
Access to any pooler in the path (PgBouncer, RDS Proxy) — its server_idle_timeout and query_timeout can cancel a build the database itself would have finished.
A decided algorithm and parameters before you start; build cost is driven by the choice covered in HNSW vs IVFFlat algorithm selection and the knobs in optimizing m and ef_construction parameters.

Step-by-step procedure

1. Triage the true cause before changing anything

Distinguish a client-side disconnect, a server-side statement limit, and lock contention — they need opposite fixes. While the build is stalled, inspect the backend from a second session:

SQL

SELECT pid, state, wait_event_type, wait_event,
       now() - query_start AS elapsed,
       left(query, 60) AS query
FROM pg_stat_activity
WHERE query ILIKE 'CREATE INDEX%';

An active state with wait_event_type of IO or LWLock means disk saturation or checkpoint pressure, not a hard timeout — the fix is memory and I/O (step 3), not raising the timeout. idle in transaction points at a pooler or an uncommitted ORM session holding the transaction open. A backend that vanishes entirely at a fixed interval is a proxy or client socket timeout, not PostgreSQL at all.

2. Read the build phase to locate the bottleneck

pg_stat_progress_create_index tells you exactly where the time is going. Poll it every few seconds:

SQL

SELECT phase, blocks_done, blocks_total,
       tuples_done, tuples_total,
       round(100.0 * blocks_done / NULLIF(blocks_total, 0), 1) AS pct
FROM pg_stat_progress_create_index;

When phase sits at building index: loading tuples (HNSW) or sorting tuples and tuples_done crawls, the build is memory-bound and spilling to disk. When blocks_done moves steadily but slowly, it is I/O-bound. A phase stuck at waiting for lock confirms contention from step 1. Enable SET log_min_duration_statement = 0 in the build session to capture the exact cancellation timestamp for cross-referencing.

3. Recalibrate memory and parallelism

Default maintenance_work_mem (64 MB) cannot hold an HNSW graph working set or an IVFFlat centroid table, so the build spills and slows until it trips a timeout. Raise it for the build session only:

SQL

SET maintenance_work_mem = '8GB';        -- size to the graph working set, not blindly
SET max_parallel_maintenance_workers = 4; -- HNSW scales here; IVFFlat centroid pass does not

Keep the ceiling below what the OOM killer will tolerate: total build memory is roughly maintenance_work_mem × (1 + workers). Size it against the index footprint estimated in pgvector storage overhead analysis rather than reflexively setting 50% of RAM. HNSW distributes layer construction across the parallel workers; IVFFlat’s k-means centroid pass is single-threaded, so extra workers do nothing for the lists-tuning path in tuning IVFFlat lists for high-throughput similarity search.

4. Set explicit timeouts for the build session

Disabling statement_timeout is correct inside an isolated maintenance session, but production still needs a bounded ceiling paired with an async build (step 5). In the dedicated session:

SQL

SET statement_timeout = 0;                       -- no wall-clock cancel during the build
SET lock_timeout = '120s';                        -- fail fast if a blocker won't clear
SET idle_in_transaction_session_timeout = '300s'; -- don't leak a half-open build txn

If a pooler sits in the path, raise its query_timeout/server_idle_timeout too, or run the build on a direct connection that bypasses the pooler — otherwise the proxy cancels the statement the database would have completed.

5. Move the build off the synchronous path

The durable fix is to stop making a live client wait on the build at all. Use CREATE INDEX CONCURRENTLY, which avoids the AccessExclusiveLock and lets reads and writes continue, run from a standalone autocommit connection so no migration transaction wraps it:

SQL

CREATE INDEX CONCURRENTLY idx_chunks_embedding_hnsw
  ON document_chunks USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 128);

Note that m and ef_construction are build-time storage parameters and cannot be altered in place — changing them means DROP INDEX + CREATE INDEX, so size them deliberately before a large build. For very large tables where even a concurrent build’s catch-up phase is too costly, stage the build on a shadow table or replica and swap it in, as detailed in the parent asynchronous index build strategies guide. In Python orchestration, wrap the build with asyncpg/psycopg retry logic and an off-peak scheduler rather than a synchronous blocking call.

Parameter reference

Name	Type	Default	Production recommendation	Notes
`maintenance_work_mem`	memory	`64MB`	`2GB`–`16GB`	Sized to the graph/centroid working set; too low forces disk spills that stall the build.
`max_parallel_maintenance_workers`	int	`2`	`≤ physical cores`	Speeds HNSW layer construction; no effect on the IVFFlat centroid pass.
`statement_timeout`	ms	`0`	`0` in the build session; bounded elsewhere	Set to `0` only in an isolated maintenance session; keep a ceiling for app traffic.
`lock_timeout`	ms	`0`	`60s`–`120s`	Fail fast on contention instead of blocking indefinitely behind DDL or `VACUUM FULL`.
`idle_in_transaction_session_timeout`	ms	`0`	`120s`–`300s`	Prevents a half-open build transaction from pinning MVCC bloat.
`max_wal_size`	memory	`1GB`	`8GB`–`32GB`	Fewer checkpoints during a large build; low values cause the `IO` waits seen in step 1.
`m` (HNSW)	int	`16`	`16`–`32`	Build-time only; raising it grows build time and cannot be changed without a rebuild.
`ef_construction` (HNSW)	int	`64`	`128`–`256`	Dominant build-cost knob; excessive values trigger the timeout directly.

Verification

After the build, confirm the index is actually valid and in service — a timed-out build can leave an invalid index behind that silently forces sequential scans:

SQL

SELECT c.relname AS index_name,
       i.indisvalid, i.indisready
FROM pg_index i
JOIN pg_class c ON c.oid = i.indexrelid
WHERE c.relname = 'idx_chunks_embedding_hnsw';
-- expect: indisvalid = t, indisready = t

Then refresh planner statistics and confirm the planner actually uses the index rather than falling back to a scan:

SQL

ANALYZE document_chunks;
EXPLAIN (ANALYZE, BUFFERS)
SELECT id FROM document_chunks
ORDER BY embedding <=> '[0.1, 0.2, 0.3]'::vector
LIMIT 10;
-- expect an "Index Scan using idx_chunks_embedding_hnsw", not "Seq Scan"

If indisvalid is f, drop the failed index (DROP INDEX CONCURRENTLY idx_chunks_embedding_hnsw;) and rerun the build — a CONCURRENTLY failure never cleans up after itself.

Troubleshooting

ERROR: canceling statement due to statement_timeout. The wall-clock ceiling fired mid-build. Confirm with the log timestamp from step 2, then SET statement_timeout = 0 in the isolated build session and move to CREATE INDEX CONCURRENTLY; if the cancel recurs at a fixed interval, the pooler’s query_timeout is the real cutoff.
ERROR: out of memory or could not extend file. maintenance_work_mem × (1 + workers) exceeded free RAM, or the sort spilled a full disk. Lower maintenance_work_mem or worker count, provision faster NVMe, or partition the table to shrink single-build pressure. Classification of these post-failure states is covered in index validation error categorization.
Build stuck in waiting / ERROR: deadlock detected. A long-running query or uncommitted ORM transaction holds a conflicting lock. Find it with SELECT pid, query FROM pg_stat_activity WHERE wait_event_type = 'Lock';, terminate the blocker with pg_terminate_backend(pid), and rerun during an off-peak window.
Build “finished” but queries are slow. The index is likely invalid from a prior timeout, or the planner has stale stats. Check indisvalid (verification block) and run ANALYZE; a Seq Scan in EXPLAIN with a healthy index usually means missing statistics or an operator-class mismatch.
CREATE INDEX CONCURRENTLY cannot run inside a transaction block. A migration framework wrapped the build in BEGIN/COMMIT. Run it as a standalone autocommit statement, or enable the framework’s non-transactional/disable-DDL-transaction flag for that step.

Index validation error categorization — classify the invalid and degraded states a timed-out build leaves behind
Tuning IVFFlat lists for high-throughput similarity search — size the lists parameter that drives the single-threaded centroid pass
Step-by-step HNSW index creation for production workloads — the full build procedure this recovery guide restarts
pgvector storage overhead analysis — size maintenance_work_mem against the real graph working set
Up: Asynchronous Index Build Strategies

Resolving pgvector Index Build Timeout Errors

Prerequisites #

Step-by-step procedure #

1. Triage the true cause before changing anything #

2. Read the build phase to locate the bottleneck #

3. Recalibrate memory and parallelism #

4. Set explicit timeouts for the build session #

5. Move the build off the synchronous path #

Parameter reference #

Verification #

Troubleshooting #

Related #