PostgreSQL Batch Delete with Progress Tracking
PostgreSQL Batch Delete with Progress Tracking
When deleting large amounts of data from PostgreSQL tables, batch processing prevents lock contention and provides visibility into the operation’s progress. This approach is essential for production environments.
The Batch Delete Pattern
DO $$
DECLARE
batch_size INTEGER := 10;
total_processed INTEGER := 0;
current_batch INTEGER;
BEGIN
WHILE TRUE LOOP
-- Delete batch
WITH batch_to_delete AS (
SELECT id
FROM jobs
WHERE 1 = 1 -- Add your conditions here
ORDER BY id
LIMIT batch_size
)
DELETE FROM jobs
WHERE id IN (SELECT id FROM batch_to_delete);
GET DIAGNOSTICS current_batch = ROW_COUNT;
total_processed := total_processed + current_batch;
-- Exit if no rows were deleted
EXIT WHEN current_batch = 0;
-- Optional: Log progress
RAISE NOTICE 'Processed % rows so far', total_processed;
END LOOP;
RAISE NOTICE 'Total rows deleted: %', total_processed;
END $$;
Why Batch Deletes Matter
Performance Benefits:
- Prevents long-running transactions that block other operations
- Reduces transaction log bloat
- Allows VACUUM and other maintenance operations between batches
- Provides checkpoints for recovery if something goes wrong
Operational Benefits:
- Progress visibility for long-running operations
- Ability to pause/resume by adjusting conditions
- Reduced impact on concurrent applications
- Better resource utilization