Avoiding SQLite Database Locks in Production

This is a Rails stack, using Solid Queue, and SQLite
My SQLite configuration
SQLite misconfiguration in deploy.yml
Recurring Jobs
Avoid Batching / Find Each in Jobs
- Opencode skill to fix this
Post your SQLite tips

While working on Wenmar Pro, a POS for auto shops, I’ve encountered quite a few SQLite3::BusyException exceptions and it’s taken me months to figure out the cause. This happens because sqlite is single writer, when one process holds the lock, any other process must wait for it to be released, but often it’s not in time and thus throws this exception. The cause can be hard to track down, so I’ll share what I have learned so far.

It comes down to:

Better error tracking (I personally use AppSignal for it’s good Rails integration)
SQLite configuration
Background job recurring jobs
Optimize SQLite batch jobs

This is a Rails stack, using Solid Queue, and SQLite

For some context, I started this app 3 or so months ago and I felt I had nothing to lose by trying a new db. I’ve heard many good things about SQLite and it’s advancements in performance, esp for Rails. So I decided to give it a shot. Otherwise I would use Postgresql. A big factor was not having to manage an other process and because this software has a low amount of users, I felt this could be a good fit. As well as, worst case a migration from sqlite to postgres is fairly simple. So I took a chance.

So far it’s been great but it’s become clear you need to spend a little more time optimizing for sqlite. It’s single writer locking needs to be taken into account, and a lot of AI agents do not. I should note, I am keeping my eyes on Turso, it’s compatible with sqlite but offers multiple writers, which is awesome but it has no rails driver atm.

My SQLite configuration

I’m using rails and I’ll just share my configuration. One important thing is, the busy_timeout and timeout values should be the same. If you’re getting a lot of random lock exceptions trying making them the same value.

Other notes:

wal should be on.
the other custom values were suggested additions, cannot confirm if they help.

For testing:

busy_timeout is higher to help with rails test parallel testing
other values are for quicker db writing, which does help

default_pragmas: &default_pragmas
  journal_mode: wal
  synchronous: normal
  temp_store: memory
  mmap_size: 134217728
  cache_size: -20000
  busy_timeout: 5000
  wal_autocheckpoint: 10000

default: &default
  adapter: sqlite3
  max_connections: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
  timeout: 5000
  pragmas:
    <<: *default_pragmas

development:
  primary:
    <<: *default
    database: storage/development.sqlite3
  queue:
    <<: *default
    database: storage/development_queue.sqlite3
    migrations_paths: db/queue_migrate
  cable:
    <<: *default
    database: storage/development_cable.sqlite3
    migrations_paths: db/cable_migrate

# Warning: The database defined as "test" will be erased and
# re-generated from your development database when you run "rake".
# Do not set this db to the same as development or production.
test:
  <<: *default
  database: storage/test<%= ENV['TEST_ENV_NUMBER'] || ENV['TEST_WORKER_ID'] || '' %>.sqlite3
  timeout: 20000
  pragmas:
    <<: *default_pragmas
    mmap_size: 268435456
    synchronous: "OFF"
    busy_timeout: 20000
    wal_autocheckpoint: 0

# Store production database in the storage/ directory, which by default
# is mounted as a persistent Docker volume in config/deploy.yml.
production:
  primary:
    <<: *default
    database: storage/production.sqlite3
  cache:
    <<: *default
    database: storage/production_cache.sqlite3
    migrations_paths: db/cache_migrate
  queue:
    <<: *default
    database: storage/production_queue.sqlite3
    migrations_paths: db/queue_migrate
  cable:
    <<: *default
    database: storage/production_cable.sqlite3
    migrations_paths: db/cable_migrate

SQLite misconfiguration in deploy.yml

I’m using Kamal for deployments, which is very simple and I recommend it. But recently I found out one reason why db locks kept happening in my background jobs. I accidently added an env var for SOLID_QUEUE_IN_PUMA. This caused solid queue to run twice, once in my web container and once in my job container.

deploy.yml:

env:
  secret:
    - RAILS_MASTER_KEY
    - LITESTREAM_ACCESS_KEY_ID
    - LITESTREAM_SECRET_ACCESS_KEY
  clear:
    RAILS_ENV: production
    RAILS_LOG_LEVEL: info
    SOLID_QUEUE_IN_PUMA: false # this actually activates it

After months I figured out what happened.

puma.rb:

# Run the Solid Queue supervisor inside of Puma for single-server deployments.
plugin :solid_queue if ENV["SOLID_QUEUE_IN_PUMA"]

See the issue? Any value in SOLID_QUEUE_IN_PUMA would then activate this plugin. I can’t blame myself, this was done at initial project setup and I just never noticed.

Recurring Jobs

I’m using solid queue and it’s built in recurring.yml, I make sure all recurring jobs will never run at the same time. This is a great method for finding which job caused the lock. So if a new lock exception gets logged, the first thing I check is the time it was logged at. From there I can track down which job caused it. Otherwise I would have no idea and would be guessing. This app has over 30 jobs, so it’s no easy debugging feat.

Avoid Batching / Find Each in Jobs

The next cause of locks was the way my AI agent wrote these background jobs. It would do a .find_each and then inside the look send emails etc. This would block other jobs / web from running because the find_each call holds the write lock until it’s released. So the fix is to pluck the ids of wanted records, and then run any long process.

Example:

# BAD — read cursor stays open while writing
Vehicle.where(stale: true).find_each { |v| v.update!(checked_at: Time.current) }

# GOOD — cursor closes instantly, writes happen in short transactions
Vehicle.where(stale: true).pluck(:id).each_slice(50) do |batch_ids|
  Vehicle.where(id: batch_ids).each { |v| v.update!(checked_at: Time.current) }
end

Opencode skill to fix this

I wrote an opencode skill to make sure all future ai agents know to avoid this. So I wrote a skill (with the help of an llm). Feel free to use it.

---
name: sqlite-lock-prevention
description: >
  Use when writing loops, batch updates, or ActiveJob code that iterates over
  a collection and writes to the same table. Prevents SQLite3::BusyException
  from holding a read cursor open during writes.
---

# SQLite Lock Prevention

## 1. Don't write while a read cursor is open
SQLite can't safely write to a table while a read cursor (`find_each`, `each`)
is still open against it on the same connection.

**Fix:** pluck IDs to close the cursor immediately, then write in small,
isolated batches.

```ruby
# BAD — read cursor stays open while writing
Vehicle.where(stale: true).find_each { |v| v.update!(checked_at: Time.current) }

# GOOD — cursor closes instantly, writes happen in short transactions
Vehicle.where(stale: true).pluck(:id).each_slice(50) do |batch_ids|
  Vehicle.where(id: batch_ids).each { |v| v.update!(checked_at: Time.current) }
end
```

## 2. Batch writes in one transaction
Wrap multi-row writes in a single transaction instead of one commit per row —
shortens the total time the write lock is held.

```ruby
Model.transaction { records.each(&:save!) }
```

Post your SQLite tips

Please share your tips for sqlite if you have any.

Bendangelo