Skip to content

Bootstrap database file instead of pulling every WAL frame#2017

Merged
penberg merged 2 commits intotursodatabase:mainfrom
avinassh:export
Apr 8, 2025
Merged

Bootstrap database file instead of pulling every WAL frame#2017
penberg merged 2 commits intotursodatabase:mainfrom
avinassh:export

Conversation

@avinassh
Copy link
Copy Markdown
Member

@avinassh avinassh commented Apr 7, 2025

Currently, the sync code pulls all the generations and frames one by one. However, this is very inefficient, for databases with many generations it could take hours.

This patch adds a optimisation that it pulls the latest generation by calling export endpoint and boostraps the db file quickly.

My testing shows insane improvements. For a db with 30 generations, and about 100 frames in each gen, it has brought down the initial sync from 1hr to 1 minute.

Fixes #1991

Currently, the sync code pulls all the generations and frames one by one.
However, this is very inefficient, for databases with many generations
it could take hours.

This patch adds a optimisation that it pulls the latest generation
by calling `export` endpoint and boostraps the db file quickly.

My testing shows insane improvements. For a db with 30 generations,
and about 100 frames in each gen, it has brought down the initial sync
from 1hr to 1 minute.
When syncctx is initialised, we will init the generation with 0 and
let the sync_db to update it to the new value
@penberg penberg changed the title Performance: optimise offline sync Bootstrap database file instead of pulling every WAL frame Apr 8, 2025
@penberg penberg added this pull request to the merge queue Apr 8, 2025
Merged via the queue into tursodatabase:main with commit 9a65514 Apr 8, 2025
19 checks passed
@avinassh avinassh deleted the export branch April 8, 2025 15:43
github-merge-queue Bot pushed a commit that referenced this pull request Apr 10, 2025
This patch is a follow up on
#2017 It changes the
heuristics for bootstrapping the DB.

One more change is, it also does bootstrap before any connection to the
local SQLite is created

Earlier, we bootstrapped whenever we noticed a change in the generation.
However, this patch changes this behaviour and does bootstrap only if
local files don't exist:

1. if no db file or the metadata file exists, then user is starting from
scratch
and we will do the sync
2. if the db file exists, but the metadata file does not exist (or other
way around),
then local db is in an incorrect state. we stop and return with an error
3. if the db file exists and the metadata file exists, then we don't
need to do the
sync
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bootstrap from empty database

2 participants