MDEV-39092 Copy Aria data and logs as part of backup by mariadb-andrzejjarzabek · Pull Request #4971 · MariaDB/server

mariadb-andrzejjarzabek · 2026-04-22T08:28:29Z

An interim solution with some room for optimization:

All DDL is blcoked while Aria is being backed up
All table caches purged when Aria backup starts (including for non-Aria tables)
Writes to non-transactional, but not to transactional tables are blocked when table files are being backed up
All commits blocked when Aria log files are being backed up

CLAassistant · 2026-04-22T08:28:38Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

This introduces a basic driver Sql_cmd_backup, storage engine interfaces, and basic copying of InnoDB data files. On Windows, we pass a target directory name; elsewhere, we pass a target directory handle. backup_target: A structured data type to represent a directory or a stream. On Microsoft Windows, we must use directory paths because there is no variant of CopyFileEx() that would work on file handles. copy_entire_file(): A file copying service for POSIX systems. copy_file(): A sparse file-copying service for all systems. backup_context: An InnoDB backup context, attached to trx->lock.backup so that context can exist between InnoDB_backup::end(), which is releasing all locks, and InnoDB_backup::fini() in the same thread, which is expected to finalize the backup without modifying files in the server data directory. fil_space_t::write_or_backup: Keep track of in-flight page writes and pending backup operation. We must not allow them concurrently, because that could lead into torn pages in the backup. fil_space_t::backup_end: The first page number that is not being backed up (by default 0, to indicate that no backup is in progress). fil_space_t::BACKUP_BATCH_SIZE: The number of preceding pages that will be covered by fil_space_t::backup_end. This is the unit of "page range locking" during InnoDB backup. TRX_STATE_BACKUP: A special InnoDB transaction state indicating association with BACKUP SERVER, which allows us to pass some context in trx_t from innodb_backup_end() to innodb_backup_finalize(). log_t::backup: Whether BACKUP SERVER is in progress. The purpose of this is to make BACKUP SERVER prevent the concurrent execution of SET GLOBAL innodb_log_archive=OFF or SET GLOBAL innodb_log_file_size when innodb_log_archive=OFF. log_sys.archived_checkpoint: Keep track of the earliest available checkpoint, corresponding to log_sys.archived_lsn. This reflects SET GLOBAL innodb_log_recovery_start (which is settable now), for incremental backup. buf_flush_list_space(): Check for concurrent backup before writing each page. This is inefficient, but this function may be invoked from multiple threads concurrently, and it cannot be changed easily, especially for fil_crypt_thread().

dr-m · 2026-06-09T07:01:04Z

    backup_target target;
 #ifndef _WIN32
    const int datadir_fd;
 #endif


Why is there a declaration of datadir_fd in the first place? It turns out that it is never being assigned, only passed to openat(2). This seems to rely on zero-initialization and some undefined behaviour. For example, in Linux, AT_FDCWD is defined as -100.

I noticed later that this data member is being initialized from a call to open(…, O_DIRECTORY) in the constructor. I think that this is bad practice; we should allow the singleton object to be initialized statically.

It is unclear when if ever the directory handle would be closed. I suspect that we would hold the handle open until the server is shut down.

datadir_fd is initialized in the Aria_backup constructor.

Is it a reasonable trade-off to have a descriptor permanently opened for backups for the purpose of just one plugin? Should each plugin that performs the backup keep an open descriptor to the data directory? When adding another bit of functionality (not backup) to any part of the server or a plugin, with its own class/module/translation unit, which needs to open files in the data directory, should it also maintain a file descriptor to the data directory? There may be a case for defining an API for the server to maintain a single descriptor and make it available for other code, but given that there isn't one, I would argue that opening a descriptor for the duration of the backup and closing it on backup end. A backup isn't a fine-grained operation which we expect to be performed multiple times per second (or even minute), but we expect that many hours may pass between successive backups. And every backup will typically open hundreds or thousands of files, so saving the time and I/O of one open/close is not significant and it's not clear that maintaining an additional open descriptor the whole time the server is running is is balanced out by that.

In any case I propose putting that discussion off for a possible future improvement, where we could discuss defining an API to make a data directory descriptor available to the whole server.

Simplify the backup_target

backup_config_append(): C compatible API

MDEV-39101: BACKUP STAGE compatible locking handlerton::backup_end: Includes backup_finalize

This reverts commit 75ff32f.

Fix concurrent BACKUP_PHASE_FINISH issue.

dr-m · 2026-06-17T08:24:52Z

+  case BACKUP_PHASE_NO_DML_NON_TRANS:
+    /* FIXME: Would be better to selectively purge only the tables we need. */
+    tc_purge();
+    tdc_purge(true);
+    return 0;


The comment is missing a reference to a ticket where this performance problem will be fixed. As far as I understand, we only need such cleanup for ENGINE=Aria and ENGINE=MyISAM. We don’t want unnecessary disruption of ENGINE=InnoDB. The BACKUP_PHASE_NO_COMMIT that will be executed a couple of steps after this one will be disruptive enough.

mariadb-andrzejjarzabek requested a review from dr-m April 22, 2026 08:28

gkodinov added the MariaDB Corporation label Apr 23, 2026

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch 6 times, most recently from 4ef94be to c143ff2 Compare April 25, 2026 19:02

dr-m reviewed Apr 28, 2026

View reviewed changes

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch from efbc62b to c77278b Compare May 4, 2026 14:00

dr-m force-pushed the MDEV-14992 branch from bcbda03 to e0d850e Compare May 5, 2026 12:06

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch 2 times, most recently from e89625e to 06fd556 Compare May 17, 2026 19:19

dr-m force-pushed the MDEV-14992 branch from e0d850e to 0c52540 Compare May 18, 2026 09:40

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch 2 times, most recently from 4d6a19c to d86a6a1 Compare May 20, 2026 05:50

dr-m force-pushed the MDEV-14992 branch from bdec600 to 45e7902 Compare May 20, 2026 06:52

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch 2 times, most recently from c6f5119 to 9ebb23e Compare May 20, 2026 20:24

dr-m reviewed May 21, 2026

View reviewed changes

Comment thread sql/sql_backup.cc Outdated

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch 2 times, most recently from 8565956 to 04e3bc2 Compare May 21, 2026 10:53

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch 2 times, most recently from 14ca552 to c9429eb Compare May 29, 2026 09:47

dr-m force-pushed the MDEV-14992 branch from d595ce0 to c0e48fc Compare May 29, 2026 14:17

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch 2 times, most recently from d35cd47 to 824afeb Compare June 1, 2026 15:15

mariadb-andrzejjarzabek marked this pull request as ready for review June 1, 2026 15:16

mariadb-andrzejjarzabek marked this pull request as draft June 1, 2026 15:17

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch from 23ac784 to 2dfc033 Compare June 5, 2026 20:07

dr-m added 2 commits June 8, 2026 12:31

WIP MDEV-39092, and back up non-InnoDB files

10539aa

dr-m force-pushed the MDEV-14992 branch from b98be03 to 10539aa Compare June 8, 2026 09:35

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch from 2dfc033 to 01a069c Compare June 9, 2026 06:47

dr-m reviewed Jun 9, 2026

View reviewed changes

dr-m added 7 commits June 9, 2026 11:19

fixup! 10f3acb

ef91938

Simplify the backup_target

fixup! ef91938

dec5d45

backup_config_append(): C compatible API

fixup! dec5d45

b868d24

fixup! b868d24

75ff32f

squash! 10f3acb

f257584

MDEV-39101: BACKUP STAGE compatible locking handlerton::backup_end: Includes backup_finalize

Revert "fixup! b868d24"

bd50329

This reverts commit 75ff32f.

fixup! f257584

7cd70f4

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch from 6905bda to b6549cc Compare June 9, 2026 14:29

fixup! f257584

7f48935

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch from 4b496a7 to 9debff9 Compare June 10, 2026 08:04

squash! 10539aa

6c1d7e0

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch from 0cfc392 to 0de6368 Compare June 10, 2026 11:54

dr-m and others added 8 commits June 11, 2026 13:43

Multi-threaded backup and stub of streaming backup

8077134

fixup! 8077134

17ce1be

Implement table and file copy using backup_step

e454e30

Fix Aria "crashed table" message on restore. Fix Windows build issue.

ece2446

Incorporate changes from code review.

c02b69b

Fix Windows crash. Enable backup_server_restore for Windows.

a59ddcd

Make the backup singleton reside in statically allocated memory.

828b9ed

Fix concurrent BACKUP_PHASE_FINISH issue.

Add a test for Aria concurrent backup with many tables

8b1c81b

mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch from 0de6368 to 8b1c81b Compare June 12, 2026 06:11

dr-m reviewed Jun 17, 2026

View reviewed changes

mariadb-andrzejjarzabek marked this pull request as ready for review June 17, 2026 08:44

dr-m force-pushed the MDEV-14992 branch from e81af7f to 6c8a37f Compare June 18, 2026 09:59

Uh oh!

Conversation

mariadb-andrzejjarzabek commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dr-m Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

dr-m Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

mariadb-andrzejjarzabek Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

mariadb-andrzejjarzabek Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dr-m Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

mariadb-andrzejjarzabek commented Apr 22, 2026 •

edited

Loading

CLAassistant commented Apr 22, 2026 •

edited

Loading