After this patch set is applied, an instance will try to detect if
it fell too much behind its peers in the cluster and so needs to be
rebootstrapped. If it does, it will skip local recovery and instead
proceed to bootstrap from a remote master. Old files (xlog, snap)
are not deleted during rebootstrap. They will be removed by gc as
usual.
TODO: write a test checking that garbage collection works as expected.
https://github.com/tarantool/tarantool/issues/461
https://github.com/tarantool/tarantool/commits/gh-461-replica-rejoin
Changes in v2:
- Implement rebootstrap support for vinyl engine.
- Call recover_remaining_wals() explicitly after recovery_stop_local()
as suggested by @kostja.
- Add comment to memtx_engine_new() explaining why we need to init
INSTANCE_UUID before proceeding to local recovery.
v1:
https://www.freelists.org/post/tarantool-patches/RFC-PATCH-0012-Replica-rejoin
Vladimir Davydov (11):
box: retrieve instance uuid before starting local recovery
box: refactor hot standby recovery
box: retrieve end vclock before starting local recovery
box: open the port before starting local recovery
box: connect to remote peers before starting local recovery
box: factor out local recovery function
applier: inquire oldest vclock on connect
replication: rebootstrap instance on startup if it fell behind
vinyl: simplify vylog recovery from backup
vinyl: pass flags to vy_recovery_new
vinyl: implement rebootstrap support
src/box/applier.cc | 15 ++
src/box/applier.h | 2 +
src/box/box.cc | 312 +++++++++++++++++--------------
src/box/box.h | 4 +-
src/box/iproto.cc | 30 ++-
src/box/iproto.h | 5 +-
src/box/iproto_constants.h | 2 +
src/box/lua/cfg.cc | 1 -
src/box/memtx_engine.c | 21 ++-
src/box/recovery.cc | 34 +++-
src/box/recovery.h | 5 +-
src/box/replication.cc | 15 ++
src/box/replication.h | 9 +
src/box/vinyl.c | 8 +-
src/box/vy_log.c | 208 +++++++++++++++------
src/box/vy_log.h | 50 ++++-
src/box/xrow.c | 36 ++++
src/box/xrow.h | 31 +++
test/replication/replica_rejoin.result | 201 ++++++++++++++++++++
test/replication/replica_rejoin.test.lua | 75 ++++++++
20 files changed, 832 insertions(+), 232 deletions(-)
create mode 100644 test/replication/replica_rejoin.result
create mode 100644 test/replication/replica_rejoin.test.lua
--
2.11.0