Database Syncer Controller

Understanding the database syncer controller.

Another db-controller big module is DbSyncer Controller. In the Goten repository, see the runtime/db_syncing_ctrl module. It is responsible for:

Maintaining the syncing.metadata field when corresponding MultiRegionPolicy changes.
Syncing resources from other Deployments in the same Service for the current local database (read copies).
Syncing resources from other Deployments and current Deployment for Search storage.
Database upgrade of local Deployment

It mixes multi-version/multi-region features, but the reason is, that we pretty much share many common structures and patterns regarding db-syncing here. Version syncing is still copying from one database to another, even if this is a bit special since we will need to “modify” the resources we are copying.

This module is interested in dynamic Deployment updates, but only for current Service. See the node_manager.go file. We utilize EnvRegistry to get the current setup. Normally we will initiate inner node manager when we get SyncEvent, but then we support dynamic updates via DeploymentSetEvent and DeploymentRemovedEvent. We just need to verify this Deployment belongs to our service. If it does, it means something changed there and we should refresh. Perhaps we can get the “previous” state, but it is fine to make NOOP refresh too. Anyway, we need to ensure that Node is aware of all foreign Deployments because those are potential candidates to sync from. Now let’s dive into a single Node instance.

Now, DbSyncingCtrl can be quite complex, even though it copies resource instances across databases. First, check ControllerNode struct in the controller_node.go file, which symbolizes a single Node responsible for copying data. What we can say about it (basic breaking down):

it may have two instances of VersionedStorage, one is older, one for newer API. Generally, we support only the last two versions for DbSyncer. It should not be needed to have more, and it would make the already complex structure more difficult. This is necessary for database upgrades.
We have two instances of syncingMetaSet, for two versioned storages. Those contain SyncingMeta objects per multi-region policy-holders and resource type pair. An instance of syncingMetaSet is used by localDataSyncingNode instances. To be honest, if ControllerNode had just one localDataSyncingNode object, not many, then syncingMetaSet would be part of it!
We have then rangedLocalDataNodes and rangedRemoteDataNodes maps.

Now, object localDataSyncingNode is responsible for:

Maintaining syncing.metadata, it must use the syncingMetaSet passed instance for real-time updates.
Syncing local resources to Search storage (read copies).
Upgrading local database.

Then, remoteDataSyncingNode is responsible for:

Syncing resources from other Deployments in the same Service for the current local database (read copies).
Syncing resources from other Deployments for Search storage.

For each foreign Deployment, we will have separate remoteDataSyncingNode instances.

It is worth asking the question, why do we have a map of syncing nodes (local and remote) for shard ranges, the reason is, that we split them to have at most ten shards. Often we may end up with maps of one sub-shard range still. Why ten? Because in firestore, which is a supported database, we can pass a maximum of ten shard numbers in a single request (filter)! Therefore, we will need to make separate watch queries, and it’s easier to separate nodes then. Now we can guarantee that a single local/remote node will be able to send a query successfully to the backend. However, because we have this split, we needed to separate syncingMetaSet away from localDataSyncingNode, and put it directly in ControllerNode.

Since we have syncingMetaSet separated, let’s describe what it does first: Basically, it observes all multi-region policy-holders a Service uses and computes SyncingMeta objects per policy-holder/resource type pair. For example, Service iam.edgelq.com has resources belonging to Service, Organization, and Project, so it watches these 3 resource types. Service devices.edgelq.com only uses Project, so it watches Project instances, and so on. It uses the ServiceDescriptor passed in the constructor to detect all policy-holders.

When syncingMetaSet runs, it collects the first snapshot of all SyncingMeta instances and then maintains it. It sends events to subscribers in real-time (See ConnectSyncingMetaUpdatesListener). This module is not responsible for updating the metadata.syncing field yet, but it is an important first step. It will be triggering localDataSyncingNode when new SyncingMeta is detected, so it can run its updates.

The next important module is the resVersionsSet object, defined in file res_versions_set.go. It is a central component in both local and remote nodes, so perhaps it is worth explaining how it works.

This set contains all resource names with their versions in the tree structure. By version, I don’t mean API version of the resource, I mean literal resource version, we have a field in metadata for that, metadata.resource_version. This value is a string but can contain only an integer that increments with every update. This is a base for comparing resources across databases. How do we know that? Well, if we have the “main” database owning resource, we know that it contains the newest version, the field metadata.resource_version is the highest there. However, we have other databases… for example search database, it may be separate, like Algolia. In that case, metadata.resource_version may be lower. We also have a syncing database (for example across regions). The other database in another region, which gets just read-only copies, also can at best match the origin database. resVersionsSet has important functions:

SetSourceDbRes and DelSourceDbRes are called by original database owning resource.
SetSearchRes and DelSearchRes are called by the search database.
SetSyncDbRes and DelSyncDbRes are called by syncing database (for example cross-region syncing).
CollectMatchingResources collects all resource names matched by prefix. This is used by metadata.syncing updates. When policy-holder resource updates its MultiRegionPolicy, we will need to collect all resources subject to it!
CheckSourceDbSize is necessary for Firestore, which is known to be able to “lose” some deletions. If the size is incorrect, we will need to reset the source DB (original) and provide a snapshot.
SetSourceDbSyncFlag is used by the original DB to signal that it supplied all updates to resVersionsSet and now continues with real-time updates only.
Run: resVersionsSet is used in multi-threading env, so we will run on separate goroutine and use Go channels for synchronization. We will need to use callbacks when necessary.

resVersionsSet also supports listeners when necessary, it triggers when source DB updates/deletes a resource, or when we reach syncing database equivalence with the original database. We don’t provide similar signals for search DB, because simply we don’t need them… but we do for syncing DB. We will explain later.

Now let’s talk about local and remote nodes, starting with local.

See the local_data_syncing_node.go file, which constructs all modules responsible for the mentioned tasks. First, analyze newShardRangedLocalDataSyncingNode constructor up to the if needsVersioning condition, where we create modules for Database versioning. Before this condition, we are creating modules for Search DB syncing and metadata.syncing maintenance. Note how we are using the activeVsResVSet object (type of resVersionsSet). We are connecting to the search syncer and syncing meta updater modules. For each resource type, we are creating an instance of source db watcher, which gets access to the resource version set. It should be clear now: Source DB, which is for our local deployment, keeps updating activeVsResVSet, which in turn passes updates to activeVsSS and activeVsMU. For activeVsMU, we are also connecting it to activeVsSyncMS, so we have two necessary signal sources for maintaining the metadata.syncing object.

So, you should know now that:

search_syncer.go

It is used to synchronize the Search database, for local resources in this case.
syncing_meta_updater.go

It is used to synchronize the metadata.syncing field for all local resources.
base_syncer.go

It is actually a common implementation for search_syncer.go, but not limited to.

Let’s dive deeper and explain what is synchronization protocol here between source and destination. Maybe you noticed, but why sourceDbWatcher contains two watchers, for live and snapshot? Also, why there is a wait to run a snapshot? Did you see that in the OnInitialized function of localDataSyncingNode, we are running a snapshot only when we have a sync signal received? There are reasons for all of that. Let’s discuss design here.

When the DbSyncingCtrl node instance is initiated for the first time, or when the shard range changes, we will need to re-download all resources from the current or foreign database, to compare with synced database and execute necessary creations, updates, and deletions. Moreover, we will need to ask for a snapshot of data on the destination database. This may take time, we don’t know how much, but probably downloading potentially millions of items may not be the fastest operation. It means, that when there are changes in nodes, upscaling, downscaling, reboots, whatever, we would need to suspend database syncing, and it may be a bit long, maybe minute, what if more? Is there an upper limit? If we don’t sync fast, this lag will start to be quite too visible for users. It is better if we start separate watchers, for live data directly. Then we will be syncing from the live database to the destination (like search db), providing almost immediate sync most of the time. In the meantime, we will collect snapshots of data from the destination database. See the base_syncer.go file, and see function synchronizeInitialData. When we are done with initialization, we are triggering a signal, that will notify the relevant instance (local or remote syncing node). In the file local_data_syncing_node.go, function OnInitialized, we are checking if all components are ready, then we run RunOrResetSnapshot for our source db watchers. This is when the full snapshot will be done, and if there are any “missing” updates during the handover, we will execute them. Ideally, we won’t have them, live watcher goes back by one minute when it starts watching, so some updates may even be repeated! But it’s still necessary to provide some guarantees of course. I hope this explains the protocol:

Live data immediately is copying records from source to destination database…
In the meantime, the destination database collects snapshots…
And when the snapshot is collected, we start the snapshot from the source database…
We execute anything missing and continue with live data only.

Another reason why we have the design we have, why we use QueryWatcher instances (and not Watchers), is simple: RAM. DbSyncingCtrl needs to practically watch all database updates and needs to get full resource bodies. Note we are also using access.QueryWatcher instances in sourceDbWatcher. QueryWatcher is a lower-level object compared to just Watcher. It means, that it can’t support multiple queries, it does not handle resets, or snapshot size checks (firestore only). This is also a reason why in ControllerNode we have a map of localDataSyncingNode instances per shard range… The watcher would be able to split queries and hide this complexity. But QueryWatcher has benefits:

It does not store watched resources in its internal memory!

Imagine millions of resources, whose whole resource bodies are kept by Watcher instance in RAM. It goes in the wrong direction, so DbSyncingCtrl is supposed to be slim. In resVersionsSet we only keep version numbers and resource names in tree form. We try to compress all syncer modules into one place, so syncingMetaUpdater and searchUpdater are in one place. If there is some update, we don’t need to further split and increase pressure on the infrastructure.

This concludes the local data syncing node discussion in terms of MultiRegion replication and Search db syncing for LOCAL nodes. We will describe later in this doc Remote data syncing nodes. However, let’s continue with the local data syncing node, and talk about its other task: database upgrades. Therefore, let’s continue the discussion here.

Object localDataSyncingNode needs to consider now actually four databases (at maximum):

Local database for API Version currently active (1)
Local database for API Version to which we sync to (2)
Local Search database for API Version currently active (3)
Local Search database for API Version to which we sync to (4)

Let’s introduce the terms: Active database, and Syncing database. When we are upgrading to a new API Version, the Active database contains old data, Syncing database contains new data. When we are synchronizing in another direction, for rollback purposes (just in case?), the Active database contains new data, and the syncing database contains old data.

And extra SyncingMetaUpdaters:

syncingMetaUpdater for the currently active version (5)
syncingMetaUpdater for synced version (6)

We need sync connections:

Point 1 to Point 2 (This is most important for database upgrade)
Point 1 to Point 3
Point 2 to Point 4
Point 1 to Point 5 (plus extra signal input from syncingMetaSet active instance)
Point 2 to Point 6 (plus extra signal input from the syncingMetaSet syncing instance)

This is insane and probably needs careful code writing, which sometimes lacking here. We will need to carefully add some tests and try to put extra makeup on the code, but the deadline was deadline.

Go back to function newShardRangedLocalDataSyncingNode in local_data_syncing_node.go, and see a line with if needsVersioning and below. This constructs extra elements. First, note we are creating a syncingVsResVSet object, and another resVersionsSet. This set will be responsible for syncing between the syncing database and the search store. It is also used to keep signaling the syncing version to syncingMetaUpdater. But I see now this was a mistake because we don’t need this element. Instead, it is enough for the Active database to keep running its syncingMetaUpdater. We will know that those updates will be reflected in the syncing database because we have already synced in this direction! We will need to keep however second, additional Search database syncing. When we finish upgrading the database to the new version, we don’t want to have an empty search store from the first moment! This may not go unnoticed. Therefore, we have this database, search syncing for “Syncing database” too.

But let’s focus on the most important bits: actual database upgrade, from Active to Syncing local main storages. Find a function called newResourceVerioningSyncer, and see what it is called. It receives access to the syncing database, and it gets access to the node.activeVsResVSet object, which contains resources from the active database. This is the object responsible for upgrading resources: resourceVersioningSyncer, in file resource_versioning_syncer.go. It works like other “syncers”, and inherits from base syncer, but it also needs to transform resources. It uses transformers from versioning packages. When it uses resVersionsSet, it calls SetSyncDbRes and DelSyncDbRes, to compare with original database. We can safely require, that metadata.resourceVersion must be the same between old and new resource instances, transformation cannot change it. Because syncDb and searchDb are different, we are fine with having search syncer and versioning syncer use the same resource versions set.

Object resourceVersioningSyncer also makes extra ResourceShadow upgrades, transformed resources MAY have different references after the changes, therefore we need to refresh them! It makes this syncer even more special.

However, we have little issue with ResourceShadow instances, they don’t have a metadata.syncing field, and they are partially covered by resourceVersioningSyncer, we are not populating some fields, like back reference sources. As this is special, we need shadowsSyncer, defined in file shadows_versioning_syncer.go. It synchronizes also ResourceShadow instances, but fields that cannot be populated by resourceVersioningSyncer.

During database version syncing, localDataSyncingNode receives signals (per resource type), when there is a synchronization event between the source database and the syncing database. See that we have the ConnectSyncReadyListener method in resVersionsSet. This is how syncDb (here it is a syncing database!) notifies when there is a match between two databases. This is used by localDataSyncingNode to coordinate Deployment version switches. See function runDbVersionSwitcher to see the full procedure. This is the place basically, where Deployment can switch from one version to another. When this happens, all backend services will flip their instances.

This is all about local data syncing nodes. Let us switch to remote nodes: remote node (object remoteDataSyncingNode, file remote_data_syncing_node.go) is syncing between the local database and a foreign regional one. It is simpler than local at least. It synchronizes:

From remote database to local database
From remote database to local search database

If there are two API Versions, it is assumed that both regions may be updating. Then, we have 2 extra syncs:

From the remote database in the other version to the local database
From remote database in the other version to local search database

When we are upgrading, it is required to deploy new images on the first region, then the second, third, and so on, till the last region gets new images. However, we must not switch versions of any region till all regions get new images. While switching and deploying can be done one by one, those stages need separation. This is required for these nodes to work correctly. Also, if we switch the Deployment version in one region before we upgrade images in other regions, there is a high chance users may use the new API and see some significant gaps in resources. Therefore, versioning upgrade needs to be considered in multi-regions too.

Again, we may be operating on four local databases and two remote APIs in total, but at least this is symmetric. Remote syncing nodes also don’t deal with Mixins, so no ResourceShadow cross-db syncing. If you study newShardRangedRemoteDataSyncingNode, you can see that it uses searchSyncer and dbSyncer (db_syncer.go).