Multi-Region Sync Flow

Understanding the multi-region synchronization flow.

First, each Deployment must keep updating metadata.syncing for all resources it owns. To watch owned resources, it must:

  • WATCH <Resource> WHERE metadata.syncing.owningRegion = <SELF>.

    It will be getting updates in real-time.

API Server already ensures that the resource on update has the metadata.syncing field synced! However, we have an issue when MultiRegionPolicy object changes. This is where Deployment must asynchronously update all resources that are subject to this policyholder. It must therefore send Watch requests for ALL resources that can be policy-holders. For example, Deployment of iam.edgelq.com will need to have three watches:

  1. Watch Projects WHERE multi_region_policy.enabled_regions CONTAINS <MyRegion>

    by iam.edgelq.com service.

  2. Watch Organizations WHERE multi_region_policy.enabled_regions CONTAINS <MyRegion>

    by iam.edgelq.com service.

  3. Watch Services WHERE multi_region_policy.enabled_regions CONTAINS <MyRegion>

    by meta.goten.com service.

Simpler services like devices.edgelq.com would need to watch only projects, because it does not have other resources subject to this.

Deployment needs to watch policyholders that are relevant in its region.

Flow is now the following:

  • When Deployment gets a notification about the update of MultiRegionPolicy, it needs to accumulate all resources subject to this policy.
  • Then it needs to send an Update request for each, API server ensures that metadata.syncing is updated accordingly.

The above description ensures that metadata.syncing is up-to-date.

The next part is actual multi-region syncing. In this case, Deployments of each Service MUST have one active watch on all other Deployments from the same family. For example, if we have iam.edgelq.com in regions japaneast, eastus2, us-west2, then following watches must be maintainer:

Deployment of iam.edgelq.com in us-west2 has two active watches, one sent to japaneast region, the other eastus:

  • WATCH <Resources> WHERE metadata.syncing.owningRegion = japaneast AND metadata.syncing.regions CONTAINS us-west2
  • WATCH <Resources> WHERE metadata.syncing.owningRegion = eastus2 AND metadata.syncing.regions CONTAINS us-west2

Deployments in japaneast and eastus2 will also have similar two watches. We have a full mesh of connections.

Then, when some resource in us-west2 gets created with metadata.syncing.regions = [eastus2, japaneast], then one copy will be sent to each of these regions. Those regions must be executing pretty much continuous work.

Now, on the startup, it is necessary to mention the following procedure:

  • Deployment should check all lists of currently held resources owned by other regions, but syncable locally.
  • Grab a snapshot of these resources from other regions, and compare if anything is missing, or if we have too much (missing deletion). If this is the case, it should execute missing actions to bring the system to sync.
  • During the initial snapshot comparison, it is still valuable to keep copying real-time updates from other regions. It may take some time for the snapshot to be completed.