This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Goten Protocol Flows

Understanding the Goten protocol flows.

Design decision includes:

  1. services are isolated, but they can use/import services on lower levels only, and they can support only a subset of regions available from these used/imported services.
  2. deployments within the Service must be isolated in the context of versioning. Therefore, they don’t need to point to the same primary API version and each Service version may import different services in different versions.
  3. references may point across services only if the Service imports another service. References across regions are fine, it is assumed regions for the same Service trust each other, at least for now.
  4. all references must carry region, version, and service information to maintain full global env.
  5. We have schema and meta owner references. Schema refs define a region by name, version, and service by context. Meta refs have separate fields for region, service, and version.
  6. Schema references may be of blocking type, use cascade deletion, or unset.
  7. Meta references must trigger cascade deletion if all owners disappear.
  8. Each Deployment, Service + Region pair, is responsible for maintaining metadata.syncing fields of resources it owns.
  9. Each Deployment is responsible for catching up with read-copies from other regions available for them.
  10. Each Deployment is responsible for local database schema and upgrades.
  11. Each Deployment is responsible for Meta owner references in all service regions if they point to the Deployment (via Kind and Region fields!).
  12. Every time cross-region/service references are established, the other side may reject this relationship.

We have several components in API servers and db controllers for maintaining order in this graph. Points one to three are enforced by Meta service and EnvRegistry components. EnvRegistry uses generated descriptors from the Goten specification to populate the Meta service. If someone is “cheating”, then look at point twelve, the other side may reject it.

1 - API Server Flow

Understanding the API server flow.

To enforce general schema consistency, we must first properly handle requests coming from users, especially writing ones.

The following rules are executed when API servers get a write call:

  • when a writing request is sent to the server, multi-region routing middleware must inspect the request, and ensure that all resources that will be written to (or deleted), are owned by the current region. It must store the MultiRegionPolicy object in the context associated with the current call.
  • write requests can only execute write updates for a resources under single multi-region policy! It means that writing across let’s say two projects will not be allowed. It is allowed to have writing operations to global resources though. If there is an attempt to write to multiple resources across different policy holders in a single transaction, the Store object must reject the write.
  • Store object must populate the metadata.syncing field when saving. It should use MultiRegionPolicy from context.
  • When the server calls the Save or Delete function on the store interface (for whatever Service resource), the following things happen:
    • If this is a creation/update, and the new resource has schema references that were not there before, then the Store is responsible for connecting to those Services and ensuring that resources exist, the relationship is established, and it is allowed to establish references in general. For references to local resources, it also needs to check if all is fine.
    • If this is deletion, the Store is obliged to check if there are any blocking back-references. It needs to connect with Deployments where references may exist, including self. For local synchronous cascade deletion & unset, it must execute them.
    • When Deployment connects with others, it must respect their API versions used.
  • Meta owner references are not checked, because it is assumed they may be created later. Meta-owner references are asynchronously checked by the system after the request is completed.

This is a designed flow for API Servers, but we have a couple more flows regarding schema consistency. First, let’s define some corner cases when it comes to blocking references across regions/services. Scenario:

  • Deployment D1 gets a write (Creation) to resource R1. Establishes SNAPSHOT transaction.
  • R1 references (blocking) R2 in Deployment D2, therefore, on the Save call, D1 must ensure everything is valid.
  • Deployment D1 sends a request to establish a blocking reference to R2 for R1. D2 can see R2 is here.
  • D2 blocks resource R2 in its SNAPSHOT transaction. Then sends a signal to D1 that all is good.

Two things can happen:

  • D1 may fail to save R1 because of the failure of its local transaction. Resource R2 may be left with some blockade.
  • Small chance, but after successful blockade on R2, D2 may get delete R2 request, while R1 still does not exist, because D1 did not finish its transaction yet. If D2 asks D1 for R1, D1 will say nothing exists. R2 will be deleted, but then R1 may appear.

Therefore, when D2 blocks resource R2, it is a special tentative blockade with a timeout of up to 5 minutes, if I recall the amount correctly. This is way more than enough since transactions are configured to timeout after one minute. It means R2 will not be possible to delete for this period. Then protocol continues:

  • If D1 fails transaction, D2 is responsible to asynchronously remove tentative blockade from R2.
  • If D1 succeeds the transaction, then D1 is responsible for informing in an asynchronous manner that tentative blockade on R1 is confirmed.

2 - Meta Owner Flow

Understanding the meta owner flow.

Let’s define some terminologies:

  • Meta Owner

    It is a resource that is being pointed by the Meta owner reference object

  • Meta Ownee

    It is a resource that points to another resource by the metadata.owner_references field.

  • Meta Owner Deployment

    Deployment to which Meta Owner belongs.

  • Meta Ownee Deployment

    Deployment to which Meta Ownee belongs.

  • Meta Owner Reference

    It is an item in metadata.owner_references array field.

We have three known cases where action is required:

  1. API Server calls Save method of Store, and saved resource has non-empty meta owner refs. API Server must schedule asynchronous tasks to be executed after the resource is saved locally (We trust meta owner refs are valid). Then asynchronously:

    • deployment owning meta ownee resource must periodically check if meta owners exist in target Deployments.
    • if after some timeout it is detected that the meta owner reference is not valid, then it must be removed. If it empties all meta owner refs array, the whole resource must be deleted.
    • if meta owner reference is valid, Deployment with meta ownee resource is responsible for sending notifications to Deployment with meta owner resource. If the reference is valid, it will be successful.
    • if Deployment with meta ownee detects that version of meta owner reference is too old (during validation), then it must upgrade it.

    Note that in this flow Deployment with meta ownee resource is an actor initializing action, it must ask Deployments with meta owners if its meta ownee is valid.

  2. API Server calls the Save method of Store, and the saved resource is known to be the meta-owner of some resources in various Deployments. In this case, it is meta owner Deployment responsible for actions, asynchronously:

    • it must iterate over Deployments where meta ownees may be, and verify if they are affected by the latest save. If not, no need for any action. Why however meta ownees may be affected? Let’s list the points below…
    • sometimes, meta owner reference has a flag telling that the meta owner must have a schema reference to the meta ownee resource. If this is the case, and we see that the meta owner lost the reference to a meta ownee, the meta ownee must be forced to clean up its meta owner refs. It may trigger its deletion.
    • If there was a Meta Owner Deployment version upgrade, this Deployment is responsible for updating all Meta ownee resources. Meta ownees must have meta owner references using the current version of the target Deployment.
  3. API Server calls Delete method of Store, and deleted resource is KNOWN to be meta-owner of some resources in various Deployments. Deployment owning deleted meta owner resource is responsible for the following asynchronous actions:

    • It must iterate over Deployments where meta ownees may exist, and list them.
    • For each meta ownee, Meta Owner Deployment must notify about deletion, Meta Ownee Deployment.
    • API Server of meta ownee deployment is responsible for removing meta owner reference from the array list. It may trigger the deletion of meta ownee if there are no more meta owner references.

Note that all flows are pretty much asynchronous, but still ensure consistency of meta owner references. In some cases though it is meta owner Deployment reaching out, sometimes the other way around. It depends on which resource was updated last.

3 - Cascade Deletion Flow

Understanding the cascade deletion flow.

When some resource is deleted, and the API Server accepts deletion, it means there are no blocking references anywhere. This is ensured. However, there may be resources pointing to deleted ones with asynchronous deletion (or unset).

In these flows we talk only about schema references, meta are fully covered already.

When Deployment deletes some resource, then all Deployments affected by this deletion must take an asynchronous action. It means that if Deployment D0-1 from Service S0 imports Service S1 and S2, and S1 + S2 have deployments D1-1, D1-2, D2-1, D2-2, then D0-1 must make four real-time watches asking for any deletions that it needs to handle! In some cases, I remember service importing five others. If there were 50 regions, it would mean 250 watch instances, but it would be a very large deployment with sufficient resources for goroutines.

Suppose that D1-1 had some resource RX, that was deleted. Following happens:

  • D1-1 must notify all interested deployments that RX is deleted by inspecting back reference sources.
  • Suppose that RX had some back-references in Deployment D0-1, Deployment D1-1 can see that.
  • D1-1, after notifying D0-1, periodically checks if there are still active back-references from D0-1.
  • Deployment D0-1, which points to D1-1 as an importer, is notified about the deleted resource.
  • D0-1 grabs all local resources that need cascade deletion or unset. For unsets, it needs to execute regular updates. For deletions, it needs to delete (or mark for deletion if there are still some other back-references pointing, which may be blocking).
  • Once D0-1 deals with all local resources pointing to RX, it is done, it has no work anymore.
  • At some point, D0-1 will be asked by D1-1 if RX no longer has back refs. If this is the case, then D0-1 will confirm all is clear and D1-1 will finally clean up what remains of RX.

Note that:

  • This deletion spree may be deep for large object deletions, like projects. It may involve multiple levels of Deployments and Services.

  • If there is an error in the schema, some pending deletion may be stuck forever. By error in the schema, we mean situations like:

    • Resource A is deleted, and is back referenced from B and C (async cascade delete).
    • Normally B and C should be deleted, but it may be a problem if C is let’s say blocked by D, and D has no relationship with A, so will never be deleted. In this case, B is deleted, but C is stuck, blocked by D. Unfortunately as of now Goten does not detect weird errors in schema like this, perhaps it may be a good idea, although not sure if possible.
    • It will be the service developers’ responsibility to fix schema errors.
  • In the flow, D0-1 imports Service to which D1-1 belongs. Therefore, we know that D0-1 knows the full-service schema of D1-1, but not the other way around. We need to consider this in the situation when D1-1 asks D0-1 if RX no longer has back refs.

4 - Multi-Region Sync Flow

Understanding the multi-region synchronization flow.

First, each Deployment must keep updating metadata.syncing for all resources it owns. To watch owned resources, it must:

  • WATCH <Resource> WHERE metadata.syncing.owningRegion = <SELF>.

    It will be getting updates in real-time.

API Server already ensures that the resource on update has the metadata.syncing field synced! However, we have an issue when MultiRegionPolicy object changes. This is where Deployment must asynchronously update all resources that are subject to this policyholder. It must therefore send Watch requests for ALL resources that can be policy-holders. For example, Deployment of iam.edgelq.com will need to have three watches:

  1. Watch Projects WHERE multi_region_policy.enabled_regions CONTAINS <MyRegion>

    by iam.edgelq.com service.

  2. Watch Organizations WHERE multi_region_policy.enabled_regions CONTAINS <MyRegion>

    by iam.edgelq.com service.

  3. Watch Services WHERE multi_region_policy.enabled_regions CONTAINS <MyRegion>

    by meta.goten.com service.

Simpler services like devices.edgelq.com would need to watch only projects, because it does not have other resources subject to this.

Deployment needs to watch policyholders that are relevant in its region.

Flow is now the following:

  • When Deployment gets a notification about the update of MultiRegionPolicy, it needs to accumulate all resources subject to this policy.
  • Then it needs to send an Update request for each, API server ensures that metadata.syncing is updated accordingly.

The above description ensures that metadata.syncing is up-to-date.

The next part is actual multi-region syncing. In this case, Deployments of each Service MUST have one active watch on all other Deployments from the same family. For example, if we have iam.edgelq.com in regions japaneast, eastus2, us-west2, then following watches must be maintainer:

Deployment of iam.edgelq.com in us-west2 has two active watches, one sent to japaneast region, the other eastus:

  • WATCH <Resources> WHERE metadata.syncing.owningRegion = japaneast AND metadata.syncing.regions CONTAINS us-west2
  • WATCH <Resources> WHERE metadata.syncing.owningRegion = eastus2 AND metadata.syncing.regions CONTAINS us-west2

Deployments in japaneast and eastus2 will also have similar two watches. We have a full mesh of connections.

Then, when some resource in us-west2 gets created with metadata.syncing.regions = [eastus2, japaneast], then one copy will be sent to each of these regions. Those regions must be executing pretty much continuous work.

Now, on the startup, it is necessary to mention the following procedure:

  • Deployment should check all lists of currently held resources owned by other regions, but syncable locally.
  • Grab a snapshot of these resources from other regions, and compare if anything is missing, or if we have too much (missing deletion). If this is the case, it should execute missing actions to bring the system to sync.
  • During the initial snapshot comparison, it is still valuable to keep copying real-time updates from other regions. It may take some time for the snapshot to be completed.

5 - Database Migration Flow

Understanding the database migration flow.

When Deployment boots up after the image upgrade, it will detect that the currently active version is lower than the version it can support. In that case, the API Server will work on the older version normally, but the new version API will become available in read-only mode. Deployment is responsible for asynchronous, background syncing of higher version database with current version database. Clients are expected to use older versions anyway, so they won’t necessarily see incomplete higher versions. Besides, it’s fine, because what matters is the current version pointed out by Deployment.

It is expected that all Deployments will get new images first before we start switching to the next versions. Each Deployment will be responsible for silent copying.

For the MultiRegion case, when multiple deployments of the same service are on version v1, but they run on images that can support version v2, they will be still synced with each other, but on both versions: v1 and v2. When images are being deployed region by region (Deployment by Deployment), they may experience Unimplemented error messages, but it should be till images are updated in all regions. We may improve this and try to detect “available” versions first, before making cross-region watches.

Anyway, it will be required that new images are deployed to all regions before the upgrade procedure is triggered on any Regional deployment.

Upgrade then can be done one Deployment by one, using the procedure described in the migration section of the developer guide.

When one Deployment is officially upgraded to the new version, but still uses primarily the old version, then all deployments still watch each other for both versions, for the sake of multi-region syncing. However, Deployment using a newer version may already opt-out from pulling older API resources from other Deployments at this point.

Meta owner references are owned by Deployment they point to. It means that they are upgraded asynchronously after deployment switch the version to the newer one.