This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Goten Design

Understanding the core concept of the Goten design.

The goten framework is designed for services to be:

  • spreading across multiple clusters in different regions
  • running with different versions at the same time

which means, Goten is aware of:

  • multi-services
  • multi-regions
  • multi-versions

We must think of it as a protocol between those entities.

Protocol, because there must be some established communication that enforces database schema stability despite swimming in this tri-dimensional environment. We can’t use any database features. Even global databases with regional replication would not work, because services are not even guaranteed to work on the same database backend. Goten was shown to be a kind of language on top of protobuf because of extended types. Now, we see it needs some protocol on top of gRPC too, to ensure some global correctness.

Since this is all integrated, we will also describe how multi-region design works from an implementation point of view. We assume you have a basic knowledge of the multi-region design as explained in the developer guide, which describes:

  1. regional resources
  2. MultiRegionPolicy object
  3. the region information included in the name field

The developer guide also explains the multi-version concept in the migration section.

With that knowlege in place, we will discuss four important concepts:

  1. Meta service as the service registry service
  2. EnvRegistry as the service discovery object
  3. the resource metadata for the service synchronization
  4. the multi-region policy store

with the protocol call flows and the actual implementation.

1 - Goten Design Concepts

Understanding the Goten design concepts.

1.1 - Meta Service as Service Registry

Understanding the role of Meta service as a service registry.

To build a multi-service framework, we first need a special service, that provides service registry offers. Using it, we must be able to discover:

  • List of existing Regions
  • List of existing Services
  • List of existing Resources per Service
  • List of existing regional Deployments per Service.

This is provided by the meta.goten.com Service, in the Goten repository, directory meta-service. It follows the typical structure of any service, but has no cmd directory or fixtures, as Goten provides only basic parts. The final implementation is in the edgelq repository, see directory meta. SPEKTRA Edge version of meta contains an old version of the service, v1alpha2, which is obsolete and irrelevant to this document. For this purpose, ignore v1alpha2 elements.

Still, the resource model for Meta service resides in the Goten repository, see normal protobuf files. For Goten, we made the following design decisions, this reflects fields we have in protobuf files (you can and should see).

  • List of regions in meta service must show a list of all possible regions where services can be deployed, not necessarily where are deployed.
  • Each Service must be fairly independent. It must be able to specify its global network endpoint where it is reachable. It must display a list of API versions it has. For each API version, it must tell which services it imports, and which versions of them. It must tell what services it would like to use as a client too (but not import).
  • Every Deployment describes an instance of a service in a region. It must be able to specify its regional network endpoint and tell which service version it operates on (current maximum version). It is assumed it can support lower versions too. Deployments for a single service do not need to upgrade at once to the new version, but it’s recommended to not wait too long.
  • Deployments can be added to a Service dynamically, meaning, service owners can expand by just adding new Deployment in Meta service.
  • Each Service manages its multi-region setup. Meaning: Each Service decides which region is “primary” for them. Then list of Deployment resources describes what regions are available.
  • Each region manages its network endpoints, but it is recommended to have the same domain for global and regional endpoints, and each regional endpoint has a region ID as part of a subdomain, before the main part.
  • For Service A to import Service B, we require that Service B is available in all regions where Service A is deployed. This should be the only limitation Services must follow for multi-region setup.

All those design decisions are reflected in protobuf files, and server implementation (custom middlewares), see in goten repository, meta-service/server/v1/ custom middlewares, they are fairly simple.

For SPEKTRA Edge, design decisions are that:

  • All core SPEKTRA Edge services (iam, meta adaptation, audit, monitoring, etc.) are always deployed to all regions and are deployed together.
  • It means, that 3rd party services can always import any SPEKTRA Edge core service because it is guaranteed to be in all regions needed by 3rd party.
  • All core SPEKTRA Edge services will point to the same primary region.
  • All core SPEKTRA Edge services will have the same network domain: iam.apis.edgelq.com, monitoring.apis.edgelq.com, etc. If you replace the first word with another, it will be valid.
  • If core SPEKTRA Edge services are upgraded in some regions, then they will be upgraded at once.
  • All core SPEKTRA Edge services will be public: Anyone authenticated will be able to read its roles, permissions, and plans, or be able to import them.
  • All 3rd party services will be assumed to be users of core SPEKTRA Edge services (no cost if no actual use).
  • Service resources can be created by a ServiceAccount only. It is assumed that it will be managing this Service.
  • Service will belong to a Project, where ServiceAccount who created it belongs.

Users may think of core edgelq services as a service bundle. Most of these SPEKTRA Edge rules are declarations, but I believe deployment workflows are enforcing this anyway. The decision, that all 3rd parties are considered users of all core SPEKTRA Edge services, and that each Service must belong to some project, is reflected in additional custom middleware we have for meta service in the edgelq repository, see file meta/server/v1/service/service_service.go. In this extra middleware, executed before custom middleware in the goten repository (meta-service/server/v1/service/service_service.go), we are adding core SPEKTRA Edge to the used services array. We also assign a project-owning Service. This is where the management of ServiceAccounts is, or where usage metrics will go.

This concludes Meta service workings, where we can find information about services and relationships between them.

1.2 - EnvRegistry as Service Discovery

Understanding the role of EnvRegistry module in Meta service.

Meta service provides API allowing inspection global environment, but we also need a side library, called EnvRegistry:

  • It must allow a Deployment to register itself in a Meta service, so others can see it.
  • It must allow the discovery of other services with their deployments and resources.
  • It must provide a way to obtain real-time updates of what is happening in the environment.

Those three items above are the responsibilities of EnvRegistry module.

In the goten repo, this module is defined in the runtime/env_registry/env_registry.go file.

As of now, it can only be used by server, controller, and db-controller runtimes. It may be beneficial for client runtimes someday probably, but we will opt out from “registration” responsibility because the client is not the part of the backend, it cannot self-register in Meta service.

One of the design decisions regarding EnvRegistry is that it must block till initialization is completed, meaning:

  • User of EnvRegistry instance must complete self-registration in Meta Service.
  • EnvRegistry must obtain the current state of services and deployments.

Note that no backend service works in isolation, as part of the Goten design, it is essential that:

  • any backend runtime knows its surroundings before executing its tasks.
  • all backend runtimes must be able to see other services and deployments, which are relevant for them.
  • all backend runtimes must initialize and run the EnvRegistry component and it must be one of the first things to do in the main.go file.

This means, that the backend service, if it cannot successfully pass initialization, will be blocked from any useful work. If you check all run functions in EnvRegistry, you should see they lead to the runInBackground function. It runs several goroutines, but then it waits for a signal showing all is fine. After this, EnvRegistry can be safely used to find other services, and deployments, and make networking connections.

This also guarantees that Meta service contains relevant records for services, in other words, EnvRegistry registration initializes regions, services, deployments, and resources. Note, however:

  • The region resources can be created/updated by meta.goten.com service only. Since meta is the first service, it is responsible for this resource to be initialized.
  • The service resource is created by the first deployment of a given service. So, if we release custom.edgelq.com for the first time, in the first region, it will send a CreateService request. The next deployment of the same service, in the next region, will just send UpdateService. This update must have a new MultiRegionPolicy, where field-enabled regions contain a new region ID.
  • Each deployment is responsible for its deployment resource in Meta.
  • All deployments for a given service are responsible for Resource instances. If a new service is deployed with the server, controller, and db-controller pods, then they may initially be sending clashing create requests. We are fine with those minor races there, since transactions in Meta service, coupled with CAS requests made by EnvRegistry, ensure eventual consistency.

Visit the runInit function, which is one of the goroutines of EnvRegistry executed by runInBackground. It contains procedures for registration of Meta resources finishes after a successful run.

From this process, another emerging design property of EnvRegistry is that it is aware of its context, it knows what Service and Deployment it is associated with. Therefore, it has getters for self Deployment and Service.

Let’s stay for a while in this run process, as it shows other goroutines that are run forever:

  • One goroutine keeps running runDeploymentsWatch
  • Second goroutine keeps running runServicesWatch
  • The final goroutine is the main one, runMainSync

We don’t need real-time watch updates of regions and resources, we need services and their regional deployments only. Normally watch requires a separate goroutine, and it is the same case here. To synchronize actual event processing across multiple real-time updates, we need a “main synchronization loop”, which unites all Go channels.

In the main sync goroutine, we:

  • Process changes detected by runServicesWatch.
  • Process changes detected by runDeploymentsWatch.
  • Catch initialization signal from the runInit function, which guarantees information about our service is stored in Meta.
  • Attachment of new real-time subscribers. When they attach, they must get a snapshot of past events.
  • Detachment of real-time subscribers.

As of additional note: since EnvRegistry is self-aware, it gets only Services and Deployments that are relevant. Those are:

  • Services and Deployments of its Service (obviously)
  • Services and Deployments that are used/imported by the current Service
  • Services and Deployments that are using the current Service

The last two parts are important, it means that EnvRegistry for top service (like meta.goten.com) is aware of all Services and Deployments. Higher levels will see all those below or above them, but they won’t be able to see “neighbors”. The higher the tree, there will be fewer services above, and more below, but the proportion of neighbors will be higher and higher.

It should not be a problem, though, unless we reach the scale of thousands of Services, core SPEKTRA Edge services will however be more pressured than all upstream ones for various reasons.

In the context of SPEKTRA Edge, we made additional implementation decisions, when it comes to SPEKTRA Edge platform deployments:

  • Each service, except meta.goten.com itself, must connect to the regional meta service in its EnvRegistry.

    For example, iam.edgelq.com in us-west2, must connect to Meta service in us-west2. Service custom.edgelq.com in eastus2 must connect to Meta service in eastus2.

  • Server instance of meta.goten.com must use local-mode EnvRegistry. The reason is, that it can’t connect to itself via API, especially since it must succeed in EnvRegistry initialization before running its API server.

  • DbController instance of meta.goten.com is special, and shows the asymmetric nature of SPEKTRA Edge core services regarding regions. As a whole, core SPEKTRA Edge services point to the same primary region, any other is secondary. Therefore, DbController instance of meta.goten.com must:

    • In the primary region, connect to the API server of meta.goten.com in the primary region (intra-region)
    • In the secondary region, connect to the API server of meta.goten.com in the primary region (the secondary region connects to the primary).

Therefore, when we add a new region, the meta-db-controller in the secondary region registers itself in the primary region meta-service. This way primary region gets the awareness of the next region’s creation. The choice of meta-db-controller for this responsibility has more for it, Meta-db-controller will be responsible for syncing the secondary region meta database from the primary one. This will be discussed in the following section of this guide. For now, we just mentioned conventions where EnvRegistry must source information from.

1.3 - Resource Metadata

Understanding the resource metadata for the service synchronization

As a protocol, Goten needs to have protocol-like properties. One of the thems is the requirement that resource types of all Services managed by Goten must contain metadata objects. It was already mentioned multiple times, but let’s put a link to the Meta object again https://github.com/cloudwan/goten/blob/main/types/meta.proto.

Resource type managed by Goten must satisfy interface methods (you can see in the Resource interface defined in the runtime/resource/resource.go file):

GetMetadata() *meta.Meta
EnsureMetadata() *meta.Meta

There is, of course, the option to opt-out, interface Descriptor has method SupportsMetadata() bool. If it returns false, it means the resource type is not managed by Goten, and will be omitted from the Goten design! However, it is important to recognize if resource type is subject to this design or not, and how we can do this, including programmatically.

To summarize, as protocol, Goten requires resources to satisfy this interface. It is important to note what information is stored in resource metadata in the context of the Goten design:

  • Field syncing of type SyncingMeta must always describe which region owns a resource, and which regions have read a copy of it. SyncingMeta must be always populated for each resource, regardless of type.

  • Field services of type ServicesInfo must tell us which service owns a given resource, and a list of services for which this resource is relevant. Unlike syncing, services may not be necessarily populated, meaning that Service-defining resource type is responsible for explaining how it works in this case. In the future probably it may slightly change:

    If services is not populated at the moment of resource save, it will point to the current service as owning, and allowed services will be a one-element array containing the current service too. This in fact should be assumed by default, but it is not enforced globally, which we will explain now.

First, service meta.goten.com always ensures that the services field is populated for the following cases:

  • Instances of meta.goten.com/Service must have ServicesInfo where:
    • Field owning_service is equal to the current service itself.
    • Field allowed_services contains the current service, all imported/used services, AND all services using importing this service! Note that this may be dynamically changing, if a new service is deployed, it will update the ServicesInfo fields of all services it uses/imports.
  • Instances of meta.goten.com/Deployment and meta.goten.com/Resource must have their ServicesInfo synchronized with parent meta.goten.com/Service instance.
  • Instances of meta.goten.com/Region do not have ServicesInfo typically populated. However, in the SPEKTRA Edge context, we have a public RoleBinding that allows all users to read from this collection (but never write). Because of this private/public nature, there was no need to populate service information there.

Note that this implies that service meta.goten.com is responsible for syncing ServicesInfo of meta.goten.com/Deployment and meta.goten.com/Resource instances. It is done by a controller implemented in the Goten repository: meta-service/controller directory. It is relatively simple.

However, while meta.goten.com can detect what ServicesInfo should be populated, this is often not the case at all. For example, when service iam.edgelq.com receives a request CreateServiceAccount, it does not know necessarily for whom this ServiceAccount is at all. Multiple services may be owning ServiceAccount resources, therefore, but the resource type itself does not have a dedicated “service” field in its schema. The only way services can annotate ServiceAccount resources is by providing necessary metadata information. Furthermore, if some custom service wants to make the ServiceAccount instance available for others services to see, it may need to provide multiple items to the allowed_services array. This should explain that service information must be determined at the business logic level. For this reason, it is allowed to have empty service information, but in many cases, SPEKTRA Edge will enforce their presence, where business logic requires it.

Then, the situation for the other meta field, syncing, is much easier. Value can be determined on the schema level. There already is instruction in the multi-region design section of the developer guide.

Regions setup always can be defined based on resource name only:

  • If it is a regional resource (has a region/ segment in the name), it strictly tells which region owns it. The list of regions that get a read-only copy is decided on below resource name properties below.
  • If it contains a well-known policy-holder in the name, then the policy-holder defines what regions get a read copy. If the resource is non-regional, then MultiRegionPolicy also tells what region owns it (default control region).
  • If the resource is not subject to MultiRegionPolicy (like Region, or User in iam.edgelq.com), then it is a subject of MultiRegionPolicy defined in the relevant meta.goten.com/Service instance (for this service).

Now the trick is: All policy-holder resources are well-known. Although we try not to hardcode anything anywhere, Goten provides utility functions for detecting if a resource contains a MultiRegionPolicy field in its schema. This also must be defined in the Goten specification. By detecting what resource types are policy-holders, Goten can provide components that can easily extract regional information from a given resource by its name only.

Versioning information does not need to be specified in the resource body. Having instance, it is easily possible to get Descriptor instance, and check API version. All schema references are clear in this regard too, if resource A has a reference field to resource B, then from the reference object we can get the Descriptor instance of B, and get the version. The only place where it is not possible, are meta owner references. Therefore, in the field metadata.owner_references, an instance of each must contain the name, owning service, API version, and region (just in case it is not provided in the name field). When talking about the meta references, it is important to mention other differences compared to schema-level references:

  • schema references are owned by a Service that owns resources with references.
  • meta owner references are owned by a Service to which references are pointing!

This ownership has implication: when Deployment D1 in Service S1 upgrades from v1 to v2 (for example), and there is some resource X in Deployment D2 from Service S2, and this X has the meta owner reference to some resource owned by D1, then D1 will be responsible for sending an Update request to D2, so meta owner reference is updated.

1.4 - Multi-Region Policy Store

Understanding the design of the multi-region policy store.

We mentioned MultiRegion policy-holder resources, and their importance when it comes to evaluating region syncing information based on resource name. There is a need to have a MultiRegion PolicyStore object, that for any given resource name returns a managing MultiRegionPolicy object. This object is defined in the Goten repository, file runtime/multi_region/policy_store.go. This file is important for this design and worth remembering. As of now, it returns a nil object for global resources though, the caller should in this case take MultiRegionPolicy from the EnvRegistry component from the relevant Service.

It uses a cache that accumulates policy objects, so we should normally not use any IO operations, only initially. We have watch-based invalidation, which allows us to have a long-lived cache.

We have some code-generation that provides us functions needed to initialize PolicyStore for a given Service in a given version, but the caller is responsible for remembering to include them (All those main.go files for server runtimes!).

In this file, you can also see a function that sets/gets MultiRegionPolicy from a context object. In multi-region design, it is required from a server code, to store the MultiRegionPolicy object in a context if there will be updates to the database!

2 - Goten Protocol Flows

Understanding the Goten protocol flows.

Design decision includes:

  1. services are isolated, but they can use/import services on lower levels only, and they can support only a subset of regions available from these used/imported services.
  2. deployments within the Service must be isolated in the context of versioning. Therefore, they don’t need to point to the same primary API version and each Service version may import different services in different versions.
  3. references may point across services only if the Service imports another service. References across regions are fine, it is assumed regions for the same Service trust each other, at least for now.
  4. all references must carry region, version, and service information to maintain full global env.
  5. We have schema and meta owner references. Schema refs define a region by name, version, and service by context. Meta refs have separate fields for region, service, and version.
  6. Schema references may be of blocking type, use cascade deletion, or unset.
  7. Meta references must trigger cascade deletion if all owners disappear.
  8. Each Deployment, Service + Region pair, is responsible for maintaining metadata.syncing fields of resources it owns.
  9. Each Deployment is responsible for catching up with read-copies from other regions available for them.
  10. Each Deployment is responsible for local database schema and upgrades.
  11. Each Deployment is responsible for Meta owner references in all service regions if they point to the Deployment (via Kind and Region fields!).
  12. Every time cross-region/service references are established, the other side may reject this relationship.

We have several components in API servers and db controllers for maintaining order in this graph. Points one to three are enforced by Meta service and EnvRegistry components. EnvRegistry uses generated descriptors from the Goten specification to populate the Meta service. If someone is “cheating”, then look at point twelve, the other side may reject it.

2.1 - API Server Flow

Understanding the API server flow.

To enforce general schema consistency, we must first properly handle requests coming from users, especially writing ones.

The following rules are executed when API servers get a write call:

  • when a writing request is sent to the server, multi-region routing middleware must inspect the request, and ensure that all resources that will be written to (or deleted), are owned by the current region. It must store the MultiRegionPolicy object in the context associated with the current call.
  • write requests can only execute write updates for a resources under single multi-region policy! It means that writing across let’s say two projects will not be allowed. It is allowed to have writing operations to global resources though. If there is an attempt to write to multiple resources across different policy holders in a single transaction, the Store object must reject the write.
  • Store object must populate the metadata.syncing field when saving. It should use MultiRegionPolicy from context.
  • When the server calls the Save or Delete function on the store interface (for whatever Service resource), the following things happen:
    • If this is a creation/update, and the new resource has schema references that were not there before, then the Store is responsible for connecting to those Services and ensuring that resources exist, the relationship is established, and it is allowed to establish references in general. For references to local resources, it also needs to check if all is fine.
    • If this is deletion, the Store is obliged to check if there are any blocking back-references. It needs to connect with Deployments where references may exist, including self. For local synchronous cascade deletion & unset, it must execute them.
    • When Deployment connects with others, it must respect their API versions used.
  • Meta owner references are not checked, because it is assumed they may be created later. Meta-owner references are asynchronously checked by the system after the request is completed.

This is a designed flow for API Servers, but we have a couple more flows regarding schema consistency. First, let’s define some corner cases when it comes to blocking references across regions/services. Scenario:

  • Deployment D1 gets a write (Creation) to resource R1. Establishes SNAPSHOT transaction.
  • R1 references (blocking) R2 in Deployment D2, therefore, on the Save call, D1 must ensure everything is valid.
  • Deployment D1 sends a request to establish a blocking reference to R2 for R1. D2 can see R2 is here.
  • D2 blocks resource R2 in its SNAPSHOT transaction. Then sends a signal to D1 that all is good.

Two things can happen:

  • D1 may fail to save R1 because of the failure of its local transaction. Resource R2 may be left with some blockade.
  • Small chance, but after successful blockade on R2, D2 may get delete R2 request, while R1 still does not exist, because D1 did not finish its transaction yet. If D2 asks D1 for R1, D1 will say nothing exists. R2 will be deleted, but then R1 may appear.

Therefore, when D2 blocks resource R2, it is a special tentative blockade with a timeout of up to 5 minutes, if I recall the amount correctly. This is way more than enough since transactions are configured to timeout after one minute. It means R2 will not be possible to delete for this period. Then protocol continues:

  • If D1 fails transaction, D2 is responsible to asynchronously remove tentative blockade from R2.
  • If D1 succeeds the transaction, then D1 is responsible for informing in an asynchronous manner that tentative blockade on R1 is confirmed.

2.2 - Meta Owner Flow

Understanding the meta owner flow.

Let’s define some terminologies:

  • Meta Owner

    It is a resource that is being pointed by the Meta owner reference object

  • Meta Ownee

    It is a resource that points to another resource by the metadata.owner_references field.

  • Meta Owner Deployment

    Deployment to which Meta Owner belongs.

  • Meta Ownee Deployment

    Deployment to which Meta Ownee belongs.

  • Meta Owner Reference

    It is an item in metadata.owner_references array field.

We have three known cases where action is required:

  1. API Server calls Save method of Store, and saved resource has non-empty meta owner refs. API Server must schedule asynchronous tasks to be executed after the resource is saved locally (We trust meta owner refs are valid). Then asynchronously:

    • deployment owning meta ownee resource must periodically check if meta owners exist in target Deployments.
    • if after some timeout it is detected that the meta owner reference is not valid, then it must be removed. If it empties all meta owner refs array, the whole resource must be deleted.
    • if meta owner reference is valid, Deployment with meta ownee resource is responsible for sending notifications to Deployment with meta owner resource. If the reference is valid, it will be successful.
    • if Deployment with meta ownee detects that version of meta owner reference is too old (during validation), then it must upgrade it.

    Note that in this flow Deployment with meta ownee resource is an actor initializing action, it must ask Deployments with meta owners if its meta ownee is valid.

  2. API Server calls the Save method of Store, and the saved resource is known to be the meta-owner of some resources in various Deployments. In this case, it is meta owner Deployment responsible for actions, asynchronously:

    • it must iterate over Deployments where meta ownees may be, and verify if they are affected by the latest save. If not, no need for any action. Why however meta ownees may be affected? Let’s list the points below…
    • sometimes, meta owner reference has a flag telling that the meta owner must have a schema reference to the meta ownee resource. If this is the case, and we see that the meta owner lost the reference to a meta ownee, the meta ownee must be forced to clean up its meta owner refs. It may trigger its deletion.
    • If there was a Meta Owner Deployment version upgrade, this Deployment is responsible for updating all Meta ownee resources. Meta ownees must have meta owner references using the current version of the target Deployment.
  3. API Server calls Delete method of Store, and deleted resource is KNOWN to be meta-owner of some resources in various Deployments. Deployment owning deleted meta owner resource is responsible for the following asynchronous actions:

    • It must iterate over Deployments where meta ownees may exist, and list them.
    • For each meta ownee, Meta Owner Deployment must notify about deletion, Meta Ownee Deployment.
    • API Server of meta ownee deployment is responsible for removing meta owner reference from the array list. It may trigger the deletion of meta ownee if there are no more meta owner references.

Note that all flows are pretty much asynchronous, but still ensure consistency of meta owner references. In some cases though it is meta owner Deployment reaching out, sometimes the other way around. It depends on which resource was updated last.

2.3 - Cascade Deletion Flow

Understanding the cascade deletion flow.

When some resource is deleted, and the API Server accepts deletion, it means there are no blocking references anywhere. This is ensured. However, there may be resources pointing to deleted ones with asynchronous deletion (or unset).

In these flows we talk only about schema references, meta are fully covered already.

When Deployment deletes some resource, then all Deployments affected by this deletion must take an asynchronous action. It means that if Deployment D0-1 from Service S0 imports Service S1 and S2, and S1 + S2 have deployments D1-1, D1-2, D2-1, D2-2, then D0-1 must make four real-time watches asking for any deletions that it needs to handle! In some cases, I remember service importing five others. If there were 50 regions, it would mean 250 watch instances, but it would be a very large deployment with sufficient resources for goroutines.

Suppose that D1-1 had some resource RX, that was deleted. Following happens:

  • D1-1 must notify all interested deployments that RX is deleted by inspecting back reference sources.
  • Suppose that RX had some back-references in Deployment D0-1, Deployment D1-1 can see that.
  • D1-1, after notifying D0-1, periodically checks if there are still active back-references from D0-1.
  • Deployment D0-1, which points to D1-1 as an importer, is notified about the deleted resource.
  • D0-1 grabs all local resources that need cascade deletion or unset. For unsets, it needs to execute regular updates. For deletions, it needs to delete (or mark for deletion if there are still some other back-references pointing, which may be blocking).
  • Once D0-1 deals with all local resources pointing to RX, it is done, it has no work anymore.
  • At some point, D0-1 will be asked by D1-1 if RX no longer has back refs. If this is the case, then D0-1 will confirm all is clear and D1-1 will finally clean up what remains of RX.

Note that:

  • This deletion spree may be deep for large object deletions, like projects. It may involve multiple levels of Deployments and Services.

  • If there is an error in the schema, some pending deletion may be stuck forever. By error in the schema, we mean situations like:

    • Resource A is deleted, and is back referenced from B and C (async cascade delete).
    • Normally B and C should be deleted, but it may be a problem if C is let’s say blocked by D, and D has no relationship with A, so will never be deleted. In this case, B is deleted, but C is stuck, blocked by D. Unfortunately as of now Goten does not detect weird errors in schema like this, perhaps it may be a good idea, although not sure if possible.
    • It will be the service developers’ responsibility to fix schema errors.
  • In the flow, D0-1 imports Service to which D1-1 belongs. Therefore, we know that D0-1 knows the full-service schema of D1-1, but not the other way around. We need to consider this in the situation when D1-1 asks D0-1 if RX no longer has back refs.

2.4 - Multi-Region Sync Flow

Understanding the multi-region synchronization flow.

First, each Deployment must keep updating metadata.syncing for all resources it owns. To watch owned resources, it must:

  • WATCH <Resource> WHERE metadata.syncing.owningRegion = <SELF>.

    It will be getting updates in real-time.

API Server already ensures that the resource on update has the metadata.syncing field synced! However, we have an issue when MultiRegionPolicy object changes. This is where Deployment must asynchronously update all resources that are subject to this policyholder. It must therefore send Watch requests for ALL resources that can be policy-holders. For example, Deployment of iam.edgelq.com will need to have three watches:

  1. Watch Projects WHERE multi_region_policy.enabled_regions CONTAINS <MyRegion>

    by iam.edgelq.com service.

  2. Watch Organizations WHERE multi_region_policy.enabled_regions CONTAINS <MyRegion>

    by iam.edgelq.com service.

  3. Watch Services WHERE multi_region_policy.enabled_regions CONTAINS <MyRegion>

    by meta.goten.com service.

Simpler services like devices.edgelq.com would need to watch only projects, because it does not have other resources subject to this.

Deployment needs to watch policyholders that are relevant in its region.

Flow is now the following:

  • When Deployment gets a notification about the update of MultiRegionPolicy, it needs to accumulate all resources subject to this policy.
  • Then it needs to send an Update request for each, API server ensures that metadata.syncing is updated accordingly.

The above description ensures that metadata.syncing is up-to-date.

The next part is actual multi-region syncing. In this case, Deployments of each Service MUST have one active watch on all other Deployments from the same family. For example, if we have iam.edgelq.com in regions japaneast, eastus2, us-west2, then following watches must be maintainer:

Deployment of iam.edgelq.com in us-west2 has two active watches, one sent to japaneast region, the other eastus:

  • WATCH <Resources> WHERE metadata.syncing.owningRegion = japaneast AND metadata.syncing.regions CONTAINS us-west2
  • WATCH <Resources> WHERE metadata.syncing.owningRegion = eastus2 AND metadata.syncing.regions CONTAINS us-west2

Deployments in japaneast and eastus2 will also have similar two watches. We have a full mesh of connections.

Then, when some resource in us-west2 gets created with metadata.syncing.regions = [eastus2, japaneast], then one copy will be sent to each of these regions. Those regions must be executing pretty much continuous work.

Now, on the startup, it is necessary to mention the following procedure:

  • Deployment should check all lists of currently held resources owned by other regions, but syncable locally.
  • Grab a snapshot of these resources from other regions, and compare if anything is missing, or if we have too much (missing deletion). If this is the case, it should execute missing actions to bring the system to sync.
  • During the initial snapshot comparison, it is still valuable to keep copying real-time updates from other regions. It may take some time for the snapshot to be completed.

2.5 - Database Migration Flow

Understanding the database migration flow.

When Deployment boots up after the image upgrade, it will detect that the currently active version is lower than the version it can support. In that case, the API Server will work on the older version normally, but the new version API will become available in read-only mode. Deployment is responsible for asynchronous, background syncing of higher version database with current version database. Clients are expected to use older versions anyway, so they won’t necessarily see incomplete higher versions. Besides, it’s fine, because what matters is the current version pointed out by Deployment.

It is expected that all Deployments will get new images first before we start switching to the next versions. Each Deployment will be responsible for silent copying.

For the MultiRegion case, when multiple deployments of the same service are on version v1, but they run on images that can support version v2, they will be still synced with each other, but on both versions: v1 and v2. When images are being deployed region by region (Deployment by Deployment), they may experience Unimplemented error messages, but it should be till images are updated in all regions. We may improve this and try to detect “available” versions first, before making cross-region watches.

Anyway, it will be required that new images are deployed to all regions before the upgrade procedure is triggered on any Regional deployment.

Upgrade then can be done one Deployment by one, using the procedure described in the migration section of the developer guide.

When one Deployment is officially upgraded to the new version, but still uses primarily the old version, then all deployments still watch each other for both versions, for the sake of multi-region syncing. However, Deployment using a newer version may already opt-out from pulling older API resources from other Deployments at this point.

Meta owner references are owned by Deployment they point to. It means that they are upgraded asynchronously after deployment switch the version to the newer one.

3 - Goten Flow Implementation

Understanding the Goten flow implementation.

All components for described flows are implemented in the Goten repository, we have several places where implementation can be found:

  • In runtime/schema-mixin we have a mixin service directory, which must be part of all services using Goten.
  • In runtime/store/constraint we have another “middleware” for Store, which is aware of cross-service & regional nature of schemas. This middleware must be used in all.
  • In runtime/db_constraint_ctrl we have a controller that handles asynchronous schema-related tasks like asynchronous cascade deletions, meta owner references management, etc.
  • In runtime/db_syncing_ctrl we have a controller that handles all tasks related to DB syncing: Cross-region syncing, metadata.syncing updates, database upgrades, and search database syncing as well.

3.1 - Schema Mixin

Understanding the schema mixin implementation.

Mixins are special kinds of services, that are supposed to be mixed/blended with proper services. Like any service, they have api-skeleton, protobuf files, resources, and server handlers. What they don’t get, is independent deployment. They don’t exist in the Meta Service registry. Instead, their resources and API groups are mixed with proper resources.

Moreover, for schema mixins, we are not validating references to other resources, they are excluded from this mechanism, and it’s up to the developer to keep them valid.

The Goten repository provides schema mixin, under runtime/schema-mixin. If you look at this mixin service, you will see that it has ResourceShadow resource. By mixing the schema mixin with let’s say Meta service, which formally has four resource types, four API groups, we have the following total Meta service with:

  • Resources: Region, Service, Deployment, Resource, ResourceShadow
  • API Groups: Region, Service, Deployment, Resource, ResourceShadow (CRUD plus custom actions).

If you inspect the Meta service database, you will have five collections (unless there are more mixins).

See api-skeleton: https://github.com/cloudwan/goten/blob/main/runtime/schema-mixin/proto/api-skeleton-v1.yaml.

By requiring that ALL services attach to themselves schema-mixin, we can guarantee, that all services can access each other via schema-mixin. This is one of the key ingredients of Goten’s protocol. Some common service is always needed, because, to enable circular communication between two services, which can’t possibly know each other schemas, they need some kind of common protocol.

Take a look at the resource_shadow.proto file. Just a note: You can ignore target_delete_behavior, they are more for informative purposes. But for mixins, Goten does not provide schema management. ResourceShadow is a very special kind of resource, and it exists for every other resource in a deployment (except other mixins). What I mean, let’s take a look at the list of resources that may exist in the Deployment of Meta service in region us-west2, like:

  • regions/us-west2 (Kind: meta.goten.com/Region)
  • services/meta.goten.com (Kind: meta.goten.com/Service)
  • services/meta.goten.com/resources/Region (Kind: meta.goten.com/Resource)
  • services/meta.goten.com/resources/Deployment (Kind: meta.goten.com/Resource)
  • services/meta.goten.com/resources/Service (Kind: meta.goten.com/Resource)
  • services/meta.goten.com/resources/Resource (Kind: meta.goten.com/Resource)
  • services/meta.goten.com/deployments/us-west2 (Kind: meta.goten.com/Deployment)

If those resources exist in the database for meta.goten.com in us-west2, then collection ResourceShadow will have the following resources:

  • resourceShadows/regions/us-west2
  • resourceShadows/services/meta.goten.com
  • resourceShadows/services/meta.goten.com/resources/Region
  • resourceShadows/services/meta.goten.com/resources/Deployment
  • resourceShadows/services/meta.goten.com/resources/Service
  • resourceShadows/services/meta.goten.com/resources/Resource
  • resourceShadows/services/meta.goten.com/deployments/us-west2

Basically it’s a one-to-one mapping, with the following exceptions:

  • if there are other mixin resources, they don’t get ResourceShadows.
  • synced read-only copies from other regions do not get ResourceShadows. For example, resource regions/us-west2 will exist in region us-west2, and resourceShadows/regions/us-west2 will also exist in us-west2. But, if regions/us-west2 is copied to other regions, like eastus2, then resourceShadows/regions/us-west2 WILL NOT exist in eastus2.

This makes Resource shadows rather “closed” within their Deployment.

ResourceShadow instances are created/updated along a resource they represent, during each transaction. It ensures that they are always in sync with a resource. They contain all references to other resources and contain all back reference source deployments. The reason we have back reference deployments, not an exact list, is that the full list would have been massive, imagine a Project instance and 10000 Devices pointing to it. Instead, if let’s say those devices are spread across four regions, ResourceShadow for Project will have 4 back reference sources, more manageable.

Now, with ResourceShadows, we can provide some abstraction needed to facilitate communication between services. However, note that we don’t use standard CRUD at all (for shadows). They were in the past, but the problem with CRUD is that they don’t contain the “API Version” field.

For example, we have the secrets.edgelq.com service in versions v1alpha2 and v1. In the older version, we have a Secret resource with the name pattern projects/{project}/secrets/{secret}. Now, with v1 upgrade, name pattern changed to projects/{project}/regions/{region}/secrets/{secret}. Note that this means, that the ResourceShadow name changes too!

Suppose there are services S1 and S2. S1 imports secrets in v1alpha2, and S2 imports secrets in v1. Suppose both S1 and S2 want to create resources concerning some Secret instance. In this case, they would try to use schema-mixin API, and they would give conflicting resource shadow names, but this conflict arises from a different version, not because of a bug. S1 would try to establish a reference to shadow for projects/{project}/secrets/{secret}, and S2 would use the version with region.

This problem repeats for the whole CRUD for ResourceShadow, so we don’t use it. Instead, we developed a bunch of custom actions you can see in the api-skeleton of schema-mixin like EstablishReferences, ConfirmBlockades, etc. All those requests contain a version field, and the API Server can use versioning transformers to convert between names between versions.

Now, coming back to custom actions for ResourceShadows, see API-skeleton along, recommended to see protobuf with request objects!

We had a flow on how references are established, when API Servers handle writing requsts, this is where schema mixin API is in use.

EstablishReferences is used by Store modules in API Servers, when they save resources with cross-region/service references. This is called the DURING transaction of Store in API Server. It ensures that referenced resources will not be deleted for the next few minutes. It creates tentative blockades in ResourceShadow instances on the other side. You may check the implementation in the goten repo, file runtime/schema-mixin/server/v1/resource_shadow/resource_shadow_service.go. When the transaction concludes, then Deployment asynchronously will send ConfirmBlockades to remove the tentative blockade from referenced ResourceShadow in the target Service. It will leave with a back reference source though!

For deletion requests, the API Server must call CheckIfResourceIsBlocked before proceeding with resource deletion. It must also block deletion if there are tentative blockades in ResourceShadow.

We also described Meta owner flows with three cases. When Meta Ownee Deployment tries to confirm the meta owner, it must use the ConfirmMetaOwner call to a Meta Owner Deployment instance. If all is fine, then we will get a successful response. If there is a version mismatch, Meta Ownee Deployment will send UpgradeMetaOwnerVersion request to itself (its API Server), so the meta owner reference is finally in the desired state. If ConfirmMetaOwner discovers the Meta Owner does not confirm ownership, then Meta Ownee Deployment should use the RemoveMetaOwnerReference call.

When it is Meta Owner Deployment that needs to initiate actions (cases two and three), it needs to use ListMetaOwnees to get meta ownees. When relevant, it will need to call UpgradeMetaOwnerVersion or RemoveMetaOwnerReference, depending on the context of why we are iterating meta ownees.

When we described asynchronous deletions handling, the most important schema-mixin API action is WatchImportedServiceDeletions. This is a real-time watch subscription with versioning support. For example, if we have Services S1 and S2 importing secrets.edgelq.com in versions v1alpha2 and v1, then if some Secret is deleted (with name pattern containing region in v1 only), separate WatchImportedServiceDeletionsResponse is sent to S1 and S2 Deployments, containing shadow ID of secret in version Service desires.

When it comes to the deletion flow, we also use CheckIfHasMetaOwnee, and CheckIfResourceHasDeletionSubscriber. These methods are used when waiting for back-references to be deleted generally.

Since the schema-mixin Server is mixed with proper service, it means we can also access original resources from the Store interface! In total, Schema-mixin is a powerful utility for Goten as protocol cases.

We still need CRUD in ResourceShadows, because:

  • Update, Delete, and Watch functions are used within Deployment itself (where we know all runtimes use the same version).
  • debugging purposes. Developers can use read requests when some bug needs investigation.

3.2 - Metadata Syncing Decorator

Understanding the metadata synchronization decorator.

As we said, when the resource is saved in the Store, the metadata.syncing field is refreshed according to the MultiRegionPolicy. See the decorator component in the Goten repository: runtime/multi_region/syncing_decorator.go. This is wrapped up by a store plugin, runtime/store/store_plugins/multiregion_syncing_decorator.go.

This plugin is added to all stores for all API Servers. It can be opted out only if multi-region features are not used at all. When Deployment sees that metadata.syncing is not up-to-date with MultiRegionPolicy, the empty update can handle this. Thanks to this, we could have annotated this field as output only (in the protobuf file), and users wouldn’t be able to make any mistakes there.

3.3 - Constraint Store

Understanding the constraint store.

As it was said, Store is a series of its middlewares like Server, but the base document in the Contributor guide only has shown core and cache layers. An additional layer is Constraints, you can see it in the Goten repo, runtime/store/constraints/constraint_store.go.

It focuses mostly on decorating Save/Delete methods. When Saving, it grabs the current ResourceShadow instance for the saved resource. Then it ensures references are up-to-date. Note that it calls the processUpdate function, which repopulates shadow instances. For each new reference, that was not before, it will need to connect with the relevant Deployment and confirm the relationship. All new references are grouped into Service & Region buckets. For each foreign Service or Region, it will need to send an EstablishReferences call. It will need to consider versioning too, because shadow names may change.

Note that we have a “Lifecycle” object, where we store any flags indicating if asynchronous tasks are pending on the resource. State PENDING shows that there are some asynchronous tasks to execute.

Method EstablishReferences is not called for local references. Instead, at the end of transactions, preCommitExec is called to connect with local resources in a single transaction. This is the most optimal, and the only option possible. Imagine that in a single transaction we create resources A and B, where A has reference to B. If we used EstablishReferences, then it would fail because B does not exist yet. By skipping this call for local resources, we are fixing this problem.

When deleting, the Constraint store layer uses processDeletion, where we need to check if the resource is not blocked. We also may need to iterate over other back reference sources (foreign Deployments). When we do it, we must verify versioning, because other Deployments may use a lower version of our API, resulting in different resource shadow names.

For deletion, we also may trigger synchronous cascade deletions (or unsets).

Also, note that there is something additional about deletions, they may delete an actual resource instance (unless we have a case like async deletion annotation), but they won’t delete the ResourceShadow instance. Instead, they will set deletion time and put Lifecycle into a DELETING state. This is a special signal that will be distributed to all Deployments that have resources with references pointing at deleted resources. This is how they will be executing any cascade deletions (or unsets). Only when back-references are cleared

This is the last layer in Store objects, along with cache and core, now you should see in full how the actually Store works, and what it does, what it interacts with (actual database, local cache, AND other Deployments). Using Schema mixin API, it achieves a “global” database across services, regions, and versions.

3.4 - Database Constraint Controller

Understanding the database constraint controller.

Each db-controller instance consists mainly of two Node managers modules: One is the DbConstraint Controller. It’s tasks include execution of all asynchronous tasks related to the local database (Deployment). There are 3 groups of tasks:

  • Handling of owned (by Deployment) resources in PENDING state (Lifecycle)
  • Handling of owned (by Deployment) resources in DELETING state (Lifecycle)
  • Handling of all subscribed (from current and each foreign Deployment) resources in the DELETING state (Lifecycle)

The module is found in the Goten repository, module runtime/db_constraint_ctrl. As with any other controller, it uses a Node Manager instance. This Node Manager, apart from running Nodes, must also keep a map of interested deployments! What does it mean: we know that iam.edgelq.com imports meta.goten.com. Suppose we have regions us-west2 and eastus. In that case, Deployment of iam.edgelq.com in the us-west2 region will need to remember four Deployment instances:

  1. meta.goten.com in us-west2
  2. meta.goten.com in eastus2
  3. iam.edgelq.com in us-west2
  4. iam.edgelq.com in eastus2

This map is useful for 3rd task group: handling of subscribed resources in the deleting state. As IAM imports meta and no other service, and also because IAM resources can reference each other, we can deduce the following: resources of iam.edgelq.com in region us-west2 can only reference resources from meta.goten.com and iam.edgelq.com, and only from regions us-west2 and eastus2. If we need to handle the cascade deletions (or unsets), then we need to watch these deployments. See file node_manager.go in db_constraint_ctrl, we are utilizing EnvRegistry to get dynamic updates about interesting Deployments. In the function createAndRunInnerMgr we use the ServiceDescriptor instance to get information about Services we import, this is how we know which deployments we need to watch.

As you can see, we utilize EnvRegistry to initiate DbConstraintCtrl correctly in the first place, and then we maintain it. We also handle version switches. If this happens, we stop the current inner node manager and deploy a new one.

When we watch other deployments, we are interested only in schema references, not meta. Meta references are more difficult to predict because services don’t need to import each other. For this reason, responsibility for managing meta owner references is split between Deployments on both sides: Meta Owner and Meta Ownee, as described by the flows.

The most important files in runtime/db_constraint_ctrl/node directory are:

  • owned_deleting_handler.go
  • owned_pending_handler.go
  • subscribed_deleting_handler.go

Those files are handling all asynchronous tasks as described by many of the flows, regarding the establishment of references to other resources (confirming/removing expired tentative blockades), meta owner references management, cascade deletions, or unsets. I was trying to document the steps they do and why, so refer to the code for more information.

For other notable elements in this module:

  • For subscribed deleting resource shadows, we have wrapped watcher, which uses a different method than standard WatchResourceShadows. The reason is, that other Deployments may vary between API versions they support. We use the dedicated method by schema mixin API, WatchImportedServiceDeletions.
  • Subscribed deleting resource shadow events are sent to a common channel (in controller_node.go) file, but they are still grouped per Deployment (along with tasks).

Note that this module is also responsible for upgrading meta owner references after Deployment upgrades its current version field! This is an asynchronous process, and is executed by owned_pending_handler.go, function executeCheckMetaOwnees.

3.5 - Database Syncer Controller

Understanding the database syncer controller.

Another db-controller big module is DbSyncer Controller. In the Goten repository, see the runtime/db_syncing_ctrl module. It is responsible for:

  • Maintaining the syncing.metadata field when corresponding MultiRegionPolicy changes.
  • Syncing resources from other Deployments in the same Service for the current local database (read copies).
  • Syncing resources from other Deployments and current Deployment for Search storage.
  • Database upgrade of local Deployment

It mixes multi-version/multi-region features, but the reason is, that we pretty much share many common structures and patterns regarding db-syncing here. Version syncing is still copying from one database to another, even if this is a bit special since we will need to “modify” the resources we are copying.

This module is interested in dynamic Deployment updates, but only for current Service. See the node_manager.go file. We utilize EnvRegistry to get the current setup. Normally we will initiate inner node manager when we get SyncEvent, but then we support dynamic updates via DeploymentSetEvent and DeploymentRemovedEvent. We just need to verify this Deployment belongs to our service. If it does, it means something changed there and we should refresh. Perhaps we can get the “previous” state, but it is fine to make NOOP refresh too. Anyway, we need to ensure that Node is aware of all foreign Deployments because those are potential candidates to sync from. Now let’s dive into a single Node instance.

Now, DbSyncingCtrl can be quite complex, even though it copies resource instances across databases. First, check ControllerNode struct in the controller_node.go file, which symbolizes a single Node responsible for copying data. What we can say about it (basic breaking down):

  • it may have two instances of VersionedStorage, one is older, one for newer API. Generally, we support only the last two versions for DbSyncer. It should not be needed to have more, and it would make the already complex structure more difficult. This is necessary for database upgrades.
  • We have two instances of syncingMetaSet, for two versioned storages. Those contain SyncingMeta objects per multi-region policy-holders and resource type pair. An instance of syncingMetaSet is used by localDataSyncingNode instances. To be honest, if ControllerNode had just one localDataSyncingNode object, not many, then syncingMetaSet would be part of it!
  • We have then rangedLocalDataNodes and rangedRemoteDataNodes maps.

Now, object localDataSyncingNode is responsible for:

  • Maintaining syncing.metadata, it must use the syncingMetaSet passed instance for real-time updates.
  • Syncing local resources to Search storage (read copies).
  • Upgrading local database.

Then, remoteDataSyncingNode is responsible for:

  • Syncing resources from other Deployments in the same Service for the current local database (read copies).
  • Syncing resources from other Deployments for Search storage.

For each foreign Deployment, we will have separate remoteDataSyncingNode instances.

It is worth asking the question, why do we have a map of syncing nodes (local and remote) for shard ranges, the reason is, that we split them to have at most ten shards. Often we may end up with maps of one sub-shard range still. Why ten? Because in firestore, which is a supported database, we can pass a maximum of ten shard numbers in a single request (filter)! Therefore, we will need to make separate watch queries, and it’s easier to separate nodes then. Now we can guarantee that a single local/remote node will be able to send a query successfully to the backend. However, because we have this split, we needed to separate syncingMetaSet away from localDataSyncingNode, and put it directly in ControllerNode.

Since we have syncingMetaSet separated, let’s describe what it does first: Basically, it observes all multi-region policy-holders a Service uses and computes SyncingMeta objects per policy-holder/resource type pair. For example, Service iam.edgelq.com has resources belonging to Service, Organization, and Project, so it watches these 3 resource types. Service devices.edgelq.com only uses Project, so it watches Project instances, and so on. It uses the ServiceDescriptor passed in the constructor to detect all policy-holders.

When syncingMetaSet runs, it collects the first snapshot of all SyncingMeta instances and then maintains it. It sends events to subscribers in real-time (See ConnectSyncingMetaUpdatesListener). This module is not responsible for updating the metadata.syncing field yet, but it is an important first step. It will be triggering localDataSyncingNode when new SyncingMeta is detected, so it can run its updates.

The next important module is the resVersionsSet object, defined in file res_versions_set.go. It is a central component in both local and remote nodes, so perhaps it is worth explaining how it works.

This set contains all resource names with their versions in the tree structure. By version, I don’t mean API version of the resource, I mean literal resource version, we have a field in metadata for that, metadata.resource_version. This value is a string but can contain only an integer that increments with every update. This is a base for comparing resources across databases. How do we know that? Well, if we have the “main” database owning resource, we know that it contains the newest version, the field metadata.resource_version is the highest there. However, we have other databases… for example search database, it may be separate, like Algolia. In that case, metadata.resource_version may be lower. We also have a syncing database (for example across regions). The other database in another region, which gets just read-only copies, also can at best match the origin database. resVersionsSet has important functions:

  • SetSourceDbRes and DelSourceDbRes are called by original database owning resource.
  • SetSearchRes and DelSearchRes are called by the search database.
  • SetSyncDbRes and DelSyncDbRes are called by syncing database (for example cross-region syncing).
  • CollectMatchingResources collects all resource names matched by prefix. This is used by metadata.syncing updates. When policy-holder resource updates its MultiRegionPolicy, we will need to collect all resources subject to it!
  • CheckSourceDbSize is necessary for Firestore, which is known to be able to “lose” some deletions. If the size is incorrect, we will need to reset the source DB (original) and provide a snapshot.
  • SetSourceDbSyncFlag is used by the original DB to signal that it supplied all updates to resVersionsSet and now continues with real-time updates only.
  • Run: resVersionsSet is used in multi-threading env, so we will run on separate goroutine and use Go channels for synchronization. We will need to use callbacks when necessary.

resVersionsSet also supports listeners when necessary, it triggers when source DB updates/deletes a resource, or when we reach syncing database equivalence with the original database. We don’t provide similar signals for search DB, because simply we don’t need them… but we do for syncing DB. We will explain later.

Now let’s talk about local and remote nodes, starting with local.

See the local_data_syncing_node.go file, which constructs all modules responsible for the mentioned tasks. First, analyze newShardRangedLocalDataSyncingNode constructor up to the if needsVersioning condition, where we create modules for Database versioning. Before this condition, we are creating modules for Search DB syncing and metadata.syncing maintenance. Note how we are using the activeVsResVSet object (type of resVersionsSet). We are connecting to the search syncer and syncing meta updater modules. For each resource type, we are creating an instance of source db watcher, which gets access to the resource version set. It should be clear now: Source DB, which is for our local deployment, keeps updating activeVsResVSet, which in turn passes updates to activeVsSS and activeVsMU. For activeVsMU, we are also connecting it to activeVsSyncMS, so we have two necessary signal sources for maintaining the metadata.syncing object.

So, you should know now that:

  • search_syncer.go

    It is used to synchronize the Search database, for local resources in this case.

  • syncing_meta_updater.go

    It is used to synchronize the metadata.syncing field for all local resources.

  • base_syncer.go

    It is actually a common implementation for search_syncer.go, but not limited to.

Let’s dive deeper and explain what is synchronization protocol here between source and destination. Maybe you noticed, but why sourceDbWatcher contains two watchers, for live and snapshot? Also, why there is a wait to run a snapshot? Did you see that in the OnInitialized function of localDataSyncingNode, we are running a snapshot only when we have a sync signal received? There are reasons for all of that. Let’s discuss design here.

When the DbSyncingCtrl node instance is initiated for the first time, or when the shard range changes, we will need to re-download all resources from the current or foreign database, to compare with synced database and execute necessary creations, updates, and deletions. Moreover, we will need to ask for a snapshot of data on the destination database. This may take time, we don’t know how much, but probably downloading potentially millions of items may not be the fastest operation. It means, that when there are changes in nodes, upscaling, downscaling, reboots, whatever, we would need to suspend database syncing, and it may be a bit long, maybe minute, what if more? Is there an upper limit? If we don’t sync fast, this lag will start to be quite too visible for users. It is better if we start separate watchers, for live data directly. Then we will be syncing from the live database to the destination (like search db), providing almost immediate sync most of the time. In the meantime, we will collect snapshots of data from the destination database. See the base_syncer.go file, and see function synchronizeInitialData. When we are done with initialization, we are triggering a signal, that will notify the relevant instance (local or remote syncing node). In the file local_data_syncing_node.go, function OnInitialized, we are checking if all components are ready, then we run RunOrResetSnapshot for our source db watchers. This is when the full snapshot will be done, and if there are any “missing” updates during the handover, we will execute them. Ideally, we won’t have them, live watcher goes back by one minute when it starts watching, so some updates may even be repeated! But it’s still necessary to provide some guarantees of course. I hope this explains the protocol:

  • Live data immediately is copying records from source to destination database…
  • In the meantime, the destination database collects snapshots…
  • And when the snapshot is collected, we start the snapshot from the source database…
  • We execute anything missing and continue with live data only.

Another reason why we have the design we have, why we use QueryWatcher instances (and not Watchers), is simple: RAM. DbSyncingCtrl needs to practically watch all database updates and needs to get full resource bodies. Note we are also using access.QueryWatcher instances in sourceDbWatcher. QueryWatcher is a lower-level object compared to just Watcher. It means, that it can’t support multiple queries, it does not handle resets, or snapshot size checks (firestore only). This is also a reason why in ControllerNode we have a map of localDataSyncingNode instances per shard range… The watcher would be able to split queries and hide this complexity. But QueryWatcher has benefits:

  • It does not store watched resources in its internal memory!

Imagine millions of resources, whose whole resource bodies are kept by Watcher instance in RAM. It goes in the wrong direction, so DbSyncingCtrl is supposed to be slim. In resVersionsSet we only keep version numbers and resource names in tree form. We try to compress all syncer modules into one place, so syncingMetaUpdater and searchUpdater are in one place. If there is some update, we don’t need to further split and increase pressure on the infrastructure.

This concludes the local data syncing node discussion in terms of MultiRegion replication and Search db syncing for LOCAL nodes. We will describe later in this doc Remote data syncing nodes. However, let’s continue with the local data syncing node, and talk about its other task: database upgrades. Therefore, let’s continue the discussion here.

Object localDataSyncingNode needs to consider now actually four databases (at maximum):

  1. Local database for API Version currently active (1)
  2. Local database for API Version to which we sync to (2)
  3. Local Search database for API Version currently active (3)
  4. Local Search database for API Version to which we sync to (4)

Let’s introduce the terms: Active database, and Syncing database. When we are upgrading to a new API Version, the Active database contains old data, Syncing database contains new data. When we are synchronizing in another direction, for rollback purposes (just in case?), the Active database contains new data, and the syncing database contains old data.

And extra SyncingMetaUpdaters:

  • syncingMetaUpdater for the currently active version (5)
  • syncingMetaUpdater for synced version (6)

We need sync connections:

  • Point 1 to Point 2 (This is most important for database upgrade)
  • Point 1 to Point 3
  • Point 2 to Point 4
  • Point 1 to Point 5 (plus extra signal input from syncingMetaSet active instance)
  • Point 2 to Point 6 (plus extra signal input from the syncingMetaSet syncing instance)

This is insane and probably needs careful code writing, which sometimes lacking here. We will need to carefully add some tests and try to put extra makeup on the code, but the deadline was deadline.

Go back to function newShardRangedLocalDataSyncingNode in local_data_syncing_node.go, and see a line with if needsVersioning and below. This constructs extra elements. First, note we are creating a syncingVsResVSet object, and another resVersionsSet. This set will be responsible for syncing between the syncing database and the search store. It is also used to keep signaling the syncing version to syncingMetaUpdater. But I see now this was a mistake because we don’t need this element. Instead, it is enough for the Active database to keep running its syncingMetaUpdater. We will know that those updates will be reflected in the syncing database because we have already synced in this direction! We will need to keep however second, additional Search database syncing. When we finish upgrading the database to the new version, we don’t want to have an empty search store from the first moment! This may not go unnoticed. Therefore, we have this database, search syncing for “Syncing database” too.

But let’s focus on the most important bits: actual database upgrade, from Active to Syncing local main storages. Find a function called newResourceVerioningSyncer, and see what it is called. It receives access to the syncing database, and it gets access to the node.activeVsResVSet object, which contains resources from the active database. This is the object responsible for upgrading resources: resourceVersioningSyncer, in file resource_versioning_syncer.go. It works like other “syncers”, and inherits from base syncer, but it also needs to transform resources. It uses transformers from versioning packages. When it uses resVersionsSet, it calls SetSyncDbRes and DelSyncDbRes, to compare with original database. We can safely require, that metadata.resourceVersion must be the same between old and new resource instances, transformation cannot change it. Because syncDb and searchDb are different, we are fine with having search syncer and versioning syncer use the same resource versions set.

Object resourceVersioningSyncer also makes extra ResourceShadow upgrades, transformed resources MAY have different references after the changes, therefore we need to refresh them! It makes this syncer even more special.

However, we have little issue with ResourceShadow instances, they don’t have a metadata.syncing field, and they are partially covered by resourceVersioningSyncer, we are not populating some fields, like back reference sources. As this is special, we need shadowsSyncer, defined in file shadows_versioning_syncer.go. It synchronizes also ResourceShadow instances, but fields that cannot be populated by resourceVersioningSyncer.

During database version syncing, localDataSyncingNode receives signals (per resource type), when there is a synchronization event between the source database and the syncing database. See that we have the ConnectSyncReadyListener method in resVersionsSet. This is how syncDb (here it is a syncing database!) notifies when there is a match between two databases. This is used by localDataSyncingNode to coordinate Deployment version switches. See function runDbVersionSwitcher to see the full procedure. This is the place basically, where Deployment can switch from one version to another. When this happens, all backend services will flip their instances.

This is all about local data syncing nodes. Let us switch to remote nodes: remote node (object remoteDataSyncingNode, file remote_data_syncing_node.go) is syncing between the local database and a foreign regional one. It is simpler than local at least. It synchronizes:

  • From remote database to local database
  • From remote database to local search database

If there are two API Versions, it is assumed that both regions may be updating. Then, we have 2 extra syncs:

  • From the remote database in the other version to the local database
  • From remote database in the other version to local search database

When we are upgrading, it is required to deploy new images on the first region, then the second, third, and so on, till the last region gets new images. However, we must not switch versions of any region till all regions get new images. While switching and deploying can be done one by one, those stages need separation. This is required for these nodes to work correctly. Also, if we switch the Deployment version in one region before we upgrade images in other regions, there is a high chance users may use the new API and see some significant gaps in resources. Therefore, versioning upgrade needs to be considered in multi-regions too.

Again, we may be operating on four local databases and two remote APIs in total, but at least this is symmetric. Remote syncing nodes also don’t deal with Mixins, so no ResourceShadow cross-db syncing. If you study newShardRangedRemoteDataSyncingNode, you can see that it uses searchSyncer and dbSyncer (db_syncer.go).