This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Goten Design Concepts

Understanding the Goten design concepts.

1 - Meta Service as Service Registry

Understanding the role of Meta service as a service registry.

To build a multi-service framework, we first need a special service, that provides service registry offers. Using it, we must be able to discover:

  • List of existing Regions
  • List of existing Services
  • List of existing Resources per Service
  • List of existing regional Deployments per Service.

This is provided by the meta.goten.com Service, in the Goten repository, directory meta-service. It follows the typical structure of any service, but has no cmd directory or fixtures, as Goten provides only basic parts. The final implementation is in the edgelq repository, see directory meta. SPEKTRA Edge version of meta contains an old version of the service, v1alpha2, which is obsolete and irrelevant to this document. For this purpose, ignore v1alpha2 elements.

Still, the resource model for Meta service resides in the Goten repository, see normal protobuf files. For Goten, we made the following design decisions, this reflects fields we have in protobuf files (you can and should see).

  • List of regions in meta service must show a list of all possible regions where services can be deployed, not necessarily where are deployed.
  • Each Service must be fairly independent. It must be able to specify its global network endpoint where it is reachable. It must display a list of API versions it has. For each API version, it must tell which services it imports, and which versions of them. It must tell what services it would like to use as a client too (but not import).
  • Every Deployment describes an instance of a service in a region. It must be able to specify its regional network endpoint and tell which service version it operates on (current maximum version). It is assumed it can support lower versions too. Deployments for a single service do not need to upgrade at once to the new version, but it’s recommended to not wait too long.
  • Deployments can be added to a Service dynamically, meaning, service owners can expand by just adding new Deployment in Meta service.
  • Each Service manages its multi-region setup. Meaning: Each Service decides which region is “primary” for them. Then list of Deployment resources describes what regions are available.
  • Each region manages its network endpoints, but it is recommended to have the same domain for global and regional endpoints, and each regional endpoint has a region ID as part of a subdomain, before the main part.
  • For Service A to import Service B, we require that Service B is available in all regions where Service A is deployed. This should be the only limitation Services must follow for multi-region setup.

All those design decisions are reflected in protobuf files, and server implementation (custom middlewares), see in goten repository, meta-service/server/v1/ custom middlewares, they are fairly simple.

For SPEKTRA Edge, design decisions are that:

  • All core SPEKTRA Edge services (iam, meta adaptation, audit, monitoring, etc.) are always deployed to all regions and are deployed together.
  • It means, that 3rd party services can always import any SPEKTRA Edge core service because it is guaranteed to be in all regions needed by 3rd party.
  • All core SPEKTRA Edge services will point to the same primary region.
  • All core SPEKTRA Edge services will have the same network domain: iam.apis.edgelq.com, monitoring.apis.edgelq.com, etc. If you replace the first word with another, it will be valid.
  • If core SPEKTRA Edge services are upgraded in some regions, then they will be upgraded at once.
  • All core SPEKTRA Edge services will be public: Anyone authenticated will be able to read its roles, permissions, and plans, or be able to import them.
  • All 3rd party services will be assumed to be users of core SPEKTRA Edge services (no cost if no actual use).
  • Service resources can be created by a ServiceAccount only. It is assumed that it will be managing this Service.
  • Service will belong to a Project, where ServiceAccount who created it belongs.

Users may think of core edgelq services as a service bundle. Most of these SPEKTRA Edge rules are declarations, but I believe deployment workflows are enforcing this anyway. The decision, that all 3rd parties are considered users of all core SPEKTRA Edge services, and that each Service must belong to some project, is reflected in additional custom middleware we have for meta service in the edgelq repository, see file meta/server/v1/service/service_service.go. In this extra middleware, executed before custom middleware in the goten repository (meta-service/server/v1/service/service_service.go), we are adding core SPEKTRA Edge to the used services array. We also assign a project-owning Service. This is where the management of ServiceAccounts is, or where usage metrics will go.

This concludes Meta service workings, where we can find information about services and relationships between them.

2 - EnvRegistry as Service Discovery

Understanding the role of EnvRegistry module in Meta service.

Meta service provides API allowing inspection global environment, but we also need a side library, called EnvRegistry:

  • It must allow a Deployment to register itself in a Meta service, so others can see it.
  • It must allow the discovery of other services with their deployments and resources.
  • It must provide a way to obtain real-time updates of what is happening in the environment.

Those three items above are the responsibilities of EnvRegistry module.

In the goten repo, this module is defined in the runtime/env_registry/env_registry.go file.

As of now, it can only be used by server, controller, and db-controller runtimes. It may be beneficial for client runtimes someday probably, but we will opt out from “registration” responsibility because the client is not the part of the backend, it cannot self-register in Meta service.

One of the design decisions regarding EnvRegistry is that it must block till initialization is completed, meaning:

  • User of EnvRegistry instance must complete self-registration in Meta Service.
  • EnvRegistry must obtain the current state of services and deployments.

Note that no backend service works in isolation, as part of the Goten design, it is essential that:

  • any backend runtime knows its surroundings before executing its tasks.
  • all backend runtimes must be able to see other services and deployments, which are relevant for them.
  • all backend runtimes must initialize and run the EnvRegistry component and it must be one of the first things to do in the main.go file.

This means, that the backend service, if it cannot successfully pass initialization, will be blocked from any useful work. If you check all run functions in EnvRegistry, you should see they lead to the runInBackground function. It runs several goroutines, but then it waits for a signal showing all is fine. After this, EnvRegistry can be safely used to find other services, and deployments, and make networking connections.

This also guarantees that Meta service contains relevant records for services, in other words, EnvRegistry registration initializes regions, services, deployments, and resources. Note, however:

  • The region resources can be created/updated by meta.goten.com service only. Since meta is the first service, it is responsible for this resource to be initialized.
  • The service resource is created by the first deployment of a given service. So, if we release custom.edgelq.com for the first time, in the first region, it will send a CreateService request. The next deployment of the same service, in the next region, will just send UpdateService. This update must have a new MultiRegionPolicy, where field-enabled regions contain a new region ID.
  • Each deployment is responsible for its deployment resource in Meta.
  • All deployments for a given service are responsible for Resource instances. If a new service is deployed with the server, controller, and db-controller pods, then they may initially be sending clashing create requests. We are fine with those minor races there, since transactions in Meta service, coupled with CAS requests made by EnvRegistry, ensure eventual consistency.

Visit the runInit function, which is one of the goroutines of EnvRegistry executed by runInBackground. It contains procedures for registration of Meta resources finishes after a successful run.

From this process, another emerging design property of EnvRegistry is that it is aware of its context, it knows what Service and Deployment it is associated with. Therefore, it has getters for self Deployment and Service.

Let’s stay for a while in this run process, as it shows other goroutines that are run forever:

  • One goroutine keeps running runDeploymentsWatch
  • Second goroutine keeps running runServicesWatch
  • The final goroutine is the main one, runMainSync

We don’t need real-time watch updates of regions and resources, we need services and their regional deployments only. Normally watch requires a separate goroutine, and it is the same case here. To synchronize actual event processing across multiple real-time updates, we need a “main synchronization loop”, which unites all Go channels.

In the main sync goroutine, we:

  • Process changes detected by runServicesWatch.
  • Process changes detected by runDeploymentsWatch.
  • Catch initialization signal from the runInit function, which guarantees information about our service is stored in Meta.
  • Attachment of new real-time subscribers. When they attach, they must get a snapshot of past events.
  • Detachment of real-time subscribers.

As of additional note: since EnvRegistry is self-aware, it gets only Services and Deployments that are relevant. Those are:

  • Services and Deployments of its Service (obviously)
  • Services and Deployments that are used/imported by the current Service
  • Services and Deployments that are using the current Service

The last two parts are important, it means that EnvRegistry for top service (like meta.goten.com) is aware of all Services and Deployments. Higher levels will see all those below or above them, but they won’t be able to see “neighbors”. The higher the tree, there will be fewer services above, and more below, but the proportion of neighbors will be higher and higher.

It should not be a problem, though, unless we reach the scale of thousands of Services, core SPEKTRA Edge services will however be more pressured than all upstream ones for various reasons.

In the context of SPEKTRA Edge, we made additional implementation decisions, when it comes to SPEKTRA Edge platform deployments:

  • Each service, except meta.goten.com itself, must connect to the regional meta service in its EnvRegistry.

    For example, iam.edgelq.com in us-west2, must connect to Meta service in us-west2. Service custom.edgelq.com in eastus2 must connect to Meta service in eastus2.

  • Server instance of meta.goten.com must use local-mode EnvRegistry. The reason is, that it can’t connect to itself via API, especially since it must succeed in EnvRegistry initialization before running its API server.

  • DbController instance of meta.goten.com is special, and shows the asymmetric nature of SPEKTRA Edge core services regarding regions. As a whole, core SPEKTRA Edge services point to the same primary region, any other is secondary. Therefore, DbController instance of meta.goten.com must:

    • In the primary region, connect to the API server of meta.goten.com in the primary region (intra-region)
    • In the secondary region, connect to the API server of meta.goten.com in the primary region (the secondary region connects to the primary).

Therefore, when we add a new region, the meta-db-controller in the secondary region registers itself in the primary region meta-service. This way primary region gets the awareness of the next region’s creation. The choice of meta-db-controller for this responsibility has more for it, Meta-db-controller will be responsible for syncing the secondary region meta database from the primary one. This will be discussed in the following section of this guide. For now, we just mentioned conventions where EnvRegistry must source information from.

3 - Resource Metadata

Understanding the resource metadata for the service synchronization

As a protocol, Goten needs to have protocol-like properties. One of the thems is the requirement that resource types of all Services managed by Goten must contain metadata objects. It was already mentioned multiple times, but let’s put a link to the Meta object again https://github.com/cloudwan/goten/blob/main/types/meta.proto.

Resource type managed by Goten must satisfy interface methods (you can see in the Resource interface defined in the runtime/resource/resource.go file):

GetMetadata() *meta.Meta
EnsureMetadata() *meta.Meta

There is, of course, the option to opt-out, interface Descriptor has method SupportsMetadata() bool. If it returns false, it means the resource type is not managed by Goten, and will be omitted from the Goten design! However, it is important to recognize if resource type is subject to this design or not, and how we can do this, including programmatically.

To summarize, as protocol, Goten requires resources to satisfy this interface. It is important to note what information is stored in resource metadata in the context of the Goten design:

  • Field syncing of type SyncingMeta must always describe which region owns a resource, and which regions have read a copy of it. SyncingMeta must be always populated for each resource, regardless of type.

  • Field services of type ServicesInfo must tell us which service owns a given resource, and a list of services for which this resource is relevant. Unlike syncing, services may not be necessarily populated, meaning that Service-defining resource type is responsible for explaining how it works in this case. In the future probably it may slightly change:

    If services is not populated at the moment of resource save, it will point to the current service as owning, and allowed services will be a one-element array containing the current service too. This in fact should be assumed by default, but it is not enforced globally, which we will explain now.

First, service meta.goten.com always ensures that the services field is populated for the following cases:

  • Instances of meta.goten.com/Service must have ServicesInfo where:
    • Field owning_service is equal to the current service itself.
    • Field allowed_services contains the current service, all imported/used services, AND all services using importing this service! Note that this may be dynamically changing, if a new service is deployed, it will update the ServicesInfo fields of all services it uses/imports.
  • Instances of meta.goten.com/Deployment and meta.goten.com/Resource must have their ServicesInfo synchronized with parent meta.goten.com/Service instance.
  • Instances of meta.goten.com/Region do not have ServicesInfo typically populated. However, in the SPEKTRA Edge context, we have a public RoleBinding that allows all users to read from this collection (but never write). Because of this private/public nature, there was no need to populate service information there.

Note that this implies that service meta.goten.com is responsible for syncing ServicesInfo of meta.goten.com/Deployment and meta.goten.com/Resource instances. It is done by a controller implemented in the Goten repository: meta-service/controller directory. It is relatively simple.

However, while meta.goten.com can detect what ServicesInfo should be populated, this is often not the case at all. For example, when service iam.edgelq.com receives a request CreateServiceAccount, it does not know necessarily for whom this ServiceAccount is at all. Multiple services may be owning ServiceAccount resources, therefore, but the resource type itself does not have a dedicated “service” field in its schema. The only way services can annotate ServiceAccount resources is by providing necessary metadata information. Furthermore, if some custom service wants to make the ServiceAccount instance available for others services to see, it may need to provide multiple items to the allowed_services array. This should explain that service information must be determined at the business logic level. For this reason, it is allowed to have empty service information, but in many cases, SPEKTRA Edge will enforce their presence, where business logic requires it.

Then, the situation for the other meta field, syncing, is much easier. Value can be determined on the schema level. There already is instruction in the multi-region design section of the developer guide.

Regions setup always can be defined based on resource name only:

  • If it is a regional resource (has a region/ segment in the name), it strictly tells which region owns it. The list of regions that get a read-only copy is decided on below resource name properties below.
  • If it contains a well-known policy-holder in the name, then the policy-holder defines what regions get a read copy. If the resource is non-regional, then MultiRegionPolicy also tells what region owns it (default control region).
  • If the resource is not subject to MultiRegionPolicy (like Region, or User in iam.edgelq.com), then it is a subject of MultiRegionPolicy defined in the relevant meta.goten.com/Service instance (for this service).

Now the trick is: All policy-holder resources are well-known. Although we try not to hardcode anything anywhere, Goten provides utility functions for detecting if a resource contains a MultiRegionPolicy field in its schema. This also must be defined in the Goten specification. By detecting what resource types are policy-holders, Goten can provide components that can easily extract regional information from a given resource by its name only.

Versioning information does not need to be specified in the resource body. Having instance, it is easily possible to get Descriptor instance, and check API version. All schema references are clear in this regard too, if resource A has a reference field to resource B, then from the reference object we can get the Descriptor instance of B, and get the version. The only place where it is not possible, are meta owner references. Therefore, in the field metadata.owner_references, an instance of each must contain the name, owning service, API version, and region (just in case it is not provided in the name field). When talking about the meta references, it is important to mention other differences compared to schema-level references:

  • schema references are owned by a Service that owns resources with references.
  • meta owner references are owned by a Service to which references are pointing!

This ownership has implication: when Deployment D1 in Service S1 upgrades from v1 to v2 (for example), and there is some resource X in Deployment D2 from Service S2, and this X has the meta owner reference to some resource owned by D1, then D1 will be responsible for sending an Update request to D2, so meta owner reference is updated.

4 - Multi-Region Policy Store

Understanding the design of the multi-region policy store.

We mentioned MultiRegion policy-holder resources, and their importance when it comes to evaluating region syncing information based on resource name. There is a need to have a MultiRegion PolicyStore object, that for any given resource name returns a managing MultiRegionPolicy object. This object is defined in the Goten repository, file runtime/multi_region/policy_store.go. This file is important for this design and worth remembering. As of now, it returns a nil object for global resources though, the caller should in this case take MultiRegionPolicy from the EnvRegistry component from the relevant Service.

It uses a cache that accumulates policy objects, so we should normally not use any IO operations, only initially. We have watch-based invalidation, which allows us to have a long-lived cache.

We have some code-generation that provides us functions needed to initialize PolicyStore for a given Service in a given version, but the caller is responsible for remembering to include them (All those main.go files for server runtimes!).

In this file, you can also see a function that sets/gets MultiRegionPolicy from a context object. In multi-region design, it is required from a server code, to store the MultiRegionPolicy object in a context if there will be updates to the database!