Developing the Sample Service

Let’s develop the sample service.

When writing code for your service, it is important to know some Goten/SPEKTRA Edge-specific components and how to use them. This part contains notable examples and advice.

Some examples here apply to edge runtimes too, as they often describe methods of accessing service backends.

Basic CRUD functionality

Unit tests are often a good way to show the possibilities of Goten/SPEKTRA Edge. While example service implementation shows something more “real” and “full”, various use cases in the shorted form are better represented with tests. In Goten, we have CRUD with: https://github.com/cloudwan/goten/blob/main/example/library/integration_tests/crud_test.go And pagination: https://github.com/cloudwan/goten/blob/main/example/library/integration_tests/pagination_test.go

Client modules will always be used by edge applications, and often by servers too - since the backend, on top of storage access will always need some access to other services/regions.

Using field paths/masks generated by Goten

Goten generates plenty of code related to field masks and paths. Those can be used for various techniques.

import (
	// Imaginary resource, but you can still use example
	resmodel "github.com/cloudwan/some-repo/resources/v1/some_resource"
)

func DemoExampleFieldPathsUsage() {
	// Construction of some field mask
    fieldMaskObject := &resmodel.SomeResource_FieldMask{Paths: []resmodel.SomeResource_FieldPath{
        resmodel.NewSomeResourceFieldPathBuilder().SomeField().FieldPath(),
        resmodel.NewSomeResourceFieldPathBuilder().OtherField().NestedField().FieldPath(),
    }}
	
	// We can also set a value to an object... if there is path item equal to NIL, then it is allocated
	// on the way. 
	res := &resmodel.SomeResource{}
    resmodel.NewSomeResourceFieldPathBuilder().OtherField().NestedField().WithValue("SomeValue").SetTo(&res)
    resmodel.NewSomeResourceFieldPathBuilder().IntArrayField().WithValue([]int32{4,3,2,1}).SetTo(&res)
	
	// You can access items from a field path... we also support this if there is an array on the path. But
	// this time we need to cast.
	for _, iitem := range resmodel.NewSomeResourceFieldPathBuilder().ObjectField().ArrayOfObjectsField().ItemFieldOfStringType().Get(res) {
		item := iitem.(string) // If we know that "item_field_of_string_type" is a string, we can safely do that!
		// Do something with item here...
    }
}

It is worth seeing interfaces FieldMask, and FieldPath in the github.com/cloudwan/object module. Those interfaces are implemented for all resource-related objects. Many of these methods have their strong-typed equivalents.

With field path objects you can:

  • Set the value to a resource
  • Extract value (or values) from a resource.
  • Compare value from the one in resource
  • Clear value from a resource
  • Get the default value for a field path (you may need reflection though)

With field masks, you can:

  • project a resource (shallow copy for selected paths)
  • Merge resources with field mask…
  • Copy selected field paths from one resource to another

You can explore some examples also in unit tests: https://github.com/cloudwan/goten/blob/main/runtime/object/fieldmask_test.go https://github.com/cloudwan/goten/blob/main/runtime/object/object_test.go

Tests for objects show also more possibilities related to field paths: We can use those modules for general deep cloning, diffing, or merging.

Creating resources with meta-owner references

In inventory-manager there is some particular example of creating a Service resource, see CreateDeviceModel custom implementation. Before the resource DeviceOrder is created, we connect with the secrets.edgelq.com service, and we create a Secret resource. We are creating it with populated metadata.ownerReferences value, as an argument, we are passing meta OwnerReference object, which contains the name of the DeviceOrder being created, along with the region ID where it is being created.

This is the file with the code we describe: https://github.com/cloudwan/inventory-manager-example/blob/master/server/v1/device_order/device_order_service.go.

Find implementation for the CreateDeviceOrder method there.

Meta-owner references are different kinds of references compared to those defined in the schema. Mainly:

  • They are considered “soft”, and can never block pointed resources.
  • You cannot unfortunately filter by them.
  • During creation (or when making an Update request with a new meta owner), meta owner reference does not need to point to the existing resource (yet - see below).
  • Have specific deletion behavior (see below).

The resource being pointed by meta owner reference we call “meta owner”, the pointing one is “meta ownee”.

Meta owner refs have however following deletion property:

  • When the meta owner resource is being deleted, then the meta owner reference is unset in an asynchronous manner.
  • If the meta owner resource does not exist, then after some time (minutes), the meta owner reference is removed from the meta ownee.
  • If the field metadata.ownerReferences becomes an empty array due to the removal of the last meta owner, the meta ownee resource is automatically deleted!

Therefore, you may consider that meta ownee has specific ASYNC_CASCADE_DELETE behavior - except that it needs all parents to be deleted.

When it is possible, it is much better to use schema references, declared in the protobuf files. However, it is not always possible, like here, because the InventoryManager service is importing secrets.edgelq.com, not the other way around. Secrets service cannot possibly know about the existence of the InventoryManager resources model, therefore Secret resource cannot have any reference to DeviceOrder. Instead, when we want to create a Secret resource and associate it with the lifecycle of DeviceOrder (we want Secret to be garbage collected), then we should precisely use meta ownership.

This way, we can ensure that “child” resources from lower-level services like secrets are automatically cleaned up. It will also happen if, after successful Secret creation, we fail to create DeviceOrder (let’s say, something happened and the database rejected our transaction without a retry option). It is because meta owner references are timing out when meta owner fails to exist within a couple of minutes since meta owner reference attachment.

There is one super corner case though, it is possible, that Secret resource will be successfully created, BUT transaction saving DeviceOrder will fail with Aborted code, but this error type can be retried. As a result, the whole transaction will be repeated, including another CreateSecret call. After the second approach, we will have two Secrets pointing to the same DeviceOrder, but DeviceOrder will have only one reference to one of those secrets. The other is stale. This particular case is being handled by the option WithRequiresOwnerReference passed to the meta owner, it means that the Meta owner reference is removed from the meta ownee also when the parent resource has no “hard” reference pointing at the meta ownee. In this case, one of the secrets would not be pointed by DeviceOrder and would be automatically cleaned up asynchronously.

It is advised to always use meta owner reference with the WithRequiresOwnerReference option if the parent resource can have a schema reference to the meta ownee - like in this case, where DeviceOrder has a reference to a Secret. It follows the principle, where the owner has a reference to the ownee. Note that in this case, we are creating a kind of loop reference, but it is allowed in this case.

Creating resources from the 3rd party service.

Any 3rd party service can create resources in SPEKTRA Edge core services, however, there is a condition attached to it. They must mark resources with service ownership information.

In method CreateDeviceOrder from https://github.com/cloudwan/inventory-manager-example/blob/master/server/v1/device_order/device_order_service.go, look again at the CreateSecret call and see field metadata.services of a Secret to create. We need to pass on the following information:

  • Which service owns this particular resource

    and we must point to our service.

  • List of allowed services that can read this resource

    we should point to our service, but we may optionally include other services too if this is needed.

Setting this field is a common requirement when 3rd party service needs to create a resource owned by it.

It is assumed that Service should not have full access to the project. Users however can create resources without this restriction.

Accessing service from the client

Services on SPEKTRA Edge typically have Edge clients, devices/applications running with ServiceAccount registered in IAM, connecting to SPEKTRA Edge/ Third party service via API.

An example is provided with inventory-manager here: https://github.com/cloudwan/inventory-manager-example/blob/master/cmd/simple-agent-simulator/dialer.go

Note that you can skip WithPerRPCCredentials to have anonymous access. The Authenticator will classify the principal as Anonymous, and the Authorizer will then likely reject the request with a PermissionDenied code. It may still be useful, for example during activation, when a service account is being created and credentials keys are allocated, the backend will need to allow anonymous access though, and custom security needs to be provided. See Edge agent activation in this doc.

Created gRPC connection you can use to wrap with client interfaces generated in client packages for your service (or also any SPEKTRA Edge-based service).

Edge agent activation

SPEKTRA Edge-based service has typically human users (represented by the User resource in iam.edgelq.com), or agents running on the edge (represented by the ServiceAccount resource in iam.edgelq.com). Users typically access SPEKTRA Edge via a web browser or CLI and get access to the service via invitation.

A common problem with Edge devices is that, during the first startup, they don’t have credentials yet (typically).

If you have an agent runtime running on the edge, and it needs to self-activate by connecting to the backend and requesting credentials, this part is for you to read.

Activation can be done with a token - the client needs to establish a connection without RPC credentials in GRPC. Then it can connect to a special API method for activation. During activation, it should send a token for identification. At this exchange, credentials are created and returned by the server. There is a plan to have a generic Activation module in SPEKTRA Edge framework, but it’s not ready yet.

For the inventory manager, we have:

It is a fairly complex example though, therefore Activation module is planned to be added in the future.

The token for activation is created with DeviceOrderService when an order for edge devices is created. We store token value using a secrets service, to ensure its value is not stored in any database just in case. This token is then needed during the Activation stream.

The activation method is bidi-streaming, as seen in api-skeleton. The client will initialize activation with the first request containing the token value. The server will respond with credentials, but to activate, the client will need to send additional confirmation. Because of multiple requests done by the client/server side, it was necessary to make this call a streaming type.

When implementing activation, there is another issue with it: ActivationRequest sent by the client has no region ID information, if there are multiple regions for a given service, and the agent connects with the wrong region, the backend will have issues during execution. RegionID is encoded however in the token itself. As of now, code-generated multi-region routing does not support methods where region ID is encoded in some field in the request. For now, it is necessary to disable multi-region routing here and implement the custom method, as shown in the example file.

During proper implementation of Activation (examine example file activation_service.go), we are:

  • Using secrets service to validate token first

  • We are opening a transaction to create an initial record for the agent object. This part may be more service-specific

    in this case, we are associating an agent with a device from a different project, which is not typical here! More likely we would need to associate the agent with a device from same project.

  • We are creating several resources for our agent: a logging bucket, a metrics bucket, and finally service account with key and role binding.

  • We then ask the client to confirm activation, if fine, we save the agent in another transaction to associate with created objects (buckets and service account)

This activation example is however good at showing how to implement custom middleware, interact with other services and create resources there.

Notable elements:

  • When creating ServiceAccount, it is not usable at the beginning: you need to create also a ServiceAccountKey, along with RoleBinding, so this ServiceAccount can do anything useful. We will discuss this example more in the document about the IAM integration document.
  • Note that the ServiceAccount object has a set meta owner reference pointing to the agent resource. It also gets the attribute WithRequiresOwnerReference(). It is highly advisable to create resources here in this way. ServiceAccount in this way is bound to the agent resource, when the agent is deleted, ServiceAccount is also deleted. Also, if Activation failed and ServiceAccount was created, then ServiceAccount will be cleaned up, along with ServiceAccountKey and RoleBinding. Note we talked about it when describing meta-owner references.
  • Logging and metrics buckets are also created using meta owner references, if an agent record is deleted, they will be cleaned automatically. The usage of buckets specified per agent is required to ensure that agents cannot read data owned by others. This topic will be covered more in a document describing SPEKTRA Edge integration. If logging and/or metrics are not needed by the agent, they can be skipped.
  • All resources in SPEKTRA Edge created by Activation require the metadata.services field populated.

EnvRegistry usage and accessing other services/regions from the server backend

The envRegistry component is used for connecting the current runtime with other services/regions. It can also provide real-time updates to changes (like dynamic deployment of a service in a new region). Although those things are rare, dynamic updates help in those cases, we should not need to redeploy clusters from existing regions if we are adding a new deployment in a new region.

EnvRegistry can be used to find regional deployments and services.

It is worth to remind difference between Deployment and Service: While service represents service as a whole, with public domain, Deployment is a regional instance of Deployment (specific cluster).

The interface of EnvRegistry can be found here: https://github.com/cloudwan/goten/blob/main/runtime/env_registry/env_registry.go

You will encounter EnvRegistry usage throughput examples, they are always constructed in the main file.

The notable thing about EnvRegistry is that all dial functions also have “fctx” equivalent calls (like DialServiceInRegion and DialServiceInRegionFCtx). FCtx stands for Forward Context. We are passing over various headers from the previous call to the next one, like authorization or call ID. Usually, it is called from MultiRegion middleware, when headers need to be passed to the new call (especially Authorization). It has some restrictions though, since services do not necessarily trust each other, forwarding authorization to another service may be rejected. MultiRegion routing is a different topic because a request is routed between different regions of the same service, meaning that the service being called stays the same.

As of now, envRegistry is available only for backend services, it may be enhanced in the future, so clients can just pass bootstrap endpoint (meta.goten.com service), and all other endpoints are discovered.

Store usage (database)

In files main.go for servers you will see a call to NewStoreBuilder. We typically add a cache and constraints layer. Then we must add plugins (this list is for server runtimes):

  • Mandatory: MetaStorePlugin, Various sharding plugins (for all used sharding)
  • Highly recommended: AuditStorePlugin and UsageStorePlugin.
  • Mandatory if multi-region features are used: SyncingDecoratorStorePlugin
  • Mandatory if you use Limits service integration: V1ResourceAllocatorStorePlugin for the v1 limits version.

Such a constructed store handle already has all the functionality: Get, Search, Query, Save, Delete, List, Watch… However, it does not have type-safe equivalents for individual resources, like SaveRoleBinding, DeleteRoleBinding, etc. To have a nice wrapper, we have a set of As<ServiceShortName>Store functions that decorate a given store handle. Note that all collections must exist within a specified namespace.

You need to call the WithStoreHandleOpts function on the Store interface before you can access the database. Typically, you should use one of the following, with snapshot transaction, or cache-enabled no-transaction session:

import (
	"context"
	
	gotenstore "github.com/cloudwan/goten/runtime/store"
)

func withSnapshotTransaction(
  ctx context.Context,
  sh gotenstore.Store,
) error {
  return sh.WithStoreHandleOpts(ctx, func (ctx context.Context) error {
    var err error
	//
	// Here we use all Get, List, Save, Delete etc.
	//
    return err
  }, gotenstore.WithTransactionLevel(gotenstore.TransactionSnapshot))
}

func withNoTransaction(
  ctx context.Context,
  sh gotenstore.Store,
) error {
  return sh.WithStoreHandleOpts(ctx, func (ctx context.Context) error {
    var err error
    //
    // Here we use all Get, List etc.
    //
    return err
  }, gotenstore.WithReadOnly(), gotenstore.WithTransactionLevel(gotenstore.NoTransaction), gotenstore.WithCacheEnabled(true))
}

If you look at any transaction middleware, like here: https://github.com/cloudwan/inventory-manager-example/blob/master/server/v1/site/site_service.pb.middleware.tx.go, you should note that typically transaction is already set per each call. It may be a different case if in the API-skeleton file, you did set the MANUAL type:

actions:
- name: SomeActionName
  withStoreHandle:
    transaction: MANUAL

In this case, transaction middleware would not set anything, and you need to call WithStoreHandleOpts yourself. MANUAL type is useful, if you plan to have multiple micro transactions.

Notes:

  • All Watch calls (singular and for collection) do NOT require WithStoreHandleOpts calls. They do not provide any transaction properties at all.
  • All read calls (Get, List, BatchGet, Search) must NOT be executed after ANY write (Save or Delete). You need to always collect all reads before making any writes.

Example usages can be found in https://github.com/cloudwan/inventory-manager-example/blob/master/server/v1/activation/activation_service.go

Note that the Activation service is using MANUAL type, middleware is not setting it.

Watching real-time updates

SPEKTRA Edge-based services utilize heavily real-time watch functionality offered by Goten. There are 3 types of watches:

  • Single resource watch

    The client picks a specific resource by name and subscribes for real-time updates of it. Initially, it gets the current data object, then it gets an update whenever there is a change to it.

  • Stateful watch

    Stateful watch is used to watch a specific PAGE of resources in a given collection (ORDER BY + PAGE SIZE + CURSOR), where CURSOR typically means offset from the beginning (but is more performant). This is more useful for web applications for users if there is a need to show real-time updates of a page where the user is. It is possible to specify filter objects.

  • Stateless watch

    It is used to watch ALL resources within a specified optional filter object. It is not possible to specify order or paging. Note this may overload the client with a large changeset if the filter is not carefully set.

For each resource, if you look at <resource_name>_service.proto files, API offers Watch<Single> or Watch<Collection>. The first one is for a single resource watch and is relatively simple to use. Collection watch type requires you to specify param: STATELESS or STATEFUL. We recommend STATEFUL for web-type applications because of its paging features. STATELESS is recommended for some edge applications that need to watch some sub-collection of resources. However, we do not recommend using direct API in this particular case. STATELESS watch, while powerful, may require clients to handle cases like resets or snapshot size checks. To hide this level of complexity, it is recommended to use Watcher modules in access packages, each resource has a typed-safe generated class.

This is reflected in tests from https://github.com/cloudwan/goten/blob/main/example/library/integration_tests/crud_test.go

There are 3 unit tests for various watches, and TestStatelessWatchAuthorsWithWatcher shows usage with the watcher.

Multi-Region development advice (for server AND clients)

Most of the multi-region features and instructions were discussed with api-skeleton functionality. If you stick to cases mentioned in the api-skeleton, then typically code-generated multi-region routing will handle all the quirks. Similarly, db-controller and MultiRegionPolicy objects will handle all cross-region synchronization.

Common advice for servers:

  • Easiest for multi-region routing are actions where isCollection and isPlural are both false.
  • Cases where isPlural is true and isCollection is false are not supported, we have built-in support for BatchGet, but custom methods will not fit. It is advised to avoid them, if possible.
  • Plural and collections requests are somewhat supported, we do support Watch, List, and Search requests. Customizations based on them are the easiest to support. You can look at the example like ListPublicDevices method in devices.edgelq.com service. However, there are certain conditions, Request object needs standard fields like parent and filter. Code-generation tool look for these to implement multi-region routing. Pagination fields are optional. In the response, it is necessary to include an array of returned resources. In the api-skeleton, it is necessary to provide responsePaths and point to the path where this list of resources is. If those conditions are met, you can implement various List variations yourself.
  • For streaming calls, you must allow multi-region routing using the first request from the client.

Links for ListPublicDevices:

Common advice for clients:

It is also advisable to avoid queries that will be routed or worse, split & merged across multiple regions. Those queries should be rather exceptional, not a rule. One easy way to avoid splitting & merge is to query for resources within a single policy-holder resource (Service, Organization, or Project). For example, if you query for a Distributions in specific project, they will likely be synced across all project regions - if not, they will at least reside in the primary region for a project. This way, one or more regions will be able to execute the request fully.

If you query (with filter) across projects/organizations/services, you can:

  • For resources attached to regions (like Device resource in devices.edgelq.com service), you can query just specific region across projects: ListDevices WHERE parent = "projects/-/regions/us-west2/devices/-". Note that the project is a wildcard, but the region is specific.
  • There is an object in each metadata object within each resource syncing. You can find this here: https://github.com/cloudwan/goten/blob/main/types/meta.proto. See the SyncingMeta object and its description. Now, if you filter by owningRegion, regardless of resource type, regardless of whether this is regional or not, a request with metadata.syncing.owningRegion will be routed to that specific region. Similarly, if you query with metadata.syncing.regions CONTAINS condition, you can also ensure requests will be routed to a specific region. Query with CONTAINS condition ensures that the client will see resources that the region can see anyway. Filter for owningRegion takes precedence over regions and CONTAINS.