This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

SPEKTRA Edge IAM Service Design

Understanding the SPEKTRA Edge IAM service design.

Service iam.edgelq.com plays one of the central parts of the SPEKTRA Edge platform, in charge of authentication and authorization. It enables multi-tenant and multi-service environments. By default, Goten does not come with authentication and authorization feature, only IAM fills this hole and allows services to work with each other without trust.

IAM Concepts, Actors, Role management, and Tenant management should be already known from the user guide, customizations, and basic usage of Authenticator, Authorizer, and Authorization middleware, from the developer guide. This document dives more into details about what is happening in the unique components provided by the IAM service. It is assumed reader can understand the general and standard structure of the IAM codebase at this point, located in the directory iam in the SPEKTRA Edge repository.

We will focus here not so much on IAM service but in big part on what IAM provides. Authenticator and Authorizer are modules provided by IAM but are linked in during each server compilation. Therefore, each API server of any backend service has these in its runtime.

1 - SPEKTRA Edge IAM Principals

Understanding the SPEKTRA Edge IAM principals.

In IAM, we identify two types of principals that can be uniquely identified and authenticated:

  • Users
  • ServiceAccounts

They have very different authentication methods. First, IAM does not manage users that much. The third party is responsible for actual Users' management. When a user sends a request to a service, it provides an authorization token with some claims. API servers must use the jwks endpoint, which provides a json web key set, to verify the signature of the access token. Verification ensures we can trust the claims stored in the token. Inside claims, we have more details like User unique identifier, which we can use to extract User resource from IAM.

As of now, we use Auth0 3rd party service for users. It is creating and signing access tokens that SPEKTRA Edge receives. They are giving us a jwks endpoint from which we get public keys for verification. Token signing, rotation, and user list management are all handled by Auth0, although IAM has several methods where that connect to Auth0 for management purposes.

When a user joins the system, it is not an IAM that is notified first. The request goes to Auth0 where data is created. Record in IAM is created later on when the user starts interacting with SPEKTRA Edge. User resources may get created in IAM during the first Authentication. It may also be saved/updated when it gets RefreshUserFromIdToken.

On the other side, ServiceAccounts are typically managed by SPEKTRA Edge, or by any other entity that creates ServiceAccounts in IAM service. How it is done:

  • ServiceAccount is created by some clients. Without ServiceAccountKey though, it is not usable.

  • ServiceAccountKey is created by a client. Regarding public-private key, there are 2 options:

    • Client generates both private and public key pair. It sends CreateServiceAccountKey with a public key only, so IAM never sees the private key. The client is fully responsible for securing it. It is a recommended procedure.

    • It is possible to ask IAM to create a public-private key pair during ServiceAccountKey creation. In this case, IAM saves only the public key in the database, private key is returned in response. In this case, still client is still fully responsible for securing it. This method allows to skip generation only.

In summary, ServiceAccounts are the responsibility of the clients, who need to secure private keys. Those private keys are then later used to create access tokens. During authentication, the backend service will grab the public key from IAM.

2 - SPEKTRA Edge IAM Authentication

Understanding the SPEKTRA Edge IAM authentication.

The module for authentication is in the SPEKTRA Edge repository, file iam/auth/authenticator.go. It also provides an Authentication function (grpc_auth.AuthFunc) that is passed to the grpc server. If you see any main.go of API Server runtime, you should find code like:

grpcserver.NewGrpcServer(
  authenticator.AuthFunc(),
  commonCfg.GetGrpcServer(),
  log,
)

This function is used by grpc interceptors, so authentication is done outside any server middleware. As a result, the context object associated with the request will contain the AuthToken object, as defined in iam/auth/types/auth_token.go, before any server processes the call.

Authenticator Tasks

Authenticator uses HTTP headers to find out who is making a call. The primary header in use is authorization. It should contain the Bearer <AccessToken> value, and we want this token part.

For now, ignore x-goten-original-auth and additional access tokens, which will be described in distributed authorization process section.

Usually, we expect just a single token in authorization, based on which authentication happens.

Authenticator delegates access token identification to a module called AuthInfoProvider. It returns the Principal object, as defined in iam/auth/types/principal.go. Under the hood, it can be ServiceAccount, User, or Anonymous.

We will dive into it in the AuthInfoProvider for authentication section below, but for now let’s touch another important topic regarding authentication.

When AuthInfoProvider returns principal data, Authenticator also needs to validate all claims are as expected, see addKeySpecificClaims. Of these claims, the most important part is the audience. We need to protect against cases, where someone gets an access token and tries to pose as a given user in some other service. For this reason, look at the expected audience for User:

claims.Audience = []string{a.cfg.AccessTokenAudience}

This AccessTokenAudience is equal to the audience specific to SPEKTRA Edge: https://apis.edgelq.com for example, configured for some of our services. If this expected audience does not match what is actually in the claims, it may be because someone got this token and uses this in different services. It’s like us: we have access tokens from users, so if we knew where else the user has an account, we could try to log in somewhere else. To prevent issues, we check the audience. However, note that audience is one global value for all SPEKTRA Edge platforms. So, one service on SPEKTRA Edge can then connect to another service in SPEKTRA Edge and use this access token, successfully posing as a user. As long as services on SPEKTRA Edge can trust each other, it is not an issue. For untrusted third party services, it may be a problem: if a user sends a request to them, they potentially can take it and use it in other SPEKTRA Edge services. In the future, we may provide additional claims, but it means that the user will need to ask jwks provider for an access token for a specific SPEKTRA Edge service, and perhaps for each of them, or groups of them.

In API Server config, see Authenticator settings, field accessTokenAudience.

A bit easier situation is with ServiceAccounts, when they send a request to service, the audience contains the endpoint of the specific service they call. Therefore, if they send requests to devices, devices won’t be able to send requests to let’s say applications. The problem may be API keys, which are global for the whole SPEKTRA Edge, but it’s the user’s choice to use this method, which was insisted as less “complicated”. It should be fine if this API key is used strictly anyway.

In API Server config, see Authenticator settings, field serviceAccountIdTokenAudiencePrefixes. This is a list of prefixes from which the audience can start.

AuthInfoProvider for Authentication

AuthInfoProvider is a common module for both authenticator and authorizer. You can see it in the SPEKTRA Edge repo, file iam/auth/auth_info_provider.go. For authenticator, only one method counts: It is GetPrincipal.

Inside GetPrincipal of AuthInfoProvider we still don’t get the full principal. The reason is, that getting principal is a tricky thing: if AuthInfoProvider is running on the IAM Server, then it may use a local database. If it is part of a different server, then it will need to ask the IAM Server to give principal data. Since it can’t fully get principal, it does what it can:

  • First, we check the ID of the key from the authorization token.
  • If the ID is equal to the one ServiceAccountKey ID Server instance uses it means that it is requesting itself. Perhaps it is a controller trying to connect to the Server instance. If this is the case, we just return “us”. This is a helpful trick when a service is bootstrapping for the first time. API Server may not be listening on the port, or the database may have missing records.
  • Mostly, however, AuthInfoProvider is one giant cache object and this includes storage of principals. Caching principals locally, with some long-term cache, significantly lowers pressure on IAM and reduces latencies.

AuthInfoProvider uses the PrincipalProvider interface to get actual instances. There are two providers:

  • LocalPrincipalProvider in iam/auth/internal/local_principal_provider.go
  • RemotePrincipalProvider in iam/auth/internal/remote_principal_provider.go

Local providers must be used only by IAM Servers, others must use the remote option. Let’s start with the remote. If you check GetPrincipal of RemotePrincipalProvider, you can see that it just connects to the IAM service, and uses GetPrincipal method, which is defined in the API skeleton file. for ServiceAccount type, it however first needs to fetch a project resource, to figure out in which regions ServiceAccountKey is available.

It is worth mentioning, that services are not supposed to trust each other, this also means IAM does not necessarily trust services requesting access to user information, even if they have a token. Perhaps access to the authorization token should be enough for IAM to return user information, but in GetPrincipalRequest we also require information which service is asking. IAM will validate if this service is allowed to see the given principal.

You should jump into different parts of the code, see the GetPrincipal implementation in file iam/server/v1/authorization/authorization_service.go. The file name may be a bit misleading, but this service has actions used for both authentication and authorization, it may be worth moving to a different API group, and making a deprecation of the current action declaration. But it’s a side note for me to correct, generally.

Implementation of GetPrincipal will stay, so you should see what happens under the hood.

In IAM Server, GetPrincipal uses PrincipalProvider that it gets from AuthInfoProvider! Therefore, AuthInfoProvider on a different server than IAM will try to use cache - in case of a miss, it will ask the remote PrincipalProvider. RemotePrincipalProvider will send GetPrincipalRequest to IAM, which then checks LocalPrincipalProvider, so we will land in LocalPrincipalProvider anyway.

Before jumping into LocalPrincipalProvider see the rest of the GetPrincipal server implementation. Inside, we are checking User or ServiceAccount data, and iterate over the metadata.services.allowed_services slice. If it contains the service provided in GetPrincipalRequest, it means that this service is allowed to see the given principal, so we can just return it safely. This field is automatically updated when User/ServiceAccount gets access to a new service (or has access revoked). We work in this principle: If a User/ServiceAccount is a participant in a service, they must be able to see each other.

Now, you can jump into the GetPrincipal code for LocalPrincipalProvider. It has separate paths for users and service accounts, but generally, it is similar. We are getting User or ServiceAccount from the local database if possible (not all regions may have ServiceAccount). If we need to make any “Save” (users!), it has to be on the primary region, because this is where users are supposed to be saved.

Distributed Authorization Process

Imagine User X asked devices.edgelq.com to assign ServiceAccount to some Device. It sends a request to the Devices service, with an authorization token containing a User X access token. Devices will successfully authenticate and authorize the user. However, ServiceAccount belongs to IAM, therefore devices.edgelq.com will need to ask iam.edgelq.com to provide ServiceAccount. When devices.edgelq.com sends a request to iam.edgelq.com, the header authorization will not have the access token of the user. It will have an access token of ServiceAccount that is used by devices.edgelq.com. This will be always true, the authorization token must contain the token of the entity sending the current request. However, devices.edgelq.com may store original access token with x-goten-original-auth header. It is an array of tokens. In theory authorization token also may have many, but it does not work on all Ingresses.

In EnvRegistry we have Dial* methods with and without the FCtx suffix. Those with suffixes copy and paste all HTTP headers with the x- prefix. They also copy authorization into x-goten-original-auth. If the latter is already present, it will be appended. The current authorization is cleared and space for new is added.

It is up to the service to decide if they want to forward HTTP headers or not. There is some work needed from EnvRegistry though, caller should be able to customize what and how headers are passed from the current context to the next, but for current needs it is sufficient.

Authorization in this context has issues with audience claim though, when we forward authorization tokens to different service entirely, the audience may not be the one we expect.

By default, we use just Dial without FCtx. We have two known cases where it is used:

  • MultiRegion routing

    It is when requests need to be provided to other regions or split across many regions.

  • When Constraint Store sends EstablishReferences to another service

    This is because we have references in saved resources to other services. The problem here is that we assume Service may not be allowed to establish references (lack of attach checks). The user may have attach permissions though, so we send two authorization tokens.

Of the two cases above, Authorization and Audience validation work well for the first one, because we forward within service. EstablishReferences is a more difficult topic, we will need probably to ensure that the Service has always attach permissions, without relying on the user. We will need however to refactor attach permissions, so there is just one per resource type. With this, we need to fix conditions, so they can apply to attach checks. Right now they simply don’t work.

3 - SPEKTRA Edge IAM Authorization

Understanding the SPEKTRA Edge IAM authorization.

Authorization happens in its dedicated server middleware, see any generated one, like https://github.com/cloudwan/edgelq/blob/main/devices/server/v1/device/device_service.pb.middleware.authorization.go.

As Authorization middleware is assumed to be after multi-region routing, we can assume that IAM Service from local region holds all resources required to execute authorization locally, specifically: RoleBindings, Roles, Conditions.

Note that IAM itself does not execute any Authorization, Middleware is generated for each service. We have an Authorizer a module that is compiled with all API servers for all services. What IAM provides is a list of Roles, RoleBindings, and Conditions. Other services are allowed to get them, but evaluation happens on the proper server side.

The authorizer rarely needs to ask IAM for any data, if possible, it is I/O less. It relies on the RAM cache to store IAM resources internally. Therefore, checks are evaluated typically fast. More problematic are resource field conditions. If we have them, we will need to get current resources from database. For attach permissions, we may need to fetch them from other services.

Authorization middleware is generated per each method, but the pattern is always the same:

  • We create BulkPermissionCheck object, where we collect all permissions we want to check for this action. It is defined in the iam/auth/types/bulk_permission_check.go file.
  • Authorizer module, defined in iam/auth/authorizer.go file, checks if passed BulkPermissionCheck is all good and authenticated user is authorized for asked permissions. Some checks may be optional, like read checks for specific fields.

When we collect permissions for BulkPermissionCheck, we add:

  • Main permission for a method. Resource name (or parent) is taken from the request object, as indicated by requestPaths in the specification, or customized via proto file.
  • If we have some writing request (like Create or Update), and we are setting references to other resources, we need to add attach permission checks. Resource names are taken from referenced objects, not referencing the resource the user tries to write to.
  • Optional read/set permissions if some resource fields are restricted. For authorization object strings we pass either collection names or specific resources.

Every permission must be accompanied by some resource or collection name (parent). Refer to the IAM user specification. In this document, we map specifications to code and explain details.

Within the Authorizer module, defined in the iam/auth/authorizer.go, we are splitting all checks by main IAM scopes it recognizes: Service, Organization, Project, or System. Next, we delegate permission checks to AuthInfoProvider. It generates a list of PermissionGrantResult relevant for all PermissionCheck instances. The relationship between these two types is many-to-many. A single Grant (assigned via RoleBinding) can hold multiple permissions, and a user may have many RoleBindings, each with different Grants: More than one Grant may be giving access to the same permission.

If AuthInfoProvider notices that some PermissionCheck has unconditional PermissionGrantResult, it skips the rest. However, if there are conditions attached, there is a possibility that some will fail while others succeed. It makes a reason why we need multiple PermissionGrantResult per single PermissionCheck, if at least one is successful, then PermissionCheck passes. It works like an OR operator. Conditions in a single PermissionGrantResult must be evaluated positively.

Therefore, once AuthInfoProvider matches PermissionGrantResult instances with PermissionCheck ones, we must evaluate conditions (if any). One popular condition type we use is ResourceFieldCondition. To evaluate this kind, we fetch resources from the local database, other services, and other regions. To facilitate this check as much as possible, the authorizer iterates through all possible conditions and collects all resources it needs to fetch. It fetches in bulk, connecting to other services if necessary (attach permissions cases). For this reason, we put a reference field to the PermissionCheck object, it will contain resolved resources, so all conditions may have easy access to it in case they need it. If the service receives a PermissionDenied error when checking other services, then PermissionDenied is forwarded to the user with information that the service cannot see resources itself. It may indicate an issue with missing the metadata.services.allowed_services field.

On their own, conditions are simple, they execute fast, without any I/O work. We just check requests/resolved resources and verify whether specified conditions apply, according to IAM Role Grant conditions.

AuthInfoProvider for Authorizer

AuthInfoProvider gets only a set of checks grouped by IAM Scope (A project, an organization, a service, or a system if none of the before). As per IAM specification, the service scope inherits all RoleBindings from the project that owns the service. If we need to validate permissions in the project scope, we must also accept RoleBindings from the parent organization (if set), and full ancestry path. RoleBindings in system scope are valid in all scopes. Moreover, even the principal may have multiple member IDs (native one with email, then domain, then allAuthenticatedUsers, allUsers). This creates lots of potential RoleBindings to check. Furthermore, we should be aware that Authorizer is part of all API servers! As SPEKTRA Edge provides a framework for building 3rd party services, they can’t trust each other. Therefore, AuthInfoProvider of any service it runs on can only ask for RoleBindings that it is allowed to see (according to metadata.services.allowed_services).

IAM Controller is copying organization-level RoleBindings to child sub-organizations and projects, but we don’t copy (at least yet) RoleBindings from service project to a service. We also don’t copy system-level RoleBindings to all existing projects and organizations. It should typically stay that way, because system-level role bindings are rather internal, and should not leak to organization/project admins. The module for copying RoleBindings is in file iam/controller/v1/iam_scope/org_rbs_copier.go. It also handles changes in the parent organization field.

During authorization, AuthInfoProvider must list and fetch all RoleBindings per each memberId/IAM Scope combination. It must also only fetch role bindings relevant to the current service. We first try to get from the local cache, in case of a miss, we ask IAM. This is why in CheckPermissions we grab all possible RoleBindings. We filter out RoleBindings by subScope or role ID later on. We try to strip all unnecessary fields, to ensure AuthInfoProvider can hold (RAM-based cache!) as much data as possible. Additionally, we try to use integer identifiers for roles and permission names.

To hold RoleBindings per member ID, we may need like, two KiBs of data on average. If we cache principal, let’s say four. Using one MiB we could hold data for 256 principals. 256 MiB can hold then 65K of principals. Let’s divide by two for a safety margin. As a result, we can expect 256 MiB to hold tens of thousands of active users. This is why AuthInfoProvider caches all RoleBindings principal can have in each scope. We extract data from IAM only when the cache expires, for new principals, or when the server starts up for the first time. This is why GetAssignments (method of RoleBindings store) is looking like it looks.

When we have all RoleBindings for relevant members and relevant IAM scope, then we can iterate PermissionCheck (object + permission) against all assignments. If many assignments match the given PermissionCheck, then PermissionCheck will have multiple Results (variable).

RoleBindings (converted to RoleAssignment for slimmer RAM usage) are matched with permissions if:

  • they have owned_objects which match the object name in the PermissionCheck.
  • if the above fails, we check if the Role pointed by RoleBinding has any Grants containing permissions specified in PermissionCheck.
  • if there are any Grants, we need to check if subScope matches (if it is specified). PermissionCheck contains iam scope and sub-scope forming a full object name. It allows us to have granularity on specific resources.
  • if we find a Grant matching PermissionCheck, we store it in Results, note Grant can carry conditions, but we haven’t evaluated them yet.

Thanks to the cache, I/O work by AuthInfoProvider is practically non-existent, typically it can quickly provide list of assigned permissions with a list of conditions.

ConditionChecker for Authorizer

Each PermissionCheck can have multiple results, which can contribute to allowed Permissions. If the result item has no conditions, then we can assume permissions are granted. If it has, then all conditions must be evaluated successfully, so we iterate in the Authorizer code.

ConditionChecker is implemented in file iam/auth/internal/condition_checker.go. We have 3 condition types:

  1. checking by resource field, function checkByResourceField
  2. checking by request field, function checkByRequestField
  3. checking by CEL condition, function checkByCELCondition (will be retired though).

Resource conditions are the most popular, and for good reason, they are simple and can handle at least full CRUD, and often custom functions too. For example, suppose we want to assign certain users access to devices if the field path satisfies metadata.annotations.key = value:

  • CreateDeviceRequest will be forbidden if this field path with a given value is not specified in the resource body.
  • UpdateDeviceRequest will be forbidden if we are trying to update this field path to a different value or if the current resource stored in the database does not match.
  • DeleteDeviceRequest checks if the Device in the database matches.
  • Get/BatchGetDevice(s) are extracted from the database and the condition is checked
  • WatchDevice also is checked when the stream starts, we grab resources from the database and evaluate them.
  • ListDevices and WatchDevices have a Filter field, so we don’t need to grab anything from DB.
  • If there are custom methods, we can still get resources from DB and check if the field path is fine.

We also support attach permissions with resource field conditions, if necessary, we fetch resources from other services. Fetching is done before condition evaluations.

A smaller weakness is the need to have extra checks in the database. The object may be stored in Redis though, giving perhaps a faster answer, but still goes through the network stack. Perhaps another RAM-based cache can be used for storage, but invalidation may be a problem if we want to include List queries. For resource updates, we need to invalidate the previous and new state, and Firestore watch shows us only the new state. Mongo may be more beneficial in this case, especially if we consider the fact that it has active watches for all collections (!!!). It may work for collections especially non-frequently updated.

Checks by request are simpler and aimed at custom methods typically.

Checks by CEL condition are so far being less and less used in v1, but may still have some special use cases if yaml (protobuf) declaration is not enough. They use conditions with bodies specified in the iam.edgelq.com/Condition resource. ConditionChecker uses AuthInfoProvider to grab Conditions from IAM.

4 - SPEKTRA Edge IAM Cache Invalidation

Understanding the SPEKTRA Edge IAM cache invalidation.

AuthInfoProvider relies on RAM cache for low latency processing. The problem is with invalidation. To achieve a long-living cache, we need real-time invalidation straight from the database.

This is why each “store” module in AuthInfoProvider has one or more goroutines using real-time watch. When some object is updated, we may need to update/invalidate the cache. In case of prolonged broken access to IAM, it will invalidate the whole cache and retry.

Invalidation of principals is done using the WatchPrincipals method. This allows IAM to ensure that only selected (allowed) principals are seen by a service.

5 - SPEKTRA Edge Multi-Service Authorization

Understanding the SPEKTRA Edge multi-service authorization.

Main authorization happens when the user sends a request to a service, the authorization is located on the front. However, sometimes a service executing a request needs to send the next requests to other services. One often example is EstablishReferences call in Schema Mixin service. It is assumed that services don’t trust each other, and it shows here too. Even if let’s say device service allows UpdateDevice, then IAM needs to check on its own if UpdateDevice can update the reference to field spec.service_account (field in Device resource, pointing as ServiceAccount from IAM). We are using the fact that cross-region and cross-service references establishment require a call to EstablishReferences.

We have even special authorization for that: see file mixins/schema/server/v1/resource_shadow/resource_shadow_service_custom_auth.go. In this file, we check referenced resources and try to see if this is allowed for service-making calls, or from the user originally making the request. In the future, we may opt-out from the original user, and require that the service has access to referenced resources.

It typically should be the case, ServiceAccount pointed by Device should be owned by devices (metadata.services.owning_service). The same goes for logging or monitoring buckets. We may need proper permission attach checks for resources first, and support for resource field conditions!

Other than that, service-to-service subsequent calls are treated separately, and service verifies a service.

6 - SPEKTRA Edge E-mail Sender

Understanding the SPEKTRA Edge e-mail sender system.

Another 3rd party service we use apart from Auth0 is Sendgrid. You should see its config in iam/config/apiserver.proto. It is a second service for emails, Auth0 itself is used for emails too, like verification accounts. After all, Users are stored in the Auth0 service, IAM just gets copies.

However, invitations (ProjectInvitations and OrganizationInvitations) are sent using Sendgrid. See iam/invitationpublisher directory.

7 - SPEKTRA Edge Multi-Service Environment Safety

Understanding the SPEKTRA Edge multi-service environment safety.

IAM needs to ensure safety not only between tenants (Organizations, Projects) but also between Services. For this reason RoleBindings are also scoped per Service. There is however a problem with services we need to solve:

  • Organizations and Projects can enable services they use, and if they do, they should be able to use these Services. IAM must ensure that the organization/project admin cannot enable services if they don’t have permission to. IAM must ensure the service does not have access to organizations/projects which don’t enable a particular service.
  • If the Organization or Project enables Service, then the Service should be able to access the Project/Organization.
  • Ideally, the Service should be able to freely access all resources a Project or Organization has, as long as those resources are defined by that service. For example, service devices.edgelq.com must always be able to access any instance of devices.edgelq.com/Device. However, other services should have limited access to devices.edgelq.com/Device collection. It’s called limited access to Projects/Organizations: Project/org admins should be able to regulate which resources are accessible.
  • Some Services may be “private”.
  • If services are using/importing each other, they need some limited access to each other.

Private services are protected by attach permissions, so project/org admins can’t just enable any Service, this requires updating the list of references of Services after all.

Those things are fixed in IAM fixtures, see iam/fixtures/v1/iam_roles.yaml.

First, see this role: services/iam.edgelq.com/roles/service-user. This gives access to Service data and the ability to attach it. If the project admin is granted this role in a service scope, they can enable that service.

Then, the next thing, Service should be able to access all relevant resources projects or organizations have, without specifying the exact instance. This is why we have the IAM role services/iam.edgelq.com/roles/base-edgelq-service, which grants access to all resources across orgs/projects, as long as certain conditions are met. Note that we don’t give any create permissions, it would be wrong, because the Service could start creating resources with the proper metadata.services field, without checking if the project/org even uses the service. It is not an issue for non-creating permissions. To allow services creating project/organization scope resources, we have services/iam.edgelq.com/roles/service-to-project-access and services/iam.edgelq.com/roles/service-to-org-access roles. RoleBindings for these roles are created dynamically by the IAM controller when the Project/Organization enables some service. This code is located in iam/controller/v1/iam_scope.

We also need to regulate service-to-service access. By default, this is not allowed. However, if one service imports or uses another, we enable their access to each other. Roles for these scenarios are in iam/fixtures/v1/per_service_roles.yaml. Roles with ID importing-service-access and imported-service-access are granted to importing and imported service, but note it is not symmetrical. It does not need to be. For example, if one service imports another, then EstablishReferences is only needed in one direction. Roles with ID service-to-service-std-access are used for minimal standard access.

All those RoleBindings regulating access between services, and between services with projects/organizations, are called “Service RoleBindings”. They are dynamically created by the IAM Controller when a service is created/updated, or when an organization/project enables some service. The module responsible for these RoleBindings is in file iam/controller/v1/iam_scope/service_rbs_syncer.go:

  • makeDesiredRbsForSvc computes desired Service RoleBindings per each Service.
  • makeDesiredRbsForOrg computes desired Service RoleBindings per each Organization.
  • makeDesiredRbsForProject computes desired Service RoleBindings per each Project.

Note the convention for mixin services, each service has its copy of them, like services/<service>/permissions/resourceShadows.listMetaOwnees. This is because all services have their schema mixins. Note that RoleBindings for those “per service roles” are located on the root scope, see function makeDesiredRbsForSvc in file iam/controller/v1/iam_scope/service_rbs_syncer.go. The reason is that ResourceShadow is a “root” resource (name pattern is resourceShadows/{resourceShadow}, not something like services/{service}/resourceShadows/{resourceShadow}). Perhaps it could have been like this, but it is some continuity from v1alpha2. Also, CLI commands would become less intuitive. In order then to enable per-service access, permissions are per-service. If we create services/<service>/permissions/resourceShadows.listMetaOwnees per service, and create root scope RoleBinding containing this permission, in effect it will be granted for specific services only, not for all.

8 - SPEKTRA Edge IAM Principal Tracking

Understanding the SPEKTRA Edge principal tracking.

ServiceAccounts are project-scoped resources, but in theory, they can be granted roles in other projects and organizations too. Users are, in IAM terms, global resources, not necessarily bound to any organizational entity. They can however join any project or organization.

Members (Users or ServiceAccounts) are associated with projects/organizations via RoleBinding resources. Organizational role bindings are copied to downstream child projects/organizations by the IAM Controller (iam/controller/v1/iam_scope/org_rbs_copier.go).

If you visit iam/server/v1/role_binding/role_binding_service.go, you should note that, for each written/deleted RoleBinding we are managing MemberAssignment resource. See iam/proto/v1/member_assignment.proto for more details, it has described the role.

Generally, though, one instance of MemberAssignment is created per each scope/member combination. This internal resource facilitates tracking of members in organizational entities.

Members can see a list of their projects/organizations via ListMyProjects/ListMyOrganization calls. To make such calls possible, we needed to use MemberAssignment helper collection, we copy also many project/organization fields directly to MemberAssignment instances. Therefore, projects/organizations filter/orderBy/fieldMask/cursor objects can be mostly translated to MemberAssignment ones. To make it work, MemberAssignment is a regional, but globally synced resource (its copies are spread through all IAM regions, period). Regional status ensures that each region is responsible for tracking members in local organizations/projects individually. IamDbController syncs all created copies across all regions, so each region knows the full list of projects/organizations where the given member participates.

In case project/organization fields change (like title), the IAM Controller is responsible for propagating change to all MemberAssignment instances. Implementation is in file iam/controller/v1/mem_assignments_meta_updater.go.

9 - SPEKTRA Edge Principal Service Access

Understanding the SPEKTRA Edge principal service access.

RoleBinding is not only binding User/ServiceAccount with a project/organization. It also binds a member with a service. For example, a devices manager Role in project X would bind a given member not only with project X but also with the devices.edgelq.com service (and applications & secrets, since those are related). Each Role has a populated field metadata.services, which points to services where Role is relevant. RoleBinding also has metadata.services populated, and it contains combined services from a Role and a parent object (organization, project, or service).

When RoleBinding is created, IAM internally creates a MemberAssignment instance per each unique combination of member/scope, and this MemberAssignment will have a “scope” field pointing to the project or organization. However, there is something more to it. IAM will also create additional MemberAssignment objects where the “scope” field points to a Service! Those Service-level MemberAssignment instances are used to track in which services the given Member (User or ServiceAccount) is participating.

IAM Controller has a dedicated module (iam/controller/v1/iam_scope/service_users_updater.go), which ensures that the field metadata.services is in sync (for Users, ServiceAccounts, and ServiceAccountKeys). It does it by watching MemberAssignment changes and making “summary” of services in use. If it notices that some user/service account has inaccurate data, it will issue an update request to IAM. Each region of IAM is responsible for watching local members, but they will access all MemberAssignment instances since those are synced globally.

Making sure the field metadata.services of all Users/ServiceAccounts is synced has double functionality:

  • It ensures that a given member can access Service-related data.
  • It ensures that the given Service can access member data (via GetPrincipal).

If you check the file iam/fixtures/v1/iam_role_bindings.yaml, you should notice special RoleBinding roleBindings/services-participant. It is a root-level RoleBinding given to all authenticated members, granting services/iam.edgelq.com/roles/selected-service-user role. This role is a multi-service one. If you see its contents in iam/fixtures/v1/iam_roles.yaml, you should see it gives many read/attach permissions to a holder in a specified list of services. In the RoleBinding yaml definition, note that this list of services comes from a principal metadata object! This is how principals get automatic access to a Service.

Role services/iam.edgelq.com/roles/selected-service-user is similar to services/iam.edgelq.com/roles/selected-user. The latter one should be used on the service level to provide access to that single service, to someone who has no other access there. The former has an internal purpose, gives access to many services at once, and will only be assigned to members who already have some access to specified services. It just ensures they can access service meta-information.