SPEKTRA Edge ActivityLogs and ResourceChangeLogs
The audit store for logs is implemented in the
audit/logs_store/v1/store.go
file. This interface also uses
the WideColumn store under the hood. We are mapping to KeyedValues
each ActivityLog
and ResourceChangeLog
.
Activity Logs
For ActivityLog, mapping is the following:
- A lot of fields are used to create
uid.Key
/uid.SKey
:scope
,request_id
,authentication
,request_metadata
,request_routing
,authorization
,service
,method
,resource.name
,resource.difference.fields
,category
,labels
. - Name field contains scope and binary key
uid.Key
. However, when logs are saved, this is often not present in the request (value is allocated on the server side). - Each event in the
events
array is mapped to a singledataentry.ColumnValue
. Then, we have two additional special Column values: fieldsresource.difference.before
andresource.difference.after
.
ActivityLog typically has 3 events: client message, server message, and exit code. It may be much longer for streaming calls, and be pretty long.
This is how ActivityLog Event maps to ColumnValue
in WideColumn:
- The whole event is marshaled into binary data and passed to ColumnValue.
- Family field is always equal to one static value,
ALStoreColumnType
. It should be noted that all ActivityLogs use one single ActivityLog column family! - We extract the time value from ActivityLog Event and use it as a Time in ColumnValue. Note that in V2 format WideColumn only uses second precision for times!
- Additional column key will contain event type (client msg, server msg…), and nanoseconds part of the timestamp. This is important if ActivityLog contains streaming messages and we have more than one of a single type within a second! This is how the column key type protects against overwrites.
Pre&Post object diffs from ActivityLogs will be mapped to ColumnValue:
- The whole object is marshaled into binary data and passed to ColumnValue.
- Column family is just equal to the const value of
ALStoreColumnType
. - Timestamp is equal to the first event from ActivityLog
- Column key contains the type (before OR after field values) with nanoseconds from the timestamp. This timestamp is not necessary, this is just to provide a similar format to those of events.
Resource Change Logs
ResourceChangeLogs have the following mapping to KeyedValues:
- A lot of fields are used to create
uid.Key
/uid.SKey
:scope
,request_id
,authentication
,service
,resource.type
,resource.name
,resource.pre.labels
,resource.post.labels
. - Name field contains scope and binary key
uid.Key
. However, when logs are saved, this is often not present in the request (value is allocated on the server side). - ResourceChangeLogs are a bit unique - but we marshal the whole of them to binary data, and they are forming ColumnValue types.
Each ResourceChangeLog typically has two ColumnValues because we are saving it twice: The first time, before the transaction concludes (so we have a chance to protest before allowing commitment), then after the transaction concludes.
In summary, ColumnValue is formed this way:
- Binary data contains the whole log marshaled
- Column family is set to the constant variable value of
StoreColumnType
. - Time is extracted from request time (first client message received).
- Column key is also used, we have one value
StoreColumnPreCommitType
when thetransaction.state
field of ResourceChangeLog is equal toPRE_COMMITTED
, otherwise, it isStoreColumnFinalizedType
.
If you check the NewStore
constructor in the
audit/logs_store/v1/store.go
file, you will notice that, unlike in
monitoring store, we have quite big uid.KeyFieldsDescriptor
instances
for Resource change and Activity logs, and a ready set of indices, not
just a common prefix.
If you analyzed monitoring time series storage querying and writing, then checking the same for Audit logs will be generally simpler. They follow the same principles with some differences:
- In monitoring we had two descriptors per TimeSeries, in Audit we have one descriptor for activity and another for resource change logs.
- Specific for resource change logs: We call
SaveResourceChangeLogsCommitState
and use internally againSaveResourceChangeLogs
, which is used for saving logs PRE COMMIT state. - For both Activity and ResourceChange logs we don’t require descriptors, usually labels are empty sets anyway, we have already large sets of promoted indices and labels, and this is useful for bootstrap processing when descriptors may not be there yet.
- When we query resource change logs, we don’t need to resolve any
integer keys, because whole logs were saved, see
onBatch
function. We only need to handle separate entries for the commit state. - When we query logs, we will get all logs up to second precision. It means, that even if we have a super large amount of logs in a single second, we cannot split them, continuation tokens (next page tokens) must be using second precision, as required by V2 storage format.
- Because logs are sorted by timestamp, but with second precision, we need to re-sort again anyway.
Things could be improved with the v3 SPEKTRA Edge wide-column version.