SPEKTRA Edge Monitoring Time Series
You can see the proper TimeSeries storage interface in the file
monitoring/ts_store/v4/store.go
, TimeSeriesStore
. We will study
here QueryTimeSeries
and SaveTimeSeries
, which are most important
for our case.
TimeSeries are the most special across all widecolumn types: when a user makes queries, the same time series can be completely different: when group by is in use (which is almost always the case!), then time series are merged forming dynamically new ones than submitted! This should be already known from the User guide though.
As per user documentation, it should be clear that we use
MonitoredResourceDescriptor and MetricDescriptor instances to create
final identifiers of TimeSeries (keys), and a list of promoted indices
for faster queries. Before we go into explaining saving/deleting, it is
worth checking the initTsData
function in the
monitoring/ts_store/v4/store.go
file. This is where we initialize:
- tsKeyDesc, which contains the common part of the descriptor for all time series, regardless of labels in metric or resource descriptors
- Common index prefix, consisting of
project
andmetric.type
fields from TimeSerie.
Additional notes:
- TimeSeries storage has no concept of regional replication, each region contains its data only.
- This document describes only TimeSeries storage, but not the full monitoring design, which is described separately.
Overall, object TimeSerie
is mapped into KeyedValues
from WideColumn,
but note this is a bit higher level object. Each mapping requires some
level of design. This is done in the following way:
- TimeSerie fields
project
,region
,metric
, andresource
are mapped to various String key-values, forming togetheruid.SKey
. This SKey is then mapped to someuid.Key
, which is a short integer representation. Note thatuid.Key
is the identifier ofKeyedValues
. - TimeSerie field
key
is binary encodeduid.Key
, which has “compressed” project, region, metric and resource fields. - TimeSerie field
points
, which is a repeated array, is converted into Widecolumndataentry.ColumnValue
, one by one.
Single Point in TimeSerie is mapped to dataentry.ColumnValue
like this:
- TypedValue, which holds an actual value, is mapped to binary data (bytes). We will need to marshal.
- The timestamp of the key naturally maps to time in
dataentry.ColumnValue
. Note that AlignmentPeriod is, apart from 0 value, dividable by 60 seconds. This means that the timestamp will be some multiplication of the AP value. - Family in
dataentry.ColumnValue
will contain AlignmentPeriod (in string format). It means that values from a single AP will be stored in separate column families (tables)! - Interface
dataentry.ColumnValue
also has an optional key, in Monitoring, we will store their perSeriesAligner value from Point Aggregation. For example, this will allow us to saveALIGN_RATE
(some double) andALIGN_SUMMARY
(some distribution) with the same key, and same timestamp, in the same family column, but next to each other.
In monitoring, we will not save in the database raw points (Alignment period = 0 seconds). We will store only aligned values.
Saving
The writing time series is implemented fully in the
monitoring/ts_store/v4/store_writing.go
file.
There is a possibility, when we save time series, that the TimeSerie
object contains already a key
field value, in which case we don’t
need to resolve it ourselves. However, we will need to verify it is
correct still! The client may choose to submit TimeSeries with binary
keys to make a final request a bit smaller (they can drop project,
region, metric, and resource fields).
Moreover, from TimeSerie, we can and must get metric and resource
descriptors, from which we can compute finally indices. This way,
we can wrap KeyedValues
into IndexedKeyedValues
, as required
by widecolumn storage. Note that those two descriptor resources
describe what fields are possible to be in TimeSerie object in general,
they regulate metric.labels
and resource.labels
. If we map
MonitoredResourceDescriptor/MetricDescriptor as widecolumn store types,
they would be mapped to uid.KeyFieldsDescriptor
!
In implementation, each TimeSerie
we will wrap into
CreateTssOperationResult
, type defined in the
monitoring/ts_store/v4/store.go
file, see it. This will contain
params of KeyedValues
, along with associated descriptor resources.
Then, metric and resource descriptor resources will be wrapped together
with uid.KeyFieldsDescriptor
types, to which they de facto map.
When we save time series, we map TimeSerie
into CreateTssOperationResult
already in initiateCreateOperations
. Inside this function, we validate
basic properties of the TimeSerie object, project, metric, and resource
type fields. We use the field descriptor tsKeyDesc
, which was initialized
in the store constructor. At this initial stage, we don’t know the exact
metric and resource descriptor types, so we just validate basic properties
only! If the binary key is provided, we are initializing descKey
instance,
otherwise descSKey
. The former one is better for performance, but not
always possible. Note that at this stage we have described keys, and
validated base properties, but descriptors still have work to do.
In the next stage, we grab descriptor references, see
getWrappedDescriptors
. It does not make any resolutions yet,
the same descriptors may be used across many TimeSerie objects,
so we don’t want to do more resolutions than necessary. With Goten
resource descriptors wrapped with uid.KeyFieldsDescriptor
, we are
resolving in resolveAndPopulateDescriptors
function, where we finally
get field descriptors as required in uid
format. This will allow us
to execute final, proper validation, and compute indices for widecolumn.
Proper validation is done in functions defaultAndValidateHeaders
and
defaultAndValidatePoints
. In the second function, we are also
generating final column values used by the widecolumn storage interface!
However, note some “traps” in defaultAndValidatePoints
:
if ap == 0 && !md.GetStorageConfig().GetStoreRawPoints() {
continue
}
Raw points, unaligned that clients are sending, with AP equal to 0 seconds, we are skipping saving in the database, second condition is pretty much always evaluated to true! It will be more explained in the monitoring design doc.
With data validated, and with output columns populated, we can now ensure
that the output raw key in CreateTssOperationResult
is present. If
the binary key was not submitted when saving TimeSerie (field key
), then
we will need to use resolver to allocate string to integer pair. Mapping
is of course saved in widecolumn storage. See the
ensureOutputKeysAreAllocated
function.
Next, with column values and raw keys, we need to wrap KeyedValues
into
indexed ones IndexedKeyedValues
. This is what we finally pass to
widecolumn storage. Inside keyed values are duplicated per each index
and saved in underlying storage.
Querying
When we query time series we need:
- Convert time series query params into WideColumn query object (mapping).
- Create a batch processor, that maps KeyedValues from Widecolumn storage into TimeSerie objects.
This implementation is fully in monitoring/ts_store/v4/store_querying.go
.
Widecolumn store, unlike regular Goten document one, supports OR groups, but it is more like executing multiple queries at once. Each query group represents a filter with a set of AND conditions, plus can be executed on different indices. We need to deal with this specific WideColumn interface trait, where we must specify indices when saving and querying.
When executing a query, we gather all input parameters and convert them
into a tssQuery
object, with tssQueryGroup
as one OR group. This is
not the format required by widecolumn but by some intermediary. See
function createTssQuery
.
We support two types of queries for TimeSeries:
- With filter specifying one or more binary keys (
WHERE key = "..."
ORWHERE key IN [...]
). Each key forms one “OR” group, with just a long list of AND conditions. - With filter specifying a set of metric/resource conditions
(
WHERE metric.type = "..." AND resource.type = "..." ...
). However, we also support IN conditions for those types. Resource types may also be omitted optionally (but defaults are assumed then). For each combination of metric + resource type, we create one OR group.
One group query must specify exactly one metric and one resource type,
because each combined pair defines its own set of promoted indices, we
MUST NOT combine them! This is reflected in createTssQuery
.
For each OR group, we are grabbing descriptors, using which we can finally verify if conditions in filters are defined correctly, if a group by fields specifies existing fields, and we can compute indices we know we can query.
Then, we map the query to widecolumn one. From notable elements:
- We are passing a reducing mask for fields, it will make output
uid.Key
have some fields “reduced”! - We most of the time ignore perSeriesAligner passed by the user,
and switch to
ALIGN_SUMMARY
. It is because when we save, we use almost exclusivelyALIGN_SUMMARY
. Other types almost always can be derived from summary, so there is no point in maintaining all.
Finally, we execute the query with our batch processor. Thanks to
the reducing mask, the field Key for each KeyedValues
has already
reduced the list of key values, reduced keys are in the rest key set.
This is how we implement groupBy in monitoring. Each entry in the
resultEntries
map field of queryResultProcessor
will represent
the final TimeSerie object. However, this is a very CPU and
RAM-intensive task, because widecolumn storage returns values still
as we saved them, all individual instances! If for a single timestamp,
we have thousands of entries sharding the same key, then we will merge
thousands of points to have one point, and we repeat per each timestamp.
At least a wide column guarantees that, when querying, results will be
returned only with increasing seq
value. If we query with AP = 300s,
we will get points for let’s say noon, then 12:05, and so on. When we see
the sequence jump, then we know we can add the final point to the increasing
list.
Still, this is a serious scaling issue if we try to merge large collections into a single reduced key.
Since perSeriesAligner is different from what we passed to widecolumn
compared to what the user requested when we convert ColumnValue into
TimeSerie point (function buildPointFromWCColumnValue
), we need to
extract proper value from the summary object.
Once TimeSerie is obtained, we resolve in bulk all integer keys into a string, and we can form output time series.