Monitoring Pipeline Data Transformation
Understanding the SPEKTRA Edge monitoring pipeline data transformation.
With the data layout of the wide-column data store explained in the previous wide-column store page, let’s talk about the monitoring pipeline aspect of the SPEKTRA Edge monitoring system.
Unlike in Logging or Audit services, usage of WideColumn is not the only specific trait of Monitoring TimeSerie resource.
When a client submits a TimeSerie object, the point values match those declared in the metric descriptor. For example, if we have something like:
- name: ...
type: ...
displayName: ...
metricKind: GAUGE
valueType: INT64
# Other fields...
Then, given TimeSerie will have points from single writer (let’s assume that it sends one point per 30 seconds):
points:
- interval:
endTime: 12:04:24
value:
int64Value:
123
- interval:
endTime: 12:04:54
value:
int64Value:
98
- interval:
endTime: 12:05:24
value:
int64Value:
121
- interval:
endTime: 12:05:54
value:
int64Value:
103
- interval:
endTime: 12:06:24
value:
int64Value:
105
- interval:
endTime: 12:06:54
value:
int64Value:
106
However, unlike logs, querying will not return the same data points,
in fact, it is likely not possible at all, unless we enable raw storage
(unaligned). Request QueryTimeSeries
typically requires an aggregation
field provides, with alignmentPeriod
ranging from one minute to one day,
and perSeriesAligner equal to some supported value, like ALIGN_SUMMARY
,
ALIGN_MEAN
etc. For example, if we cuttle monitoring service like:
cuttle monitoring query time-serie \
--project '...' \
--filter '...' \
--interval '...' \
--aggregation '{"alignmentPeriod":"60s","perSeriesAligner":"ALIGN_SUMMARY"}' \
-o json | jq .
Then, for these points, we should expected output like:
points:
- interval:
endTime: 12:05:00
value:
distributionValue:
count: 2
mean: 110.5
sumOfSquaredDeviation: 312.5
range:
min: 98
max: 123
bucketOptions:
dynamicBuckets:
compression: 100.0
means: [98, 123]
bucketCounts: [1, 1]
- interval:
endTime: 12:06:00
value:
distributionValue:
count: 2
mean: 112
sumOfSquaredDeviation: 162
range:
min: 103
max: 121
bucketOptions:
dynamicBuckets:
compression: 100.0
means: [103, 121]
bucketCounts: [1, 1]
- interval:
endTime: 12:07:00
value:
distributionValue:
count: 2
mean: 105.5
sumOfSquaredDeviation: 0.5
range:
min: 105
max: 106
bucketOptions:
dynamicBuckets:
compression: 100.0
means: [105, 106]
bucketCounts: [1, 1]
Note that:
ALIGN_MEAN
, then we would get
doubleValue instead of distributionValue, with mean values only.If you check file monitoring/ts_store/v4/store_writing.go
, you should
note that:
Each AlignmentPeriod has its own Column Family:
ap := dp.GetAggregation().GetAlignmentPeriod().AsDuration()
cf := tss.columnFamiliesByAp[ap]
We typically don’t store raw data points (AP = 0):
if ap == 0 && !md.GetStorageConfig().GetStoreRawPoints() {
continue
}
Now, when we query (monitoring/ts_store/v4/store_querying.go
),
we query for specific column family:
colFamily := tss.columnFamiliesByAp[
q.aggregation.GetAlignmentPeriod().AsDuration(),
]
if colFamily == "" {
return nil, status.Errorf(
codes.Unimplemented,
"unsupported alignment period %s",
q.aggregation.GetAlignmentPeriod(),
)
}
When we query, we are changing per series aligner from query to
other type in storage (function createWCQuery
).
To summarize, the data that we query, and the data that the client submits are not the same, and this document describes what is going on in Monitoring service.
Understanding the SPEKTRA Edge monitoring pipeline data transformation.
Understanding the SPEKTRA Edge monitoring pipeline streaming jobs.