Manage alerts
Let’s learn how to manage alerts on SPEKTRA Edge.
SPEKTRA Edge offers powerful alerting system built-in to monitor majority part of the system managed by the platform, includes devices and applications. The platform even allow you to extend the standard alert system by defining brand new alerts to meet your needs.
In this page, we will learn how to create standard alerts, the device connection status alert and the CPU utilization alert, to get familier with the SPEKTRA Edge alerting system.
What you need
You need the followings to setup the alerts on SPEKTRA Edge.
- an access to the SPEKTRA Edge dashboard
- an active project
- a device provisioned under the project
Configure alerts
Let’s first understand how the alerting system is orchestrated in SPEKTRA Edge.
There are three main components in the SPEKTRA Edge alerting system.
The alerting policy is the top component to organize both the alerting conditions and the notification channels. It also contains the triggered alerts so that you can observe the historic alerts for the particular policy.
The alerting condition let you express the condition in which the alert happens, for example, the device loses the connection or the CPU utilization passes a certain threshold.
The notification channel expresses how and where the alert is sent, either through email or slack.
Please take a look at the following diagram to understand the relationship of those three components.
Multiple alerting conditions and notification channels
Multiple conditions and channels is allowed under the alerting policy, as shown in the diagram above, though, it’s not a necessity.
You can create the alerting policy with one alerting condition and one notification channel. You can create a policy even without a notification channel.
Let’s take a look at those more detail with the actual example.
Alerting policies
A alerting policy is the top level component to organize both the alerting conditions, notification channels, and the actual triggered alerts.
Select the Alerting policies option of the Alerts pull down-menu to go to the Alerting overview page.
Selecting the Alerting policies option from the Alerts pull down menu.
Once you’re on the Alerting overview page, click the Create alerting policy button located at the top right corner.
Click the Create alerting policy on the Alerting overview page.
Fill in the alerting policy name and click Create. We’ll fill in the notification channels in the later stage.
Create the alerting policy by clicking the Create button on the Create alerting policy page.
That’s it. Let’s move on to the alerting condition next.
Alerting conditions
Let’s create the alerting condition to detect the device connection status.
A alerting condition, as the name suggests, defines the condition to trigger alerts. It’s grouped in three different sections and we’ll go over those one-by-one in the following sections.
But first, let’s open the Create alerting condition page by clicking the policy condition plus sign on the Alerting overview page.
Create the alerting condition by clicking the plus sign on the Alerting overview page.
Alert metrics
The alert metric section is the first thing to set on the Create alerting condition page. It gives you all the alerting options supported by the system. You could browse those to understand what are covered by the SPEKTRA Edge alerting system.
Let’s select the Device connected alert metric type for the device connection status alerting condition.
Select the Device connected alert metric option as the Alert metric value.
You don’t need to touch the resource filter section, which is automatically set by the system, unless you need the additional filtering.
Threshold conditions
The threshold conditions section is where to configure the alerting condition.
Here is the threshold condition for the device connection status alerting condition for firing alerts when the device is offline for more than five minutes.
The threshold condition for the device is offline more than five minutes.
Since the Online status is treated as number one and Offline as number zero, we use the Less than operator against the Online status to detect the device offline event. You set the duration time to five minutes to express the system to trigger alerts when the device is offline more than five minutes.
Time series configurations
The time series configuration section expresses how to aggregate data points for the targetted time series data. There are two time series aggregation functionalities here.
- the alignment period with the per series aligner
- the time-series grouping with the cross series reducer
Let’s take a look at the actual example to understand those two functionalities.
The time series configuration to aggregate time series data points.
Here is the detailed description.
- the alignment period to one minutes with the Max per series aligner
- resource.labels.device_id based grouping with the Min cross series reducer
The first aggregation is for the noise reduction. It treats the device is offline only when it’s offline for the entire one minutes.
The second aggregation is to treat each devices under the project separately, which is the the grouping part. Since there is only one state for the device connection, the reducer doesn’t mean anything. We’ll take a look at the other example and explain the usage of the reducer there.
Let’s click Save as you completed the device connection status alerting condition.
Click Save to finish the device connection status alerting condition.
Here is the brief description of the typical aligners and reducers for your reference.
Aligner name | The aligned data point |
---|---|
None | No alignment made and keeps all the time-series data points. |
Mean | The average or arithmetic mean of the data points in the alignment period. |
Min | The minumum value of the data points in the alignment period. |
Max | The maximum value of the data points in the alignment period. |
Count | The count of the data points in the alignment period. |
Sum | The sum of the data points in the alignment period. |
Stddev | The standard deviation of the data points in the alignment period. |
Percentile 99 | The 99th percentile of the data points in the alignment period. |
Percentile 95 | The 95th percentile of the data points in the alignment period. |
Percentile 50 | The 50th percentile of the data points in the alignment period. |
Percentile 5 | The fifth percentile of the data points in the alignment period. |
Reducer name | The reduced data point |
---|---|
None | No cross time-series reduction. |
Mean | The mean across the aligned data points of the multiple time series. |
Min | The minium of the aligned data points of the multiple time series. |
Max | The maximum of the aligned data points of the multiple time series. |
Sum | The sum of the aligned data points of the multiple time series. |
Stddev | The standard deviation of the aligned data points of the multiple time series. |
Count | The count of the aligned data points of the the multiple time series. |
Percentile 99 | The 99th percentile of the aligned data points of the multiple time series. |
Percentile 95 | The 95th percentile of the aligned data points of the multiple time series. |
Percentile 50 | The 50th percentile of the aligned data points of the multiple time series. |
Percentile 5 | The fifth percentile of the aligned data points of the multiple time series. |
Monitoring API document
Please take a look at the monitoring service API document for the technical details for the aligners and the reducers.
- Aggregation.Aligner enumeration type
- Aggregation.Reducer enumeration type
Great. Let’s move on to the notification channels next.
Notification channels
A notification channel allows you to configure how to notify alerts over multiple channels.
There are three types of notification channels supported by the SPEKTRA Edge.
- Slack
- Webhook
Each channel will be created separately and be tied to the alerting policy to be operational. A single channel can be shared by multiple alerting policies
We will go over how to create all three channels below and link those to the alerting policy we’ve created in the previous step.
Select the Notification channels option of the Alerts pull-down menu to open the Alerting overview page.
Select the Notification channels option from the Alerts pull-down menu.
Click the Create notification channel button to create a notification channel.
Click the Create notification channel button on the Alerting overview page
To create the email notification channel, you need to
- select Email in the type field
- fill in the email address(es) in the Emails field
and click the Create button.
Fill in the email address(es) and click the Create button.
Click the Send button to send the test email notification to verify the configuration.
Clicking the Send button to send the test email notification.
Enable the notification channel by clicking the Enabled switch once you verify receiving the test email notification from SPEKTRA Edge.
Enabling the email notification channel.
You need a webhook endpoint to configure the Slack notification channel on SPEKTRA Edge.
Go to the official slack api page and create your Slack app, if you haven’t have one yet, by clicking the Create your Slack app button.
Clicking the Create your Slack app button on the Slack api page.
Once you have your Slack app, go to the Incoming webhooks section and activate the incoming webhooks by toggling the Incoming Webhooks switch On.
Activating the incoming webhooks by making the switch On.
Create a new webhook endpoint by adding the new webhook to your Slack workspace.
Getting the new webhook by clicking the Add New Webhook to Workspace button.
Copy the webhook URL by clicking the Copy button of the newly created webhook URL.
Copying the webhook URL for the newly created webhook URL.
Now, go back to the SPEKTRA Edge dashboard and set the webhook URL you just copied on the Create notification channel page after selecting the notification type to Slack.
Paste the webhook URL you copied above and click Create on the Create notification channel page.
Once the slack notification channel is created, let’s verify it by sending the test slack notification.
Go to the Notification channel overview page and click the Send button to send the test Slack notification.
Clicking the Send button to send the test slack notification.
Enable the notification channel by clicking the Enabled switch once you verify receiving the test Slack notification from SPEKTRA Edge.
Enabling the slack notification channel.
To create the webhook notification channel, you need to
- select Webhook in the type field
- provide the webhook endpoint in the Webhook field
- add the Content-Type: application/json header in the Add headers field
and click the Create button.
Fill in the webhook endpoint and click the Create button.
Click the Send button to send the test webhook notification to verify the configuration.
Clicking the Send button to send the test webhook notification.
Enable the notification channel by clicking the Enabled switch once you verify receiving the test webhook notification from SPEKTRA Edge.
Enabling the webhook notification channel.
Here is the sample alert JSON data sent over to the webhook endpoint for the device connection status alerting condition.
{
"project": {
"name": "projects/your-project",
"title": "Your Project"
},
"events": [
{
"alertingCondition": {
"name": "projects/your-project/regions/us-west2/alertingPolicies/device-connection-status-ny5lzm/alertingConditions/device-connection-condition-qv3w6g",
"displayName": "Device connection condition",
"spec": {
"timeSeries": {
"query": {
"filter": "(resource.type = devices.edgelq.com/device AND metric.type = devices.edgelq.com/device/connected)",
"selector": {
"metric": {
"types": [
"devices.edgelq.com/device/connected"
]
},
"resource": {
"types": [
"devices.edgelq.com/device"
]
}
},
"aggregation": {
"alignmentPeriod": "60s",
"perSeriesAligner": "ALIGN_MAX",
"crossSeriesReducer": "REDUCE_MIN",
"groupByFields": [
"resource.labels.device_id"
]
}
},
"threshold": {
"compare": "LT",
"value": 1
},
"duration": "300s"
},
"trigger": {}
}
},
"metricDescriptor": {
"name": "projects/your-project/metricDescriptors/devices.edgelq.com/device/connected",
"type": "devices.edgelq.com/device/connected",
"metricKind": "GAUGE",
"valueType": "INT64",
"unit": "1",
"displayName": "Device connected"
},
"alerts": [
{
"name": "projects/your-project/regions/us-west2/alertingPolicies/device-connection-status-ny5lzm/alertingConditions/device-connection-condition-qv3w6g/alerts/2024-11-20T01:33:00Z-5ftg31",
"displayName": "Device connection condition devices.edgelq.com/device {device_id:pp-quick-202410-bjr55hvm22jfhu}",
"info": {
"timeSerie": {
"key": "BQHPAQoCGrEEHaQheAECGXc=",
"metric": {
"type": "devices.edgelq.com/device/connected"
},
"monitoredResource": {
"type": "devices.edgelq.com/device",
"labels": {
"device_id": "pp-quick-202410-bjr55hvm22jfhu"
},
"reducedLabels": [
"project_id",
"region_id"
]
}
},
"observedValues": {}
},
"state": {
"isFiring": true,
"lifetime": {
"startTime": "2024-11-20T01:33:00Z"
},
"needsNotification": true,
"notificationCreated": true
}
}
]
}
]
}
Enable alerting policies
With all those three components configured, We’re ready to enable the alerting policy to monitor the device connection status for all the devices under the project.
Go to the Alerting overview page by selecting the Alerting policies option of the Alerts pulldown menu.
Selecting the Alerting policies option from the Alerts pull down menu.
Enable the alerting policy by sliding the Enabled switch to be on for the Alerting policy you created named Device connection status.
Enabling the alerting policy by sliding the Enabled switch.
And also, let’s link the notification channels to the alerting policy so that we get notified whenever alerting status changes. Select the Edit details option of the alerting policy menu and provide the notification channels in the Notification channels field.
Great!
You’ve configured the alerting policy to detect the device offline status on SPEKTRA Edge.
Let’s simulating the offline connection status and observe what kind of information you can get from the SPEKTRA Edge alerting system.
Monitor alerts
Let’s pull the cable from one of your devices and see how the alert looks like.
After waiting for five minutes, you should be able to see the alert raised on the sidebar of the dashboard page.
The circle alert number right next to the Alerts section of the sidebar.
Why five minutes?
The SPEKTRA Edge alerting system triggers alerts when the condition is true for the specified duration of time. Since we configured the alerting condition with five minutes duration time, you need to wait for roughly five minutes after pulling the cable.
Set the shorter duration time and you will see the alert happens much quicker.
Go to the Alerts page by selecting the Alerts option of the Alerts pull-down menu. You will see the Fireing alert of the Device connection status alerting policy with the on-going alert duration and the device information.
The Fireing alert on the Alerting overview page.
Click the start time of the firing alert and get the detailed information of the alert. You can observe much more information of the alert firing including the link to the Device overview page of the device without the connection.
Deivce information of the alert firing on.
Example: CPU utilization alerting condition
Before wrapping up, let’s take a look at another example to understand how to configure alerting condition on SPEKTRA Edge.
Here is the CPU utilization alerting condition, which triggers alert whenever the average CPU utilization is more than 50% for half an hour.
The CPU Utilization alerting condition example for your reference.
Here is some of the highlight:
- Threshold condition
- Greater than is used as the comparison operator
- 50% as the threshold value
- 30 minutes as the duration time
- Time series configuration
- five minutes alignment period with Mean per-series aligner
- Group by device_id with Mean cross-series reducer
Here is the summary of the time series configuration parameters.
- using the Mean per-series aligner to get the average of the CPU utilization of the five munites time period
- using the Mean cross-series reducer to get the average of the multiple CPU time series to be treated as the devices CPU utilization data point
With those two aggregations, the system compares the aggregated data point to compare to the threshold condition, more than 50%, and raises an alert when it’s true for more than the duration time, 30 minutes.
Next step
Congratulations for creating and monitoring alerts on SPEKTRA Edge. It’s a little long explanation but we hope you understand the insight of the SPEKTRA Edge alerting system as well as be ready to create your own alerting policies and conditions.
Let’s learn accounts management next as a path to the SPEKTRA Edge mastery.
Onwards.