Skip to main content

How to Create PagerDuty Alerts

Fix Inventory constantly monitors your infrastructure, and can alert you to any detected issues. PagerDuty is the de-facto standard to escalate alerts. In this guide, we will configure Fix Inventory to send alerts to PagerDuty with a custom command.

Prerequisites​

This guide assumes that you have already installed and configured Fix Inventory to collect your cloud resources.

You will also need a valid routing key for your PagerDuty account.

Directions​

  1. Open the relevant service in PagerDuty and click Integrations. Then, click the Add new integration button.

  2. Expand Events API V2 and copy the revealed integration key:

    PagerDuty Integration

    note

    We will refer to this key as the "routing key" for the remainder of these instructions.

  3. Open the fix.core.commands configuration:

    > config edit fix.core.commands
  4. Add the routing key copied in step 2 as the default value of the routing_key parameter in the pagerduty section. This will allow you to execute the pagerduty command without specifying the routing key parameter each time.

    info

    The pagerduty command has the following parameters, all of which are required:

    ParameterDescriptionDefault Value
    summaryAlert summary
    routing_keyEvents API V2 integration key
    dedup_keyString identifier that PagerDuty will use to ensure that only a single alert is active at a time
    sourceAlert sourceFix Inventory
    severityAlert severity (critical, error, warning, or info)warning
    sourceLocation of the affected system (preferably a hostname or FQDN)Fix Inventory
    event_actionAlert action (trigger, acknowledge, resolve or assign)trigger
    clientName of the monitoring client submitting the eventFix Inventory
    client_urlURL to the monitoring clienthttps://fixinventory.org/
    webhook_urlPagerDuty events API URL endpointhttps://events.pagerduty.com/v2/enqueue
  5. Define the search criteria that will trigger an alert. For example, let's say we want to send alerts whenever we find a Kubernetes Pod updated in the last hour with a restart count greater than 20:

    > search is(kubernetes_pod) and pod_status.container_statuses[*].restart_count > 20 and last_update<1h
    ​kind=kubernetes_pod, name=db-operator-mcd4g, restart_count=[42], age=2mo5d, last_update=23m, cloud=k8s, account=prod, region=kube-system
  6. Now that we've defined the alert trigger, we will simply pipe the result of the search query to the pagerduty command, replacing the name with your desired alert name:

    > search is(kubernetes_pod) and pod_status.container_statuses[*].restart_count > 20 and last_update<1h | pagerduty summary="Pods are restarting too often!" dedup_key="Fix Inventory::PodRestartedTooOften"

    If the defined condition is currently true, you should see a new alert in PagerDuty.

  7. Finally, we want to automate checking of the defined alert trigger and send alerts to PagerDuty whenever the result is true. We can accomplish this by creating a job:

    > jobs add --id alert_on_pod_failure--wait-for-event post_collect 'search is(kubernetes_pod) and pod_status.container_statuses[*].restart_count > 20 and last_update<1h | pagerduty summary="Pods are restarting too often!" dedup_key="Fix Inventory::PodRestartedTooOften"

Further Reading​