Nagios XI
Monitor Nagios XI Attatched Infrastructure
This source has been deprecated
observIQ is in the process of transitioning a subset of BindPlane's monitoring capabilities to the observIQ OpenTelemetry Collector. As a result, this Source is no longer publicly available in BindPlane. If you need access to this Source, please reach out to our support via chat or via [email protected].
Not Included in the BindPlane with Google Stackdriver offering
All of the Google Cloud Platform sources listed within this documentation are not included with the BindPlane with Google Stackdriver offering.
Data Collection Setup
Metrics are collected via REST API from Nagios XI monitoring systems
Network Requirements
Port: 443 (TCP) HTTPS to Nagios XI REST API
Least Privileged User
Bindplane requires a least-privileged user (LPU) account of read-only
Finding the API Key
The API keys in Nagios XI are bound to a user account. Generate and find the key in the user information.
Admin Required for Certain Metrics
A Nagios Admin account is required in order to collect the following Nagios System Status and User Configuration metrics:
- Active Host Checks Enabled
- Active Service Checks Enabled
- Event Handlers Enabled
- Is Currently Running
- Notifications Enabled
- Passive Host Checks Enabled
- Passive Service Checks Enabled
Supported Versions
Nagios XI: 5.3.x+
Connection Parameters
Name | Required? | Description |
---|---|---|
Host | Required | |
Port | ||
Username | Required | |
API Key | Required | API key for the Nagios system. |
SSL Configuration | ||
Collect Events | Whether or not to collect events. | |
Minimum Event Severity | Minimum severity of events to collect. | |
Services Whitelist | Comma separated list of services to collect. Otherwise, all. | |
Timeout In Seconds | Timeout (in seconds) for requests to the API. | |
Max Threads | Maximum simultaneous requests |
Metrics
Host
Name | Description |
---|---|
Active Checks Enabled | TODO |
Address | The address of the Host resource. |
Alias | TODO |
Check Type | TODO |
Idle CPU (%) | The aggregated value of 'Idle' metrics from child Service resources. |
IO Wait (%) | The aggregated value of 'IO Wait' metrics from child Service resources. |
System CPU Use (%) | The aggregated value of 'System' metrics from child Service resources. |
Used CPU (%) | The aggregated value of User, System, and IO Wait metrics from child Service resources. |
User CPU Use (%) | The aggregated value of 'User' metrics from child Service resources. |
Current Attempt | TODO |
Current State | TODO |
Disk Free (Megabytes) | The aggregated value of 'Free' metrics from child Service resources. |
Disk Size (Megabytes) | The aggregated value of 'Total' metrics from child Service resources. |
Disk Used (Megabytes) | The aggregated value of 'Used' metrics from child Service resources. |
Duration | TODO |
Event Handler Enabled | TODO |
Execution Time (Seconds) | TODO |
Flap Detection Enabled | TODO |
Host Name | The name of the Host resource. |
Last Check | TODO |
Last Notification | TODO |
Last State Change | TODO |
Latency (Seconds) | TODO |
Buffers and Cached Memory (Megabytes) | The aggregated value of 'Buffers and Cached' metrics from child Service resources. |
Free Memory (Megabytes) | The aggregated value of 'Free' metrics from child Service resources. |
Shared Memory (Megabytes) | The aggregated value of 'Shared' metrics from child Service resources. |
Total Memory (Megabytes) | The aggregated value of 'Total' metrics from child Service resources. |
Used Memory (Megabytes) | The aggregated value of 'Used' metrics from child Service resources. |
Next Check | TODO |
Notifications Enabled | TODO |
Obsession Enabled | TODO |
Parent System | The parent system of the host resource. |
Passive Checks Enabled | TODO |
Problem Acknowledged | Whether or not the problem has been acknowledged. |
Performance Data Processing Enabled | TODO |
Service Problem Count | The total number of Service descendants whose status is not 'OK'. |
Services Count | The number of Service descendants of this resource. |
Status Information | TODO |
Host Group
Name | Description |
---|---|
Alias | The alias of the Host resource. |
Average CPU Use (from Host descendants) (%) | The average CPU used across all Host descendants. |
Maximum CPU Use (from Host descendants) (%) | The maximum CPU used across all Host descendants. |
Minimum CPU Use (from Host descendants) (%) | The minimum CPU used across all Host descendants. |
Average Disk Use (from Host descendants) (Megabytes) | The average disk used across all Host descendants. |
Maximum Disk Use (from Host descendants) (Megabytes) | The maximum disk used across all Host descendants. |
Minimum Disk Use (from Host descendants) (Megabytes) | The minimum disk used across all Host descendants. |
Total Disk Space (from Host descendants) (Megabytes) | The total disk used across all Host descendants. |
Used Total Disk Use (from Host descendants) (Megabytes) | The total disk used across all Host descendants. |
Host Problem Count | The total number of Host descendants whose status is not 'OK'. |
Host Unhandled Problem Count | The total number of Host descendants whose status is not 'OK' and have an unacknowledged problem. |
Host Group Name | The group name of the Host resource. |
Host Count | The total number of Host descendants. |
Host Down Count | The total number of Host descendants where status is down. |
Host Pending Count | The total number of Host descendants where status is pending. |
Host Unreachable Count | The total number of Host descendants where status is unreachable. |
Host Up Count | The total number of Host descendants where status is up. |
Average Memory Use (from Host descendants) (Megabytes) | The average memory used across all Host descendants. |
Maximum Memory Use (from Host descendants) (Megabytes) | The maximum memory used across all Host descendants. |
Minimum Memory Use (from Host descendants) (Megabytes) | The minimum memory used across all Host descendants. |
Total Memory Capacity (from Host descendants) (Megabytes) | The total memory across all Host descendants. |
Used Total Memory Use (from Host descendants) (Megabytes) | The total memory used across all Host descendants. |
Parent System | The parent system of the host group resource. |
Service
Name | Description |
---|---|
Check Latency (Milliseconds) | The latency of this Service on the associated Host resource. |
State | The current state of the Service resource. |
Execution Time (Seconds) | The execution time of the latest status check for this Service on the associated Host resource. |
Host Name | The address of the Service resource. |
Last Check | The time of the last state change of the Service resource. |
Parent System | The parent system of the service resource. |
Performance Data | The performance data returned from the latest status check for this Service on the associated Host resource. |
Problem Acknowledged | If there is a problem, whether or not it is acknowledged. |
Service Description | The description of the Service resource. |
Status Text | The status text of the latest status check for this Service resource on the associated Host resource. |
Service Group
Name | Description |
---|---|
Alias | The alias of the Service Group. |
Host Problem Count | The total number of problems in the Service Group. |
Host Count | The total number of hosts in Service Group |
Host Down Count | The total number of hists in the Service Group where status is down. |
Host Pending Count | The total number of hosts in the Service Group where status is pending. |
Host Unhandled Problems Count | The total number of host problems in the Service Group that are unhandled. |
Host Unreachable Count | The total number of hosts in the Service Group where stauts is unreachable |
Host Up Count | The total number of hosts in the Service Group where status is up. |
Parent System | The parent system of the service group resource. |
Service Group Name | The group name of the Service Group. |
Service Count | The total number of services in the Service Group |
Service Critical Count | The total number of services in the Service Group where status is critical. |
Service OK Count | The total number of services in the Service Group where status is okay. |
Service Pending Count | The total number of services in the Service Group where status is pending. |
Service Problem Count | The total number of service problems in the Service Group |
Service Unhandled Problem Count | The total number of service problems in the Service Group that are unhandled. |
Service Unknown Count | The total number of services in the Service Group where status is unknown. |
Service Warning Count | The total number of services in the Service Group where status is warning. |
System
Name | Description |
---|---|
Active Host Checks Enabled | Whether or not active host checks are enabled. |
Active Service Checks Enabled | Whether or not active service checks are enabled. |
Average CPU Use (from Host children) (%) | The average CPU used across all Host children. |
Maximum CPU Use (from Host children) (%) | The maximum CPU used across all Host children. |
Minimum CPU Use (from Host children) (%) | The minimum CPU used across all Host children. |
Average Disk Use (from Host children) (Megabytes) | The average disk used across all Host children. |
Maximum Disk Use (from Host children) (Megabytes) | The maximum disk used across all Host children. |
Minimum Disk Use (from Host children) (Megabytes) | The minimum disk used across all Host children. |
Total Disk Capacity (from Host children) (Megabytes) | The total disk capacity across all Host children. |
Total Disk Use (from Host children) (Megabytes) | The total disk used across all Host children. |
Event Handlers Enabled | Whether or not event handlers are enabled. |
Host Count | Number of hosts within this system. |
Host Down Count | Number of hosts within this system that are down. |
Host Pending Count | Number of hosts within this system that are unreachable. |
Host Problem Count | Number of host problems within this system. |
Host Unhandled Problem Count | Number of unhandled host problems within this system. |
Host Unreachable Count | Number of hosts within this system that are unreachable. |
Host Up Count | Number of hosts within this system that are up. |
Hostname | Hostname of the Nagios System. |
Is Currently Running | Whether or not the system is currently running. |
Average Memory Use (from Host children) (Megabytes) | The average memory used across all Host children. |
Maximum Memory Use (from Host children) (Megabytes) | The maximum memory used across all Host children. |
Minimum Memory Use (from Host children) (Megabytes) | The minimum memory used across all Host children. |
Total Memory Capacity (from Host children) (Megabytes) | The total memory across all Host children. |
Total Memory Use (from Host children) (Megabytes) | The total memory used across all Host children. |
Notifications Enabled | Whether or not notifications are enabled. |
Passive Host Checks Enabled | Whether or not passive host checks are enabled. |
Passive Service Checks Enabled | Whether or not passive service checks are enabled. |
Service Count | Number of services within this system. |
Service Critical Count | Number of hosts within this system that have state 'critical'. |
Service Pending Count | Number of hosts within this system that have state 'pending'. |
Service Problem Count | Number of service problems within this system. |
Service Unhandled Problem Count | Number of unhandled service problems within this system. |
Service Unknown Count | Number of hosts within this system that have state 'unknown'. |
Service OK Count | Number of hosts within this system that have state 'OK'. |
Service Warning Count | Number of hosts within this system that have state 'warning'. |
Updated almost 2 years ago