Nagios hard soft states

Nagios hard soft states for reference below.

Recently I was working on gathering nagios state history reports on when the hosts went down and when they were up and running. I could see two specific State Types,

Hard
Soft

and also I could see that both Hard and Soft state types were referring to DOWN State. I was wondering what could be the difference if they mention the same for DOWN state. Sample screenshots for reference,

nstate4

States description during monitoring,

Hard

nstate3

Soft

nstate1

There is a detailed information on these state types in Nagios site and below link for reference,

https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/3/en/statetypes.html

As mentioned in the link (content taken from the site),

There are two state types in Nagios – SOFT states and HARD states. These state types are a crucial part of the monitoring logic, as they are used to determine when event handlers are executed and when notifications are initially sent out.

In order to prevent false alarms from transient problems, Nagios allows you to define how many times a service or host should be (re)checked before it is considered to have a “real” problem. This is controlled by the max_check_attempts option in the host and service definitions. Understanding how hosts and services are (re)checked in order to determine if a real problem exists is important in understanding how state types work.

Soft States

Soft states occur in the following situations,

1. When a service or host check results in a non-OK or non-UP state and the service check has not yet been (re)checked the number of times specified by the max_check_attempts directive in the service or host definition. This is called a soft error.
2. When a service or host recovers from a soft error. This is considered a soft recovery.

The following things occur when hosts or services experience SOFT state changes:

The SOFT state is logged.
Event handlers are executed to handle the SOFT state.

The only important thing that really happens during a soft state is the execution of event handlers. Using event handlers can be particularly useful if you want to try and proactively fix a problem before it turns into a HARD state. The $HOSTSTATETYPE$ or $SERVICESTATETYPE$ macros will have a value of “SOFT” when event handlers are executed, which allows your event handler scripts to know when they should take corrective action.

Hard states

Hard state occur for hosts and services in the following situations:

1. When a host or service check results in a non-UP or non-OK state and it has been (re)checked the number of times specified by the max_check_attempts option in the host or service definition. This is a hard error state.
2. When a host or service transitions from one hard error state to another error state (e.g. WARNING to CRITICAL).
3. When a service check results in a non-OK state and its corresponding host is either DOWN or UNREACHABLE.
4. When a host or service recovers from a hard error state. This is considered to be a hard recovery.
5. When a passive host check is received. Passive host checks are treated as HARD unless the passive_host_checks_are_soft option is enabled.

The following things occur when hosts or services experience HARD state changes:

The HARD state is logged.
Event handlers are executed to handle the HARD state.
Contacts are notifified of the host or service problem or recovery.

The $HOSTSTATETYPE$ or $SERVICESTATETYPE$ macros will have a value of “HARD” when event handlers are executed, which allows your event handler scripts to know when they should take corrective action.

Tags: nagios, Nagios hard soft, Nagios XI

Nagios hard soft states

MWP Blog Search

Our Recent Posts

Archives

Categories