Embedded Event Manager
Embedded Event Manager (EEM) is a powerful device- and system-management technology integrated in NX-OS. EEM helps customers harness the network intelligence intrinsic to Cisco’s software and give them the capability to customize behavior based on the network events as they happen. EEM is an event-driven tool that takes various types of trigger input and enables the user to define what actions can be taken. This includes capturing various show commands or performing actions such as executing a Tool Command Language (TCL) or Python script when the event gets triggered.
An EEM consists of two major components:
Event: Defines the event to be monitored from another NX-OS component
Action: Defines action to be taken when the event is triggered
Another component of EEM is the EEM policy, which is nothing but an event paired with one or more actions to help troubleshoot or recover from an event. Some system-defined policies look out for certain system-level events such as a line card reload or supervisor switchover event and then perform predefined actions based on those events. These system-level policies are viewed using the command show event manager system-policy. The policies are overridable as well and can be verified using the previous command. The system policies help prevent a larger impact on the device or the network. For instance, if a module has gone bad and keeps crashing continuously, it can severely impact services and cause major outages. A system policy for powering down the module after N crashes can reduce the impact.
Example 2-24 lists some of the system policy events and describes the actions on those events. The command show event manager policy-state system-policy-name checks how many times an event has occurred.
Example 2-24 EEM System Policy
NX-1# show event manager system-policy Name : __lcm_module_failure Description : Power-cycle 2 times then power-down Overridable : Yes Name : __pfm_fanabsent_any_singlefan Description : Shutdown if any fanabsent for 5 minute(s) Overridable : Yes Name : __pfm_fanbad_any_singlefan Description : Syslog when fan goes bad Overridable : Yes Name : __pfm_power_over_budget Description : Syslog warning for insufficient power overbudget Overridable : Yes Name : __pfm_tempev_major Description : TempSensor Major Threshold. Action: Shutdown Overridable : Yes Name : __pfm_tempev_minor Description : TempSensor Minor Threshold. Action: Syslog. Overridable : Yes NX-1# show event manager policy-state __lcm_module_failure Policy __lcm_module_failure Cfg count : 3 Hash Count Policy will trigger if ---------------------------------------------------------------- default 0 3 more event(s) occur
An event can be either a system event or a user-triggered event, such as configuration change. Actions are defined as the workaround or notification that should be triggered in case an event occurs. EEM supports the following actions, which are defined in the action statement:
Executing CLI commands (configuration or show commands)
Updating the counter
Logging exceptions
Reloading devices
Printing a syslog message
Sending an SNMP notification
Setting the default action policy for the system policy
Executing a TCL or Python script
For example, an action can be taken when high CPU utilization is being seen on the router, or logs can be taken when a BGP session has flapped. Example 2-25 shows the EEM configuration on a Nexus platform. The EEM has the trigger event set for the high CPU condition (for instance, the CPU utilization is 70% or higher); the actions include BGP show commands that are captured when the high CPU condition is noticed. The policy is viewed using the command show event manager policy internal policy-name.
Example 2-25 EEM Configuration and Verification
event manager applet HIGH-CPU event snmp oid 1.3.6.1.4.1.9.9.109.1.1.1.1.6.1 get-type exact entry-op ge entry-val 70 exit-val 30 poll-interval 1 action 1.0 syslog msg High CPU hit $_event_pub_time action 2.0 cli command enable action 3.0 cli command "show clock >> bootflash:high-cpu.txt" action 4.0 cli command "show processes cpu sort >> bootflash:high-cpu.txt" action 5.0 cli command "show bgp vrf all all summary >> bootflash:high-cpu.txt" action 6.0 cli command "show clock >> bootflash:high-cpu.txt" action 7.0 cli command "show bgp vrf all all summary >> bootflash:high-cpu.txt" NX-1# show event manager policy internal HIGH-CPU Name : HIGH-CPU Policy Type : applet action 1.0 syslog msg "High CPU hit $_event_pub_time" action 1.1 cli command "enable" action 3.0 cli command "show clock >> bootflash:high-cpu.txt" action 4.0 cli command "show processes cpu sort >> bootflash:high-cpu.txt" action 5.0 cli command "show bgp vrf all all summary >> bootflash:high-cpu.txt" action 6.0 cli command "show clock >> bootflash:high-cpu.txt" action 7.0 cli command "show bgp vrf all all summary >> bootflash:high-cpu.txt"
In some instances, repetitive configuration or show commands must be issued when an event is triggered. Additionally, using an external script makes it difficult to continuously monitor the device for an event and then trigger the script. For such scenarios, a better solution is to use automation scripts and tools that are available with NX-OS. NX-OS provides the capability to use TCL and Python scripts in the EEM itself, which allows those scripts to be triggered only when an event is triggered.
Consider an example software problem in which any link shutdown on the switch causes the switching to get disabled on all the VLANs present on the switch. Example 2-26 demonstrates triggering the TCL script for a link shutdown. The TCL is saved on the bootflash with the .tcl extension. The TCL file iterates over all the VLAN database and performs a no shutdown under the VLAN configuration mode.
Example 2-26 EEM with TCL Script
! Save the file in bootflash with the .tcl extension set i 1 while {$i<10} { cli configure terminal cli vlan $i cli no shutdown cli exit incr i } ! EEM Configuration referencing TCL Script event manager applet TCL event cli match "shutdown" action 1.0 syslog msg "Triggering TCL Script on Module Failure Event" action 2.0 cli local tclsh EEM.tcl
Similarly, a Python script can be referenced in the EEM script. The Python script is also saved in the bootflash with the .py extension. Example 2-27 illustrates a Python script and its reference in the EEM script. In this example, the EEM script is triggered when the traffic on the interface exceeds the configured storm-control threshold. In such an event, the triggered Python script collects multiple commands.
Example 2-27 Python Script with EEM
! Save the Python script in bootflash: import re import cisco cisco.cli ("show module >> bootflash:EEM.txt") cisco.cli ("show redundancy >> bootflash:EEM.txt") cisco.cli ("show interface >> bootflash:EEM.txt") ! EEM Configuration referencing Python Script event manager applet Py_EEM event storm-control action 1.0 syslog msg "Triggering TCL Script on Module Failure Event" action 2.0 cli local python EEM.py