Ensuring Technical Readiness For Copilot in Microsoft 365
Performing an HP ProLiant Server NMI Crash Dump
1. Technical white paper
Performing an HP ProLiant
server NMI crash dump
Table of contents
Introduction............................................................................................................................................................................2
NMI crash dump overview....................................................................................................................................................2
Initiating NMI crash dumps...................................................................................................................................................3
NMI crash jumper pins and dump switches ...................................................................................................................3
ROM-Based NMI Debug button........................................................................................................................................4
NMI crash dump compliant operating systems.................................................................................................................4
Microsoft Windows............................................................................................................................................................5
VMware...............................................................................................................................................................................5
Linux ...................................................................................................................................................................................5
iLO Virtual NMI Button...........................................................................................................................................................6
Resources...............................................................................................................................................................................7
Click here to verify the latest version of this document
2. Technical white paper | ProLiant diagnostic tools
2
Introduction
This document describes the implementation of non-maskable interrupt (NMI)-based crash dump capabilities in HP ProLiant
servers, including ProLiant Gen8 servers. The ability to perform an NMI-based crash dump can be beneficial to system
administrators in their root cause failure analysis.
An NMI crash dump allows you to obtain critical diagnostic information in the event of system failures. We present both
user-initiated and automatic crash dump methods.
NMI crash dump overview
The NMI crash dump is a diagnostic mechanism that allows the creation of crash dump files in situations when a system is
unresponsive and traditional debugging mechanisms are unsuccessful.
Crash dump analysis is an essential diagnostic tool for addressing reliability problems in operating systems, device drivers,
and applications. Many crashes will freeze a system in such a way that your only recourse is to do a hard reset (cycling
power on the system). Since resetting the system erases any information supporting an analysis of the problem, the system
must execute a memory dump before you perform a hard reset. A hardware jumper, dump switch, or virtual NMI button
along with supported operating systems provide this function.
Figure 1 shows the course of events that occur when you force the operating system to invoke the NMI handler, generate a
crash dump log, and then use that log to diagnose software failures. The crash dump log can provide critical information for
root-cause analysis that may be difficult or impossible to obtain through other means. You initiate an NMI event by shorting
the jumper pins, by pressing the dump switch, or through the HP iLO Virtual NMI Button feature. The NMI can allow a frozen
system to become responsive enough to generate a crash dump log.
Figure 1.
Warning
Using the NMI crash jumper pins or dump switch on a functioning system (using any operating system) will cause an
abruptly halt. You should never use NMI crash dump during normal operation.
The jumper pins and dump switch operate even if the appropriate driver is not loaded. If present, the driver disables the
Automatic Server Recovery (ASR) feature so that the server does not reboot when a debug session is in progress.
3. Technical white paper | ProLiant diagnostic tools
3
The NMI crash dump jumper pins or dump switch may not work in all situations: after another NMI has already occurred in
the system, when the OS crash handler is incapable of running properly, and following some hardware failures. Table 1
highlights ProLiant server NMI crash dump capabilities and benefits.
Table 1.
ProLiant NMI crash dump compatibility NMI benefits for ProLiant servers
ProLiant hardware Newer ProLiant servers only include NMI
jumpers, not dump switches. Consult the
product documentation for your server.
ProLiant server blades do not include
physical NMI debug jumper pins or dump
switches. You can only use iLO-based Virtual
NMI functions. See the iLO Virtual NMI Button
section later in this paper.
Jumper pins, a dump switch, or a virtual
NMI function cause a ProLiant server to
initiate an NMI (PCI SERR) event and create
a crash dump file.
ProLiant software Beginning with ProLiant Gen8 servers, an HP
NMI Sourcing driver is not required for any
operating system. The system ROM logs the
NMI event. Older servers require the
appropriate HP NMI Sourcing driver to create
a crash dump file.
The NMI crash dump function is dependent
upon the ProLiant server Generation,
whether the ProLiant is a rack or blade
server, and on the appropriate driver being
installed when necessary. All drivers are
distributed with Service Pack for ProLiant
(SPP) hp.com/go/spp. The SPP detects
and installs the appropriate driver for the
server automatically.
Note
The NMI jumper pins or dump switch will cause an NMI upon activation. This feature does not require any software to
generate the NMI. An NMI event by itself will not create a crash dump log.
Initiating NMI crash dumps
You can initiate a NMI event through the jumper pins or dump switch provided on the ProLiant server, or remotely through
the Virtual NMI button in iLO (see the “iLO Virtual NMI Button” section).
NMI crash jumper pins and dump switches
The NMI crash dump jumper pins or dump switch generate a PCI SERR under all operating systems.
Figures 2 and 3 are examples of jumper pins and dump switches found on ProLiant servers. For exact placement, refer to
the illustration on the hood label of the server or in the user guide.
Figure 2. Figure 3.
4. Technical white paper | ProLiant diagnostic tools
4
Note
Newer ProLiant servers only include NMI jumpers, not dump switches. Consult the product documentation for your server.
ProLiant server blades do not include physical NMI debug jumper pins or dump switches. You can only use iLO-based Virtual
NMI functions. See the iLO Virtual NMI Button section later in this paper.
ROM-Based NMI Debug button
The NMI Debug Button option is a toggle setting (Figure 4) that allows you to enable debug functionality when the system
has experienced a software lock-up. The NMI Debug Button generates an NMI to enable the use of the operating system
debugger. The NMI Debug Button is enabled by default.
Options include:
• Enabled (default)
• Disabled
Figure 4.
NMI crash dump compliant operating systems
The operating systems discussed here give you the ability to initiate crash memory dumps.
Note
Beginning with ProLiant Gen8 servers, an HP NMI sourcing driver is not required for any operating system. The system ROM
logs the NMI event.
5. Technical white paper | ProLiant diagnostic tools
5
Microsoft Windows
You can find the latest guidelines for generating a NMI crash dump file or a kernel crash dump file on a Windows-based
system in this Microsoft support article: support.microsoft.com/kb/927069
The article also contains a current list of applicable Microsoft operating systems. If your ProLiant server requires an HP
sourcing driver (a ProLiant pre-Gen8 server), you will need to use the SPP software appropriate for your operating system to
supply that driver.
Use either the SPP version that shipped with your ProLiant server, or a later version from the hp.com. You can download the
latest version of SPP from: hp.com/go/spp_download
Note
For optimal functionality, we recommend using SPP to obtain the appropriate HP NMI sourcing driver for servers older than
ProLiant Gen8. This is an optional step since the HP NMI Sourcing driver is not required with compliant Microsoft operating
systems.
Warning
Before making changes in the Registry, HP recommends that you make a copy of the system settings. This will allow you to
restore the system settings if there are errors.
VMware
VMware 5 is compliant with the HP NMI Sourcing driver, but only on ProLiant pre-Gen8 servers. On ProLiant Gen8 servers,
even those without the HP NMI driver, VMware will halt the VMkernel with a purple diagnostic screen (panic) when an NMI
occurs. The sole purpose of the HP NMI Sourcing driver is to tell the OS to panic and log the NMI event to the HP Integrated
Management Log (IML).
For more information about managing a VMware NMI event, see the VMware Knowledge Base. You can access the VMware
Knowledge base at either of the following locations:
• kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1014767
• kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=20
02955
You can download the latest HP NMI sourcing driver for VMware at hp.com/products/servers/software/vmware-
esxi/driver_version.html.
Linux
Linux uses the kdump facility to create a crash dump when an NMI event occurs. Most Linux kernels are configured to be
"kdump-ready". You can typically find a description of the configuration inside the Linux kernel source tree file
“Documentation/kdump/kdump.txt.”
This is generic information regarding Linux and NMI-related crash dumps. You should look for specific information relating
to your version of Linux.
6. Technical white paper | ProLiant diagnostic tools
6
Table 2 provides an overview of compliant operating systems and benefits.
Table 2.
Operating system NMI crash dump compliance NMI benefits
Compliant Microsoft
Windows operating
systems
Registry changes are required to generate a
crash dump when the NMI dump switch is
used. No special installation requirements are
needed.
Allows user level settings for a crash
dump file generation.
VMware VMware 5 is compliant with the HP NMI
sourcing driver, but only on ProLiant pre-
Gen8 servers. No drivers are required on
ProLiant Gen8 servers.
VMware is compatible with NMI crash
dump for ProLiant Gen8 and earlier
servers.
Linux Linux uses the kdump facility Linux is compatible with ProLiant server
NMI crash dumps
iLO Virtual NMI Button
ProLiant servers with iLO can initiate an NMI crash dump through a web browser. The iLO- based Virtual NMI button allows
users to trigger an NMI without requiring physical access to the server chassis or knowing the precise location of the NMI
control for the host. Access to this control is restricted to users with the “iLO Virtual Power & Reset” privilege. The same NMI
crash dump conditions and restrictions apply when using iLO.
To generate an NMI using iLO, you must:
1. Log into the iLO processor of the target using an account with the Virtual Power & Reset privilege.
2. Navigate to the iLO Diagnostics screen as shown in Figure 5.
3. Click the Generate NMI to System button.
Figure 5.