Cancelling a stuck soft reboot task on Xen Server

Today, one of my fellow colleagues received a call about a server that had run out of memory. They sent a soft reboot, and because of that the process task hung. This is because the hypervisor compute node sends a message to the nova agent running on the guest virtual machine! If the guest virtual machine has run out of memory, it’s not possible for nova to receive that command, or, if it does, then the soft (software) reboot can fail, because there is not enough memory to fork the process.

This could have been avoided by issuing a hard reboot straight away, but in this case we needed to cancel the task and send a hard reboot. Here is what I did:

List all pending tasks on xen-server

# xe task-list

uuid ( RO)                : a9f84f3d-0b96-8da2-a1d1-f5b774cd9173
          name-label ( RO): VM.clean_reboot
    name-description ( RO): 
              status ( RO): pending
            progress ( RO): 0.275

Cancel a pending task on xen-server

xe task-cancel uuid=a9f84f3d-0b96-8da2-a1d1-f5b774cd9173

This sets the active_state back to normal and gets rid of the ‘pending soft reboot’, but we need to restart the server too.

Using supernova API to stop and restart the server

supernova lon stop serveruuidhere
supernova lon start serveruuidhere

and…The customer is back up online and running, yay!