Wednesday, July 20, 2011

VMware_Cannot create a snapshot because the snapshot operation exceeded the time limit for holding off I/O in the frozen virtual machine

  • Creating a quiesced snapshot fails.
  • While taking a snapshot, you see the error:

    Cannot create a quiesced snapshot because the snapshot operation exceeded the time limit for holding off I/O in the frozen virtual machine
     
  • You may also experience this issue when you are doing a hot clone of a virtual machine. You see the error: 
    An error occurred while quiescing the virtual machine. See the virtual machine's event log for details 'VssSyncStart' operation failed: IDispatch error #8449 (0x80042301)
  • The /var/log/vmware/vpx/vpxa logs contain entries similar to:
    2010-10-10T22:53:31.077Z [45EEEB90 info 'Default' opID=D1D8F924-00002337-83] [VpxLRO] -- ERROR task-768 -- -- vpxapi.VpxaService.createSnapshot: vim.fault.ApplicationQuiesceFault:
    --> Result:
    (vim.fault.ApplicationQuiesceFault) {
       dynamicType = <unset>,
       faultCause = (vmodl.MethodFault) null,
       faultMessage = (vmodl.LocalizableMessage) [
          (vmodl.LocalizableMessage) {
             dynamicType = <unset>,
             key = "msg.snapshot.quiesce.vmerr",
             arg = (vmodl.KeyAnyValue) [
                (vmodl.KeyAnyValue) {
                   dynamicType = <unset>,
                   key = "1",
                   value = "5",
                },
                (vmodl.KeyAnyValue) {
                   dynamicType = <unset>,
                   key = "2",
                   value = "'VssSyncStart' operation failed: IDispatch error #8449 (0x80042301)",
                }
             ],
             message = "The Guest OS has reported an error during quiescing.
    --> The error code was: 5
    --> The error message was: 'VssSyncStart' operation failed: IDispatch error #8449 (0x80042301)
    --> ",
          }
       ],
       msg = "An error occurred while quiescing the virtual machine. See the virtual machine's event log for details."
    }
    --> Args:
    --> Arg vmid:6
    --> Arg name:"clone-temp-1286776661914351"
    --> Arg description:
    "This temporary snapshot is taken as part of the clone operation. The temporary snapshot will be deleted once the clone operation is complete."
    --> Arg memory:false
    --> Arg quiesce:true
  • The vmware.log of the virtual machine contains entries similar to:
    2010-10-10T23:02:12.050Z| vcpu-3| [msg.snapshot.quiesce.vmerr] The Guest OS has reported an error during quiescing.
    2010-10-10T23:02:12.050Z| vcpu-3| --> The error code was: 5
    2010-10-10T23:02:12.050Z| vcpu-3| --> The error message was: 'VssSyncStart' operation failed: IDispatch error #8449 (0x80042301)

Resolution

This issue occurs because the I/O in the virtual machine is high and the quiescing operation is unable to flush all the data to disk, while further I/O is created.
To resolve this issue, perform one of these options:
  • Verify VSS, guest, and backup product configuration. For additional steps on troubleshooting VSS, seeTroubleshooting Volume Shadow Copy (VSS) quiesce related issues (1007696).
  • Reduce the amount of ongoing I/O to the virtual machine. This can be accomplished using pre or post-freeze scripts to quiesce application I/O. For more information, see the Virtual Machine Backup Guide.
  • Opt for a crash-consistent snapshot (as opposed to application-consistent) of the virtual machine by avoiding quiescing of the file system.
If you do not want to quiesce the virtual machine during the snapshot creation, there are several options. Choose one of these options, based on your environment: 
  • If you are using VCB or a backup product that relies upon the VCB framework, see Quiescing Mechanisms in theVirtual Machine Backup Guide.
  • If you are taking the snapshot manually in ESX/ESXi 4.x, deselect the Quiesce guest file system option in the vSphere Client. In ESX/ESXi 3.5 it is not possible to take quiescent snapshots from the VMware Infrastructure Client.
  • If the snapshots are taken via the ESX host terminal using the vmware-cmd command, set the option of quiescing to 0.

    To set quiescing to 0, run the command:
    # vmware-cmd <cfg> createsnapshot <name> <description> <quiesce> <memory>

    For example:
    # vmware-cmd /vmfs/volumes/4adecc3a-62b367e8-5b15-001a4be960e0/VMname/VMname.vmx createsnapshot "Snap Name" "Snap Description" 0 0
  • If you are using a 3rd party backup product that does not allow you to configure for non-quiescent snapshots, remove the VSS component from the Windows guest operating system, provided as part of VMware Tools. When a quiescent snapshot is requested, VMware Tools does not find and utilize the VSS driver and the attempts to quiesce the filesystem are not made. The /var/log/vmware/hostd.log file reports this incident, but the snapshot creation completes without error additional errors.

    Note: This requires a reboot of the virtual machine. VMware recommends scheduling downtime before performing this action.
    1. Uninstall VMware Tools.
    2. Allow the system to reboot.
    3. Reinstall VMware Tools. Ensure to click Custom Install.
    4. Deselect VSS.

      After the installation, you are able to take a snapshot where the quiescing operation is not performed even if specifically requested.
Note: Older versions of Windows and guests deployed prior to VMware ESX 3.5 Update 2 utilize the Sync Driver, instead of VSS, for quiescent snapshot requests.

4 comments:

  1. Creating a virtual machine snapshot fails with the error: The attempted operation cannot be performed in the current state (Powered Off)

    Details
    You cannot create a virtual machine snapshot
    You see this error in VMware Infrastructure (VI) Client:

    The attempted operation cannot be performed in the current state (Powered Off)


    You see this error on the ESX server console operating system command line (vmware-cmd /vmfs/volumes/my-vmfs3/my-vm1.vmx createsnapshot sample-name sample-description 0 0):

    VMControl error -3: Invalid arguments
    Solution
    This issue may occur if snapshot.current in the .vmsd metadata file points to a non-existent snapshot UID. If snapshot.current points to a non-existant snapshot UID, attempting to create a snapshot causes the virtual machine to power off.
    In the following example, snapshot.current points to 38. This issue occurs if none of the snapshots have a UID of 38:

    root@bshp020 my-vm1]# cat my-vm1.vmsd
    snapshot.lastUID = "39"
    snapshot.numSnapshots = "0"
    snapshot.current = "38"
    snapshot0.uid = "39"
    snapshot0.filename = "my-vm1-Snapshot39.vmsn"
    snapshot0.displayName = "Consolidate Helper"
    snapshot0.description = "Helper snapshot for online consolidate."
    snapshot0.createTimeHigh = "274900"
    snapshot0.createTimeLow = "-753698745"
    snapshot0.numDisks = "1"
    snapshot0.disk0.fileName = "my-vm1.vmdk"
    snapshot0.disk0.node = "scsi0:0"
    snapshot.needConsolidate = "FALSE"
    To resolve this issue, perform one of these options:
    If the virtual machine does not have any snapshots, delete the existing .vmsd file. The file is recreated the next time a snapshot is created or the next time the virtual machine is powered ON.
    If the virtual machine has existing snapshots:
    Modify snapshot.current in the .vmsd so that it points to an existing snapshot.
    When the virtual machine is up and running, commit all snapshots, power off the virtual machine, then delete the virtual machine's .vmsd file. The file is recreated the next time a snapshot is created or the next time the virtual machine is powered ON.

    ReplyDelete
  2. Creation of a Snapshot of a VM fails.
    If using Snapshots to facilitate backups, clones, etc, the other task also fails.
    If using VDR, manifests as the error:

    Failed to create snapshot for "VMNAME" - error 3941 (create snapshot failed)

    Manual quiesce snapshot fails with the error:

    A snapshot operation cannot be performed

    In the vmware.log file of the virtual machine, you see error similar to:

    Oct 01 06:06:08.347: vmx| DISKLIB-LIB : Resuming change tracking.
    Oct 01 06:06:08.355: vmx| FILE: File_VMFSSupportsFileSize: Requested file size (554051831808) larger than maximum supported filesystem file size (274877906944)
    Oct 01 06:06:08.355: vmx| DiskLibCreateCustom: if your disk is on VMFS, you may consider increasing the block size.
    Oct 01 06:06:08.355: vmx| DISKLIB-LIB : Failed to create link: The destination file system does not support large files (12)
    Oct 01 06:06:08.355: vmx| SNAPSHOT: BranchDisk: Failed to create child disk '/vmfs/volumes/4b7d5904-ff90f633-7345-a4badb09bd7f /svsql01.ttni.com.sg/svsql01.ttni.com.sg_1-000001.vmdk' : The destination file system does not support large files (12)
    Oct 01 06:06:08.355: vmx| DISKLIB-VMFS : "/vmfs/volumes/4b82e7df-198a6606-f37c-a4badb09bd7f/svsql01.ttni.com.sg/svsql01.ttni.com.sg_1-flat.vmdk" : closed.
    Oct 01 06:06:08.355: vmx| SNAPSHOT: SnapshotBranch: Unlinking '/vmfs/volumes/4b7d5904-ff90f633-7345-a4badb09bd7f/svsql01.ttni.com.sg/svsql01.ttni.com.sg-000001.vmdk'.
    Oct 01 06:06:08.355: vmx| DISKLIB-VMFS : "/vmfs/volumes/4b7d5904-ff90f633-7345-a4badb09bd7f/svsql01.ttni.com.sg/svsql01.ttni.com.sg-000001-delta.vmdk" : open successful (17) size = 137438953472, hd = 0. Type 8
    Oct 01 06:06:08.355: vmx| DISKLIB-LIB : Resuming change tracking.
    Oct 01 06:06:08.358: vmx| DISKLIB-VMFS : "/vmfs/volumes/4b7d5904-ff90f633-7345-a4badb09bd7f/svsql01.ttni.com.sg/svsql01.ttni.com.sg-000001-delta.vmdk" : closed.
    Oct 01 06:06:08.360: vmx| SNAPSHOT: SnapshotBranch failed: The destination file system does not support large files (5).
    Oct 01 06:06:08.360: vmx| CPT current = 2, requesting 6
    Oct 01 06:06:08.360: vmx| Checkpoint_Unstun: vm stopped for 1287944 us
    Oct 01 06:06:08.360: vmx| SnapshotVMX done with snapshot 'test': 0
    Oct 01 06:06:08.360: vmx| Msg_Reset:
    Oct 01 06:06:08.360: vmx| [msg.checkpoint.save.fail2.std3] Error encountered while saving snapshot.
    Oct 01 06:06:08.360: vmx| The destination file system does not support large files.----------------------------------------
    Oct 01 06:06:08.360: vmx| Vix: [7370 vmxCommands.c:2353]: VMAutomationCreateSnapshotCallback: Got CreateSnapshot callback, snapshotErr = 12, UID = 0
    Resolution
    This issue may occur if the size of the snapshot file is larger than the available space on the datastore. In this case, the ESX host cancels the operation and displays the error.

    To resolve this issue, compare the base disk size of the virtual machine with the block size of the datastore which contains the working directory of the virtual machine.

    If the base disk size of the virtual machine is larger than the block size of the datastore, try one of these options:
    Change the working directory (workingDir) to a datastore with sufficient block size. For more information, see Creating snapshots in a different location than default virtual machine directory (1002929).
    Change the location of the virtual machine configuration files. To move the virtual machine configuration files, you can use Storage vMotion or cold migration with relocation of files. For more information, see Moving a single virtual disk using Storage VMotion (1004040).
    Note: This issue can also occur if you do not have the correct vCenter permissions to take a snapshot. Verify the permissions for your account if you see this error.

    ReplyDelete
  3. I think I have found another reason this could be happening. I'm running vSphere 5 and found this blog by Google...

    I'm using Acronis Backup & Recovery 11 and found some VMs are failing because vSphere is unable to Quiesce the VM. Trying to clone the machine I get the error this blog describes. The machines that are all failing to Quiesce do not already contain snapshots... but they did at one point. When you click, "Delete All" in the snapshot manager (vSphere), It removes the snapshot, but not completely in some cases. Browse your datastore and look in the folder where your VM files are located. In all three Virtual Machines that have this issue there are still snapshot disks being used and virtual disk consolidation is needed. You'll know if this is the case for you if you see multiple disks like Server01-000001.vmdk and Server01-000002.vmdk. These disks are used to capture changes and appear to grow to 17,408KB before another one is created. If you have more than 32 of these when you try to remove your snapshots VMware cannot consolidate them without running the Virtual Disk consolidation task.

    This task can be run by (First get out of the datastore if you're still in there) Right clicking the VM in the inventory, Go to Snapshot -> Consolidate, then click yes to the redo logs warning message. I'm going to try this tonight and see if it solves my problem. I'll post back my results here if I remember or can find this blog again.

    ReplyDelete
  4. So... that didn't fix the issue.

    ReplyDelete