Upgraded VC2.5u2 to VC2.5u3, had some issues
Friday, 19 June 2009 14:48

Case

Yesterday we upgraded vCenter 2.5u2 to vCenter 2.5u3 (yeah, please don't ask why we didn't upgrade to u4). The upgrade process went fine, until we discovered the process was not started.

Errors

One thing that caught our eye was the Service Account used to start the vCenter Server Service. It was set to the default 'Local System' account while it was set to an NPA (Non Person Account we use these kind of services). So after that change we hoped it would start the vCenter Server Service. But it didn't, gave us a clueless error -2:

"The VMware VirtualCenter Server service terminated with service-specific error 2 (0x2)."

For some reason we were unable to start the service.

Try to solve

The next thing I did was starting the vpxd.exe with the -s option from the commandline:

vpxd.exe -s, note the 3rd line from the bottom.

vpxd -s
So I started my 'Microsoft SQL Server Management Studio' and opened up my vCenter database.

There is a table called: 'VPX_VERSION'. In this table, there is only one row and two values: VER_ID = '5' which is fine. The other one: VERSION_VALUE = 'VirtualCenter Database 2.5u2' which is NOT fine. Changed it to 'VirtualCenter Database 2.5u3' and closed the table. Headed to the service, tadaaa, starting the service works! I just fooled vCenter Cool.

Anyway, since this is a heavy production environment (48 ESX Hosts on this vCenter and growing) I wanted to make sure there are NO differences in the database and n original 2.5u3 database.

Export database structure

I exported the database tables and views structure: 'Right click on the database, Tasks, Generate Scripts...'

scriptw1
Click Next here (shouldn't be a problem)

scriptw2Choose your vCenter Database

scriptw3This is where things get interesting. I didn't care about the owner and the use database part. I did care about all the stuff which is in the database structure. So I set all the setting to 'true' on the bottom part.

scriptw4
I wanted to compare everything, SP's, Tables, Views.

scriptw5Selected everything since I wanted to compare everything Laughing

scriptw6And I wanted the output to a file.

Take a short peek at the end and do the same on your other database.

The compare

Using WinMerge or better yet, the portable one , I compared both files. Lucky for us, there was only one difference since the data in the table wasn't fille. Check the screenshot.

WinMerge

So the upgrade was quite succesful, to bad the installer did some strange things. But using the export in SQL and WinMerge I was able to resolve the issue and get a satisfied feeling about the upgrade.

 
Cluster overview with capacity
Wednesday, 17 June 2009 16:56

@Depping asked me to send an overview of our current capacity, both hosts in clusters and storage. In half an hour I scripted this powershell script. This one checks all you clusters and hosts within, calculate the total available GB, GHz and in the end of the report, an overview of the datastores is presented (from John Tuffin's Blog)

Here is the script:

$VC = Connect-VIServer "<your vCenter Server here>"
$grandtotalGB = 0
$grandtotalGHz = 0
$Clusters = Get-Cluster -Location "<If needed, else delete it from -Location>" | sort name
foreach ($Cluster in $Clusters) {
$TotalGB = 0
$TotalGHz = 0
 
$ESXHosts = $Cluster | Get-VMHost | sort name
Write-Host $Cluster.name
foreach ($ESXHost in $ESXHosts) {
$ESXHostProp = $ESXHost | get-view
$ESXTotalGHz = [Math]::round($ESXHostProp.Hardware.CpuInfo.Hz/ 1GB) * $ESXHostProp.Hardware.CpuInfo.NumCpuCores
$ESXTotalGB = [Math]::round($ESXHostProp.Hardware.MemorySize/ 1GB)
$TotalGHz = $TotalGHz + $ESXTotalGhz
$TotalGB = $TotalGB + $ESXTotalGB
Write-Host " " $ESXHost.name ": Total GHz:" $ESXTotalGhz "- Total GB:" $ESXTotalGB
}
$TotalAvailableGB = $TotalGB-$ESXTotalGB
$TotalAvailableGHz = $TotalGHz-$ESXTotalGHz
Write-Host "Total GB =" $TotalGB
Write-Host "Total Available GB (n-1) =" $TotalAvailableGB
Write-Host "Total GHz =" $TotalGHz
Write-Host "Total Available GHz (n-1) =" $TotalAvailableGHz
Write-Host "-------------------------------------------------------------------"
$grandtotalGB = $grandtotalGB + $TotalAvailableGB
$grandtotalGHz = $grandtotalGHz + $TotalAvailableGHz
}
Write-Host "Total Available Memory : $grandtotalGB GB"
Write-Host "Total Available CPU : $grandtotalGHz GHz"
Get-Datastore | sort name | ft name,@{ Label = "FreespaceGB"; Expression = { [Math]::round($_.FreeSpaceMB * 1MB / 1GB) } }, @{ Label = "CapacityGB"; Expression = { [Math]::round($_.CapacityMB * 1MB / 1GB) } }

 
ESX Boot Date-Time
Tuesday, 16 June 2009 17:15

After my change yesterday I wanted to make sure I rebooted all ESX servers with the latest Queue depth settings. Powershell to the rescue (and it saved me one time Laughing):

$VC = Connect-VIServer "<vCenter FQDN>"

Get-VMHost | sort name | % { $server = $_ | Get-View ; Write-Host $_ $server.Runtime.BootTime }

 
Queue Depth and more...
Monday, 15 June 2009 11:27

Last few weeks we have had some issues regarding our HDS USP-V. For some reason a SCSI lock locked the whole VMFS and all ESX servers in that cluster were not able to read/write to the VMFS anymore. The LUN was still available, however the VMFS not. So this crashed all the VMs in the cluster. The incident repeated itself after 6 weeks, on another server, other chassis, other switches and so we contacted VMware and HDS to help us in this situation.

After a lot of log file sending, checking settings, etc., one of the things we were recommended to change was:

  • Masking
  • Queue depth of HBA

Masking

According to a PDF send by HDS Services the masking can be done in some ways:

"Host Groups per HBA Versus Host Groups per ESX Hosts or VMware Cluster
To present a set of common, shared LUs to multiple ESX hosts or to a VMware cluster, host groups can be created either per HBA port (that is, per WWPN) or per a group of ESX hosts or VMware cluster.
A host group created on an HBA port basis contains the HBA's WWPN and a set of common, shared LUs (that is, only one WWPN, multiple LUs). A host group created per group of ESX hosts or per VMware cluster contains at least one WWPN from every ESX host and multiple LUs (that is, multiple WWPNs, multiple LUs). Every LU must be presented with the same host LU ID to every host or VMware treats the LU as a snapshot LU and disables access to the VMFS by default.
Although both concepts are supported, Hitachi Data Systems recommends creating host groups per HBA port (that is, per WWPN)."

So this is exactly what we changed. This creates more administrative overhead, but since HDS recommended us to do this, we did.

Queue Depth

HDS also told us the default Queue Depth on our Emulex adapters was too high. The default is 32, and after some calculations we needed to set it to 4. It is VERY important you check your vendor before changing this. More info about Queue Depth can be found on Duncan Epping his site here. If you want to know how the recommended settings is calculated check Frank Denneman his website. He did an excellent job describing how to calculate the queue depth.

Next thing I needed to do is change 96 servers to the correct queue depth and reboot them. I first created a .sh script, but, doing this by hand, was not very clever. So I was thinking: Powershell. Powershell is the way to go. So I created the script:

$VC = Connect-VIServer "<vCenter FQDN>"
$QUEUEDEPTH = 4     <---- change to your value!!!!
$ESXHOSTS = Get-VMHost
 
foreach ($ESXHOST in $ESXHOSTS)
{
$VIServer = Connect-VIServer $ESXHOST -User <administrator role> -Password <password>
 
Get-VMhostModule "lpfc_740" | Set-VMHostModule -Options "lpfc_lun_queue_depth=$QUEUEDEPTH"
Set-VMHostAdvancedConfiguration -Name "Disk.SchedNumReqOutstanding" -Value $QUEUEDEPTH
Set-VMHostAdvancedConfiguration -Name "Disk.UseDeviceReset" -Value 0
Set-VMHostAdvancedConfiguration -Name "Disk.UseLunReset" -Value 1
}
 

Be aware that this script is written for our servers with an Emulex FC adapter. If you have a Qlogic HBA you need to change the "lpfc_740" to something else and also the options part needs to be changed. For more info, check http://www.vmware.com/pdf/vi3_301_201_san_cfg.pdf page 107 and 108.

In the end, using this script, I was able to update 96 server within 30 minutes, awesome! Again, consult your Storage vendor before changing the Queue depth, else it might result in an unsupported configuration.

 
Virtual systems are gaining share over physical servers
Friday, 05 June 2009 16:27

Today I was reading an article on Computable.nl . The number of sold Virtual Machines was higher in 2008 then Physical Machines. Now, for us, this is not supprising. I think that the current hardware allows us to get consolidation ratios higher and higher. Read the full article here (in Dutch)

I would like to note though, that virtualisation isn't just about consolidation. We still gain massive advantages using Virtual Machines instead of Physical Machines. In case of VMware vSphere, think about vMotion, Storage vMotion, snapshot, record/reply, shared VMFS, resource control, DRS, HA, FT, simplified management, multiple storage solution support, fast performance and it is Cloud Ready. How cool is that!

 
<< Start < Prev 1 2 3 4 5 6 7 8 9 10 Next > End >>

Page 7 of 25
Did you know: that ESX checks every 20ms to migrate a vCPU to another pCPU for the optimal workload balance. This is configurable (0ms - 5000ms) in Cpu.MigratePeriod in Advanced Settings of you ESX server.