VCDX #200 Blog of one VMware Infrastructure Designer: October 2013

Monday, October 28, 2013

VMware vSphere: Script to change Default PSP to Round Robin

Automated way how to set default PSP for particular SATP.

vCLI example:

esxcli --server myESXi --username user1 --password 'my_password' storage nmp satp set --default-psp=VMW_PSP_RR --satp=VMW_SATP_DEFAULT_AA

PowerCLI example:

$esxcli=Get-EsxCli -VMHost

$esxcli.storage.nmp.satp.set($null,"VMW_PSP_RR","VMW_ SATP_DEFAULT_AA")

Please note that for both examples ESX Server needs Reboot to take Effect.

Saturday, October 26, 2013

VMware VCDX by the numbers

Brian Suhr had a great idea to summarise and publicly share available information about VMware top certificated experts knows as VCDX (VMware Certified Design Experts).

It is real motivation for others preparing for VCDX.

Write-up is available here http://www.virtualizetips.com/2013/09/27/vmware-vcdx-numbers/

Friday, October 25, 2013

DELL EqualLogic valuable resources on the web

I've got an email from one DELL EqualLogic expert and he has in the mail signature links to very valuable DELL EqualLogic web resources. Here there are:

Also see my another blog post DELL Storage useful links.

I'm sharing these links here because I'm sure it is interested for other people.

Saturday, October 19, 2013

DELL is able to build CDN (Content Delivery Network) for telco providers

Are you surprised DELL is able to build CDN? Yes, that's true ... Dell, EdgeCast Shake Up Content Delivery Networks ...

"Every single teleco service provider globally is trying to build some kind of content delivery network," said Segil. The rapid expansion of the use of video, pictures, and multimedia text and graphics is putting a strain on network operators' capacity that would be relieved by effective use of a content delivery network. A film that is suddenly in demand from many parts of the world, for example, would be more effectively streamed from servers close to requestors than struggling to scale from one point.

... I know some people who cannot imagine DELL can help customers with CDN (Content Delivery Network). That's probably because DELL is well known as PC & Laptop manufacturer. However it is not the right image of modern DELL anymore. DELL is manufacturing and delivering enterprise gear (servers, storage and network) almost 8 years and DELL GICS (Global Infrastructure Consulting Services) providing infrastructure consulting services. DELL has today all hardware components to build CDN. CDN is usually described as special virtual network (aka VPN, tunnel, overlay) on top of internet optimized to deliver digital content (aka digital objects). To be more specific DELL has a partnership with EdgeCast which has complete software solution leveraging commodity x86 hardware. Dell is producing a content delivery platform based on its PowerEdge servers and software from number-three content delivery network service supplier EdgeCast Networks. More information about DELL and EdgeCast CDN solution are here, here, here and here.

However it is worth to mention that before anybody is going to build their own CDN it is very important to gather business requirements, target users and content type for delivery. Conceptual and logical architecture has to be prepared based on specific requirements and constraints. Different CDN can be built for different purposes. And last but not least the technical architecture must be fully aligned with business model and the investor must fully believe that business forecast is achievable.

Wednesday, October 16, 2013

Out-of-band BIOS settings management

Today I did some troubleshooting with customer. We needed to verify what NUMA type is set in server's BIOS. In the past I posted more info about BIOS NUMA settings here. The customer sighed that he can not restart the server just to jump and look into BIOS screen. My answer was ...

... it is not necessary to reboot the server because you have modern gear which allows you to read BIOS settings via out-of-band management card.

In our case we had DELL rack server PowerEdge R620 with iDrac 7 management card. BIOS settings are not visible on iDRAC web interface and you have to use CLI (aka racadm). There are several methods how to use racadm cli but IMHO the simplest method is to SSH into iDRAC ip address and execute command:

racadm get bios.MemSettings

You should get result similar to the screenshot bellow.

For more information look at DELL Tech Center.

Tuesday, October 15, 2013

iSCSI NetGear datastore issues

Yesterday I had a phone call from my neighbor who work as vSphere admin for one local system integrator. He was in the middle of upgrade from vSphere 4.1 to vSphere 5.5 and had a trouble.

He decided to use vSphere 5.5 but not by in place upgrade but as having two environments. The legacy one (vSphere 4.1) and new one (vSphere 5.5). Each environment had their own vCenter and he used one iSCSI datastore connected to both environments as transfer datastore. He called me because he experienced issues with powering on particular VM stored on transfer datastore and registered on ESXi 5.5 managed by vCenter 5.5. When VM power on was initiated it took some time and the task failed in - he told me - 25%.

I remember we were discussing some time ago if is better to use vSphere 5.1 or go directly to very new vSphere 5.5. My answer was "it depends" but at the end we agreed that in small environment is possible to go directly to vSphere 5.5 and accept some risk. That's the reason why I felt little bit guilty.

As we are neighbors he came to my garden. He smoked several cigarettes probably to organize his thoughts and we were discussing potential root cause and other best practices including migration possibilities. All those general ideas and recommendations were just best practices and hypothesis. At the end we agreed that we have to look at log files to understand what is really happening and what issue he is experiencing.

I have to say I like troubleshooting ... the first log file to check in such situations is obviously /var/log/vmkernel.log

As he is more Microsoft (GUI) then *nix (CLI) oriented I navigated him over the phone how to enable ssh, login to ESXi and check the log file.

When we start the command

tail -f /var/log/vmkernel.log

the troubleshooting was almost done. Lot of SCSI errors were continuously logged into vmkernel.log. SCSI errors included following useful information

H:0x0 D:0x2 P:0x0 SCSI sense keys: 0x0B 0x24 0x00

Let's translated log file information into human language ... device returns "aborted command" (0x0B) and additional sense code (0x24) is undocumented so it is probably device specific.
However root cause was obvious ... it is storage related issue. We tried to create directory on affected datastore and it took almost 30 seconds which prove our assumption of storage issue. Problematic datastore was backed by iSCSI NetGear storage. The same operation in another datastore backed by another storage connected directly by SAS was, of course, immediate.

So I asked him again (we talk about HCL at the beginning general discussion) if he checked HCL and he confirmed he does it but he will double check it. In one hour later he send me a message that storage model is supported but the firmware must be upgraded to work correctly with ESX 5.5

All my "ad-hoc consulting" was done just like quick help to friend of mine so I don't even know what NetGear iSCSI storage my neighbor has but I will ask him and update this post because it can help other people.

Update 10/16/2013:
I have been informed that exact NetGear iSCSI storage model is "NetGear Ready NAS 3100". I checked VMware HCL by my self and at the moment it is supported just for ESX 5.1 with firmware RAIDiator-x86 4.2.21. So I warn my neighbor that even it will work with new firmware this configuration will be unsupported. Another lesson from this - don't trust anybody and validate everything by your self :-)

So what is the conclusion of this story? Plan, plan and plan again before any vSphere upgrade. Don't check just hardware models on HCL but check also firmwares. Modern hardware and operating systems (including hypervisors) are very software dependent so firmware versions matters.

Wednesday, October 09, 2013

Two (2) or four (4) socket servers for vSphere infrastructure?

Last week I had interesting discussion with customer subject matter experts and VMware PSO experts about using 2-socket versus 4-socket servers for VMware vSphere infrastructure in IaaS cloud environment. I was impressed how difficult is to persuade infrastructure professionals about 4-socket server benefits in some cases.

Although it seems as pretty easy question it is actually more complex if we analyze it deeper. It is common question from many of my customers and because it is a typical "it depends" answer I've decided to blog about it.

Let's start with some general statements:

2-socket servers are designed and used for general business workloads
2-socket servers are less expensive
4-socket servers are very expensive
4-socket servers are designed and used for high performance and mission-critical workloads
failure of single 4-socket server node in vSphere cluster has bigger impact on capacity

All these general statements are relative so what is really better for particular environment depends on customer's requirements, expected workload size and chosen hardware platform.

It is important to note that at the time of writing this post there are two types of 4-socket Intel servers in the market. Servers with Intel E7 CPU Family and servers with Intel E5-4600 Family. Comparing the Intel E7-4780 (10 core, 2.4GHz) with an Intel E5-4650 (8 core, 2.7 GHz), you’ll find that the E5 server outperforms against the E7 server in the following benchmarks:

CAE
SPECfp*_rate_base2006
Numerical Weather
Financial Services
Life Sciences
Linpack AVX
SPECint*_rate_base2006

The E7 server outperforms the E5 server in the following benchmarks:

java* Middleware
OLTP Database
Enterprise Resource Planning

CPU family comparison is taken from here.

Intel E7 are designed for mission critical workloads and E5-4600 family for general workloads with big CPU performance requirements. Therefore E7 processors are "very" (I would say more) expensive but price difference between E5-4600 (4-socket) and E5-2600 (2-socket) servers is usually less than 10 or 20 percent but it can vary among different hardware vendors.

Server consolidation is the most common use case of server virtualization. Before any server consolidation it is highly recommended to do "ASIS" capacity monitoring and "TOBE" capacity planning with consolidation scenarios. There are plenty of different tools for such exercise. For example VMware Capacity Planner, Platespin Recon, CIRBA, etc. However if we design green field environment and there is not legacy environment which can be monitored we have to define expected average and maximum VM. So, let's define our average and maximum workload we are planning to virtualize in single VM.

Let's assume our typical VM is configured as

1 vCPU consuming 333 MHz CPU
1 vCPU consuming 1/3 of one CPU Thread
4GB RAM

and maximal VM (aka monster VM) is configured as

8 vCPU
128 GB RAM

So what physical servers to use for virtualization in such environment? E7 CPUs are significantly more expensive therefore let's compare 2-socket servers with E5-2665 (2.4GHz) and 4-socket server with E5-4640 (2.4GHz). So here are our server options in detail.

4S-SERVER: Single 4-socket server E5-4640 (8 cores) has 32 cores and 64 CPU Threads (logical CPU) in case hyper-threading is enabled. Total CPU capacity is 76.8 GHz. From RAM perspective it can accommodate 48 DIMMs (4 sockets x 4 channels x 3 DIMMs).

2S-SERVER: Single 2-socket server E5-2665 (8 cores) has 16 cores and 32 CPU Threads (logical CPUs) in case hyper-threading is enabled. Total CPU capacity is 38.4 GHz. From RAM perspective it can accommodate 24 DIMMs (2 sockets x 4 channels x 3 DIMMs).

So in first look 8 x 4-socket server has same compute capacity and performance as 16 x 2-socket servers, right? 4-socket server can accommodate double number of DIMMs, so total RAM capacity of 8 x 4-socket servers and 16 x 2-socket servers is also the same. It is 768GB RAM in 16GB DIMMs or 1536GB RAM in 32GB DIMMs.

If we will use vSphere Cluster with 8 x 4S-SERVER or 16 x 2S-SERVER we have same total raw capacity and performance but 16 x 2S-SERVERs will beet 8 x 4S-SERVERx in real available capacity because in case of single server fail we will lose just 1/16 of capacity and performance unlike 1/8 of capacity.

Is it true or not?

Yes, from memory perspective.
Yes and sometimes No, from CPU performance.

Let's concentrate on CPU performance and compare CPU performance of DELL 2-socket server M620 (E5-2665/2.4GHz) with DELL 4-socket server M820 (E5-4640/2.4GHz). We all know that 1MHz on two different systems doesn't represent comparable performance, so the question is how to compare CPU performance. The answer is CPU normalization. Good, good ... but wait how we can normalize CPU performance. The answer is CPU benchmark. Good ... but what benchmark?

Below are listed different benchmark results for single host so based on results we can deeply discuss what system is better for particular environment. Please note that some benchmark results are not available or published so I use results from similar systems. I believe it's enough accuracy for our comparison.

2S-SERVER: M620 (E5-2665/2.4GHz)

SPECint_rate2006: 611
SPECfp_rate2006: 467
VMmark: 5.1 Calculation: 2x M620 VMmark (E5-2680) has 10.20 @ 10 Tiles = 10.2 / 2
SPECvirt_sc2013: 236.15 Calculation: 1x HP DL380p G8 SPECvirt_sc2013 (E5-2690) 472.3 @ 27 =472.3 / 2

4S-SERVER: M820 (E5-4640/2.4GHz)

SPECint_rate2006: 1080
SPECfp_rate2006: 811
VMmark: 10.175 Calculation: 2x HP DL560 VMmark (E5-4650) 20.35 @ 18 Tiles =20.35 / 2
SPECvirt_sc2013: 454.05 Calculation: 1x HP DL560 SPECvirt_sc2013 (E5-4650) 908.1 @ 53 =908.1 / 2

Note 1: DELL 4S-SERVER VMware results are not published so I use results for HP DL560 servers

Note 2: Some SPECvirt_sc2013 results are not available for VMware vSphere so I use results for Redhat KVM.

Based on results above I prepared performance benchmark comparison table:

Benchmark	2x 2S	1x 4S	4S against 2S
SPECint	1222	1080	88.38%
SPECfp	934	811	86.83%
VMmark	10.2	10.175	99.75%
SPECvirt_sc	472.3	454.05	96.13%

So what does it mean? I explain it by way of 2-socket servers are better for RAW mathematical operations (integer and flouting point) but for more real live workloads 4-socket servers have generally same performance like 2-socket servers and more cores/threads per single system.

BTW: It seems to me that CPU performance normalization based on SPECint and/or SPECfp is not fair to 4-socket servers. That's what Platespin Recon use for CPU normalization.

We can say that there is not 1MHz performance difference between our 2S-SERVER and 4S-SERVER. So what is the advantage of 4-Socket servers based on E5-4600 CPUs? The CPU performance is not only about MHz performance but also about CPU scheduling (aka multi-threading). The 4S-SERVER advantage is bigger count of logical CPUs which has positive impact on co-scheduling vCPUs of vSMP virtual machines. Although vCPU co-scheduling has been dramatically improved in ESX 3.0 some co-scheduling is required anyway. Co-scheduling executes a set of threads or processes at the same time to achieve high performance. Because multiple cooperating threads or processes frequently synchronize with each other, not executing them concurrently would only increase the latency of synchronization. For more information about co-scheduling look at https://communities.vmware.com/docs/DOC-4960

In our example we are planning to have monster VMs with 8 vCPUs so 64 logical CPUs in 4S-SERVER offers potentially more scheduling opportunities against 32 logical CPUs in 2S-SERVER. As far as I know virtualization benchmarks tiles (tiles are group of VMs) usually have up to 2 vCPUs so I think co-scheduling issue is not covered by these benchmarks.

So final decision depends on expected number of monster VMs which can affect real performance of workloads inside these monster VMs. CPU performance overloading can be monitored by ESX metric %RDY (vCPU READY but pCPU not) and co-scheduling execution delays by metric %CSTP (vCPU stoped because of co-scheduling). Recommended thresholds are discussed here but every environment has different quality requirements so your thresholds can be different and it depends what quality SLA you want to offer and what type of application you want to run on top of virtual infrastructure.

Anyway, co-scheduling of monster VMs is serious issue for IaaS Cloud Providers because it is really hard to explain customers that less vCPUs can give them paradoxically better CPU performance. I call this phenomenon "VIRTUALIZATION PARADOX".

The final hardware selection or recommendation is always dependent on justification of vSphere Architect who has to carefully analyze specific requirements and constraints in particular environment and reasonably justify selected decision. We should remember there can be other requirements favoring a specific platform. Example of "other" requirement (sometime the constraint) can be the situation when blade servers want to be used. In 2-socket blade servers is usually very difficult and sometimes even impossible to avoid single point of failure of NIC/HBA/CNA adapter. 4-socket server are usually full height (dual slot) and therefore I/O cards are doubled ... but that's another topic.

Pages