CITRIX XENDESKTOP 5 POC On A Vblock 300-The Results are In!
Hi, During july a lot of people gathered somewhere in Boston to run a pretty interesting POC, it was about running a pretty heavy 2,000 users workload running in a […]
Dell Storage, PowerStore, PowerFlex PowerMax & PowerScale, Virtualization & Containers Technologies
Hi, During july a lot of people gathered somewhere in Boston to run a pretty interesting POC, it was about running a pretty heavy 2,000 users workload running in a […]
Hi,
During july a lot of people gathered somewhere in Boston to run a pretty interesting POC, it was about running a pretty heavy 2,000 users workload running in a XENDESKTOP 5 environment which is provided by a Vblock 300 series. while I already published some VNX results for CITRIX XENDESKTOP, I would like to use this post to also publish some real world considerations about the deployment.
the environment overview:
A Vblock 300 GX – A full public overview can be seen from the link below:
VCE VBLOCK 300 SERIES OVERVIEW
ESX CPU Performance considerations:
The ESX’s that hosted the VDI VM’s were based around CISCO B200 with 96GB and a CPU of 2.93 GHz of RAM, this server was able to run 70 users with high workload without any problem at all, each VM ran with 1vCPU and 2GB of RAM, the ESX balloon driver worked hard but that’s what it suppose to do, so we performed a heavy memory over commitment which worked great and really provided an added value with a server that theoretically doesn’t have enough RAM (96GB of RAM) to host 70 Users each (70vms X 2gb RAM X guest overhead). no VMKernel disk swapout was noted.
80 Users on this server configuration started to show some high &RDY& values which LoginVSI didn’t complain about but I thought that they will be too high for a real world implementation so 70 remained the sweet spot
below: the environment running 2,000 users, 70 users per B200, the reason that you see 2,500 users is that at some point someone decided to stress the environment even further..I love this job!
below: the RAM usage
Below: A B200 with 70 users running heavy workload profile, note that the %RDY% is still within the “OK” limit
below: B200 with 80 users running heavy profile workload. the %RDY% is starting to look bad..
we then started to experience with a B230 server that has far more RAM (192GB to be exact) which did carry more user workload – around 83 but the TCO for this type of server Vs the B200 isnt worth it so B200 it is!!
Anti Virus Considerations:
Ok, this is a very hot topic to discuss, there are a lot of new “specific to VDI” AV solutions out there, the one that we used here wasn’t based around the vShield Endpoint API and boy, you could tell that
attached below, you can see a B200 that was perfectly capable of running 70 users before AV and now it runs 70 users with AV on, the showed the ugly head of %RDY again, so PLEASE, PLEASE PLEASE, make sure you evaluate a proper AV solution or otherwise, you are going to pay (literally speaking) a lot for the overhead that non VDI AV will bring to your environment..to be fair with the AV vendor that was used here, I wouldn’t mention the company name as the customer turned to them and they said that are in a work on a better solution..
Storage Considerations:
So, it’s one thing to preach your customers about EMC FAST CACHE:
Permanent link to CITRIX XenDesktop 5 on EMC VNX – Match made in Heaven (Part1)
and here:
Permanent link to CITRIX XenDesktop 5 on EMC VNX – Match made in Heaven (Part2)
and It’s another thing to actually eat your own dogfood with no net and see the numbers in real action, so let’s start
Read / Writes:
VDI workload (as oppose to the common belief) tend to have a very high Write percentage, how much exactly, well, it depends, I’ve seen numbers varies from 40-60% for writes, so make sure your storage array cache support both read AND writes caching technologies.
on the figure below, you can see Writes peaking from 40-60% during the test
let’s see some more numbers from the FAST CACHE perspective:
Booting simultaneously 2000 VMs on VNX 5700 (16:32)
80K IOPS in total (40K each SP).
A great response time !
A great Fast Cache utilization: (yes, 1.000 actually means near 100% writes were caches while almost 87% of the reads were absorbed by the FAST CACHE..yep, I know it sounds crazy but in a good way!
Total SP’s Utilization, they remained well below 80% (great!!!, in fact if you take a closer look you can see that the average utilization was more in the region of 60-65% utilization, also, as a real life consideration, it is very rare that all the users will work in 100% concurrency, not to mention that loginVSI simulate users that are doing heavy tasks again and again and again…my bottom line is that this VNX is far more capable in real life scenario and you can probably far more users..
Network Considerations:
Network Performance Analysis – Ethernet/IP Network.
We used the below LAN topology network for vBlock internal and external connectivity..
Key Highlights:
Overview:
A snip showing 640 Mbps from the test:
Key findings and conclusions:
Network Performance Analysis – Fibre-Channel/Storage Area Network.
Pushing the envelope..
at some point we wanted to push the storage to hold more than 2,000 users and so we loaded the Vblock with 2,500 users:
now, one would expect adding a quarter of the original load to add at least 25% to the SP’s utilization, right?
Wrong!
below, you can see the SP’s utilization, only a slight increase of the original 2,000 user workload, this is to do with Mr. FAST CACHE
below, you can see the FAST CACHE utilization, almost 100%..now that’s really cool (at least in my mind..)
So, in the post i wasn’t trying to cover all the aspects but just to show you some of the highlights that a Vblock can offer to your VDI enviornment, may it be a CITRIX or VMware, both of them are running in a very similar workload, CITRIX mcs works very similar to VIEW linked Clones and the Uset / CPU core ratio works the same..the end results is that a Vblock will ALWAYS generate you the same results over again and again in the same way you know that buying a car model from one place or the other will work the same. different Vblocks can also be managed from one console (UIM) which allow you to quickly deploy different Vblocks from one interface, imagine this, you have 6 Vblock that are all used for VDI with similar workloads running in different sites, you can create one service offering and basically clone it to the remote Vblock, let UIM do the heavy lifting for you (SAN,NAS and ESXi deployments) and you are done!
Credits:
This type of work is never a one man mission, a lot of people were involved in order to make this POC a successful one, i would like to give some credits to:
Miri Weiss Korn – EMC TC
Max (Hi Guys, this is max speaking!) Fishman – EMC TC
Gadi Feldman – CITRIX Consultant
until next time..
Itzikr,
Great info! In the middle of a competitive VDI opp and was using B230s. Going to rerun the configs based on this to see how we end up.
Thanks
very good work, thanks for sharing 🙂
your’e welcome crazy boy..
Very nice, what was the VNX array config?
Great Post! I was wondering how much fast cache was setup to support 2000 users?
How many B200’s were used?
28 for the VM’s workload (not including management servers, failover capacity etc`..)
I wonder if the OS used what 32 or 64bit and how much memory saving was due to TPS?
Hi Duncan, the OS was win7 64bit, let’s do the math togethor, 70 VM’s, each 2GB of RAM = 140GB, add each VM CPU / RAM overhead which is based around 1vCPU and 2048mb of RAM = 137.81 X 70 = 9646.7mb, so even without the ESX VMKernel overhead we are talking about pushing 150GB of RAM workload into A B200 with 96GB of RAM..
Do you have the esxtop memory data? would be cool to see those as well 🙂
Hi Itzik, thank you for using Login VSI!
any info / links on pricing ?
Great post!
Good One.. Did you use MCS, XenApps in this test ?
Hi,
we used MCS (CITRIX version of “linked clones”) worked great! and was the first time we (and CITRIX) tested MCS for more than 2,000 users, FAST Cache helped here a lot. we didn’t use XenApp or APP-V, all the applications were installed in the replica (and therefore, “in” the MCS vms)
Hi itzikr Great post, thanks for the information. just wondering was the intel speedstep on the processors disabled, as UCS blades has it enabled by default and currently having problems trying to disable on B230’s. B200’s is no problem to disable. See this blog http://www.unidesk.com/blog/speedstep-and-vdi-it-good-thing-not-me . We have seen the exact same in the above blog on our B230’s and B200’s regarding performance. Cheers
Hi itzikr Great post, thanks for the information. just wondering was the intel speedstep on the processors disabled, as UCS blades has it enabled by default and currently having problems trying to disable on B230’s. B200’s is no problem to disable. See this blog http://www.unidesk.com/blog/speedstep-and-vdi-it-good-thing-not-me . We have seen the exact same in the above blog on our B230’s and B200’s regarding performance. Cheers (IE crashed as i posted so hope it does not appear twice) 🙂
Is it possible to send me or publish some information about the structure of vblock300 GX ? Thanks for your cooperation 🙂
Is this with MCS or PVS????
MCS..traditionally speaking, MCS requires more IOPS but when using the EMC FAST Cache this goes away and you end up with the administrative benefits of MCS Vs PVS
Nice POC analysis. Was wondering if you have tried anything for a site failover with minimal infrastructure available at a DR site.
HI Jerry, we didnt test any DR solution such as SRM
great and very informative post Itzikr. await to see more on VDI AV Solution.
Itzikr, great blog, very informative!
Thanks Mike,
glad you found it usefull
Hey Itzik,
Sent you a mail regarding VMware View RA’s etc recently. I had been briedfed that the customer wants to deplot VMware View, when visiting the customer I noted that they actually wanted CITRIX XenDesktop with vSphere provided by VBLOCK infrastructure…Wanted to thank you for the above as it really helped me deliver a great presentation…
🙂 Thanks again
Glad it helped!
Also, this testing was with vSphere 4.1, any work/results/thoughts concerning vSphere 5?
Hi,
vSphere 5.0 wont change the results so much, however when CITRIX XD take advantage of CBRC…