More detailed CloudFlare analysis

Following my last post about CloudFlare, I ran some further benchmarks in response to the feedback from their team. Here’s the summary:

  • CloudFlare only combines our Javascript, not our CSS files, despite what it said on the tin (the site has been updated, when we signed it said JS & CSS)
  • This only happens on some user agents on some operating systems, CloudFlare will not give me a list of user agents.
  • On browsers where this is enabled, we see a marked improvement, where it’s not, we see no gain or a small loss.

Graphs

I ran two sets of tests (using only browsers where RocketLoader is enabled).

Graph showing marked improvement with cloudflare on

Second graph showing marked improvement with cloudflare on

Conclusion

We probably won’t implement CloudFlare across all our sites. I might still experiment on one of our higher traffic sites now that we’re running boomerang and gathering real user data to compare. However, the black box nature of CloudFlare fundamentally leaves me feeling uneasy.

The product appears to be in beta, which wasn’t clear when we signed up. I thought it was a polished product ready for production use. But much of the support chat is about RocketLoader being in beta, no list of user agents at this time, etc.

Bottom line, CloudFlare hasn’t done what I expected. I’ll test mod_pagespeed and we’ll probably go with that, pending any major roadblocks.

Advertisements

Benchmarking Rackspace dedicated vs cloud

People keep telling me that Magento performs better on dedicated hardware. I haven’t been able to find any numbers to support this, but I’ve heard it so often it’s either a very popular myth, or it’s true.

Now that our Rackspace dedicated box is online, I’m trying some benchmarks toput some numbers against the comparison. I wanted to test different operating systems and server sizes, so I booted 10 servers, one of each Ubuntu 10.04 LTS and RHEL 5.5 at 0.5, 1, 2, 4, 8 GB.

In order to automate the setup, I ran the following command on the Ubuntu boxes:

mkdir .ssh && chmod 700 .ssh && echo ssh-rsa <<snip>> > .ssh/authorized_keys && chmod 600 .ssh/authorized_keys && locale-gen en_GB.UTF-8 && update-locale LANG=en_GB.UTF-8 && apt-get update && apt-get --yes dist-upgrade && apt-get --yes install sysbench screen && reboot now && exit

Then on the RHEL boxes, something vaguely similar:

mkdir .ssh && chmod 700 .ssh && echo ssh-rsa <<snip>> > .ssh/authorized_keys && chmod 600 .ssh/authorized_keys && yum -y update && rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-4.noarch.rpm && yum -y install screen sysbench && reboot && exit

To automate the actual testing, I created a script bench.sh and uploaded it to each of the servers. It’s a simple nested for loop to run each test 3 times.

#!/bin/bash

for threads in 1 4 8 16 32 64
do
	for r in 1 2 3
	do
		sysbench --num-threads=$threads --test=cpu run > sysbench_cpu_${threads}_threads_$(date +%Y-%m-%d_%H-%M-%S).log
		sleep 30
	done

	for r in 1 2 3
	do
		sysbench --num-threads=$threads --test=memory run > sysbench_memory_${threads}_threads_$(date +%Y-%m-%d_%H-%M-%S).log
		sleep 30
	done

	sysbench --num-threads=$threads --test=fileio --file-test-mode=rndrw prepare

	for r in 1 2 3
	do
		sysbench --num-threads=$threads --test=fileio --file-test-mode=rndrw run > sysbench_fileiorndrw_${threads}_threads_$(date +%Y-%m-%d_%H-%M-%S).log
		sleep 30
	done
done

Then I connected to each server, uploaded the script (actually copied / pasted into vim, seemed quicker) chmod +x and ran it. The scripts are running now on 10 machines…

Results

I started writing this post about 2 months ago and haven’t yet published it. The bottom line was that memory comparisons were roughly even between virtual and physical environments. However, disk IO was hugely variable on the virtual hardware, at the top end it was comparable to the dedicated hardware, at the bottom end about 10% of that.

I didn’t notice any difference between operating systems, but I didn’t look for it very hard either. The results for the cloud servers were all over the place, while the dedicated box was very consistent.

My takeaway result was that disk is unpredictable in the cloud. If you’d like to see the actual results to make a more detailed analysis, let me know in the comments and I’ll dig out the numbers. For now I’m going to finally publish this! ­čÖé

CloudFlare slowed down our site

I deployed CloudFlare onto one of our sites today. I wanted to see hands on exactly how it works. I ran a simple benchmark. I ran 3 test 4 times each. The three tests were all from the same location, at 3 different network speeds. I ran the test 4 times, once in each configuration. I had CloudFlare enabled and disabled, and static assets coming from 3 domains or 1 domain. Here’s the results:

In every case, with CloudFlare was slower than without. Every single comparison, I ran the numbers. Some comparisons were very close, 0.02 or 0.03 seconds in it. But CloudFlare did not come out ahead one single time.

It could be a network issue. Maybe the test server (WebPageTest.org / Gloucester) is very close to our server and far from the CloudFlare servers. But even so, I’d have expected some performance gain from all the “magic” CloudFlare is supposed to do.

I’ll update with further tests. I’m also going to email CloudFlare and ask for their comments. I’ll post anything salient here.

[ Update: I’ve published some further test results here. ]