SPDY, Googlebot, Alternate-Protocol and SEO

We deployed SPDY on our production sites on Monday. It’s hard to tell precisely, but it looks like our pages are being removed from Google search results.

Some background. We used the Alternate-Protocol: 443:npn-spdy/2. Our http pages are INDEX,FOLLOW and our https / SPDY pages are NOINDEX,NOFOLLOW.

Could it be that Googlebot takes follows the Alternate-Protocol header, loads the https version of the page, and then doesn’t index it because of the noindex tags?

Can’t find anything in Google about this issue. Anyone else have experience? I’ll try to post back here if we find anything more definitive than pure speculation…

Advertisements

Firefox SPDY alternate-protocol

Firefox does not support the alternate-protocol header part of SPDY. I’m not 100% confident of this, but from scanning this and reading this, that’s my understanding.

I couldn’t find a definitive answer to this question, so I’m posting this in the hope of saving others the search time. If you have information to the contrary, or if the situation changes, please let me know in the comments and I’ll update this post.

This raises the question, how do we deploy SPDY for Firefox users? Do we redirect all traffic to SSL anyway? Redirect only Firefox browsers that we think support SPDY? Only use SPDY for Chrome users? I’ll post more once we make a decision…

Track performance by user?

Thanks to the awesome folks at SOASTA we’re now using their mPulse system instead of our own boomerang install. This gives us 2 major wins. First, we’re including the tracking code in the non-blocking asynchronous iframe method, which gives the best possible performance at this point. Second, we can actually see the data. Previously, we just weren’t getting visibility into our boomerang data. We had the data, but weren’t using it, which was a total waste.

Looking at our stats today, mPulse tracks the median page load time. I was looking at the data and thinking, I wonder what it looks like per user. For example, I wonder if users with faster connections typically hit more pages. If they do, that means our median average user load time is actually lower than our median page load time.

Take two users, Alice and Bob. Alice is on her desktop in London with a 100Mbps line. She visits 8 pages. The average load time for her 8 pageviews is 1.2s. Bob is on his iPad over 3G in Alabama. (We’re in the UK, so London is closer!) Bob visits 4 pages. The average load time for the 4 pages is 2.3s. Now our arithmetic mean is somewhere in between the two, but our median, in this case, is one of Alice’s pageviews.

What would be really interesting, is to group pageviews by users. To count up all the Alices and Bobs, and then calculate the median (and 95th, 99th percentile) on their averages. That actually tells me, 50% of our users saw a page load of <1.4s, 95% <8s.

Having said all that, the data might actually look very similar to what we’re currently seeing. I’ll try to dig out some of our archived boomerang data, do some analysis, and post an update once I have more info.

All Magento pages from Varnish

I just realised I haven’t updated the site since our last big development. We’re now serving almost all of our pages from Varnish. Crude research suggests around 90% of our pageviews are now coming from Varnish. In simple terms, we’re doing did the following:

  • Render the cart / account links from a cookie with javascript
  • Ajax all pages, so everything can be cached (with a few exceptions we manually exclude)
  • Cache page variants for currencies and tax rate

We’re also warming / refreshing the cache using a bash script parsing the sitemap and hitting every url with a PURGE then a GET.

The hardest part of the whole performance space has been to measure the impact of our changes. But our TTFB was previously in the 300-500ms range for most pages, and now it’s in the 20-30ms range for pages that come from Varnish. I’m very confident that it’s impacting our bottom line.

All category pages from varnish

It’s a glorious day in the pursuit of ultra high performance on Magento. Today, we serve all our category pages from varnish. Plus, we artificially warm the cache to ensure that all category pages across all sites are already in the cache when a real user hits the sites.

Varnish typically takes our time to first byte from around 300ms – 400ms to 20ms – 30ms. We were previously serving 80% of landing pages from varnish, but this changes should improve overall performance by a noticeable margin. Happy days. 🙂

The implementation is fairly custom. Essentially, we’re adding a header to all pages which tells varnish whether the page can be cached or not. So on category pages that header says yes, on product pages that header says no. We also did some custom coding to dynamically the header links (My Cart, Login, Logout, etc) from a cookie. We set that cookie on login, add to cart, etc.

varnishd: Child start failed: could not open sockets

I was banging my head against a wall trying to figure out this error:

varnishd: Child start failed: could not open sockets

I checked netstat -tlnp but nothing was listening on the target port or IP. Turns out, the IP was simply missing. ifconfig didn’t show that IP being up at all. DOH! Simple solution once I found the actual problem. Posting here because I couldn’t find much on this one online.