Category Archives: General

ModPagespeedLoadFromFile all CDN domains

While investigating a serious performance issue over the last 24 hours I discovered an issue whereby some of our CSS files were being combined but were not being rewritten by mod_pagespeed. After much hair pulling, far too many hours spent in front of curl, vim, less and friends, I finally tracked down the solution. I added a ModPagespeedLoadFromFile directive for every CDN domain and now all our resources are being properly rewritten.

I think, but I’m not sure, that mod_pagespeed tries to retrieve from the same domain as the request arrives on. So if you’re rewriting resources from example.tld onto cdn.xmpl.tld, when the request arrives at mod_pagespeed with a Host header of cdn.xmpl.tld, mod_pagespeed tries to look up every resource on the cdn.xmpl.tld hostname, instead of doing a reverse lookup through the ModPagespeedShardDomain directive.

Interestingly, this only seemed to affect our primary CSS file. Our print.css and javascript appeared to be unaffected. Strange. Very glad I can put this one to bed.

Advertisements

Graphing performance metrics

I’ve spent a couple of days researching options to graph performance metrics. We’re trying to get all our metrics from all our services into a single interface. Here’s the pieces I found which seemed to fit best.

  • dygraphs – Javascript timeseries graphing library, looks excellent
  • InfluxDB + Grafana – Appears to be the best in class for metric storage + display
  • Graphite (+ Carbon) – Looks to be more complex to setup and a little less powerful than InfluxDB
  • StatsD – Pre-process high volume metrics before feeding to influx / carbon

Next step is to play with Influx + Grafana, either their hosted service, or our own, and see what we can push into it. More to follow…

mod_rpaf and intermittent port errors

We’re using mod_rpaf and trying to use it for SSL offloading so we can cache all HTTPS requests with Varnish. This has worked well in testing, but on production, we’re seeing intermittent port errors. So making 1’000 requests to the same URL over HTTP, we get several hundred requests showing in the apache logs as port 443.

This causes all sorts of unexpected side effects. Particularly with mod_pagespeed which serves 404 when the port has been set incorrectly. Nightmare.

Bottom line, we’ve take this out of our architecture until we can find a solution. The hardest part is that we can’t replicate the issue on staging. I’ve opened an issue.

SPDY, Googlebot, Alternate-Protocol and SEO

We deployed SPDY on our production sites on Monday. It’s hard to tell precisely, but it looks like our pages are being removed from Google search results.

Some background. We used the Alternate-Protocol: 443:npn-spdy/2. Our http pages are INDEX,FOLLOW and our https / SPDY pages are NOINDEX,NOFOLLOW.

Could it be that Googlebot takes follows the Alternate-Protocol header, loads the https version of the page, and then doesn’t index it because of the noindex tags?

Can’t find anything in Google about this issue. Anyone else have experience? I’ll try to post back here if we find anything more definitive than pure speculation…

All Magento pages from Varnish

I just realised I haven’t updated the site since our last big development. We’re now serving almost all of our pages from Varnish. Crude research suggests around 90% of our pageviews are now coming from Varnish. In simple terms, we’re doing did the following:

  • Render the cart / account links from a cookie with javascript
  • Ajax all pages, so everything can be cached (with a few exceptions we manually exclude)
  • Cache page variants for currencies and tax rate

We’re also warming / refreshing the cache using a bash script parsing the sitemap and hitting every url with a PURGE then a GET.

The hardest part of the whole performance space has been to measure the impact of our changes. But our TTFB was previously in the 300-500ms range for most pages, and now it’s in the 20-30ms range for pages that come from Varnish. I’m very confident that it’s impacting our bottom line.

All category pages from varnish

It’s a glorious day in the pursuit of ultra high performance on Magento. Today, we serve all our category pages from varnish. Plus, we artificially warm the cache to ensure that all category pages across all sites are already in the cache when a real user hits the sites.

Varnish typically takes our time to first byte from around 300ms – 400ms to 20ms – 30ms. We were previously serving 80% of landing pages from varnish, but this changes should improve overall performance by a noticeable margin. Happy days. 🙂

The implementation is fairly custom. Essentially, we’re adding a header to all pages which tells varnish whether the page can be cached or not. So on category pages that header says yes, on product pages that header says no. We also did some custom coding to dynamically the header links (My Cart, Login, Logout, etc) from a cookie. We set that cookie on login, add to cart, etc.

varnishd: Child start failed: could not open sockets

I was banging my head against a wall trying to figure out this error:

varnishd: Child start failed: could not open sockets

I checked netstat -tlnp but nothing was listening on the target port or IP. Turns out, the IP was simply missing. ifconfig didn’t show that IP being up at all. DOH! Simple solution once I found the actual problem. Posting here because I couldn’t find much on this one online.