Archive for the ‘Web Infrastructure’ Category

Google+ API Wishlist

Sunday, October 23rd, 2011

While I was a very early adopter of Google+, today I’ve basically disabled my Twitter account and my Facebook account remains open only to manage a few advertising campaigns and applications. I’ve used Google+ as my primary social outlet since late June. Initially I started to write a scraper to deal with Google+ to fix a few things I didn’t like about it, but, Google did mention that an API was coming. Google+’s API contains read-only access to your account, but, surely needs improvement.

While Games do appear to have access to these APIs, releasing them to the general public so that they can create their own apps would be greatly appreciated. I understand the complexity of writing an API and getting it right the first time, I’d like to put forward a list of items that would be helpful.

Post to Stream
  Circles/Public/People
  Notification list. Perhaps the post circles contains a tuple that can
    turn notification on for each of the circles or people. If Public is
    passed a notification, ignore it silently. Alternatively, a second
    list of notification targets.
  Content of post
  Attached media object(s) Picture URL, Gallery URL Link URL, Video tag/url.
    Currently Google+ only supports a single attached object, but why not
    plan for the future here. Options might include, preview thumbnail/fullsize
    inserted into stream.
  Email People not yet using Google+, default to false/no.
Get Circles
  return a list of the circles that the user currently has
Get Members in Circles
  return a list of the members in a circle. If no circle name passed, return
  list of circles with members. Pagination if too large?

What would be nice for the Google+ API

Add Member to Circle
  Add a member ID to a particular circle
Delete Member from Circle
  Delete a member ID from a circle
Add Circle
Delete Circle

Personally, adding members to circles would greatly simplify the manual process behind http://plus.cd34.com/, but, I understand the obvious spam implications here.

With even the basic functionality listed above, even if we couldn’t attach objects, we could have our blogs post to Google+ or have our favorite desktop/webtop software post to Google+, making it one of the ‘Big Three’ rather than the duopoly the social media world currently has.

I would love to have the ability to post to Google+ from certain apps that I have running locally. I used to tweet IPv6 traffic tracker data of weekly statistics on email % received over IPv6, IPv6 traffic data volumes and other such data. I set up a small project that I thought was fun – replaying historic events synchronized to the actual event so that people could follow along. At present, there is no easy way to do this. Knowing what application published to the stream would also be very helpful – allowing developers to customize the ‘posted by’ line. When someone sees a post, they would know if it was automated or entered through the web client.

As a hobbyist, I’d love to see a slightly expanded API.

Finding my XFS Bug

Thursday, October 6th, 2011

Recently one of our servers had some filesystem corruption – corruption that has occurred more than once over time. As we use hardlinks a lot with link-dest and rsync, I’m reasonably sure the issue occurs due to the massive number of hardlinks and deletions that take place on that system.

I’ve written a small script to repeatedly test things and started it running a few minutes ago. My guess is that the problem should show up in a few days.

#!/bin/bash

RSYNC=/usr/bin/rsync
REVISIONS=10

function rsync_kernel () {
  DATE=`date +%Y%m%d%H%M%S`

  BDATES=""
  loop=0
  for f in `ls -d1 /tmp/2011*`
  do
    BDATES[$loop]=$f
    loop=$(($loop+1))
  done

  CT=${#BDATES[*]}

  if (( $CT > 0 ))
  then
    RECENT=${BDATES[$(($CT-1))]}
    LINKDEST=" --link-dest=$RECENT"
  else
    RECENT="/tmp/linux-3.0.3"
    LINKDEST=" --link-dest=/tmp/linux-3.0.3"
  fi

  $RSYNC -aplxo $LINKDEST $RECENT/ $DATE/

  if (( ${#BDATES[*]} >= $REVISIONS ))
  then
    DELFIRST=$(( ${#BDATES[*]} - $REVISIONS ))
    loop=0
    for d in ${BDATES[*]}
      do
        if (( $loop < = $DELFIRST ))
        then
          `rm -rf $d`
        fi
        loop=$(($loop+1))
      done
  fi
}

while [ 1==1 ]
do
  rsync_kernel
  echo .
  sleep 1
done

Pyramid Apex – putting it in production

Monday, August 15th, 2011

After quite a bit of work we’ve finally gotten Pyramid Apex to a point where I can deploy it on two production apps to make sure things are working as I expect they should.

If you’re developing a Pyramid Application and are using Authentication/Authorization, I18N/L10N, Flash Messages and a Form Library, take a look at Pyramid Apex, a library Matthew Housden and I wrote to make it easier to quickly develop Pyramid applications.

It supports OpenID, Local authentication storage using bcrypt and a number of other basic features.

W3 Total Cache and Varnish

Thursday, July 21st, 2011

Last week I got called into a firestorm to fix a set of machines that were having problems. As Varnish was in the mix, the first thing I noticed was the hit rate was extremely low as Varnish’s VCL wasn’t really configured well for WordPress. Since WordPress uses a lot of cookies and Varnish passes anything with a cookie to the backend, we have to know which cookies we can ignore so that we can get the cache hit rate up.

Obviously, static assets like javascript, css and images generally don’t need cookies, so, those make a good first target. Since some ad networks set their own cookies on the domain, we need to know which ones to set. However, to make a site resilient, we have to get a little more aggressive and tell Varnish to cache things against its judgement. When we do this, we don’t want to have surfers see stale content, so, we need to purge cached objects from Varnish when they are changed to keep the site interactive.

Caching is easy, purging is hard

This particular installation used W3 Total Cache, a plugin that does page caching, javascript/css minification and combining and handles a number of other features. I was unable to find any suggested VCL, but, several posts on the forums show a disinterest in supporting Varnish.

In most cases, once we determine what we’re caching, we need to figure out what to purge. When a surfer posts a comment, we need to clear the cached representation of that post, the Feed RSS and the front page of the site. This allows any post counters to be updated and keeps the RSS feed accurate.

W3TC includes the ability to purge, but, only works in a single server setting. If you put a domain name in the config box, it should work fine. If you put a series of IP addresses, your VCL either needs to override the hostname or, you need to apply the following patch. There are likely to be bugs, so, try this at your own risk.

If you aren’t using the Javascript/CSS Minification and combining or some of the CDN features that W3TC provides, then I would suggest WordPress-Varnish which is maintained by some people very close to the Varnish team.

I’ve maintained the original line of code from W3TC commented above any changes for reference.

--- w3-total-cache/inc/define.php	2011-06-21 23:22:54.000000000 -0400
+++ w3-total-cache-varnish/inc/define.php	2011-07-21 16:10:39.270111723 -0400
@@ -1406,11 +1406,15 @@
  * @param boolean $check_status
  * @return string
  */
-function w3_http_request($method, $url, $data = '', $auth = '', $check_status = true) {
+#cd34, 20110721, added $server IP for PURGE support
+# function w3_http_request($method, $url, $data = '', $auth = '', $check_status = true) {
+function w3_http_request($method, $url, $data = '', $auth = '', $check_status = true, $server = '') {
     $status = 0;
     $method = strtoupper($method);

-    if (function_exists('curl_init')) {
+#cd34, 20110721, don't use CURL for purge
+#    if (function_exists('curl_init')) {
+    if ( (function_exists('curl_init')) && ($method != 'PURGE') ) {
         $ch = curl_init();

         curl_setopt($ch, CURLOPT_URL, $url);
@@ -1474,7 +1478,13 @@
             $errno = null;
             $errstr = null;

-            $fp = @fsockopen($host, $port, $errno, $errstr, 10);
+#cd34, 20110721, if method=PURGE, connect to $server, not $host
+#            $fp = @fsockopen($host, $port, $errno, $errstr, 10);
+            if ( ($method == 'PURGE') && ($server != '') ) {
+                $fp = @fsockopen($server, $port, $errno, $errstr, 10);
+            } else {
+                $fp = @fsockopen($host, $port, $errno, $errstr, 10);
+            }

             if (!$fp) {
                 return false;
@@ -1543,8 +1553,9 @@
  * @param bool $check_status
  * @return string
  */
-function w3_http_purge($url, $auth = '', $check_status = true) {
-    return w3_http_request('PURGE', $url, null, $auth, $check_status);
+#cd34, 20110721, added server IP
+function w3_http_purge($url, $auth = '', $check_status = true, $server = '') {
+    return w3_http_request('PURGE', $url, null, $auth, $check_status, $server);
 }

 /**
diff -Naur w3-total-cache/lib/W3/PgCache.php w3-total-cache-varnish/lib/W3/PgCache.php
--- w3-total-cache/lib/W3/PgCache.php	2011-06-21 23:22:54.000000000 -0400
+++ w3-total-cache-varnish/lib/W3/PgCache.php	2011-07-21 16:04:07.247499682 -0400
@@ -693,7 +693,9 @@
                     $varnish =& W3_Varnish::instance();

                     foreach ($uris as $uri) {
-                        $varnish->purge($uri);
+#cd34, 20110721 Added $domain_url to build purge hostname
+#                        $varnish->purge($uri);
+                        $varnish->purge($domain_url, $uri);
                     }
                 }
             }
diff -Naur w3-total-cache/lib/W3/Varnish.php w3-total-cache-varnish/lib/W3/Varnish.php
--- w3-total-cache/lib/W3/Varnish.php	2011-06-21 23:22:54.000000000 -0400
+++ w3-total-cache-varnish/lib/W3/Varnish.php	2011-07-21 16:04:52.836919164 -0400
@@ -70,7 +70,7 @@
      * @param string $uri
      * @return boolean
      */
-    function purge($uri) {
+    function purge($domain, $uri) {
         @set_time_limit($this->_timeout);

         if (strpos($uri, '/') !== 0) {
@@ -78,9 +78,11 @@
         }

         foreach ((array) $this->_servers as $server) {
-            $url = sprintf('http://%s%s', $server, $uri);
+#cd34, 20110721, Replaced $server with $domain
+#            $url = sprintf('http://%s%s', $server, $uri);
+            $url = sprintf('%s%s', $domain, $uri);

-            $response = w3_http_purge($url, '', true);
+            $response = w3_http_purge($url, '', true, $server);

             if ($this->_debug) {
                 $this->_log($url, ($response !== false ? 'OK' : 'Bad response code.'));
diff -Naur w3-total-cache/w3-total-cache.php w3-total-cache-varnish/w3-total-cache.php
--- w3-total-cache/w3-total-cache.php	2011-06-21 23:22:54.000000000 -0400
+++ w3-total-cache-varnish/w3-total-cache.php	2011-07-21 15:56:53.275922099 -0400
@@ -2,7 +2,7 @@
 /*
 Plugin Name: W3 Total Cache
 Description: The highest rated and most complete WordPress performance plugin. Dramatically improve the speed and user experience of your site. Add browser, page, object and database caching as well as minify and content delivery network (CDN) to WordPress.
-Version: 0.9.2.3
+Version: 0.9.2.3.v
 Plugin URI: http://www.w3-edge.com/wordpress-plugins/w3-total-cache/
 Author: Frederick Townes
 Author URI: http://www.linkedin.com/in/w3edge
@@ -47,4 +47,4 @@
     require_once W3TC_LIB_W3_DIR . '/Plugin/TotalCache.php';
     $w3_plugin_totalcache = & W3_Plugin_TotalCache::instance();
     $w3_plugin_totalcache->run();
-}
\ No newline at end of file
+}

Gracefully Degrading Site with Varnish and High Load

Saturday, July 16th, 2011

If you run Varnish, you might want to gracefully degrade your site when traffic comes unexpectedly. There are other solutions listed on the net which maintain a Three State Throttle, but, it seemed like this could be done easily within Varnish without needing too many external dependencies.

The first challenge was to figure out how we wanted to handle state. Our backend director is set up with a ‘level1′ backend which doesn’t do any health checks. We need at least one node to never fail the health check since the ‘level2′ and ‘level3′ backends will go offline to signify to Varnish that we need to take action. While this scenario considers the failure mode cascades, i.e. level2 fails, then if things continue to increase load, level3 fails, there is nothing preventing you from having separate failure modes and different VCL for those conditions.

You could have VCL that replaced the front page of your site with ‘top news’ during an event which links to your secondary page. You can rewrite your VCL to handle almost any condition and you don’t need to worry about doing a VCL load to update the configuration.

While maintaining three configurations is easier, there are a few extra points of failure added in that system. If the load on the machine gets too high and the cron job or daemon that is supposed to update the VCL doesn’t run quickly enough or has issues with network congestion talking with Varnish, your site could run in a degraded mode much longer than needed. With this solution, in the event that there is too much network congestion or too much load for the backend to respond, Varnish automatically considers that a level3 failure and enacts those rules – without the backend needing to acknowledge the problem.

The basics

First, we set up the script that Varnish will probe. The script doesn’t need to be php and only needs to respond with an error 404 to signify to Varnish that probe request has failed.

<?php
$level = $_SERVER['QUERY_STRING'];
$load = file_get_contents('/proc/loadavg') * 1;
if ( ($level == 2) and ($load > 10) ) {
  header("HTTP/1.0 404 Get the bilge pumps working!");
}
if ( ($level == 3) and ($load > 20) ) {
  header("HTTP/1.0 404 All hands abandon ship");
}
?>

Second, we need to have our backend pool configured to call our probe script:

backend level1 {
  .host = "66.55.44.216";
  .port = "80";
}
backend level2 {
  .host = "66.55.44.216";
  .port = "80";
  .probe = {
    .url = "/load.php?2";
    .timeout = 0.3 s;
    .window = 3;
    .threshold = 3;
    .initial = 3;
  }
}
backend level3 {
  .host = "66.55.44.216";
  .port = "80";
  .probe = {
    .url = "/load.php?3";
    .timeout = 0.3 s;
    .window = 3;
    .threshold = 3;
    .initial = 3;
  }
}

director crisis random {
  {
# base that should always respond so we don't get an Error 503
    .backend = level1;
    .weight = 1;
  }
  {
    .backend = level2;
    .weight = 1;
  }
  {
    .backend = level3;
    .weight = 1;
  }
}

Since both of our probes go to the same backend, it doesn’t matter which director we use or what weight we assign. We just need to have one backend configured that won’t fail the probe along with our level2 and level3 probes. In this example, when the load on the server is greater than 10, it triggers a level2 failure. If the load is greater than 20, it triggers a level3 failure.

In this case, when the backend probe request fails, we just rewrite the URL. Any VCL can be added, but, you will have some duplication. Since the VCL is compiled into the Varnish server, it should have negligible performance impact.

sub vcl_recv {
  set req.backend = level2;
  if (!req.backend.healthy) {
    unset req.http.cookie;
    set req.url = "/level2.php";
  }
  set req.backend = level3;
  if (!req.backend.healthy) {
    unset req.http.cookie;
    set req.url = "/level3.php";
  }
  set req.backend = crisis;
}

In this case, when we have a level2 failure, we change any URL requested to serve the file /level2.php. In vcl_fetch, we make a few changes to the object ttl so that we prevent the backend from getting hit too hard. We also change the server name so that we can look at the headers to see what level our server is currently running. In Firefox, there is an extension called Header Spy which will allow you to keep track of a header. Often times I’ll track X-Cache which I set to HIT or MISS to make sure Varnish is caching, but, you could also track Server and be aware of whether things are running properly.

sub vcl_fetch {
  set beresp.ttl = 0s;

  set req.backend = level2;
  if (!req.backend.healthy) {
    set beresp.ttl = 5m;
    unset beresp.http.set-cookie;
    set beresp.http.Server = "(Level 2 - Warning)";
  }
  set req.backend = level3;
  if (!req.backend.healthy) {
    set beresp.ttl = 30m;
    unset beresp.http.set-cookie;
    set beresp.http.Server = "(Level 3 - Critical)";
  }

At this point, we’ve got a system that degrades gracefully, even if the backend cannot respond or update Varnish’s VCL and it self-heals based on the load checks. Ideally you’ll also want to put Grace timers and possibly run Saint mode to handle significant failures, but, this should help your system protect itself from meltdown.

Complete VCL

backend level1 {
  .host = "66.55.44.216";
  .port = "80";
}
backend level2 {
  .host = "66.55.44.216";
  .port = "80";
  .probe = {
    .url = "/load.php?2";
    .timeout = 0.3 s;
    .window = 3;
    .threshold = 3;
    .initial = 3;
  }
}
backend level3 {
  .host = "66.55.44.216";
  .port = "80";
  .probe = {
    .url = "/load.php?3";
    .timeout = 0.3 s;
    .window = 3;
    .threshold = 3;
    .initial = 3;
  }
}

director crisis random {
  {
# base that should always respond so we don't get an Error 503
    .backend = level1;
    .weight = 1;
  }
  {
    .backend = level2;
    .weight = 1;
  }
  {
    .backend = level3;
    .weight = 1;
  }
}

sub vcl_recv {
  set req.backend = level2;
  if (!req.backend.healthy) {
    unset req.http.cookie;
    set req.url = "/level2.php";
  }
  set req.backend = level3;
  if (!req.backend.healthy) {
    unset req.http.cookie;
    set req.url = "/level3.php";
  }
  set req.backend = crisis;
}

sub vcl_fetch {
  set beresp.ttl = 0s;

  set req.backend = level2;
  if (!req.backend.healthy) {
    set beresp.ttl = 5m;
    unset beresp.http.set-cookie;
    set beresp.http.Server = "(Level 2 - Warning)";
  }
  set req.backend = level3;
  if (!req.backend.healthy) {
    set beresp.ttl = 30m;
    unset beresp.http.set-cookie;
    set beresp.http.Server = "(Level 3 - Critical)";
  }

  if (req.url ~ "\.(gif|jpe?g|png|swf|css|js|flv|mp3|mp4|pdf|ico)(\?.*|)$") {
    set beresp.ttl = 365d;
  }
}