Binary Search for Javascript Arrays

If you need to search through a large array, or you search arrays frequently in your Javascript code, or if you do both, chances are a binary search will give you better performance than a linear search (read: for loop). One caveat, however, is that binary search algorithms only work on sorted arrays. Here is a binary search function I sometimes use in my code:

Array.prototype.binSearch = function(needle, case_insensitive) {
    if (!this.length) return -1;

	var high = this.length - 1;
	var low = 0;
	case_insensitive = (typeof(case_insensitive) !== 'undefined' && case_insensitive) ? true:false;
	needle = (case_insensitive) ? needle.toLowerCase():needle;

	while (low <= high) {
		mid = parseInt((low + high) / 2)
		element = (case_insensitive) ? this[mid].toLowerCase():this[mid];
		if (element > needle) {
			high = mid - 1;
		} else if (element < needle) {
			low = mid + 1;
		} else {
			return mid;
		}
	}

	return -1;
};

A Better HAProxy Health Check For Dynamic Websites

Nobody wants their website to go down, or worse, for users to notice the site is down. Because of this most larger websites will run on multiple servers to provide some level of high availability. In a multi-server architecture there is typically a load-balancer (or cluster of load-balancers) to distribute the load among a pool of web servers. When a server goes down it’s taken out of the pool until it is once again ready to handle requests. HAProxy has the ability to perform this task by performing periodic health checks on all the servers in a cluster. The default settings, though, could give false positives in some cases, and thus create a bad user experience by allowing ill servers to continue receiving requests.

When in HTTP mode HAProxy’s default health check is a simple OPTIONS request. This has the advantage of being a very lightweight request, and is easy to identify and filter from logs. Consider this scenario though: HAProxy balances the load between several web servers running nginx and PHP-FastCGI. If nginx is up but PHP-FastCGI goes down, nginx will still properly handle the OPTIONS request from HAProxy, giving the impression that all is well. HAProxy continues sending requests to the ill server which in turn get a 504 Gateway Timeout (or similar) response. Not a very good situation.

A solution would be to use a deeper health check, one that goes beyond nginx to the PHP-FastCGI process. That way if PHP-FastCGI goes down, the whole server is presumed ‘down’.

backend appservers

 mode http

 option httpchk HEAD /health_check.php HTTP/1.1\r\nHost:\ example.com

 server web1 x.x.x.x:80 weight 5 check inter 2000

 server web2 x.x.x.x:80 weight 5 check inter 2000

 server web3 x.x.x.x:80 weight 5 check inter 2000

In the above example I’m using a custom health check request which will be processed by PHP-FastCGI. health_check.php is a lightweight script that contains simply <?php echo "I'm healthy"; ?>. I also added a host header so that the health check will be handled by a specific nginx virtual host. The nginx vhost config has this in it:

location = /health_check.php {

 access_log		off;

 fastcgi_pass	127.0.0.1:9000;

 fastcgi_index	index.php;

 include	/etc/nginx/fastcgi_params;

}

And there you have it–a better HAProxy health check for dynamic websites.

Recursive Find and Replace With grep and Perl

I thought it might be a nice idea to start posting useful little commands and bits of code every now and then–ones I’ve found to be particularly useful. So here’s the first one, recursive find and replace. A masterfully crafted regular expression paired with this command can save you hours of tedious work.

This will search all files recursively for SEARCH_STRING and replace all occurrences of SEARCH_STRING with REPLACE_STRING throughout each unique file found. It also creates a backup of each modified file so that FILE is backed-up as FILE~ (with a tilde).

grep -R --files-with-matches 'SEARCH_STRING' . | sort | uniq | xargs perl -pi~ -e 's/SEARCH_STRING/REPLACE_STRING/'

Taking The Pain Out Of Domain Hunting

Trionym Screenshot

Coming up with a suitable name for a business, product, or website is something I do on a fairly regular basis. In brainstorming a name I often make lists of words I’d like to use, like adjectives and nouns than relate the product. Then I start combining the words to create a unique name and check to see if the related domain is taken or not. The problem is that even though I may have come up with a name I really like, if the domain name is taken, it isn’t worth keeping.

These days a LOT of domains are taken, either by people using them or companies squatting them. So to make this whole process a little easier I built a website that takes most of the work out of hunting for a good domain: Trionym.The idea is fairly simple: enter up to three lists of words, choose which Top-Level Domains you’re willing to use, and search. Trionym will then create all the possible word combinations and check whois databases to see if the domains are registered. It’s relatively simple for now, but I’m considering adding more options, so if there’s a feature you’d like to see just let me know.

High Performance Comet on a Shoestring

I’ve had my eye on the advances that are being made in the Comet arena for a while now, but it was only this past weekend that I finally sat down and used it for a project. In doing so, there was a particular configuration problem I needed to address, and that was…uh, addressing.

Introducing Comet to an existing architecture assumes there is already a web server in the neighborhood, and that it is, in one way or another, recieving traffic from port 80. Due to the fact that many site visitors will likely be positioned behind a firewall unwilling to accept connections on ports other than 80 or 443, we also need to get our Comet server running on port 80 as well. This normally wouldn’t be much of a problem at all, unless you don’t want to fork over the money for an extra IP address. I don’t & I didn’t. So let me show you how I did so.

As I eluded to above, to solve this problem of running two services on the same port in the same server enviromnent you would normally have two different IP addresses assigned to the same front-end server. This is typically a load-balancer or firewall, but these could also be running on the same machine as a web server and Comet server. The load-balancer would then accept requests for x.x.x.1:80 and send them to the web server, and requests for x.x.x.2:80 would go to the Comet server. However if we only have one IP address that means we have to route requests based on a higher network layer, the Application Layer (7). Now we route by domain name.

In fact, that is something most web servers can handle using name-based virtual hosts. “So why not set-up Apache to reverse-proxy requests to the Comet server?”, you ask. Well, that would work. The reason Comet servers even exists though, is because web server connection threads are too heavy to support the level of concurrency Comet requires (for a decent number of users). This is where the “high performance” part comes in. HAProxy is a fantastic high performance layer 7 load-balancer. Using HAProxy’s ACL feature we can basically mimic Apache virtual hosts. Consider this example snippet from haproxy.cfg:

frontend www *:80
    mode http
    acl comet hdr_beg(host) comet.
    use_backend meteor if comet

default_backend apachebackend meteor
    mode http
    server server1 127.0.0.1:4670

backend apache
    mode http
    server server2 127.0.0.1:8080

As you can see, I set up a front-end to accept all connections on port 80. Then I use an ACL to examine the HOST header and see if it begins with comet. (e.g. http://comet.example.com). If it does, the request is sent to the comet server on port 4670, and if not, requests go to Apache on port 8080. And there you have it, a high performance Comet installation with no money out-of-pocket.

Next Page →