[Thrift] Handling failover and high availability with thriftservices (oh, and load balancing too)
Mark Slee
mslee at facebook.com
Thu Aug 9 13:40:36 PDT 2007
Hi guys,
First of all, we're thrilled to hear you saying that some of this stuff
doesn't belong in Thrift. We totally agree. Keeping the core Thrift
library super lightweight is one of the main goals, while also designing
to make it easy to extend, build support libraries for, or add
wrappers/tools for more complicated tasks.
To give a little insight into how we do some load-balancing and
availability things here:
1/ The PHP TSocketPool is really useful for this. This solves the
majority of our availability issues. When used in Apache, the
TSocketPool keeps track of server availability using APC cache, and is
configurable to retry or ignore downed hosts on an application-defined
interval. Most of our services scale horizontally on the backend, so we
just balance requests randomly across the pool, letting the TSocketPool
randomly pick from one of many hosts. This is a dead simple approach,
but surprisingly effective and very maintainable.
2/ Thrift over HTTP was another of our design points. Note that there is
a THttpClient available in most languages now. We actually run some
Thrift services in PHP using Apache, with the clients connecting via
THttpClient. The nice thing you can do here is take advantage of all the
HTTP proxies, load-balancers, etc. A simple setup is to create a VIP for
your backend service, and use THttpClients to connect over that VIP. You
can move all your availability/load-balancing into whatever you use to
power your VIP (presumably some kind of load-balancer), and keep the
Thrift application code dead simple and cleanly isolated.
3/ It's also possible to use Thrift to actually build a services
management tool. i.e. have a central Thrift service that can be queried
to find out information about which hosts are running which services. We
have done this internally, and would share more details or open source
it, but it's a bit too particular to the way our network is set up and
how we cache data. The gist of it, though, is that you have a highly
available meta-service that you use to configure your actual application
server/clients.
4/ Finally, the TTransport interface was designed so that a lot of
complexity could be placed in proprietary implementations. i.e. you
could write your own TReliableTransport and use it in your
clients/servers, filling it with whatever crazy logic you like to
guarantee delivery, no wire corruption, etc. At Facebook we haven't
really yet found any cases where we weren't happy to just use vanilla
TCP or UDP.
Cheers,
Mark
-----Original Message-----
From: thrift-bounces at publists.facebook.com
[mailto:thrift-bounces at publists.facebook.com] On Behalf Of Matt Reynolds
Sent: Thursday, August 09, 2007 8:33 AM
To: thrift at publists.facebook.com
Subject: Re: [Thrift] Handling failover and high availability with
thriftservices (oh, and load balancing too)
Sorry Chris, I accidentally sent this to you. Re-sending to list.
On Aug 9, 2007, at 12:58 AM, Chris Lamprecht wrote:
> (This is a repost of a message I put on the facebook thrift group
> earlier, but I'm moving over to this mailing list instead..easier
> to keep track of)
>
> Hi,
>
> I've been messing with thrift this week, it looks like it might be
> exactly what we've been looking for -- an easy way for our
> developers to use or write backend services without having to write
> networking code and without the overblown heavyweight wsdl stuff
> available.
Agreed! This is the impetus of my interest in thrift as well.
> The only thing we need that thrift doesn't address is failover/high
> availability and load balancing. I'm not suggesting that thrift add
> this in -- it's so ingeniously simple, I'd hate to mess that up :)
> But a layer on top of thrift services is what I'm thinking about. I
> have a few ideas that I'll explore, but I am curious how others
> (including facebook) have handled HA/failover/load balancing.
>
> If I understand the TProcessor interface, it looks like one could
> easily write a TProcessor that is basically a proxy, taking
> requests and dispatching them to a pool of servers, keeping track
> of load, etc.
>
> Another approach, which I've started already, is to write a Java
> servlet bridge to thrift services (it's about 5 lines of code in a
> servlet), and have the clients use HTTP. Then we can leverage
> apache mod_jk, which does a decent job at load balancing/failover
> for Java servlets. I know HTTP isn't as efficient as a pure
> streaming protocol, but with connection keep-alives it's probably
> fast enough for us. A quick ping() benchmark over localhost got
> about 1000 synchronous requests per second. I'll be happy to
> contribute this in a patch if anyone is interested.
I'm looking at using HTTP as well, but using my own (and at work, my
company's) existing stateless services and networking gear that
provides failover and load-balancing.
If you're thinking about HA, that's a much harder task that I don't
think Thrift should try and take on (maybe allowing the underlying
transport mechanism to handle things like guaranteed delivery), since
thrift's simplicity is part of it's benefit, and the protocol
abstraction will allow for handling of HA (in a rudimentary sense,
but good enough for most applications).
The bridge might be a useful utility, however, so submit away. It'll
be in the archives at the very least.
_______________________________________________
Thrift mailing list
Thrift at publists.facebook.com
http://lists.pub.facebook.com/mailman/listinfo/thrift
More information about the Thrift
mailing list