Mailing list archives
10GbE load-balancing (updated)
Web Based User Interface
HATop: Ncurses Interface
You want to donate ?
They use it !
Some happy users want to report their experience with HAProxy. Some sites openly confess they use it.
Some users are simply curious about who currently uses it. So I'm assembling all these information
here for them, in alphabetical site name order. If you think something is inaccurate, or if you'd
like to send a few sentences about your experience and have a link to your site, please contact me.
Quoting A0 Labs' CEO Olivier Warin: “A0 Labs is a French hosting company specialized in critical applications which need high performance and security. We have been using Haproxy for several years now and deploy it for our customers to provide high availability and performance to their web sites”. I would personally like to add that A0 Labs helped a lot on the FreeBSD port a few years ago, regularly helps by benchmarking and testing new releases, and contributes to quite a number of other opensource projects, not just HAProxy !
Airbnb built SmartStack on top of HAProxy to have exactly the load balancer they needed. The article explains in details the limitations of all other solutions they have considered (including HAProxy itself).
Alibaba / Taobao CDN
Taobao's CDN is the world's largest picture CDN, it delivers contents for all online shops hosted by Taobao and Alibaba, they represent around 80% of China's online business. They use a lot of opensource, use simple and scalable architecture including LVS + HAProxy for the load balancing layer, Squid for the cache and their fork of Nginx (Tengine) for the servers. A fairly complete and interesting overview is available here.
AppScale runs Google App Engine apps at scale in many different environments (clouds, clusters, laptops) around the world. As such, it must adapt to resource constraints imposed by the underlying server resources and cost constraints specified by our customers. To hide the differences among infrastructures, AppScale provides a layer of abstraction and isolation between the infrastructure and the app. "AppScale uses HAProxy statistics to inform its autoscaling decisions. HAProxy provides an easy to use interface that AppScale uses to configure and update the system quickly, on the fly." -- Chandra Krintz, CSO, AppScale. More details are provided here and here.
“Biblio.com uses HAProxy to load balance and provide fault tolerance for 10 million searches per day across a cluster of Solr search servers that index 95 million bibliographic records of used and rare books. HAProxy has been very fast and rock solid for us. Thanks for providing it.” - CEO Brendan Sherar
BitPusher designs, implements and manages infrastructures that are highly reliable, scalable and performant. Quoting CEO Daniel Lieberman “We use HAProxy extensively, both as a standard edge load balancer and (especially in AWS) as a proxy for internal services, avoiding the need for IP-based failover.”
Border Stylo is one of the companies making heavy use of the stunnel + haproxy tandem on Amazon
instances. Carlo Flores, Sr Operations Engineer, says :
“Border Stylo is a social networking site, and we use HAProxy at the core of our stack after
benchmarking scale (speed, KB/connected user) vs Pound and Nginx for our architecture serving
retrollectapp.com. Operations Engineers here consider
the mailing list one of the better resources on the net, if only for Willy and Cyril's musings
and configuration files, and contribute what we can with our ctl script at
Personally, I think the statistics generation and halog sources is at least as impressive as
HAProxy's speed and resource management”.
An article on High Scalability
gives some details about the architecture behind DISQUS and the traffic it has to deal with. In short, 17000
requests per second, 250 million visitors. They're using HAProxy both in front of the web servers and in front of the database. A lot more info
are found in this detailed presentation.
Egnyte is a Cloud File Server. Quoting Sachin Shetty : “Haproxy is a fantastic feature-rich load balancer and we at Egnyte have been using it for a while. Apart from using haproxy for standard application load balancing, we are using haproxy to overcome some limitations of Apache like use queue timeout to prevent backlogging in Apache when application servers are loaded. We also use haproxy for load direction to route requests i.e. send requests to specific server under specific conditions and failover accordingly”. Thanks guys, your nice feedback is much appreciated !
El Commercio / Peru21
Héctor Paz, the sites' sysadmin, reported on the HAProxy mailing list :
“We use haproxy to handle web traffic for peruvian news sites: elcomercio.pe, peru21.pe, etc. Around 2k session rate in peak hours. Haproxy is the most reliable part of our architecture”.
Farmville is one of the most popular online games, edited by Zynga. Mark William indicated here that mid-2010 the site had over 70 million active users a month. While Zynga doesn't explicitly advertise use of HAProxy, they don't hide it either as they report using RightScale at Amazon EC2 to scale seamlessly, and even the error page has it in its URL.
Fedora is the community-driven distribution behind Red Hat. The Wiki explains how HAProxy is used there, and even provides links to the stats pages.
Free is a major player among the french ISPs. Free has always promoted the use of free software, and has been using HAProxy for many years. The Webmail and the file exchange service have been the most heavily loaded deployment ever reported in terms of network bandwidth, with more than 5 Gbps of traffic at any moment. They're used to provide extremely valuable feedback which has contributed to make the 10Gbps performance something real and to get TCP splicing a reliable solution.
There's a lengthy article on the Github Blog entitled "How We Made GitHub Fast". It explains in depth how the GitHub architecture works, and there's a lot to learn there for anyone who's planning on starting a scalable site. Interestingly, a second article here gives a few more details as to why they're not only using HAProxy but Ldirectord too (eg: smaller memory usage in VMs).
Globo.Tech is a Hosting and Managed Service Provider, in Canada. Anthony Levesque describes how they are using HAProxy in their internal infrastructure, as well as for their client-facing needs: “We know that using HAProxy allows us to build robust and scalable setups not only for our clients, but for our internal services as well. We operate multiple internal or private cloud Infrastructures where the management layer's high availability and scalability is done with HAProxy. HAProxy is a reliable constant in our clustering design”.
The Imgur guy describes his architecure choices on Reddit here and why HAProxy makes a good choice for him here. The full thread is quite informative about what issues such fast-growing sites are facing.
Jon Watte, IMVU's CTO, describes on slideshare how IMVU's architecture works and how it scales. Haproxy is just one small piece in the puzzle there despite being on the front. The site's home page indicates the number of concurrent users in real time (more than 120k when last checked). It's nice to see some large sites sensibilized to latency and report their usage numbers.
Mike Krieger explained on Slideshare how they scaled Instagram to 30 million users in less than 2 years. Now they're the first proposed choice when you strike letter "i" on Google! Of course, when extremely fast scaling is needed, HAProxy is in the mix :-)
ITA Software is one of the companies who acknowledge use of Open-source components.
Their experience is best described with their own words : “ITA Software is a leading provider of innovative solutions for the travel industry. Haproxy's power, flexibility, and reliability, have quickly made it a valuable part of ITA's infrastructure that supports leading travel compaines worldwide including American Airlines, Bing, Continental Airlines, Kayak, Orbitz, Southwest Airlines, United Airlines, US Airways, Virgin Atlantic Airways, and others”.
Linux Kernel hosting infrastructure's admin and architect, Konstantin Ryabitsev, shares his experience running one of the most challenging sites dedicated to development, and of course it runs behind haproxy :-)
London Trust Media, Inc.
London Trust Media's President, Jonathan Roudier, says “Half of the Internet is built on HAProxy. We are pleased to support the Internet!”. While I personnaly think that "half of the internet" probably is a bit overrated, I'd like to state that London Trust Media is one of such now rare companies so obsessed by their customer's experience that they're constantly striving to squeeze the last possible nanosecond of latency in their infrastructure and with which I'm delighted to discuss bits and bytes, CPU affinity, cache miss latency and such important considerations that tend to be dismissed too often by many users these days. Sharing experience with people able to provide test results is pleasant and helps designing better solutions. Thanks for your support guys and don't change the way you work!
MaxCDN indicates here that they're using haproxy in their CDN solution.
Kevin Phair of NYI reports that haproxy handles the load balancing aspects of their Fault Tolerant Web service : “HAProxy easily fits our performance needs, and we find it far easier to manage and trouble-shoot than any of the expensive big-name load-balancer options that have had experience with”.
Olark / Hab.la
The Olark guys explained here how they set up their site with high availability, and some of their decisions to ensure uninterrupted service in case something goes wrong. They give a bit more details about the monitoring and some architecture fixes here. Please note that Olark is among the cool companies who funded the development of a number of features.
pfSense is an open source firewall based on FreeBSD and has an haproxy optional module along with a web interface for configuring haproxy. More information on the package is available here.
Playfish, now part of EA, offers a wide number of online social games. As of 2010, the site already counts 10M daily users and 50M monthly users. As a number of such other gaming sites, it's hosted in Amazon EC2 and incoming traffic is load balanced using HAProxy.
Ravelry is a social network dedicated to knitting that was founded by Casey and Jessica Forbes in 2007. It was quikly welcomed with a great success and Casey had to perform important changes several times in the architecture to follow the growth. He explains his adventures here. In 2008, one year after the project was born, Casey told me : “HAProxy is fantastic. We use it at http://www.ravelry.com to handle 5 million or so requests per day”. And now we're in 2011... Their project is quite original and I wish them a long success story !
It's probably the only site who is so open about its infrastructure that even their HAProxy configuration is available to everyone !
Red Hat's Cloud : OpenShift
As described in their architecture overview, Red Hat uses HAProxy as the load balancing solution in its cloud architecture OpenShift. While I know for sure they're not the first cloud provider to use it, I can say that they're the first one to openly admit it and that's nice from them (their architecture overview is well detailed and worth a read BTW).
Ben Timby of SmartFile says :
“Hundreds of gigabytes of data flows from SmartFile through HAProxy each day. SmartFile uses HAProxy both for HTTP and FTP protocol load balancing. The PROXY protocol makes it possible to provide highly available and lightning fast FTP service. Without the many features of HAProxy and the support of Willy Tarreau dragging the old FTP protocol into the 21st century would have been near impossible. Many thanks for such a stable and flexible product”. Please note that SmartFile was kind enough to fund development of the server-side PROXY protocol implementation.
SOS Children's Villages UK
Anthony Gerrard of SOS Children's Villages UK says :
“SOS Children's Villages UK have been using haproxy with great success for nearly 3 years now. We were previously using a popular HTTP server's load balancing capabilities for distributing traffic to our CMS instances but were experiencing issues that all went away after the move the haproxy. Since then we've also made use of its capabilities to run content experiments on our donation forms by directing traffic to different Tomcat back-ends based on http path parameters”.
Spinn3r's CEO Kevin Burton says : “We've been really happy with haproxy and have been using it for more than 5 years. We have all web components of Spinn3r behind haproxy including our firehose API which serves over 5TB per day to our customers. Haproxy has been nothing but rock solid. It's one of our most reliable components of our infrastructure. It just works.”.
Stack Overflow / Server Fault
The same team is managing both sites. They're well known and have high expectations on reliability and quality of service. They've funded the development of the anti-abuse features in HAProxy.
Jeff Atwood says :
“We're big fans of HAProxy, which the guys at Reddit turned us on to. It has been working flawlessly for us in load balancing Stack Overflow between two - and now three - servers”. There's also a nice presentation of the updated architecture (2014) on High Scalability.
Transloadit is a file upload service for web applications. One of its co-founders, Kevin van Zonneveld, explains here that he uses it for the content switching, and also gives some hints about setting up logging under Ubuntu. I'll probably have to put that into the doc because it looks like it was not obvious.
As of April 2012, TubeMogul is the biggest Real Time Bidding video ads platform. Nicolas Brousse, Lead Operations Engineer, says : “We use Haproxy in four different EC2 regions and five Availability Zones. It allows us to handle over 10 billion HTTP bid requests a day and deliver over six billion videos ad streams last year”.
As of December 2011, Tuenti is the most trafficked website in Spain with more than 12M users and 40 billion page views a month. In the following presentation, Senio systems Engineer Ricardo Bartolomé explains the previous load balancing infrastructure, the neew load balancing strategy, as well as the reasons why they have chosen HAProxy as the Layer7 load balancing solution : http://www.slideshare.net/ricbartm/load-balancing-at-tuenti.
This article gives some details on the Tumblr architecture. As of Feb 2012, it's at 500M page views a day, 40k requests/s with plans to go to 400k, and observes a 30% monthly growth. It involves more than 1000 servers, and employs 25 HAProxy, 15 Varnish and 8 Nginx to make this run smoothly, the same winning trio that is found on many large sites !
John Adams explains here how they scaled Twitter to support a traffic growth of 1358% in 2009. It looks like they adopted the principle of "one component per function" which generally scales the best. There is very little information about the load balancing part in this slide show, but it also happens that scaling the rest is much more important.
Andrew Rodland of Vimeo reported “We use HAProxy for our website, our API, our video packager, and quite a few internal services as well, for upwards of 20K requests/sec. Vimeo is built on open-source and proud to contribute back ”.
Virgin America airlines
There was a presentation from Virgin America at LinuxCon 2010 where they explained how they migrated to full open-source. Among the numerous products involved, HAProxy is used for the load balancing. The complete presentation is available in PDF format here. A quick summary of the presentation is also available.
The W3C obviously doesn't have to be presented to you if you're working in web environments. Yes, when you visit the W3C, you're passing via HAProxy as is explained here. From some past discussions, I remember it also helps protecting the whole site against unintentional misuse caused by excess of document validation.
This probably is the fastest site I've ever seen and certainly one of the most highly stressed I know. They deliver the nice counter you can see on the HAProxy page to millions of web sites around the world. To get an idea of the load, consider that each time one of these sites' page is viewed, they receive a request. Their response time and availability are obviously critical to those sites, and they excel in this area with sub-millisecond response times. This site perfectly fits HAProxy's strengths, and some of the high performance optimizations directly come from their feedback.
As described in this highscalability article (slides here), YouPorn stacks HAProxy, Varnish and Nginx to achieve 300000 requests per second and 100 Gbps of traffic, all of this producing 15 GB of logs per hour. With numbers 10 times larger than any of the other sites listed here, it looks like porn will always be most of the net's traffic !
Feel free to contact me at for any questions or comments :
Some people regularly ask if it is possible to send donations, so I have set up a Paypal account for this.
Click here if you want to donate.