Forgot your password?
typodupeerror
Facebook Networking The Internet BSD

Facebook Seeks Devs To Make Linux Network Stack As Good As FreeBSD's 195

Posted by timothy
from the high-praise-all-around dept.
An anonymous reader writes Facebook posted a career application which, in their own words is 'seeking a Linux Kernel Software Engineer to join our Kernel team, with a primary focus on the networking subsystem. Our goal over the next few years is for the Linux kernel network stack to rival or exceed that of FreeBSD.' Two interesting bullet points listing "responsibilities": Improve IPv6 support in the kernel, and eliminate perf and stability issues. FB is one of the worlds largest IPv6 deployments; Investigate and participate in emerging protocols (MPTCP, QUIC, etc) discussions,implementation, experimentation, tooling, etc.
This discussion has been archived. No new comments can be posted.

Facebook Seeks Devs To Make Linux Network Stack As Good As FreeBSD's

Comments Filter:
  • by Anonymous Coward

    Why not use FreeBSD? It's already there and at least as good as linux. Or have they perhaps hung themselves on systemd?

    • by HuguesT (84078) on Wednesday August 06, 2014 @02:00PM (#47615573)

      I love FreeBSD, I support them financially every year, and I use it daily but it is not uniformly better than Linux. Hardware support, in particular, is very far behind. Two random examples:

      1- My NAS system does not recognise any USB storage when they are plugged in after boot (no hotplug). It does not support USB superspeed (USB 3.0) either (I have to boot in compatibility mode by disabling xHCI in the BIOS). This is a known issue with some Asus motherboards, still unfixed in 10.0
      2- FreeBSD does not install on some of my HP G6 servers. The kernel simply segfaults. I really wanted FreeBSD on this hardware, so I run it in a VM under Linux (using KVM). Has been running brilliantly for about 2 years now.

      Also security update in FreeBSD are really difficult. I haven't finished dealing with updating my ports since I moved from 9.2 to 9.3 last week.

      I have to say this though: when it runs, it runs really well.

      • by ci4 (98735) on Wednesday August 06, 2014 @02:35PM (#47615959)

        USB3 support in FreeBSD 10 is OK (bunch of external disks used for PC backup - speed was essential). No problem with hot-plug either. Ports upgrade is trivial (although I have switched to pkg-ng now). I really can't find why do you think that security updates are difficult either. I've got only one 9.2 system around which I at the moment am not bothered to upgrade.

      • by jdew (644405)

        What asus board? I've got an asus crosshair iv formula and the usb3 has been working fine in both 9.2 and 9.3. The aibs driver doesn't support this board yet though.

        as for security updates:
        freebsd-update fetch; freebsd-update install

        and port updates:
        portsnap fetch update; portmaster -gda

        So not sure what the issues are here.

      • by TheRaven64 (641858) on Wednesday August 06, 2014 @02:45PM (#47616047) Journal
        I'm pretty sure that Facebook buys enough hardware that they can afford to write drivers for anything they're missing and demand FreeBSD support from vendors for their next round of purchases. Netflix already does this (they won't buy any hardware that doesn't have vendor support for FreeBSD), as do a few other companies, and so a number of NIC vendors (particularly in the 10G/40G space) are now putting quite a bit of effort into their FreeBSD drivers.
        • by lgw (121541) on Wednesday August 06, 2014 @03:10PM (#47616293) Journal

          If you have 1 million Linux machines deployed, with full Linux-specific software stacks on each, it's cheaper and easier (and most likely faster) to fix the problems you see in Linux than to move the fleet to a new OS.

          Facebook's dev shop culture is all about banging out code as fast as possible for the problem in front of them, then moving on. Forward planning isn't really the thing there, from what I hear (though I think you're no longer discouraged from testing your code before it goes live, that's a recent change). Moving to BSD might well be a better long term plan, but it would take years to get there and they don't really think on that timescale, from the rumors I hear anyhow.

        • by Pieroxy (222434) on Wednesday August 06, 2014 @03:10PM (#47616301) Homepage

          Don't you think it's easier and cheaper to optimize the network stack of Linux rather than writing tons of hardware drivers for FreeBSD? Hardware which, most of the time, will be undocumented. Furthermore, when you change your servers, yay, more drivers to write...

          • Facebook buys custom servers, so will be 100% documented. Also they are of the vanity free variety lacking any bolted on bits added strictly to make the numbered list of features on the side of the box longer. I suspect that the only thing they are going to care about are disks and nics. Sounds cards, video cards, random USB hardware, bluetooth, none of that matters to them at all. These are datacenter housed pieces of equipment.

            • Right. They want FreeBSD drivers, they can put that on the requirements list for network and storage controller vendors. But that does leave the issue of where the vendors are going to find good FreeBSD hackers to write these drivers.
              • by RockDoctor (15477)

                they can put that on the requirements list for network and storage controller vendors. But that does leave the issue of where the vendors are going to find good FreeBSD hackers to write these drivers.

                Isn't there a cart and horse misalignment here. If you're a hardware vendor looking to get a big (and FB are big buyers of hardware) contract, and the tendering terms include "must have drivers for [Open Source system]", then won't you look at what you've already got on the shelf in terms of capabilities, and

                • ASICs generally aren't flexible enough that you could simply emulate another controller in firmware, while FPGAs suck too much power to use on commodity network adapters. Writing a new driver (or bringing an existing neglected driver up to scratch) is going to be quicker than trying to make hardware that's compatible enough to work with a driver written for another vendor's controller.

                  (Besides which, as that other driver is probably maintained by your competitor, do you really think they're going to make an

          • The network stack is a complex bit of code that interacts in complex ways with the scheduler, the generic locking frameworks, the driver infrastructure, the bus and DMA, infrastructure and everything up to the system call layer. The device drivers are self-contained and (individually) small pieces of code. Which do you think is cheaper and easier to modify?
            • by Pieroxy (222434)

              Well, they may be self contained (which is not true because NIC drivers interact directly with the network stack for example) and small, but they are also highly untestable and undocumented.

              What I see is a stack to fix and a continuous stream of drivers to write. The one shot operation is bound to be cheaper on the long term.

              • NIC drivers interact directly with the network stack only via a set of well-defined interfaces (in FreeBSD at least), and writing the drivers is a one-shot operation because for new purchases they can demand FreeBSD drivers and support from their vendors (as Netflix, Verisign, and others do already). In the networking space, enough big companies are already demanding FreeBSD support for the high-end cards that the drivers are already likely to exist, so the missing ones are likely to be other things.
      • by Fweeky (41046) on Wednesday August 06, 2014 @02:47PM (#47616075) Homepage

        pkgng's made port upgrading much less burdensome - even fairly complex dependency changes can be handled automatically as of 1.3, and the official package repositories are a lot more useful now. They even have stable security-fix-only branches.

        I still make my own customised builds, but I make binary packages in an isolated jail using poudriere. 99% of upgrades are a matter of updating its ports tree, running rebuild-packages, and running pkg upgrade on all my machines.

        You couldn't pay me to go back to portupgrade/portmaster/portmanager.

      • I used to run a lot of freebsd at home - until I tried linux for samba and never went back to bsd again. bsd samba simply sucks. its dog slow and it hammers on disks, unlike linux which is very cache-oriented (from just looking at the drive lights).

        that was my main complaint. maybe linux has better integration with smb and the network stack but bsd is just not nearly as fast.

        stability wise, this isn't 10 yrs ago and linux stays up as long as you need it (years, even).

        bsd has better mgmt, though; one dist

      • by evilviper (135110)

        2- FreeBSD does not install on some of my HP G6 servers. The kernel simply segfaults.

        Did you try disabling anything and everything (drivers) in the kernel? Or perhaps just blacklisting most/all the modules?

        OpenBSD makes this easier with UKC and "config -e".

  • It might be a silly question, but why don't they just use FreeBSD in that case?
    • by halivar (535827)

      If they've done a lot of work already on a custom kernel, it may not make sense to try porting all that work to completely different kernel architecture. They may have done a cost benefit analysis and decided that the cost of improving their current architecture is less than retrofitting FreeBSD. Not saying this is the case, just posing a possible scenario where this would be the better option.

    • by discord5 (798235) on Wednesday August 06, 2014 @01:47PM (#47615461)

      It might be a silly question, but why don't they just use FreeBSD in that case?

      You haven't heard? I'm sorry... My condolences, but the writing had been on the wall for a while though. Netcraft even confirmed it years ago...

      *BSD is dying.

      • by gstoddart (321705)

        Isn't Netcraft dead?

        Because people have been saying the BSDs are dying for night on 20 years now, possibly longer.

        • by hawk (1151)

          Gee, Apple has been dying *far* longer than that . . . :)

          hawk

      • Ye gods I'm so tired of this 'joke' cropping up every so often. It wasn't really that funny 12 years ago, either.

        • by dr.Flake (601029)

          Ye gods I'm so tired of this 'joke' cropping up every so often. It wasn't really that funny 12 years ago, either.

          Actually,

          i still find the dying remark funny.
          And the more time has passed, the funnier it gets. As still being around and still being a relevant OS proves how silly the study was at the time.

          So i hope we can still laugh about it in 30 years or so.

          • by gstoddart (321705)

            So i hope we can still laugh about it in 30 years or so.

            You know, anybody who knows that joke now ... in 30 years or so will be sitting in the old geeks home laughing into their beard for reasons nobody else understands.

            And, yes, I consider myself among them ... now, if you'll excuse me, I should get started on my beard.

    • by HuguesT (84078)

      May be multiple issues. Perhaps better OpenMP support ? maybe NUMA ? Maybe Linux has a better virtual machine infrastructure ? maybe hardware support.

  • by gstoddart (321705) on Wednesday August 06, 2014 @01:39PM (#47615385) Homepage

    Look, this is FreeBSD ... why not just take their damned code?

    It's not like you're not allowed to do that. That's what is great about the BSD license.

    If FreeBSD's network stack is what you aspire to, why reinvent the wheel?

    • by Bengie (1121981) on Wednesday August 06, 2014 @01:44PM (#47615429)
      You can't just copy/paste code and expect it to work, it must be refactored, and may not even be compatible until completely new features are added in the current code to allow the new code to function. It's like saying "that fusion reactor is an open design, why not just place that in our coal power plant?". You may need to make some changes to your current power plant before you change its core.
      • Quite a few consumer router manufacturers actually ship Linux with the FreeBSD network stack grafted onto it. I don't know why.
      • You can't just copy/paste code and expect it to work

        Yes you can. You'll be disappointed, but you can.

  • Insulting their work might not be the right way to get the best Linux kernel network engineers to join your company.

    • Re: (Score:3, Funny)

      by Anonymous Coward

      Insulting their work might not be the right way to get the best Linux kernel network engineers to join your company.

      Or it might be the best way ever. Linux people and their egos....

    • by Zapotek (1032314)
      How is that an insult? As a coder (software engineer/developer/whatever) myself I'm always glad when people point out things in my systems that can be improved. The possibility of someone hiring someone else to improve them for me, that'd make me ecstatic. This isn't about egos, this is about getting stuff done right.
      • As a programmer, project manager and substantial shareholder in my own company, I'm always grateful when people point out things I do that could be done better.

        If I was just the former and not the latter two, I think the relative importance of my ego and the quality of the product would be shifted somewhat.

        Most, Open Source programmers tend to value recognition pretty highly relative to absolute merit of the product. To your average guy, if you say "your project sucks, we'll pay you to fix it" means that yo

  • by CadentOrange (2429626) on Wednesday August 06, 2014 @01:48PM (#47615471)
    What makes the FreeBSD network stack superior?
    • by gstoddart (321705) on Wednesday August 06, 2014 @01:53PM (#47615511) Homepage

      Way more cowbell, and a much cooler logo.

    • by maliqua (1316471)

      FBSD essentially always wins who has the fastest network stack contests, sets internet speed records etc.

      it just is always has been slightly a head.

      There is a reason why juniper uses it to base junos on

    • by Zarjazz (36278) on Wednesday August 06, 2014 @02:05PM (#47615631)

      As someone who has used various BSD's and Linux in large scale environments, and is a fan of both, I've configured servers with multi-10Gb interfaces and handling 100k+ requests a second I honestly can't think of any example of where Linux has been inferior. The often repeated line that FreeBSD has a better networking stack was probably true over 10 years ago with Linux 2.2 and earlier, but since then I'd say that myth is just bullcrap.

      Maybe Facebook are talking about some specific IPv6 or cutting edge features like MPTCP they need on their network, but as a general statement it's utterly misleading.

    • by Bengie (1121981) on Wednesday August 06, 2014 @02:10PM (#47615685)
      A lot of sysadmins from companies that push a lot of data over lots of connections have blogs about tweaking your OS to handle stuff like 10gb+ of traffic and millions of connections. A lot of these people complain about Linux having strange problems under these loads, and FreeBSD just seems to work. Linux may be faster in some cases, but it still has stability issues that are hard to debug.

      Then there's the whole thing about most network stack research happening primarily on FreeBSD because of licensing. There's a new zero-copy network API that was developed in FreeBSD that allows line rate 64byte 10gb traffic on a 450mhz quadcore cpu. Linux and old-api FreeBSD were about 1/10th the packets-per-second.

      A new thread friendly socket API has just been pushed to FreeBSD 11. One of Netflix' engineers had a pet project that now allows near zero lock-contention thread scaling. He was able to done line speed 40gb/s with 150k TCP sessions. Instead of having one file descriptor with a single listening thread, you instead have one file-descriptor and listening thread per MSS queue from the NIC and you can lock your thread to the same CPU as the MSS queue, so the packet is already in L2 cache. No shared network state. This also means no share locks with nearly perfect linear scaling and virtually no cache trashing or bouncing.

      They're starting work for extend the API to also allow the OS to better handle NUMA and to attach the MSS queues to the CPU to which the NIC is attached. This will virtually remove all cross-talk among the CPU cores trying to handle the network state.

      They're looking into expanding this same concept to the Storage IO system.
      • Re: (Score:2, Troll)

        by mvdwege (243851)

        [citation needed]

        I really want to believe you are sincere and everything you say is true. Unfortunately, you don't seem to have provided any references.

        This is the Web. Hyperlinks exist for a reason. Use them.

        • by phoenix_rizzen (256998) on Wednesday August 06, 2014 @02:37PM (#47615979)

          Google searches for "netmap" and "FreeBSD" will give you lots of information on pushing millions of pps through 900 MHz single-core machines. Netmap is also available on Linux. There's even a netmap-enabled version of IPFW that allows you to do packet filtering and routing completely in userspace, again will millions of pps. IPFW is also available on Linux, although I don't know if the netmap-enabled version is.

          Google searches for "openconnect" and "FreeBSD" will give you lots of information and blog posts from the Netflix guys about why they picked FreeBSD, and how it all works, including details on the networking.

          Google searches for "Adrian Chadd", or "RSS scaling", or similar terms will show you threads and posts on various FreeBSD mailing lists with information detailing a lot of the MSS/RSS work that's going into FreeBSD 11, and several projects that build off that. Those also have links to other information around sockets and similar.

          Google searches for "NUMA" and "FreeBSD" will bring up mailing list threads that cover the different projects being undertaken to improve the CPU affinity and thread locality and all that jazz.

          Sure, it would be nice if the OP had posted links to the info, but it's not like the information is secret or hard to find.

          • Re: (Score:2, Redundant)

            by mi (197448)

            Google searches for ...

            Sorry, but no. The onus is on you — one making the claim — to offer links supporting it.

            You Google it, you pick the links you deem most suitable, you embed them in your posting.

            Making the reader do it is not only impolite, it also makes it easier to attack your argument (that FreeBSD is superior) — the Google search could very well offer a link to some blog saying "FreeBSD networking sucks"...

            • What makes you believe OP was "making an argument"? He was responding to a comment that asked for more details, and he has provided them. If you want to argue with him, go ahead, but then you'll be the one making the claim that he's wrong (or whatever it is you want to argue about).

          • That's a useful list, especially if you have no clue as to what to actually search for. Like me.

            Cheers.

          • Netmap is also available on Linux

            While this is technically true, we recently had a PhD student try this. First, the Linux version is only available as patches so it took him a while to find a version of Linux that they'd apply against. Once he'd done this, it turns out that the driver support is basically only there in Linux for Intel NICs, which are modified as part of the patch. In FreeBSD, because it's merged into trunk, most NIC drivers support it.

            Oh, and Adrian isn't at Netflix anymore, but he's still working on networking stuff

            • by adri (173121)

              I don't work at a CDN company. :-)

              I work at norse-corp.com. RSS is a non-paid thing I do at home.

        • by Anonymous Coward

          I'm really getting tired of people like you: This is not a scientific publication, and this is not for sure wikipedia, this is a forum where people expose opinions. Want citations? Go back to the fucking Wikipedia and please DO verify some of the already present sources on many articles there (I grew tired of pointing out the ones that didn't even mention what they're supposed to support). Otherwise use a search engine, the beauty of the 2000's is that you no longer need to spend hours searching for informa

        • by Bengie (1121981)
          They're random blogs that I tend to come across when googling how to tweak network stacks for handling lots of connections. Take it as you will. It was an entirely anecdotal and opinionated blob of text that I probably just should have left out.
          • by Bengie (1121981)
            Re-reading when I posted, this was meant for just the first sentence talking about the blogs. The rest of the information is "true", but based off of memory. The part about the threaded api code for FreeBSD 11 was from a youtube from BSDCON(or similar convention)(https://www.youtube.com/watch?v=7CvIztTz-RQ) that was just published a week or two ago. The zero copy part is about "netmap" I think. Another /. post in this thread also mentioned it.
          • by mvdwege (243851)

            Ok, that's too bad. I was genuinely curious about the work being done.

            Sorry if I sounded a bit gruff, but when I see categorical statements, I like to have references to check myself.

        • by adri (173121)

          Me - adrian@freebsd.org. Also, https://wiki.freebsd.org/Netwo... [freebsd.org] .

      • Re: (Score:2, Informative)

        by Anonymous Coward

        Hi! adri here. I'm Adrian (adrian@freebsd.org) the ex-Netflix engineer who was doing this as his project whilst I was working there.

        I'm continuing to do it in my spare time now, as time and hardware permit.

        https://wiki.freebsd.org/NetworkRSS

        I'm working out the kinks in how IP fragments are correctly handled. It'll be more useful for real world deployments after that - unfortunately in the real world you have to deal with fragmented TCP and UDP; you can't just pretend it's not there.

        I'll continue chipping aw

      • Out of curiosity, are they giving the code back, or is it all tied up in their own distro?
      • by adri (173121)

        Hi,

        That'd be me (adrian@freebsd.org)

        https://wiki.freebsd.org/Netwo... [freebsd.org]

    • by dkman (863999)
      If I knew what made it superior I'd be sending a resume, not writing this post.
    • by ray-auch (454705)

      Red demon costumes vs. penguin costumes. No contest really.

      http://freebsd-image-gallery.n... [netcode.pl]
      http://freebsd-image-gallery.n... [netcode.pl]

      http://www.flickr.com/photos/1... [flickr.com]
      http://www.flickr.com/photos/1... [flickr.com]

  • by phizi0n (1237812) on Wednesday August 06, 2014 @02:33PM (#47615939)

    I don't understand why there's all these comments saying they should just use FreeBSD. There are many reasons to despise Facebook but their desire to improve the Linux networking stack is admirable. We should be encouraging corporations to contribute to OSS, not telling them to just use that other thing that is better in some ways but not others. Kudos to them for contributing back to the projects they use.

  • by m.dillon (147925) on Wednesday August 06, 2014 @02:38PM (#47615991) Homepage

    Designing algorithms that play well in a SMP environment under heavy loads is not easy. It isn't just a matter of locking within the protocol stack... contention between cpus can get completely out of control even from small 6-instruction locking windows. And it isn't just the TCP stack which needs be contention-free. The *entire* packet path from the hardware all the way through to the system calls made by userland have to be contention-free. Plus the scheduler has to be able to optimize the data flow to reduce unnecessary cache mastership changes.

    It's fun, but so many kernel subsystems are involved that it takes a very long time to get it right. And there are only a handful of kernel programmers in the entire world capable of doing it.

    -Matt

  • by Anonymous Coward on Wednesday August 06, 2014 @07:33PM (#47618395)

    FreeBSD also includes an alternative to select/poll called kqueue that allows it to scale client connections massively with minimal performance degradation. Linux introduced epoll as a work-alike, but it has some drawbacks ...

    http://www.eecs.berkeley.edu/~sangjin/2012/12/21/epoll-vs-kqueue.html

    What's a massive scale? WhatsApp, recently acquired by Facebook, uses FreeBSD and Erlang to power it's service offerings. They sustain over 2 million simultaneous client connections per FreeBSD server ...

    http://blog.whatsapp.com/196/1-million-is-so-2011

    I wouldn't be surprised if the internal comparison between Linux and FreeBSD network features/performance was fueled by feedback from their new subsidiary.

    FreeBSD also works very closely with the Nginx community. If you look at the dev mailing list, you will see a fair amount of kernel level dev work sponsored by companies that use nginx on top of FreeBSD. This constant tuning keeps nginx consumers loyal to FreeBSD for obvious reasons. There is no wonder why this combination was selected by NetFlix to power their new content delivery network.

Our business in life is not to succeed but to continue to fail in high spirits. -- Robert Louis Stevenson

Working...