A Complete Guide to FreeNAS Hardware Design, Part I: Purpose and Best Practices

A guide to selecting and building FreeNAS hardware, written by the FreeNAS Team, is long past overdue by now. For that, we apologize. The issue was the depth and complexity of the subject, as you’ll see by the extensive nature of this four part guide, due to the variety of ways FreeNAS can be utilized. There is no “one-size-fits-all” hardware recipe. Instead, there is a wealth of hardware available, with various levels of compatibility with FreeNAS, and there are many things to take into account beyond the basic components, from use case and application to performance, reliability, redundancy, capacity, budget, need for support, etc. This document draws on years of experience with FreeNAS, ZFS, and the OS that lives underneath FreeNAS, FreeBSD. Its purpose is to give guidance on intelligently selecting hardware for use with the FreeNAS storage operating system, taking the complexity of its myriad uses into account, as well as providing some insight into both pathological and optimal configurations for ZFS and FreeNAS. freenashome

A word about software defined storage:

FreeNAS is an implementation of Software Defined Storage; although software and hardware are both required to create a functional system, they are decoupled from one another. We develop and provide the software and leave the hardware selection to the user. Implied in this model is the fact that there are a lot of moving pieces in a storage device (figuratively, not literally). Although these parts are all supposed to work together, the reality is that all parts have firmware, many devices require drivers, and the potential for there to be subtle (or gross) incompatibilities is always present.

Best Practices

ECC RAM or Not?

This is probably the most contested issue surrounding ZFS (the filesystem that FreeNAS uses to store your data) today. I’ve run ZFS with ECC RAM and I’ve run it without. I’ve been involved in the FreeNAS community for many years and have seen people argue that ECC is required and others argue that it is a pointless waste of money. ZFS does something no other filesystem you’ll have available to you does: it checksums your data, and it checksums the metadata used by ZFS, and it checksums the checksums. If your data is corrupted in memory before it is written, ZFS will happily write (and checksum) the corrupted data. Additionally, ZFS has no pre-mount consistency checker or tool that can repair filesystem damage. This is very nice when dealing with large storage arrays as a 64TB pool can be mounted in seconds, even after a bad shutdown. However if a non-ECC memory module goes haywire, it can cause irreparable damage to your ZFS pool that can cause complete loss of the storage. For this reason, I highly recommend the use of ECC RAM with “mission-critical” ZFS. Systems with ECC RAM will correct single bit errors on the fly, and will halt the system before they can do any damage to the array if multiple bit errors are detected. If it’s imperative that your ZFS based system must always be available, ECC RAM is a requirement. If it’s only some level of annoying (slightly, moderately…) that you need to restore your ZFS system from backups, non-ECC RAM will fit the bill.

How Much RAM is needed?

FreeNAS requires 8 GB of RAM for the base configuration. If you are using plugins and/or jails, 12 GB is a better starting point. There’s a lot of advice about how RAM hungry ZFS is, how it requires massive amounts of RAM, an oft quoted number is 1GB RAM per TB of storage. The reality is, it’s complicated. ZFS does require a base level of RAM to be stable, and the amount of RAM it needs to be stable does grow with the size of the storage. 8GB of RAM will get you through the 24TB range. Beyond that 16GB is a safer minimum, and once you get past 100TB of storage, 32GB is recommended. However, that’s just to satisfy the stability side of things. ZFS performance lives and dies by its caching. There are no good guidelines for how much cache a given storage size with a given number of simultaneous users will need. You can have a 2TB array with 3 users that needs 1GB of cache, and a 500TB array with 50 users that need 8GB of cache. Neither of those scenarios are likely, but they are possible. The optimal cache size for an array tends to increase with the size of the array, but outside of that guidance, the only thing we can recommend is to measure and observe as you go. FreeNAS includes tools in the GUI and the command line to see cache utilization. If your cache hit ratio is below 90%, you will see performance improvements by adding cache to the system in the form of RAM or SSD L2ARC (dedicated read cache devices in the pool).

RAID vs. Host Bus Adapters (HBAs)

ZFS wants direct control of the underlying storage that it is putting your data on. Nothing will make ZFS more unstable than something manipulating bits underneath ZFS. Therefore, connecting your drives to an HBA or directly to the ports on the motherboard is preferable to using a RAID controller; fortunately, HBAs are cheaper than RAID controllers to boot! If you must use a RAID controller, disable all write caching on it and disable all consistency checks. If the RAID controller has a passthrough or JBOD mode, use it. RAID controllers will complicate disk replacement and improperly configuring them can jeopardize the integrity of your volume (Using the write cache on a RAID controller is an almost sure-fire way to cause data loss with ZFS, to the tune of losing the entire pool).

Virtualization vs. Bare Metal

FreeBSD (the underlying OS of FreeNAS) is not the best virtualization guest: it lacks some virtio drivers, it lacks some OS features that make it a better behaved guest, and most importantly, it lacks full support from some virtualization vendors. In addition, ZFS wants direct access to your storage hardware. Many virtualization solutions only support hardware RAID locally (I’m looking at you, VMware) thus leading to enabling a worst case scenario of passing through a virtual disk on a datastore backed by a hardware RAID controller to a VM running FreeNAS. This puts two layers between ZFS and your data, one for the Host Virtualization’s filesystem on the datastore and another on the RAID controller. If you can do PCI passthrough of an HBA to a FreeNAS VM, and get all the moving pieces to work properly, you can successfully virtualize FreeNAS. We even include the guest VM tools in FreeNAS for VMware, mainly because we use VMware to do a lot of FreeNAS development. However if you have problems, there are no developer assets running FreeNAS as a production VM and help will be hard to come by. For this reason, I highly recommend that FreeNAS be run “On the Metal” as the only OS on dedicated hardware.

Josh Paetzel
iXsystems Director of IT

Part 2/4 of A Complete Guide to FreeNAS Hardware Design: Hardware Specifics >>

18 Comments

  1. Jon

    I look forward to the rest of the entries in this series.

    One question: Can you elaborate on what exactly you mean by a lack of stability on ZFS configurations with low amounts of RAM? It seems to me that a small/tiny ARC for a pool, while certainly impacting performance on that pool, shouldn’t make the storage and normal I/O against that pool less “stable.” So do you imply that ZFS will crash or it will lose data? That there would be a kernel panic? I just can’t imagine it failing in those ways solely due to a small ARC. Is there some sort of interaction between the FreeBSD kernel itself with memory management and ZFS?

    Perhaps I’m mistaken, but maybe you mean to say that the _performance_ of the pool will be “less stable?” I’m sure a small, constrained ARC, with a workload that shifted between metadata and data loads in unpredictable ways would certainly give some bizarre perf characteristics, for example.

    Reply
    • Michael Dexter

      Correct. Performance desegregation is the concern and ZFS does not handle resource exhaustion well. A 95%+ full pool will also grind performance to a halt.

      Reply
  2. Tim Herklots

    If ZFS can’t detect that it’s non-ECC RAM cache has an occasional bit error, how does the sysadmin find out that the server has stored faulty data?

    Reply
    • Michael Dexter

      It would be the same with any other file system and this is a risk. Do consider using ECC RAM.

      Reply
  3. Jason

    I’ve always been intrigued by FREENAS, just wish he bar was lower for the hardware requirements. Sure would prefer to use my old hardware than having to build a new system with 8-12gb of ECC RAM.

    Reply
    • Michael Dexter

      The article describes a pretty high-end system and many users use repurposed hardware. Do consider the FreeNAS Mini for something in between.

      Reply
  4. Paul

    In section “How Much RAM is needed?”, it should say “2TB array with 3 users that needs 1GB” not 1TB

    Reply
  5. W.T.

    I see the freeNAS has left the causal home media user behind.

    Reply
    • Michael Dexter

      The article represents a high-end configuration. The plugin functionality and Mini have not gone anywhere.

      Reply
  6. pablo andres

    BUENOS DIAS,
    Tengo una duda sobre la instalación de freenas, que es mas recomendable, instalar freenas con una maquina virtual o directamente (sin maquina virtual). Muchas gracias por su colabhoracion.

    Reply
    • Michael Dexter

      Directamente en un equipo dedicado.

      Reply
  7. Jean-Charles Lambert

    Is it “safe”, for the zfs pool, to use a RAID controler in JBOD mode and keep enable its write cache memory ? Because I noticed that it speeds up a lot NFS operations. I do not see why it would not be safe, since the ZIL is always written to the pool.
    Could you be more specific when your write :”Using the write cache on a RAID controller is an almost sure-fire way to cause data loss with ZFS”

    Thanks in advance,
    Jean-Charles

    Reply
    • Michael Dexter

      Jean-Charles,

      The problem with an on-card write cache is that it reports to the OS that the incoming data has been written to disk while in fact it has only been written to cache. While a battery backup unit (BBU) and even the on-disk super capacitors exist to “guarantee” that the data makes it to persistent storage in the case of a power loss or system panic, these “guarantees” have not proven adequately reliable. The proper solution with ZFS is to add a separate log device or SLOG. Note the recent freenas.org blog post on this topic and this is considered a standard practice for NFS.

      Reply
      • Jean-Charles Lambert

        Michael,

        thanks for the reply. I see the point which can be critical for bank transactions which can loose crucial information during power loss if some data remains in the raid cache while the ZIL has not be committed to the pool.
        In our case, we use freenas to store astrophysical scientific data which can be reproduce. Then the only risk I take, by using a raid controller with cache, in case of power loss and/or system panic, is to loose the latest data from the raid cache which have not been committed ? In any case I could lost the entire zpool, right ? (This is important to know)
        Thanks in advance

        Reply
        • Jean-Charles Lambert

          In any case I could not loose the entire zpool, right ? I meant……

          Reply
          • Charles

            I think that’s exactly the kind of scenario in which you could lose the whole zpool. I think ZFS is quite sensitive to small amounts of corruption in metadata – you may be lucky and lose only a file or two, but it’s possible that the whole file system goes.

  8. Vincent Jansen

    Any recommendations for HBA cards?

    Reply
    • Joshms

      Many of our users have used the LSI 9211 card with great success!

      Reply

Submit a Comment

Your email address will not be published. Required fields are marked *