In the early 1980s, Motorola introduced the first commercial cell phones. They were huge, heavy, cost thousands of dollars, and ran on an analog network called AMPS, which was bandwidth-intensive and didn’t have basic security features to prevent calls from being intercepted or tapped.
As cutting-edge as they were in their day, no one in their right mind would use one now.
Around the time the Motorola DynaTac 8000X was introduced, Sun Microsystems developed its Network File System (NFS) as a protocol for client computers to access files on a single central server.
About the author
Björn Kolbeck is the co-founder and CEO of Quobyte
NFS was a breakthrough at the time, but no one in their right mind would use it today, right?
Back then, dial-up connections via modems were measured in bits per second, and local Ethernet LANs peaked at 1
With the advent of scale-out architectures in IT – or warehouse-scale computing as Google called it – we have come to environments that aren’t even suitable for the latest and greatest NFSv4. Indeed, it has become a liability.
The biggest problem: NFS was designed for a single central server, not for scale-out. Today’s NFSv4 and even the parallel NFS are still based on a centralized model. NFS wasn’t just designed for clients to communicate with a single server. These computers only had a capacity of a few MB, the file size was relatively small, and the throughput was relatively low.
Every IT manager, CIO, and data scientist in a company in the world today has two goals: one that meets the needs of users and applications, and two, adequate data security to ensure security, compliance and availability.
Full mesh (many-to-many) communication between clients and storage servers is required for scale-out. Otherwise, there are bottlenecks and crashes that affect performance, especially with read or write intensive workloads – these are essentially all workloads in a modern enterprise.
And this is ultimately the critical flaw: NFS is itself a bottleneck. The NFS device is inherently right in the data path and cannot scale performance to meet the needs of I / O intensive computing or multiple concurrent needs.
Any gateway is also a bottleneck, and NFS gateways are no exception. Architectures based on NFS gateways are subject to strict performance scaling constraints due to the consistency of the caches between the NFS gateways to create the illusion of a single NFS server. Because that’s all NFS can do, and cache consistency is an expensive help in keeping an outdated protocol working rather than fixing the problem: NFS.
Load balancing – I use quotation marks because most of the time the result is unbalanced – requires a distributed environment or system. Since NFS was never intended for distributed systems, load balancing is painful and disruptive. It just doesn’t think so.
Ah, but that’s where parallel NFS comes in. People think it solves all of these problems. Unfortunately, pNFS is still just as broken and still the opposite of scale-out. Only I / O are distributed across multiple servers. There is still only one central server for metadata and the control plane. It will come as no surprise that the explosion in corporate data comes with a corresponding explosion in metadata. Performance and scalability in metadata processing are particularly important in big data applications such as AI / ML and analytics.
Unfortunately, as I keep seeing, pNFS only solves a tiny part of the problem: data transmission. It may be the most modern iteration, but it’s 15 years late in the market and leaves many of the real problems unsolved.
NFS also fails on failover. Anyone using NFS is familiar with the “stale filehandle” problem when experiencing NFS failover. The protocol, even NFSv4, has no idea what failover is – again, it wasn’t designed that way – and instead relies on a fragile IP failover that is slow and disruptive. As with many critical functions, fault tolerance needs to be built into a log from the start, but NFS later added cumbersome failover, like a poorly designed building waiting to collapse.
This brings me to the second goal of corporate IT: data security – a collective term for data integrity, governance, compliance, protection, access control, etc.
Data security is a major concern whether it is preventing data breaches or industry regulations. Recently, data breaches have resulted in significant fines for companies subject to the European Union’s GDPR. Companies that process personal data or health data must implement state-of-the-art data protection through encryption.
Again, NFS is a liability, as neither pNFS nor NFSv4 offer proper end-to-end encryption, let alone other security mechanisms such as TLS and X.509 certificates, all of which are available today in storage technologies that are designed for scale-out and security, including Quobyte’s data center file system. In comparison, NFS is a serious business and compliance risk.
pNFS and NFSv4 also lack end-to-end checksums to identify data corruption. In part, this is due to the increasing scope of data operations today compared to the development of NFS. In the 1980s, data integrity via checksums was not an issue because the data transmitted over IP packets was small and the TCP checksums were reasonable. The TCP checksums are now too weak, however, especially when scaling more than 64 KB per packet. Today a company expects gigabytes per second. Decades later, NFS still does not deal adequately with data integrity. You are likely underestimating how often you get corrupted data from your NFS storage – and tracking down the problem is difficult and time-consuming.
Whether it’s high-throughput requirements, random or mixed general workloads, or data security and access, there is nowhere in today’s business that NFS excels. It’s time to ditch the Back to the Future era protocols in favor of alternatives that give users and applications the performance and reliability they need.