NAS or cloud storage? (Networking)

iljitsch

Ars Tribunus Angusticlavius
8,604
Subscriptor++
After living with modest 100/100 Mbps internet service for the past year, I went back to gigabit. And I could even upgrade to 4 Gbps now. Coincidentally, the past week I experimented with adding 2.5 Gbps Ethernet to my Synology NAS. (See later on in this thread.)

Which made me think: we tended to migrate stuff to the cloud as our internet connectivity became fast enough to no longer notice a speed difference between running it locally and running the same service in some remote datacenter.

For instance, in the late 1980s / early 1990s remote login at modem speeds was mostly on par with using a command line locally. A few years later, Gmail worked just as well as a local email client. Then complex web apps. And now the time has come that raw bandwidth wise, I can access files on a server at the same speed locally as from the cloud.

So... is the NAS industry about to collapse?

On the one hand that makes sense. On the other hand, not sure if just bandwidth is good enough if round trip times are much longer. And of course privacy. But also: it makes sense to share one big CPU with a lot of other people, because CPU usage is extremely spiky. Storage, not so much: minimum and maximum over a year may not even differ by a factor 2. So if I need 8 TB then I'd need to get an 8 TB drive here at home, but the cloud storage provider would probably use half of a 16 TB drive just for me in their datacenter. Those are not favorable economies of scale. And it gets worse: spinning rust is still cheapest, but you need power to keep it spinning. Probably 24/7 in a datacenter, with data from different customers on the same drive. Here at home, the drives get to sleep when I do.

Thoughts, considerations?

Just for my own amusement I'm posting this same thread both in the networking forum and the other hardware forum to see how the reactions are going to differ.
 

redleader

Ars Legatus Legionis
35,125
I think cloud storage has been at the point where its good enough for people that just need storage for a number of years now, so I don't think it is going to collapse now. Personally I had a NAS from about 2000 to 2012, then did cloud only from 2012 to 2020, and then went back to NAS because I wanted the extra features a dedicated machine offers (docker to run home assistant and local video recording). The future is probably increasingly marketing NAS devices as a hybrid of storage and services.
 

Paladin

Ars Legatus Legionis
32,735
Subscriptor
The problem is that most cloud providers (and the internet in between) will often average out to peak performance around 200-400 megabit per client IP/transaction so even as gigabit is more common and devices can provide more performance, you still get better performance from a NAS for a lot of situations.

For the average user who just wants to dump their pictures from their phone, yeah cloud storage is fine. Of course, that has been solved for years without a NAS. Google and Apple already let you easily copy your phone data or similar data to the cloud in the background at a few megabit per second.

I don't think I will ever get away from using a NAS at home unless cloud storage becomes much, much cheaper ($10 a year for 50TB or something) and it comes with better reliability, and local caching features somehow. I want access to my data when the internet is down, and I don't want to lose my data because of an stupid account lockout mistake or something.
 
Last edited:
The problem is that most cloud providers (and the internet in between) will often average out to peak performance around 200-400 megabit per client IP/transaction so even as gigabit is more common and devices can provide more performance, you still get better performance from a NAS for a lot of situations.
That's precisely it for me. I could get up to 8Gbps from my ISP (which blows my mind, as I still remember my excitement at getting my first 2400 baud modem), but I won't get even 10% of that sort of speed from a cloud storage provider. Local is fast, so the NAS stays.

Cloud storage has its place. If you're not dealing with huge data, or if you are dealing with huge data but can afford to wait, it's probably fine. As an offsite backup for your NAS, it's also fine. But at least for me, it's not a replacement.
 

iljitsch

Ars Tribunus Angusticlavius
8,604
Subscriptor++
Is that raw speed the issue, though? That seems something like the cloud storage providers could offer without too much trouble. But the benefits of scale just don't seem to be there. You still need to have those platters covered with rust spinning, probably 24/7, with mostly the advantage that you don't need one NAS box for two or four drives, but one system per 8 - 20 drives, with the drives costing mostly the same and the power costing maybe even more.

Now this is for large amounts of data such as around 10 TB per user. A few GB here and there (like the iCloud free tier) is not going to move the needle.
 

continuum

Ars Legatus Legionis
95,187
Moderator
Is that raw speed the issue, though? That seems something like the cloud storage providers could offer without too much trouble.
They could offer but they don't, and even if they could for whatever $sumofmoney, internet bandwidth between said provider and end user is much harder to guarantee. Most users* wouldn't want to pay either, which is likely the also a major issue. Latency is also an issue for a lot of use cases.

As stated for the average consumer, cloud storage, even in the TB range, is usually fine. Those who know they need/want a NAS can still buy a NAS.

* = we have enough trouble getting large organizations to pay...
 
Last edited:
Is that raw speed the issue, though?
It's not the only issue, but IMO it's the only issue that's a blocker rather than negotiable.

Pricing could always be better -- you could argue around it, ease of access is actually an advantage to the cloud side, what to do if your cloud service goes bankrupt or you forget your password or you die and your kids want the family photos is all sort of disaster-casting and works against whichever side you're inclined to dislike most.

But speed? It's either there or it's not. And it's not, so I'm not even looking to answer all those negotiable questions. And let's say they decide to offer multi-gig speeds and they have access to the distributed infrastructure to make that a reality all the way to the consumer Internet connection, and let's add on the definite fiction that they accomplish all of this without raising prices. Well... for 3Gbps service at my house, it's $70 a month more than what I pay now. So if I want to take advantage of this completely free faster service (let's assume I already have an account and I get the new speeds gratis), it's still $840/year over what I'm spending right now with a NAS to get something that in the end is very similar. So with a scenario already tilted with assumptions that heavily favor the cloud service, I'm still not convinced.

I'm not saying I'll never be convinced. Internet prices may keep falling, and maybe there will be other reasons by then to have multi-gig internet to sweeten the pot. But it's a VERY tough sell. Again, just in my opinion, with my personal priorities.

And this is speaking as someone who just went through the trouble of configuring multiple laptops to access the NAS over a VPN connection back to my house from wherever they are, so that they can access their data, even if it's slower. A cloud service would be a MUCH easier and less time-consuming way of accomplishing this. But the crazy thing is, even with my weird VPN duct-tape-and-baling-wire hackery, a roaming laptop will still have speeds back to my house comparable to those of an average cloud storage provider.
 
Yeah, once you get symmetric gigabit self-hosting stuff gets just dramatically more attractive. All my mobile devices VPN to my home 100% of the time. Zero battery impact with wireguard that I can see. If you want it easy just use tailscale instead.
Agree, but in spite of this, I opted out of gigabit for lack of enough bytes to fill the pipe. You could leave off the gigabit and just say that self-hosting on any symmetric connection is attractive, compared to web-hosting. After all, if you've got a slow 100/100 connection, your web-hosted and self-hosted content are both limited to 100Mbps, delivering an equivalent experience. Not me though, I've got blazing fast 200Mbps symmetrical ;) For a household of five, and still extremely rarely fill the pipe...
 

iljitsch

Ars Tribunus Angusticlavius
8,604
Subscriptor++
[Whether speed is the issue]
It's not the only issue, but IMO it's the only issue that's a blocker rather than negotiable.
[...] It's either there or it's not. And it's not, so I'm not even looking to answer all those negotiable questions.
If speed is the blocking issue, then that's simply a business opportunity for the cloud storage provider and the ISP. On the storage provider side you just need to make sure the striping and prefetching is set up for good performance and these days servers are connected to the network at speeds well over 10 Gbps.

In an ISP network, reserving 100 Gbps or some such as extra headroom to be able to sell SLAs that guarantee 10 Gbps from their customers through their backbone to the interconnect with the storage provider is definitely not a big deal.

But I think I made an incorrect hidden assumption: the use case I was thinking about is working on files directly off the file server. Here you need decent speed, say at least 1 Gbps and more like 10 if you want to come close to built-in SSD speeds.

But of course Dropbox, Google Drive, Apple iCloud and Microsoft OneDrive, to name a few, all work with a locally stored cache that is synced with the cloud in the background. This makes raw speed much less important and thus the end of NASes has come and gone for that particular usecase.

(Although I like using Synology Drive that does the same but off of my NAS as per my personal motto cave nube.)

But I'm thinking with the rise of 5G there could be a decent business case for for instance video editing with cloud storage. The storage won't be cheap, but maintaining that level of NAS performance locally won't either. So just order a 100 Gbps internet link you're done! And the real benefit will be that you can edit video on a laptop on location without having to think about storage. (As long as there is good 5G.)
 

Xelas

Ars Praefectus
5,564
Subscriptor++
There are still cases where a NAS makes sense, and that's if you are actively working with very large files or large sets of files that are larger than what local PC storage can accommodate, such as with video production or processing large amounts of photo shoots. A fast NAS will always much better latencies and reliable throughput. You could still have a backup in the cloud, or have a tiered system where you offload older footage for archiving into cloud storage to free up space on the NAS.
No internet cloud provider can outperform a decent NAS. It's trivial to run a 10Gb LAN, but it is very expensive to manage a 10Gb internet connection. The cost of the service is high, and having a good router that can handle actual security at these speeds (and not just basic NAT) is $$$.
 

ocf81

Smack-Fu Master, in training
89
I think that there will always be groups which will prefer local storage. Businesses, privacy minded people, performance minded people who want to exceed their internet link capacity etc. So for now the NAS market is not about to disappear any time soon. For example, I'm on 10GbE internally and have a NAS to match. (might upgrade to 25Gbit some time next year or so) Even that 8Gbit uplink I could get in theory won't make me change my mind any time soon.
 

iljitsch

Ars Tribunus Angusticlavius
8,604
Subscriptor++
What is your drive configuration that could take advantage of more than 10 Gbps?

(BTW, I was copying 5 GB between two virtual servers presumably in the same datacenter and also from my home now at 1 Gbps to one of those. The former reached 120 MB/s (about the max for gigabit Ethernet) quickly, the latter 50 MB/s slowly. Distance: ~50 km, 5 ms.)
 

ocf81

Smack-Fu Master, in training
89
What is your drive configuration that could take advantage of more than 10 Gbps?

(BTW, I was copying 5 GB between two virtual servers presumably in the same datacenter and also from my home now at 1 Gbps to one of those. The former reached 120 MB/s (about the max for gigabit Ethernet) quickly, the latter 50 MB/s slowly. Distance: ~50 km, 5 ms.)
8x SATA SSD is enough, but you'll need to tune TrueNAS a bit to get it consistently.
 
  • Like
Reactions: continuum

w00key

Ars Tribunus Angusticlavius
6,148
Subscriptor
A single NVMe disk can max out a 10 Gb, or ~1 GByte/s after overhead, link. Even a fast SATA 600 disk like a 870 EVO does 500+ MByte/s or ~ 5 Gb. 2 to 4 in a RAID should max out the link easily.

Plus long distance shit only works well with protocols and usages that aren't latency sensitive. Do a bunch of small read writes locally and then try the same on a remote mount and you will notice that cloud stuff work much slower. So instead of copying, try opening a large document with lots of external embeds. Lots of tiny reads, takes way longer if not on the same machine.

Even on this pc vs local network is noticeable. Then next step is remote network but close, like Europe West1a to 1b, a few ms already makes you optimize code to do big SQL queries instead of lots of tiny ones.

Then cross region (west1 to central) - if not designed for it from scratch, it will crater performance. This, or further, is the standard "distance" of cloud providers.


Their workaround is to cache as much as possible locally. Sure, if you don't mind having a duplicate on the PC, work PC, laptop, etc, or have performance crater to "opens the .indd in 5 minutes" when not cached. For me that's fine for rarely accessed files but it won't replace the NAS, it's in addition to it.

Our production media Google shared drive works like that, slow unless you tell it to download it in advance. Full of InDesign, Photoshop etc files. But it's accessible everywhere and backup by afi.ai makes dataloss pretty much impossible.
 

iljitsch

Ars Tribunus Angusticlavius
8,604
Subscriptor++
Plus long distance shit only works well with protocols and usages that aren't latency sensitive. Do a bunch of small read writes locally and then try the same on a remote mount and you will notice that cloud stuff work much slower.
It's not just the small stuff or really large latency. I'm migrating some websites and ordered a new virtual private server yesterday. Then I copied ~ 5 GB of data (one big tar file) from another VPS in the same datacenter and that happened at ~ 120 MB/sec. That's just about the max you can get out of Gigabit Ethernet.

However, I've been working on this stuff locally here some 50 km and 5 ms away at home, with now again GE. (50 km should really only be 0.5 milliseconds...) The overhead is slightly higher due to a VLAN header and a PPPoE header. But I started copying more or less the same 5 GB from my Synology through my Mac Mini to that new VPS at about 20 MB/sec, slowly creeping up to 50 MB/sec.

(Too lazy and too protective of my SSD's aging to try directly from the Mac Mini to the VPS.)

I had an interesting experience back in 2015 when I was in Honolulu, some 12 timezones away from the Netherlands. Dutch sites loaded remarkably slowly with several hundreds of milliseconds of roundtrip delay. Interestingly, on the continental US this was still not too bad, but another half an ocean made all the difference.

Continuum: so what's faster, 25 GE or "directly connected" (how exactly?)?
 

804solutions

Smack-Fu Master, in training
80
Speed on a LAN is better than anything you will get from a Cloud provider. A 5-10 Gbps internet connection will still muddle along at a much lower connection speed to virtually every site on the internet.

However, my main issue is privacy. I really don't want my personal data stored on any cloud site. If I was going to put that data on a cloud site, I would personally encrypt it before uploading.
 
Trivially easy to E2E encrypt via rclone, I used it for years myself. You can mount the encrypted volume also so it’s transparent. I believe there’s windows and mac software for it too.

Excellent software.

 
  • Like
Reactions: continuum

w00key

Ars Tribunus Angusticlavius
6,148
Subscriptor
Typically you go with iSCSI when performance and latency are important not NFS or SMB. Much less overhead.
RCoE stands for
RDMA over Converged Ethernet (RoCE) is a network protocol which allows remote direct memory access (RDMA) over an Ethernet network.
Even that has noticeable latency, never mind iSCSI. Infiniband has (had?) its own little niche as high bandwidth low latency network with Remote Direct Memory Access as a core feature of HBAs. You can't get anywhere near that speed with any other protocol, direct write access to another systems RAM can't be beaten by SCSI command and data packets wrapped in IP that needs the CPU to process it first.

Most software actually doesn't really mind lowish speed storage, see everyone hosting VMFS on NFS and iSCSI. But for some tasks, nothing beats local. It's rough, dev = local and many GB/s and 6 digit IOPS on demand, prod on a VM = hah, have fun sharing that with 200 other VMs. Or VM is on availabulity zone 1a, DB on 1b and site runs slow as shit due to latency.
 
  • Like
Reactions: rodalpho

continuum

Ars Legatus Legionis
95,187
Moderator
Even that has noticeable latency, never mind iSCSI. Infiniband has (had?) its own little niche as high bandwidth low latency network with Remote Direct Memory Access as a core feature of HBAs.
Yar. IIRC it's sub-5 microseconds (5us) for RoCE, sub-2 microseconds (2us) for Infiniband. In absolute terms not much but that's still a 60% reduction.
 

804solutions

Smack-Fu Master, in training
80
Trivially easy to E2E encrypt via rclone, I used it for years myself. You can mount the encrypted volume also so it’s transparent. I believe there’s windows and mac software for it too.

Excellent software.

I still don't want all my personal data in the cloud. It took the FBI 40 minutes to hack Trump's shooter's phone.
 

Jeff S

Ars Tribunus Angusticlavius
9,100
Subscriptor++
The problem is that most cloud providers (and the internet in between) will often average out to peak performance around 200-400 megabit per client IP/transaction so even as gigabit is more common and devices can provide more performance, you still get better performance from a NAS for a lot of situations.

For the average user who just wants to dump their pictures from their phone, yeah cloud storage is fine. Of course, that has been solved for years without a NAS. Google and Apple already let you easily copy your phone data or similar data to the cloud in the background at a few megabit per second.

I don't think I will ever get away from using a NAS at home unless cloud storage becomes much, much cheaper ($10 a year for 50TB or something) and it comes with better reliability, and local caching features somehow. I want access to my data when the internet is down, and I don't want to lose my data because of an stupid account lockout mistake or something.
I would also add that a NAS can provide some advanced features a cloud provider can't. Someone else mentioned docker (e.g. running your own network services - which might also be done with a VM). Of course, most people aren't going to do that.

But, another advanced feature of a NAS is snapshotting CoW filesystems (ZFS, Btrfs, maybe APFS [ not a mac guy so don't know much about this ]). That is, not only can your NAS have a backup of all your current files, but past snapshots.

Cloud storage providers tend to provide file versioning, but not full snapshots. So you might be stuck restoring files to previous versions on a file-by-file basis. While it's true that generally speaking, most users would only want to roll back to a previous version of a single file (e.g. to undo a bad edit of a file), it can be useful sometimes to roll back entire directory trees of related files.

I've also found it can be somewhat hard to see what the actual contents of previous versions are with cloud storage providers - you can see there were 3 previous versions at certain dates and times, but not easily view the contents or a diff of those versions.

With snapshots, you can mount a snapshot and browse the full snapshot like it's a directory. You can use standard dir diff tools (diff, kdiff3, BeyondCompare, etc).

NAS also allows you to do much faster recovery if you have, e.g. a 4, 6 or 8 drive RAID 1+0 setup. If you have drive failure and want to restore all or much of a 2+ TB backup to your local drive, be prepared to wait awhile (although to the point of the OP, maybe the files just live in the cloud and are only downloaded on demand). But if you have a RAID and one drive fails, replace it and start a rebuild - the NAS might even allow you to continue accessing all the files while the RAID copies data to the failed drive, I think?

Of course, all that comes at a cost. Even a relatively modest 4 drive NAS setup in RAID 1+0 with say 2TB drives, and perhaps a 1TB system SSD for OS + fast SSD caching of the files on the main data drives (ZFS allows this; not sure if Btrfs does) can add up to something like ($800 for base NAS Server + 4x$150 for hard drive [=$600 for drives], for total system cost of something like $1400; maybe with bargain hunting you can get that down to $1000).

You could do a hybrid where you have your local NAS, do coud backup of only the latest version of files so that you can pay less for cloud storage by not storing the old versions, and only have old versions on the NAS (so maybe you only pay for 2TB of Cloud Storage, which is still fairly affordable, but a lot of cloud providers will force to you to pay a lot more if you go over 2TB), but store old snapshots/versions of files on the NAS so maybe using 4TB.

Fiinally, on the topic of saving money on cloud storage - maybe you want Cloud storage to store all your unique original files (documents you created, photos and videos you took yourself on your phone or SLR camera; databases you compiled yourself, etc), but maybe you decide not to pay to backup 6 TB of movies and TV shows you err, acquired by whatever means and could re-aquired if you really need to; or maybe you have the movies and TVs backed up to BluRay discs or whatever.
 
  • Like
Reactions: Paladin

iljitsch

Ars Tribunus Angusticlavius
8,604
Subscriptor++
Of course, all that comes at a cost. Even a relatively modest 4 drive NAS setup in RAID 1+0 with say 2TB drives, and perhaps a 1TB system SSD for OS + fast SSD caching of the files on the main data drives (ZFS allows this; not sure if Btrfs does) can add up to something like ($800 for base NAS Server + 4x$150 for hard drive [=$600 for drives], for total system cost of something like $1400; maybe with bargain hunting you can get that down to $1000).
You can do this well below $1000 if you omit the SSD. See my calculations in the other thread. If you go for a lower end NAS then that SSD is not going to do much for you anyway as these don't have 5 or 10 Gbps Ethernet and HDDs will keep up with 1 or even 2.5 Gbps just fine.
 
  • Like
Reactions: Jeff S

Jeff S

Ars Tribunus Angusticlavius
9,100
Subscriptor++
You can do this well below $1000 if you omit the SSD. See my calculations in the other thread. If you go for a lower end NAS then that SSD is not going to do much for you anyway as these don't have 5 or 10 Gbps Ethernet and HDDs will keep up with 1 or even 2.5 Gbps just fine.
Fair point - the SSD was more, in my mind, for two purposes: 1) So you aren't taking up space on the spinning disks for the NAS OS, and 2) it means when you need to reboot (e.g to install upgraded kernel, or just because something's glitching, or you lose power temporarily), the OS loads up a lot faster.

Also, the cache through on SSD might help with stuff like latency - issues with hard drives aren't just about average throughput, but also seek times when dealing with lots of small files or a highly fragmented FS.

Of course, for home users, I'm not clear how much SSD caching would help anyhow, as for a cache to be useful, you need to read the files more than once (first read always has to hit the spinning disks), in a relatively short time frame - something more likely to happen with an organization that has lots of users, and much less likely for a home NAS with only maybe 2 to 6 users. They are more likely to be accessing different files, and only once.
 

steelghost

Ars Praefectus
5,058
Subscriptor++
See https://forum.level1techs.com/t/zfs-metadata-special-device-z/159954

ZFS doesn't really do file caching in the way people expect, and while you can have special VDEVs that will give you faster access to metadata and optionally, smaller files, they are part of the main pool and need to be redundant just like everything else. If you only put a single SSD as your special VDEV and it fails, you lose the whole pool. So approach with caution for ZFS!
 

Paladin

Ars Legatus Legionis
32,735
Subscriptor
See https://forum.level1techs.com/t/zfs-metadata-special-device-z/159954

ZFS doesn't really do file caching in the way people expect, and while you can have special VDEVs that will give you faster access to metadata and optionally, smaller files, they are part of the main pool and need to be redundant just like everything else. If you only put a single SSD as your special VDEV and it fails, you lose the whole pool. So approach with caution for ZFS!
Unless things have changed a lot recently, you can also use an SSD(s) for L2ARC (secondary read cache after RAM) and SLOG (high speed write area). For a system with a lot of access (read and write), those can be helpful but for the average home, they are not super useful. If you had to choose one, the SLOG is probably best since it speeds up writes which are generally the most time-expensive operation. A couple of cheap, leftover SLC/MLC SSDs in the 100-200GB range are perfect for that kind of thing. Of course, RAM is the top order cache for the system so if you can put in more RAM, that is usually better than an SSD.

For the L2ARC and SLOG, they can fail without much chance of data loss or even downtime. I have had them fail after a few years of use on a system and it raises a warning and stops using it until you replace it, with lowered performace of course.
 

steelghost

Ars Praefectus
5,058
Subscriptor++

SLOG only really makes a difference to write operations called with sync(), I guess that would some databases, iSCSI (?) and NFS (depending on settings).

I imagine most normal home NAS use is going to be via SMB, which is called as async(). So a SLOG probably isn't going to make any performance difference to the average home NAS user who might choosing between a NAS device and cloud storage. It absolutely could come in handy for a homelab storage backend scenario though.

Ultimately though, ZFS will throttle overall throughput to match the ability of your pool to commit the TXGs in RAM, so none of these caching strategies really make any difference when transferring large amounts of data, especially when operating over gigabit ethernet, at least outside of non-filesystem setups like the UNraid cache disk / "mover" combo.