Thinking Differently About Storage for Unstructured Data

If you ask the internet what the main types of storage are, you’ll get a technically factual answer that completely misses how businesses consume storage, especially when it comes to dealing with unstructured storage needs.

The textbook answer goes something like:

“There are three types of storage where unstructured data can sit – Block, File, and Object.”

Whilst this technically is correct, the problem we have is that it’s a technical storage classifications. These are designed by engineers for engineers. The classification doesn’t reflect how organisations actually think about, budget for, or consume storage in the real world.

Storage as a Commodity: The Coffee Analogy

If storage is a commodity then we can assume that its base level of functionality is simply to act as a medium to store data without corruption.

That’s it.

Everything else about storage “the commodity” is a variation of quality, performance, and how you pay for it.

Now think about coffee.

Its base functionality is to provide a way to transport flavour and caffeine to the end user. And the price varies depending on the quality of the flavour, the way it is produced, and the delivery experience.

Let’s compare factory produced instant coffee that you pour boiling water on yourself, versus boutique hand-roasted beans crafted by an experienced barista. Both deliver caffeine. Both technically “work.” But the quality and experience are what impacts the price.

This same logic applies to storage. If the underlying commodity’s job is to store data without corruption, then price variability comes down to performance characteristics, latency requirements, and the commercial consumption model.

Just like coffee, you can get cheap storage that technically works but at the expense of performance, latency or both. Or you can pay premium prices for storage that delivers blistering performance, super low latency, and a white-glove service to your business.

Storage as a Cost Centre: The Business Reality

This is where storage classifications don’t align with the way businesses buy stuff.

If a CFO looks at storage, they don’t see “block versus file versus object.” They see a cost centre. They see budget line items that need to be coded to a department, project or customer.

Storage vendors build their go-to-market strategy around this way of working. They are selling the commodity.

Storage is sold as capacity on day one – either as a large capital expenditure (CapEx) or as a recurring operational expense (OpEx) that amortises over the asset’s lifespan. The vendor’s job is to sell you more capacity, regardless of whether you actually need it or are using what you already have effectively.

But there’s a fundamental misalignment at play.

Storage vendors succeed when you consume more storage. But your business succeeds when you extract maximum value from the data that resides on the underlying storage..

Just like a boutique coffee vendor, they will sell you a barista served doppio-espresso (my personal favourite). They’ll sell you some of their beans to take home. They might even sell you some ground beans if you don’t have a grinder.

But they’re not going to sell you a cheap jar of a competitor’s instant coffee brand.

The same is true in storage.

A storage vendor will provide analytics and insights about your data usage, but only as a means to an end. The analytics are the means, and purchasing more of their storage is the end.

The Four Types of Storage That Matter to Business

Forget the technical categories of block, file, and object for a moment. From a business consumption perspective, there are four main types of storage where unstructured data actually lives

1. Filesystem Storage

This is where most of your active unstructured data lives. It might be referred to as network attached storage (NAS), file servers or shared drives. Users have immediate access by browsing familiar folder structures. Everyone understands it, most applications expect it, and it’s where active work gets done.

We dive deeper into the performance, latency and consumption models of filesystems in this post

2. Object Storage

This is the equivalent of a warehouse for your data. Object storage is great at scale and access by API. It’s perfect for tightly integrated applications (the easiest example is a webserver), backups, and content that needs to be accessed by systems rather than humans browsing through folders.

The main technical difference between object storage and a filesystem is latency of access. Unlike a filesystem, its job is to get data there “eventually” (applications are happy to wait a few hundred milliseconds, end users may not)

Read more about Object storage in this post.

3. Platform Storage

Think OneDrive, Box.com of LucidLink. There are lots of SaaS platform where data storage is part of a convenient monthly payment service delivery. You’re not buying storage directly – you’re paying for the platform capability.

And as part of the contract you have with the platform, they provide the base functionality of keeping your data safe, and then you’ll get different levels of performance, latency and availability.

4. Archive Storage

This is a storage class that could be considered part Object, part Platform. Archive platforms are designed to store data that needs to be kept but is rarely accessed.

An archive system is often a complete solution an organisation purchases. Some consist of a filesystem, some form of proprietary tape storage (that is occasionally accessed using the object protocol), but wrapped in a license agreement that makes it feel like Platform Storage.

For more information about Archive storage, read here.

Why This Framework Actually Matters

Understanding these four categories helps organisations make better decisions about where your data should live and how much you should pay for it.

The real opportunity isn’t in optimising storage types independently, it’s understanding the data flows between these classes in the data lifecycle.

Most organisations treat the storage types as silos by their technical classifications, both from an IT department’s perspective but also from a CFO and budgeting perspective.

The problem with this is that future spending follows the same pattern. As more data gets created in one type of storage, a business will approach their friendly vendor and ask for more.

Companies that think about their data rather than storage are optimising based on the business business value that their data can create.

They can break free from the trap of continually (and blindly) purchasing technology.