What is Block Storage?

May 15th, 2024 | 8 min read

How does Block Storage Work?
Block Storage use Cases
Block Storage vs File Storage
Block Storage vs Object Storage
Block Storage as a Service
Questions and Answers

The term Block storage describes a technology that controls how data is stored on storage devices. The name is derived from the fact that block storage splits any kind of information, such as files or raw disk content, into chunks (or blocks) of equal size.

Block storage is the most versatile type of storage, as it is the underlying structure of other storage options, such as file or object storage. It is also the most well-known type of storage since most typical storage media (HDD, SSD, NVMe, …) are exposed to the system as block storage devices.

How does Block Storage Work?

Block storage devices are split into a number of independent blocks. Each block has a logical block address (LBA) that uniquely identifies it. Furthermore, the blocks are all the same size for the same block device, and typically only one piece of information can be stored within a single block (as it is the smallest addressable unit).

When an application wants to write a file it is first determined if the file fits into a single block. If this is the case, it’s an easy operation. Find a free/unused block, write the file to it.

If the file is larger than a single block, it is split into multiple parts, with each part being written to a separate free block. The order or consecutive positioning of these blocks is not guaranteed.

Anyhow, after the file is written to one or more blocks, the block addresses (es) are written to a lookup table. The lookup table is provided through the filesystem that was installed onto the block device and varies depending on the filesystem in use. If you’ve ever heard the term Inode in Linux, that’s part of the lookup mechanism.

When reading the file, based on the filename, the blocks and their read-order is looked up in the lookup table, and the block storage reads the requested blocks back into memory, where the file is pieced together in the right order.

Block Storage use Cases

Due to the unique characteristics of block storage, it can be used for any kind of use case. Typical simple use cases include computer storage, including virtual hard drives for virtual machines, being used to store and boot the operating system.

Where block storage really shines, though, is when high performance is required, or when IO-intensive, latency-sensitive, as well as mission-critical workloads, such as relational or transactional databases, time-series databases, and container storage, require storage. In this case, it’s common to claim, the faster the better.

Database Workloads

A transactional workload is a series of changes from different users. That means that the database receives reads and writes from various users over time. Modifications between different changes need to be atomic (meaning, happen at once or not at all), which is known as a transaction. A common example of transactional workloads is banking systems, where multiple money transactions happen in parallel.

Due to the nature of block storage, where each block is an independent unit, databases can optimally read and write data, either with a filesystem in between, or taking on the role of managing the block assignment themselves. With a growing data set, the underlying physical storage can be split into multiple devices, or even multiple storage nodes. The logical view of a block storage device stays intact.

Cloud and Container Workloads

Virtual machines and containers are designed to be a flexible way to place workloads on machines, isolated from each other. This flexibility requires storage that is just as flexible and can easily be grown in size and migrated to other locations (servers, data centers, or operating environments). While alternative storage technologies are available, none of them is as flexible as pure block storage devices.

For organizations considering block storage migration, ensuring flexibility and performance during the transition is critical. Block Storage Migration allows businesses to move their workloads seamlessly while avoiding vendor lock-in and maintaining efficiency.

Other High Velocity Data Workloads

Workloads with high data velocity, meaning rapidly changing data, oftentimes within seconds, need storage solutions that can keep up with the speed of writes and reads. Typical use cases of such workloads include Big Data Analytics, but also real-time use cases, such as GPS tracking data (Uber, DHL, etc). In these cases, direct addressable block access improves read and write performance by removing additional, non-standard access layers.

Block storage speed put to test.

Block Storage vs File Storage

File level storage, or file storage, refers to storage options that work purely on a file level. File storage is commonly associated with local file systems such as NTFS or ext4, and with network file systems like SMB or NFS, which are widely supported across operating systems. For Mac users, learning how to use on Mac can help you better understand how these file storage protocols integrate with your system.

From a user’s perspective, file storage is easy to use and to navigate, since their design replicates how we operate with local file systems. The present directories and files, and mimic the hierarchical nesting of those. File storages often provide access control and permissions on a file basis.

While easy to use, the way those storages are implemented introduces a single access path, hence the performance can be impacted compared to block storage, especially in situations with many concurrent accesses. It also means that interoperability may be decreased over a pure block storage device since not every file system implementation is available on every operating system.

Typically, a file storage is backed by a block storage device in combination with a file system. This file system is either used locally, or made available remotely through one of the available network file systems.

Block Storage vs Object Storage

Object storage, sometimes also known as blob storage, is a storage approach that stores information in blobs or objects (which explains the origin of its name). Each object has a variable amount of metadata attached to it, and is globally uniquely identifiable.

The object identities are commonly collected and managed by the application that stores or reads the file. These identities commonly are represented by URIs, due to the fact that most object storages (these days) are based on HTTP services, such as AWS S3, or Azure Blob Storage. This means that typical access patterns aren’t available, and that object storages most often require application changes. The S3 protocol currently is kind of a de facto standard across many object storage implementations. Yet, not all (especially other cloud providers) implement it. Meaning that implementations aren’t compatible or interchangeable.

Object storages, while versatile, impact the performance and accessibility of files. The additional protocol overhead, as well as access patterns are great for unstructured, static files, such as images, video data, backup files, and similar, but aren’t a good fit for frequently accessed or updated data.

Block Storage as a Service

In summary, block storage, file storage, and object storage each offer distinct advantages and are suited to different use cases. While block storage excels in performance-critical applications, file storage is ideal for shared file access, and object storage provides scalable storage for unstructured data.

Anyhow, each of those storage types is available via one or more storage as a service offerings. While some may be compatible inside their own category, others are not. Being able to interchange implementations, or change cloud providers when necessary, may be a requirement. Incompatible protocols, especially in the case of object storages, paired with the performance impact over block storage, makes a basic block storage still the tool of choice for most use cases.

Simplyblock offers a highly distributed, NVMe-optimized, block storage solution. It combines the performance and capacity of many storage devices throughout the attached cluster nodes and enables the creation of logical block devices of various sizes and performance characteristics. Imagine virtualization, but for your storage. To build your own Amazon EBS-like storage solution today, you can get started right away. An overview of all the features, such as snapshots, copy-on-write clones, online scalability, and much more, can be found on our feature page.

Questions and Answers

What is block storage?

Block storage is a data storage method that provides virtual hard disk-like raw block devices. It is highly performant and ideal for databases, virtual machines, and Kubernetes stateful workloads, where low-latency access is essential.

How does block storage differ from file and object storage?

Unlike file storage (which stores data as files in directories) or object storage (which stores data with metadata and unique IDs), block storage offers direct access to raw storage volumes. This makes it faster and more suitable for high-performance applications like databases on Kubernetes.

What are common use cases for block storage?

Block storage is ideal for transactional databases, virtual machines, containers, and any workload requiring fast, consistent I/O. It’s especially effective in scenarios demanding high IOPS and low latency, such as AI/ML pipelines or cloud cost optimization.

What are the advantages of block storage in cloud environments?

Block storage allows dynamic provisioning, scalability, and performance tuning. When used with protocols like NVMe over TCP, it provides near-local performance across the network while remaining infrastructure-agnostic. In addition, block storage looks like a typical hard disk and is highly compatible with modern and legacy applications.

How does Simplyblock enhance block storage?

Simplyblock delivers high-performance software-defined block storage with NVMe over Fabrics, native Kubernetes integration, snapshotting, and encryption. It’s designed for modern workloads needing scalable, reliable, and cost-efficient storage without vendor lock-in.

Topics

All Posts Cloud Knowledge

Simplyblock

Supported Environments

Use Cases