Multiple block sizes at the pool level

See this idea on ideas.ibm.com

Pools could be configured with different block sizes (not metablocks) based on workload requirement.

Idea priority

Urgent

Post comment

Guest

Reply
| Sep 30, 2020

Due to processing by IBM, this request was reassigned to have the following updated attributes:
Brand - Servers and Systems Software
Product family - IBM Spectrum Scale
Product - Spectrum Scale (formerly known as GPFS) - Private RFEs
Component - Product functionality

For recording keeping, the previous attributes were:
Brand - Servers and Systems Software
Product family - IBM Spectrum Scale
Product - Spectrum Scale (formerly known as GPFS) - Private RFEs
Component - V3 Product functionality

0 reply Hide replies

Guest

Reply
| Jun 4, 2020

Duplicate of another RFE, so closing this one

0 reply Hide replies

Guest

Reply
| Oct 22, 2014

This is a requirement that has been around for a long time. In particular, once the support for different data and metadata block size was introduced, it seemed logical to take this one step further and allow specifying different block sizes for different data pools. Unfortunately, that extra step, while logical, is also substantially harder to take, since it introduces a new class of problems. Data and metadata can never be mixed within a single object -- an object is either data or metadata. So there is no scenario where a migration between the two could happen. This is different for different block sizes for different data pools. The idea of different data pools implies the ability to migrate files between them. Such a migration cannot be done atomically for a file with more than a single block, which means GPFS code would need to be able to describe a file which has blocks in two (or more) different pools, having different block sizes. If a migration between pools is interrupted, it can be later restarted, involving yet another pool. With the way the data addressing is implemented, describing data residing in different pools within the same file is quite hard.

The provocative question that usually gets asked in discussions about this item is: why do users feel the need to use different block sizes in the first place? Wouldn't it be so much better if there was a single block size that worked equally well for all workloads? Why not use 16M for everything? The two reasons usually quoted are (a) performance, and (b) disk space utilization. While the reason (a) was very pressing some years ago, the performance picture has been changing, and the issue should not be quite as pressing now. In GPFS 4.1, the granularity of IO is substantially independent from the block size. That is, one can use 16M blocks with a small record size workload, and GPFS will only read and write the relevant parts of blocks, down to 4K granularity. There is still some known work in handling very large blocks with utmost efficiency though. The reason (b) is a real problem. Since the smallest unit of allocation in GPFS is a subblock, which is currently fixed to be a 1/32nd of a full block, a small file that doesn't fit in the inode will occupy at least a subblock. While the support for 4K inodes and data-in-inode has ameliorated the issue for very small files and directories, the problem still exists. One way to address it is to allow more than 32 subblocks per block, which is a work item under consideration. If both (a) and (b) were to be addressed, that may be a credible alternative to using multiple block sizes.

0 reply Hide replies

Guest

Reply
| Sep 19, 2014

Due to processing by IBM, this request was reassigned to have the following updated attributes:
Brand - Servers and Systems Software
Product family - General Parallel File System
Product - GPFS
Component - V3 Product functionality
Operating system - Multiple

For recording keeping, the previous attributes were:
Brand - Servers and Systems Software
Product family - General Parallel File System
Product - GPFS
Component - V3
Operating system - Multiple

0 reply Hide replies

Guest

Reply
| Jul 31, 2014

Creating a new RFE based on Community RFE #56776 in product GPFS.

0 reply Hide replies

By clicking the "Post Comment" or "Submit Idea" button, you are agreeing to the IBM Ideas Portal Terms of Use.
Do not place IBM confidential, company confidential, or personal information into any field.

Shape the future of IBM!

Search existing ideas

Post your ideas

Specific links you will want to bookmark for future use

Multiple block sizes at the pool level

Please enter your email address

RELATED IDEAS

Multiple block sizes at the pool level