Lustre cluster file system

2022.01.16 00:36

Any overheads that are introduced into these environments reduces the usable bandwidth for applications and reduces overall efficiency, which in turn increases the time it takes to arrive at a result. Therefore, the Lustre file system architecture does not implement a redundant storage pattern for data objects across storage servers, due to the inherent latency and bandwidth overheads that replication and other data redundancy mechanisms introduce. These storage systems typically comprise multi-ported enclosures, each containing an array of disks or other persistent storage devices.

Intelligent storage arrays have the benefit of abstracting the complexity of managing storage redundancy through RAID configuration, and offloading the computational overheads for checksum and parity calculations from the host server.

Dedicated storage controllers are typically configured with battery-backed cache for buffering data, thereby further isolating from the storage services running on the host computer the IO overhead associated with writing additional data blocks e.

After the block layout is defined, a file system can be formatted on top of the software volume. However in recent years, the JBOD architecture has received a significant boost in popularity thanks to the development of advanced file system technology, as exemplified by OpenZFS, which makes storage management easier while at the same time improving reliability and data integrity.

The OpenZFS file system reduces the administrative complexity of maintaining software-based storage by taking a holistic view of both the file system and storage management. ZFS integrates volume management features with an advanced file system that scales efficiently and provides enhancements including end-to-end checksums for protection against data corruption, versatility in storage configuration, online data integrity verification, and a copy-on-write architecture that eliminates the need to perform offline repairs.

There is no fsck in ZFS. Advances in software-based storage architectures are also influencing storage hardware design, creating hybrid server and storage enclosures that combine storage trays with standard servers into a single high-density chassis. These integrated systems can offer higher density per rack and less complex physical integration including reduced cabling. Lustre clients do not have direct connections to the block storage and are often completely diskless, with no local data persistence.

Because data is not replicated between Lustre servers, loss of access to a server means loss of access to the data managed by that server, which in turn means that a subset of the data managed by the Lustre file system will not be available to clients. To protect against service failure, Lustre data is usually held on multi-ported, dedicated storage to which two or more servers are connected.

Each server that is attached to the enclosure has equal access to the storage targets and can be configured to present the storage targets to the network, although only one server is permitted to access an individual storage target in the enclosure at any given time. Lustre uses an inter-node failover model for maintaining service availability, meaning that if a server develops a fault, then any Lustre storage target managed by the failed server can be transferred to a surviving server that is connected to the same storage array.

This configuration is usually referred to as a high-availability cluster. WekaFS has also solved the metadata choke point which too often cripples the Lustre file system. While the early design of separating data and metadata alleviated many of the problems of legacy NAS, it creted a new bottleneck for modern workloads. WekaFS has solved the metadata performance issues by equally spreading data and metadata across all the nodes in the storage cluster.

The infrastructure is greatly simplified while the performance on small file and metadata is orders of magnitude better than the Lustre design. Performance scales linearly as more nodes are added and there is no requirement for complex inode setup to a ensure file system will scale to the size required by the application.

Figure 2: WekaFS vs. Lustre on IO benchmark. With its huge parallelism, distributed metadata, enterprise feature set, integrated access to NFS and SMB, and greatly simplified infrastructure design, the Weka file system is much easier to configure and manage for enterprises wishing to leverage a parallel file system for modern workloads in AI and machine learning. It is fully cloud enabled and can be deployed as a private cloud, in the public cloud or a hybrid cloud storage model between the two for cloud bursting and backup.

Figure 3: WekaFS in a production environmenet. As noted earlier, the Lustre file system is a complex product to set up and configure to meet demanding workloads.

Amazon FSX for Lustre has removed the deployment complexity for users, however it cannot remedy the inherent challenges outlined in the Lustre design. The feature set is minimal and would not meet the demands of most enterprise requirements. Weka also offers a simple-to-deploy instance of WekaFS in AWS delivers significantly higher performance density with extensive feature set.

Figure 4: FSx for Lustre vs. Flashblade vs. SAN vs. This means that a single client should be able to saturate the bandwidth of one OSS. Only the metadata is read, so files are downloaded on-demand as they are accessed. But, other than on-demand downloads, all the other commands for archival are not automatic. The copytool for Azure is available here. This copytool supports users, groups, and UNIX file permissions that are added as meta-data to the files stored in Azure Blob storage.

The HSM actions are available with the lfs command. All the commands that follow work with multiple files as arguments. It no longer takes up space in Lustre, but it still appears in the filesystem.

When opened, it's downloaded again. Example usage:. This is output for a file that isn't archived:. This is output for a file that is archived and released that is, in storage but not taking up space in the filesystem :. This is most useful when checking the progress on files being archived or restored. In certain cases, you may want to restore all the released or imported files into the filesystem. This is best used in cases where all the files are required and you don't want the application to wait for each file to be retrieved separately.

This can be started with the following command:. To find out how many files are left to be restored, use the following command:. You can view this in the portal by selecting Monitor and then Logs. Here is an example query:. You must be a registered user to add a comment.

If you've already registered, sign in. Lustre is written in C. Learn C with our recommended free books and free tutorials. This site uses Akismet to reduce spam. Learn how your comment data is processed. Skip to content Lustre is a massively, global, parallel distributed file system, generally used for large scale cluster computing.

Share this article. Share your Thoughts Cancel reply. New to Linux? Read our Linux for Starters series.

utamtiolamp1970's Ownd

0コメント

1000 / 1000