Get started with cunoFS Fusion#

Overview#

Warning

cunoFS Fusion is currently in BETA. Your feedback and bug reports will help us improve it.

cunoFS Fusion is way to upgrade high-performance attached storage solutions like Amazon Elastic File System (EFS) with the throughput of object storage. It is a cheaper and faster solution compared to using EFS alone.

cunoFS Fusion takes an attached storage filesystem and an initially empty object storage bucket/directory and exposes a single interface for both. cunoFS Fusion will migrate files between object storage and local filesystem depending on where they are best for performance/cost.

Note

This feature is available only to Professional and Enterprise Tier customers. If you have any questions, please feel free to contact us.

How it works#

cunoFS Fusion combines both into a virtual mount or a FUSE mount. The files on object storage are represented as hidden links from the host filesystem to the object store. Unlike other solutions, the object store is a first-class high throughput tier, rather than a slow archival tier. cunoFS Fusion automatically migrates files between the two according to application behaviour on such files. New files may be written in either tier, depending on predicted and observed access properties. cunoFS Fusion supports multiple users simultaneously accessing files on multiple nodes by sharing the attached storage and mount location.

Setting up cunoFS Fusion#

cunoFS Fusion is expected to be used within the same high-speed LAN as your object storage. For example, if you use AWS S3, cunoFS Fusion should only be set up on EC2 nodes within the same AWS Region as the bucket to accessed. If you are using an S3-compatible on-premises object storage solution, then cunoFS Fusion should be set up on a computer on the same high-speed local network.

If you’re already set up and using an attached storage system, such as EFS, you may skip ahead to Mounting a cunoFS Fusion filesystem.

Set up an empty bucket or directory on object storage#

For use as the object storage half of a running cunoFS Fusion mount, you will need a location on object storage that is empty. This will be used as the location that the Fusion filesystem will migrate data to when it’s more suitably stored on object storage. This can either be an entirely empty bucket, or an empty directory on an existing bucket. This location must not be modified by anything but cunoFS Fusion filesystems.

For a new empty bucket, follow the instructions for setting up a new bucket.

Set up a new compute node#

cunoFS Fusion needs to be set up on a compute node in the same LAN/region as the object storage bucket to be used.

You will also need to attach a high-speed file storage solution to the instance. If convenient or if this is not possible later, please do it while creating the new instance/image.

To set up a compute node in the same region as your bucket, follow the relevant steps:

Attach a file storage device#

You will need an empty directory on your attached file storage that is writable by your user. If don’t have this set up already, follow the relevant steps to attach a writable storage device to your compute node:

Mounting a cunoFS Fusion filesystem#

To set up cunoFS Fusion, you need cunoFS installed on the compute node, and you need to set up credentials so that the bucket/container is accessible.

cunoFS Fusion is set up and accessed through a cunoFS Mount. To do this, additional options must be passed to cuno mount when it is run:

cuno mount \
    --fusion "<path to attached storage backing directory>" \
    --root "<path to object storage backing directory>" \
    "<mount location>"

By running the above command, the <mount location> becomes the way to access the Fusion filesystem. All operations, workflows, pipelines, etc. should therefore be pointed to <mount location>.

cuno mount options relevant to cunoFS Fusion#

Option

Description

<mount location>

The location from which the Fusion filesystem will be accessed.

--fusion "<path to attached storage backing directory>"

Enables Fusion for this mount, and sets it to use the location specified to store file-storage data. This is where files that are migrated off the object store into local storage go. This can happen when access patterns suggest the file would benefit from higher IOPS, or when a file doesn’t meet the minimum size or age thresholds.

Example: /dev/sdf/fusion-store

--root "<path to object storage backing directory>"

Sets the object storage location to be used as the place to store data that is better suited to be stored on object storage rather than on the file storage.

Example: /cuno/s3/bucket/fusion-store

--fusion-size-threshold <size: default 10M> (optional)

Define the minimum size that a file needs to be for it to be migrated from file storage to object storage.

The argument value is of the form <INTEGER>[UNIT]. If no unit is given, the value is assumed to be in bytes. Valid units are K (Kilobytes), M (Megabytes), G (Gigabytes), T (Terabytes).

--fusion-age-threshold <age: default 1h> (optional)

Defines the minimum age requirement for files to be considered for migration from file storage to object storage. The age is measured as the time since the most recent of the POSIX creation time, modify time, access time, and change time - in other words, the time since each of these must be greater than the set threshold.

The argument value is of the form <INTEGER>[UNIT]. If no unit is given, the value is assumed to be in seconds. Valid units are s (seconds), m (minutes), h (hours), d (days).

  1. If the bucket is not empty, create a new empty directory on your bucket:

    cuno run mkdir "/cuno/s3/<bucket>/fusion-store"
    
  2. Create a directory on the attached-storage device (assuming you’ve mounted it at /dev/sdf) to use for this purpose:

    mkdir "/dev/sdf/fusion-store"
    
  3. Create the cunoFS Fusion mount:

    cuno mount \
        --fusion "/dev/sdf/fusion-store" \
        --root "/cuno/s3/<bucket>/fusion-store" \
        "~/my-fusion-filesystem"
    

For instructions on unmounting, see Unmounting.

Using the cunoFS Fusion filesystem#

All operations, workflows, pipelines, etc. should be pointed to your mount location.

Examples#

The following examples assume you have mounted a Fusion filesystem at ~/my-fusion-filesystem.

Download data from the web into the Fusion filesystem:

cd ~/my-fusion-filesystem
wget http://vision.stanford.edu/aditya86/ImageNetDogs/images.tar

List the files in the Fusion filesystem:

ls ~/my-fusion-filesystem/

Unpack a tar archive:

cd ~/my-fusion-filesystem
tar -xf images.tar

Testing#

Check that access, modification and migration are working as expected. We recommend measuring cost differences, and the overall time taken for equivalent jobs run on attached storage alone.

Not seeing what you expect? Get in touch on our public discourse forum or get in touch privately at support@cuno.io.