Get started with cunoFS Fusion#

Overview#

Warning

cunoFS Fusion is currently in BETA. Your feedback and bug reports will help us improve it.

cunoFS Fusion is way to upgrade high-performance attached storage solutions like Amazon Elastic File System (EFS) with the throughput of object storage. It is a cheaper and faster solution compared to using EFS alone.

cunoFS Fusion takes an attached storage filesystem and an initially empty object storage bucket/directory and exposes a single interface for both. cunoFS Fusion will migrate files between object storage and local filesystem depending on where they are best for performance/cost.

Note

This feature is available only to Professional and Enterprise Tier customers. If you have any questions, please feel free to contact us.

How it works#

cunoFS Fusion combines both into a virtual mount or a FUSE mount. The files on object storage are represented as hidden links from the host filesystem to the object store. Unlike other solutions, the object store is a first-class high throughput tier, rather than a slow archival tier. cunoFS Fusion automatically migrates files between the two according to application behaviour on such files. New files may be written in either tier, depending on predicted and observed access properties. cunoFS Fusion supports multiple users simultaneously accessing files on multiple nodes by sharing the attached storage and mount location.

Setting up cunoFS Fusion#

cunoFS Fusion is expected to be used within the same high-speed LAN as your object storage. For example, if you use AWS S3, cunoFS Fusion should only be set up on EC2 nodes within the same AWS Region as the bucket to accessed. If you are using an S3-compatible on-premises object storage solution, then cunoFS Fusion should be set up on a computer on the same high-speed local network.

If you’re already set up and using an attached storage system, such as EFS, you may skip ahead to Mounting a cunoFS Fusion filesystem.

Set up an empty bucket or directory on object storage#

For use as the object storage half of a running cunoFS Fusion mount, you will need a location on object storage that is empty. This will be used as the location that the Fusion filesystem will migrate data to when it’s more suitably stored on object storage. This can either be an entirely empty bucket, or an empty directory on an existing bucket. This location must not be modified by anything but cunoFS Fusion filesystems.

For a new empty bucket, follow the instructions for setting up a new bucket.

Set up a new compute node#

cunoFS Fusion needs to be set up on a compute node in the same LAN/region as the object storage bucket to be used.

You will also need to attach a high-speed file storage solution to the instance. If convenient or if this is not possible later, please do it while creating the new instance/image.

To set up a compute node in the same region as your bucket, follow the relevant steps:

Attach a file storage device#

You will need an empty directory on your attached file storage that is writable by your user. If don’t have this set up already, follow the relevant steps to attach a writable storage device to your compute node:

Mounting a cunoFS Fusion filesystem#

To set up cunoFS Fusion, you need cunoFS installed on the compute node, and you need to set up credentials so that the bucket/container is accessible.

Your next steps depend on the filesystem you are using and the options used when mounting/attaching it. Please see the relevant section below:

Filesystems without extended attribute support (e.g. EFS)#

cunoFS Fusion is set up and accessed through a cunoFS Mount. To do this, additional options must be passed to cuno mount when it is run. You will need to use the --fusion option to specify the path to the attached storage backing directory, and the –root option to specify the path to the object storage:

cuno mount \
    --fusion "<path to attached-storage backing directory>" \
    --root "<path to object storage backing directory>" \
    "<mount location>"

Example: cuno mount --fusion /dev/sdf/fusion-store --root s3://bucket/fusion-store $HOME/my-fusion-filesystem

By running the above command, the <mount location> becomes the way to access the Fusion filesystem. All operations, workflows, pipelines, etc. should therefore be pointed to <mount location>.

cuno mount options relevant to cunoFS Fusion#

Argument/Option

Description

<mount location>

The location from which the Fusion filesystem will be accessed.

Example: $HOME/my-fusion-filesystem

--fusion "<path to attached storage backing directory>"

Enables Fusion for this mount, and sets it to use the location specified to store file-storage data. This is where files that are migrated off the object store into local storage go. This can happen when access patterns suggest the file would benefit from higher IOPS, or when a file doesn’t meet the minimum size or age thresholds.

Example: /dev/sdf/fusion-store

--root "<path to object storage backing directory>"

Sets the object storage location to be used as the place to store data that is better suited to be stored on object storage rather than on the file storage.

Example: /cuno/s3/bucket/fusion-store

--fusion-size-threshold <size: default 10M> (optional)

Define the minimum size that a file needs to be for it to be migrated from file storage to object storage.

The argument value is of the form <INTEGER>[UNIT]. If no unit is given, the value is assumed to be in bytes. Valid units are K (Kilobytes), M (Megabytes), G (Gigabytes), T (Terabytes).

--fusion-age-threshold <age: default 1h> (optional)

Defines the minimum age requirement for files to be considered for migration from file storage to object storage. The age is measured as the time since the most recent of the POSIX creation time, modify time, access time, and change time - in other words, the time since each of these must be greater than the set threshold.

The argument value is of the form <INTEGER>[UNIT]. If no unit is given, the value is assumed to be in seconds. Valid units are s (seconds), m (minutes), h (hours), d (days).

  1. If the bucket is not empty, create a new empty directory on your bucket:

    cuno run mkdir "/cuno/s3/<bucket>/fusion-store"
    
  2. Create a directory on the attached-storage device (assuming you’ve mounted it at /dev/sdf) to use for this purpose:

    mkdir "/dev/sdf/fusion-store"
    
  3. Create the cunoFS Fusion mount:

    cuno mount \
        --fusion "/dev/sdf/fusion-store" \
        --root "/cuno/s3/<bucket>/fusion-store" \
        "~/my-fusion-filesystem"
    

Filesystems with extended attribute support (e.g. NFSv4 with Linux Kernel 5.9+)#

When using a filesystem that supports extended attributes please “bind” it to a cloud location first as this does additional error checking and makes for a simpler mount/unmount procedure.

Bind a directory to a cloud location#

To bind a local Fusion directory to a cloud location, use cuno fusion. This command will make a note locally of where the mount we make later will point to. It will also save any relevant options used at this stage as metadata, to be used by default by any mounts that use this fusion binding.

cuno fusion "<path to attached storage backing directory>" "<path to object storage backing directory>"

Example: cuno fusion /dev/sdf/fusion-store s3://bucket/fusion-store

cuno fusion arguments#

Argument

Description

<path to attached storage backing directory>

Sets the location specified as the place to store file-storage data. This is where files that are migrated off the object store into local storage go. This can happen when access patterns suggest the file would benefit from higher IOPS, or when a file doesn’t meet the minimum size or age thresholds.

<path to object storage backing directory>

Sets the object storage location to be used as the place to store data that is more suited for object storage than file storage.

cuno fusion options#

Option

Description

--fusion-size-threshold <size: default 10M> (optional)

Define the minimum size that a file needs to be for it to be migrated from file storage to object storage.

The argument value is of the form <INTEGER>[UNIT]. If no unit is given, the value is assumed to be in bytes. Valid units are K (Kilobytes), M (Megabytes), G (Gigabytes), T (Terabytes).

--fusion-age-threshold <age: default 1h> (optional)

Defines the minimum age requirement for files to be considered for migration from file storage to object storage. The age is measured as the time since the most recent of the POSIX creation time, modify time, access time, and change time - in other words, the time since each of these must be greater than the set threshold.

The argument value is of the form <INTEGER>[UNIT]. If no unit is given, the value is assumed to be in seconds. Valid units are s (seconds), m (minutes), h (hours), d (days).

cuno fusion subcommands#

Subcommand

Description

cuno fusion bind <local> <cloud>

Bind a local directory to a cloud directory.

cuno fusion unbind <local>

Unbind a local directory from a cloud directory.

cuno fusion rebind <local> <cloud>

Rebind a local directory to a cloud directory.

cuno fusion info <local>

Get information about the binding at the location.

Mount the cunoFS Fusion filesystem after binding#

cunoFS Fusion is set up and accessed through a cunoFS Mount. To do this, additional options must be used when creating a cunoFS Mount.

If you have already bound a directory to a cloud location, you only need to use the --root option to specify the path to the attached-storage backing directory in order to use the mount as a Fusion mount:

cuno mount \
    --root "<path to attached-storage backing directory>" \
    "<mount location>"
cuno mount arguments/options relevant to cunoFS Fusion#

Argument/Option

Description

<mount location>

The location from which the Fusion filesystem will be accessed.

--root "<path to bound attached-storage backing directory>"

Sets the path that the mount will look for a directory that has been set up for cunoFS Fusion. This means that you must previously have run cuno fusion with the same path, so that Fusion Mount can see which object storage path it is bound to. If your filesystem does not support extended attributes, you should use –root as specified

Example: /dev/sdf/fusion-store

  1. If the bucket is not empty, create a new empty directory on your bucket:

    cuno run mkdir "s3://<bucket>/fusion-store"
    
  2. Create a directory on the attached-storage device (assuming you’ve mounted it at /dev/sdf) to use for this purpose:

    mkdir "/dev/sdf/fusion-store"
    
  3. Create the cunoFS Fusion mount:

    cuno fusion "/dev/sdf/fusion-store" "s3://<bucket>/fusion-store"
    cuno mount --root "/dev/sdf/fusion-store" "$HOME/my-fusion-filesystem"
    

For instructions on unmounting, see cunoFS Mount commands.

Using the cunoFS Fusion filesystem#

All operations, workflows, pipelines, etc. should be pointed to your mount location.

Examples#

The following examples assume you have mounted a Fusion filesystem at $HOME/my-fusion-filesystem.

  • Download data from the web into the Fusion filesystem:

    cd $HOME/my-fusion-filesystem
    wget http://vision.stanford.edu/aditya86/ImageNetDogs/images.tar
    
  • List the files in the Fusion filesystem:

    ls $HOME/my-fusion-filesystem/
    
  • Unpack a tar archive:

    cd $HOME/my-fusion-filesystem
    tar -xf images.tar
    

Testing#

Check that access, modification and migration are working as expected. We recommend measuring cost differences, and the overall time taken for equivalent jobs run on attached storage alone.

Not seeing what you expect? Get in touch on our public discourse forum or get in touch privately at support@cuno.io.