Configure cunoFS#

Overview#

Each time you set up cunoFS in a new compute environment, you will need to assess the requirements of your workloads and your end goals.

By default, cunoFS is configured for core object storage access in a LAN setting. If you have one of the following specialized use cases, follow the relevant quick start guide:

Which mode is for me?#

When configuring cunoFS, it is important to understand the needs of the tools and workloads you wish to run. There are many configuration options which can be reviewed in our user guide as well in the cuno help pages. You can change your settings for each workflow/tool, or you can set some options on the bucket-level to impose a usage mode for all users.

There are two independent aspects you need to consider when configuring cunoFS:

  • The requirements of your tools in order to work correctly.

  • The needs of your tools in different environments to work efficiently.

For correct operation, you need to know what level of POSIX compatibility your tools require to function. For efficient operation your primary concerns are the bandwidth/latency between the compute and object storage resources, as well as the behaviours of your tools.

Correct operation - Levels of POSIX compatibility#

Some applications have limited requirements from a POSIX perspective and only needs the filesystem for basic operations like renaming, reading, or writing new data. Other applications may require a fully POSIX compatible interface, which requires additional work from the compatibility layer. Because full compatibility requires extra metadata to be written to object storage, and may slightly affect performance we make this optional.

The main levels of POSIX compatibility are:

Note

How to check if you need POSIX File Access

  • Check the outputs correspond to the same workload when run on the local file system. If there are randomised elements to workload (such as sampling in machine learning use cases), then the seeds need to be fixed to make such a comparison.

  • If the application running under cunoFS with Core File Access fails with (134) ENOTSUP (not supported) or Operation not permitted then it is likely that POSIX File Access needs to be enabled.

Core File Access#

Objects as files, files as objects; this is the default mode. Does not support the persistence or modification of POSIX users, groups, symlinks, hard links, permissions control or file modes attributes on objects.

  • Use this when you don’t have any metadata requirements and your tools don’t need any POSIX metadata persistence to function correctly.

  • Use this when you don’t have any write access to the bucket in question, or you do not want to create any cunoFS-internal objects there.

Example use-cases#

  • Use this when interacting with data you already have on object storage and only require access to the names and data of those objects. For example, if you have machine learning datasets in the cloud and you have previously configured your libraries to read them directly from object storage.

How to enable#

This is the default mode.

POSIX File Access#

This mode will maintain POSIX metadata for your objects, but it won’t enforce any permissions or modes set on the objects. That means any user can use cunoFS to read or write any file that their object storage credentials give them access to.

Warning

This mode stores POSIX metadata as objects in a “hidden” directory in your buckets alongside your data. You cannot see these directories when using cunoFS to list objects, but you will see them if you use other tools (such as your storage provider’s web console). Non-cunoFS access which renames, moves, or copies objects with cunoFS-stored POSIX file attributes will result in those objects losing their metadata. You will need to use cunoFS to manage those files while preserving their attributes.

Example use-cases#

  • Use this when your applications are dependent on the preservation of POSIX metadata (owner/group permissions, change/modify times, etc.) or POSIX “links” (symlinks or hardlinks).

  • Use this if you’re moving workflows from POSIX to object storage, such as workloads that were previously run on EC2 with EFS.

Note

We don’t currently support POSIX ACLs or extended attributes on the cloud. Please get in contact with us at support@cuno.io <mailto:support@cuno.io> if you need these features.

How to enable#

There are two main ways to enable this. If the object storage provider supports setting tags at the bucket level, then POSIX compatibility mode can be enabled using the command cuno creds setposix s3://yourbucket true. This will affect everyone using the bucket and force all cunoFS users of that bucket into POSIX compatability mode. Otherwise it can be enabled manually by a user setting the environment variable export CUNO_POSIX=1 (valid per-session).

POSIX Enforced File Access#

This mode will maintain POSIX metadata for your objects, and will enforce POSIX access controls on those objects. Use this when you want to manage what users have access to based on the UID/GID of their UNIX user and the corresponding POSIX metadata (owner, group, mode) on files. That means users will encounter access denied errors if they try to read or write to a file/directory they haven’t been given permission to (by a suitably privileged user doing chown, chgrp or chmod).

Note that this is client-side rather than server-side enforcement. If the user has access to object storage credentials with server-side privileges beyond this, then the user can potentially access or modify objects outside of these POSIX access controls. Contact us at support@cuno.io for how to setup ACL Policies to enforce server-side access control that reflects POSIX access controls.

Warning

This mode stores POSIX metadata as objects in a “hidden” directory in your buckets alongside your data. You cannot see these directories when using cunoFS to list objects, but you will see them if you use other tools (such as your storage provider’s web console). Non-cunoFS access which renames, moves, or copies objects with cunoFS-stored POSIX file attributes will result in those objects losing their metadata. You will need to use cunoFS to manage those files while preserving their attributes.

Example use-cases#

  • Host a website using NGINX (or other server technologies) entirely backed by object storage, without any attached storage device (such as EFS). This mode lets you do this while controlling which files/directories the NGINX process (as the nginx user) can access.

  • Host an organisation’s user filesystem in the cloud.

Note

We don’t currently support POSIX ACLs or extended attributes on the cloud. Please get in contact with us at support@cuno.io <mailto:support@cuno.io> if you need these features.

How to enable#

Please refer to the getting started guide: Setting up Enforced POSIX Access.

Efficient operation in your environments#

Where is the client?#

Accessing object storage from within the same high-speed network#

If the client is in the same high-speed network as the object storage system, the connection is considered LAN (local-area network) access. Examples include:

  • The client is a cloud-hosted EC2 instance / virtual machine within the same region as the cloud data.

  • The client is in the same LAN network as an on-premises object storage cluster and has a high-speed, low-latency connection linking the client to the storage.

If this is the case, consider what the tools behaviours are:

  • If the workload requires both high IOPS (such as database operations), and high throughput, consider using cunoFS Fusion: combining the best of high-speed attached storage and object storage.

  • Otherwise, no additional configuration is necessary.

Accessing object storage from remote networks#

If the client is outside the network of the object storage system, the connection is considered WAN (wide-area network). Examples include:

  • Accessing buckets from a cloud-hosted EC2 instance / virtual machine in a different region to the bucket being accessed.

  • A home computer accessing cloud-hosted object storage.

  • Accessing cloud storage of a different cloud provider.

If this is the case, consider:

  • If the client needs to repeatedly read the same parts of a file, consider using client-side caching on a fast local disk.

Note

Client-side caching on disk for workloads requiring many reads of the same data, such as video editing, is coming soon. Be the first to find out about it by signing up to our mailing list on our website or by emailing us directly at info@cuno.io.