Limitations
Note
cunoFS imposes no restrictions on local file accesses. The limitations listed in this section only apply to data stored on object-storage.
Note
This document contains technical limitations applying to all cunoFS users. There are additional limitations dictated by your licence tier, see Pricing.
Direct interception
Direct interception (using cunoFS CLI or LD_PRELOAD
) does not currently support SUID binaries, or certain packaged apps like Snap, AppImage, or Flatpak applications. Future updates are planned to address this.
If you need to use such apps, prefer to use cunoFS Mount or cunoFS FlexMount.
Maximum object size
Depending on the solution provider, cunoFS has a limitation on the maximum file size it can store on a remote location. The following table indicates the maximum file sizes per provider.
Cloud provider |
Maximum file size |
---|---|
AWS S3 |
5 TB |
Google Cloud Storage |
5 TB |
Azure Storage |
4.77 TB |
Ownership, permissions and file metadata
In Core File Access mode, the owner of the remote objects is by default always reported as the current user, and remote file permissions are always 777
. Also, the creation time of directories is always displayed as the Unix Epoch (00:00:00 UTC on 1 January 1970). These can be overridden using CUNO_OPTIONS
(Ownership and Permissions).
Directories in Azure
Creating a directory in Azure Storage (using mkdir
) will result in a remote blob called <no name> to be displayed inside the created directory when the user is using the GUI/file explorer that Azure portal provides. However, ls
and all CLI commands will behave as expected.
Auto-completion
Auto-completion and wildcard characters are fully supported on a cunoFS active shell. This can be created either using the cuno
command or using LD_PRELOAD
(e.g. LD_PRELOAD=/usr/lib/cuno.so bash
). In the latter, paths containing colons such as s3://bucket
on cloud paths will only succeed if :
is removed from the separator list variable COMP_WORDBREAKS
. For example:
[[ "$LD_PRELOAD" =~ cuno ]] && export COMP_WORDBREAKS=${COMP_WORDBREAKS/:/}
Memory-mapping
Currently, only read-only private file memory mapping is supported.
Applications
Python
Python’s os.path.realpath(path)
is not supported for URI-based access like xx://
.
Rsync
We strongly recommend running rsync
with the options --inplace -W
. This makes rsync work more efficiently with object storage.
To use rsync options that preserve permissions (-p
) and modification times (-t
), such as when you want to update files only when the source has been updated, you must enable POSIX File Access.
Fpsync
When using fpysync
, use the -o
option to pass the options recommended for rsync down to the worker processes, e.g. -o "--inplace -W"
. Further, because cunoFS is already parallelised, we recommend limiting the number of Fpsync worker processes using the -n
option.
To use rsync options that preserve permissions (-p
) and modification times (-t
), such as when you want to update files only when the source has been updated, you must enable POSIX File Access.
sudo with Direct Interception
Using Direct Interception (including when using the cunoFS CLI) requires the LD_PRELOAD
environment variable to be set and maintained for all executed child processes. Since sudo
usage by default does not preserve the environment variables set, the following requirements apply:
sudo
needs to be run with--preserve-env
to preserveCUNO_OPTIONS
sudo
needs to launch a child shell that will then run the command, so that theLD_PRELOAD
environment variable can be set before running the command to be intercepted.LD_PRELOAD
needs to be manually set inside the child shell launched
To use sudo
with Direct Interception, please do the following:
Start a wrapped-shell using cunoFS CLI:
cuno
Run sudo in the following way:
sudo --preserve-env /bin/bash -c "export LD_PRELOAD=$LD_PRELOAD && <YOUR COMMAND HERE>"
Locate
The locate
application requires some heightened privileges to create the database, and also has some incompatibilities with cunoFS Direct Interception.
Issues with Direct Interception
When using Direct Interception, note that locate
and updatedb
do not work with URI-style paths. Please use directory-style paths of the form /cuno/xx/<bucket>
.
Furthermore, Direct Interception (even when using the cunoFS CLI) requires the LD_PRELOAD
environment variable to be set and maintained for all executed child processes. Since updatedb
typically needs to be run with sudo
, the limitations specifed in sudo with Direct Interception apply here.
Instructions for using Locate
To help work around these limitations, we provide steps below on how to use locate
.
Create a new database, which we call
cunoloc.db
:Assuming you have a cunoFS Mount set up at
~/my-object-storage
, you can useupdatedb
directly to crawl all paired buckets from all your object storage providers:sudo updatedb -U ~/my-object-storage -o cunoloc.db
Warning
The mount location should not change (
~/my-object-storage
in the example above), because it will be written into the database created.Run
updatedb
with the workarounds for sudo:Launch a cunoFS shell:
$ cuno
Run
updatedb
inside the shell, using a cloud path in the directory format:(cuno) $ sudo --preserve-env /bin/bash -c "export LD_PRELOAD=$LD_PRELOAD && updatedb -U /cuno/<s3/az/gs>/<bucket> -o cunoloc.db"
Change the database ownership back to your current user:
sudo chown $(whoami):$(whoami) cunoloc.db
Add the database to your
LOCATE_PATH
environment variable and use locate normally; or search within the database, as follows:locate -d cunoloc.db myfile
Setting up the Locate cron job
By default, the global locate database is periodically updated by a cron job. To setup the cron job for cunoFS, you need to edit the file /etc/cron.daily/mlocate
. The last line updates the global database:
flock --nonblock /run/mlocate.daily.lock $NOCACHE $IONICE nice /usr/bin/updatedb.mlocate
This should be replaced with something like:
LD_PRELOAD='/usr/lib/cuno.so' CUNO_OPTIONS='<your options>' CUNO_CREDENTIALS='<path to credentials usable by the root user>' flock --nonblock /run/mlocate.daily.lock $NOCACHE $IONICE nice /usr/bin/updatedb.mlocate
If you don’t want to index all of your object storage, you can specify where updatedb
does not look for files by adding paths to PRUNEPATHS
in the file /etc/updatedb.conf
.