Large File Storage (LFS)
By default, the ARC Commander tracks the following files via LFS:
- All files stored in an assay’s
dataset
folder, and - All files with a size larger than 150 MB.
The threshold of 150 MB can easily be adjusted using the ARC Commander. For instance, if you want to decrease it to 5 MB (i.e. 5000000 bytes), run
Track files via LFS
Section titled Track files via LFSIn addition to the defaults, you can also actively choose, which files to track via LFS.
- Update your local ARC via
arc sync
- Add large files or folders by copying or moving them to your ARC
- Track files via
- Sync your ARC to the DataHUB via
arc sync
Downloading an ARC without large data files
Section titled Downloading an ARC without large data filesSometimes you may want to download your ARC to a smaller computer, where you do not need a full copy of your ARC including all its large data files. For instance, you just want to work with smaller derived data sets or want to update ISA metadata.
In this case, you can add the -n
or --nolfs
flag to your arc get
command:
For example, have a look at the ARC https://git.nfdi4plants.org/shiltemann/physcomitrium-patens-light-signaling-2022/. In the DataHUB this ARC has a storage volume of ~84GB (December 2023), most of which comes from the large RNASeq data files flagged as “LFS”.
You can download this ARC without the LFS objects via
Selectively download large files
Section titled Selectively download large filesIf at some point you wish to selectively download one or more of the LFS objects of your ARC to that machine, you can do so via git lfs pull --include "<path/to/fileOrFolder>"
For example, the following command will download one of the large RNASeq data files.
Download all large files in the ARC
Section titled Download all large files in the ARCIf at some point you wish to download all LFS files of your ARC, you can use the following command