Rsync for Mediaflux
Introduction
Welcome to the step-by-step tutorial on how to use the rsync
command line with your Mediaflux project at the University of Melbourne (UoM). This guide will help you understand the basic usage and various options of rsync
command you can use to access your UoM Mediaflux projects for efficient file synchronization and transfer.
Why use rsync?
rsync is a command line file transfer and syncing tool used to transfer and sync files and directories. It is especially good for syncing file sets where only a small portion of files have changed and where there are some changes to existing files. It does this by only transferring the files with changes and the changes made in a file instead of resending entire files over, hence being more bandwidth efficient.
It also has an advantage over the unimelb-mf-clients in terms of this efficiency and that most linux and macOS systems have rsync pre-installed while you need to install the unimelb-mf-clients . In windows you can get rsync through WSL(windows subsystem for linux) or cygwin.
The main limitations are that rsync is single threaded and is slow to upload entire large files. Also slow to upload large files that have changed as it needs to compare existing blocks with those to transfer. Hence, sometimes it can be faster to delete existing large files that have changed and transfer the whole file, especially if you have a fast network connection
Please note, this is a special access method offered by the UoM Mediaflux service and not a standard Mediaflux function.
Currently Beta release and on a USE AT OWN RISK basis, please verify data integrity after using and before deleting other copies of your data. Use our unimelb-mf-check client to verify your data - unimelb-mf-check (module already installed on SPARTAN)
Prerequisites
A UoM Mediaflux project
Access to UoM VPN
Access to a terminal or command prompt
Installed
rsync
on your system (ensure thersync
protocol version is 30 or later)Connect to university VPN
Connect to RCS VPN
Options supported
List of supported args/options:
-a |
-t |
-p |
-g |
-n/--dry-run |
--delete |
--exclude |
--include |
-v |
-l |
-r |
-i/--itemize-changes |
--filter |
-I/--ignore-times |
-o |
-d/--dirs |
--progress |
--stderr=e|a|c |
--update, -u |
Example 1: Upload Files From Your Local Computer Into Your Mediaflux Project
$ rsync -rv -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME' YOUR_LOCAL_DIRECTORY/ mediaflux.researchsoftware.unimelb.edu.au::projects/YOUR_MEDIAFLUX_PROJECT/FOLDER/
$ rsync -rv -e 'ssh -p 6600 -l unimelb:smithj' /Downloads/files/ mediaflux.researchsoftware.unimelb.edu.au::projects/proj-test-1234.5.6/dir01/folder01/ |
Example 2: Download Files From Your Mediaflux Project to Your Local Computer
$ rsync -rv -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME' mediaflux.researchsoftware.unimelb.edu.au::projects/YOUR_MEDIAFLUX_PROJECT/FOLDER/ YOUR_LOCAL_DIRECTORY/
$ rsync -rv -e 'ssh -p 6600 -l unimelb:smithj' mediaflux.researchsoftware.unimelb.edu.au::projects/proj-test-1234.5.6/dir01/folder01/ /Downloads/files/ |
Example 3: Using "Exclude" & "Include" Directories and Files
Exclude and/or include specific files or directories during synchronisation:
For include (need to be paired with exclude option): $ rsync -rv --include "FILTER_CONDITION" --exclude="*" -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME' YOUR_LOCAL_DIRECTORY/ mediaflux.researchsoftware.unimelb.edu.au::projects/YOUR_MEDIAFLUX_PROJECT/FOLDER/
$ rsync -rv --include "*.zip" --exclude "*" -e 'ssh -p 6600 -l unimelb:smithj' /Downloads/files/ mediaflux.researchsoftware.unimelb.edu.au::projects/proj-test-1234.5.6/dir01/folder01/ |
For exclude: $ rsync -rv --exclude "FILTER_CONDITION" -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME' YOUR_LOCAL_DIRECTORY/ mediaflux.researchsoftware.unimelb.edu.au::projects/YOUR_MEDIAFLUX_PROJECT/FOLDER/
Example 4: Delete Directories/Files in the Destination Directory
Deletes directories / files in the destination directory if they no longer exist in the source directory:
$ rsync -rv --delete -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME'
SOURCE_DIRECTORY/ DESTINATION_DIRECTORY/
Example 5: Only show changes that would be made(doesn't do them) Recommended to do before every transfer
The -n flag previews the changes that would be made without actually doing them. We recommend you preview your changes with this flag before actually doing the upload/download to help prevent data loss.
$ rsync -rv -n -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME'
SOURCE_DIRECTORY/ DESTINATION_DIRECTORY/
Errors from unexpected usage
Not specifying your project name after projects when trying to rsync will give you an error
Eg: this cmd
$ rsync -rv -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME' YOUR_LOCAL_DIRECTORY/ mediaflux.researchsoftware.unimelb.edu.au::projects
will give this error:
Failed to add /projects/. to initial file list: java.io.IOException: java.lang.RuntimeException: java.lang.RuntimeException: arc.mf.server.Services$ExServiceError: call to service 'asset.get' failed: No permission to access metadata for asset (id) 199
Conclusion
Congratulations! You've learned the basics of using rsync
to synchronise your data to your UoM Mediaflux project. If you have any question, please feel free to contact us via
UoM Staff: Login - Employee Center
UoM Students: Login - Student Portal