Rsync for Mediaflux

Introduction

Welcome to the step-by-step tutorial on how to use the rsync command line with your Mediaflux project at the University of Melbourne (UoM). This guide will help you understand the basic usage and various options of rsync command you can use to access your UoM Mediaflux projects for efficient file synchronization and transfer.

Why use rsync?

rsync is a command line file transfer and syncing tool used to transfer and sync files and directories. It is especially good for syncing file sets where only a small portion of files have changed and where there are some changes to existing files. It does this by only transferring the files with changes and the changes made in a file instead of resending entire files over, hence being more bandwidth efficient.

It also has an advantage over the unimelb-mf-clients in terms of this efficiency and that most linux and macOS systems have rsync pre-installed while you need to install the unimelb-mf-clients . In windows you can get rsync through WSL(windows subsystem for linux) or cygwin.

The main limitations are that rsync is single threaded and is slow to upload entire large files. Also slow to upload large files that have changed as it needs to compare existing blocks with those to transfer. Hence, sometimes it can be faster to delete existing large files that have changed and transfer the whole file, especially if you have a fast network connection

Please note, this is a special access method offered by the UoM Mediaflux service and not a standard Mediaflux function.

Currently EXPERIMENTAL and on a USE AT OWN RISK basis, please verify data integrity after using and before deleting other copies of your data.

Prerequisites

  • A UoM Mediaflux project

  • Access to UoM VPN

  • Access to a terminal or command prompt

  • Installed rsync on your system (ensure the rsync protocol version is 30 or later)

  • Connect to university VPN

  • Connect to RCS VPN

Options supported

List of supported args/options:

-a

-t

-p

-g

-n/--dry-run

--delete

--exclude

--include

-v

-l

-r

-i/--itemize-changes

--filter

-I/--ignore-times

-o

-d/--dirs

--progress

--stderr=e|a|c

--update, -u



Example 1: Upload Files From Your Local Computer Into Your Mediaflux Project

$ rsync -rv -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME' YOUR_LOCAL_DIRECTORY/ mediaflux.researchsoftware.unimelb.edu.au::projects/YOUR_MEDIAFLUX_PROJECT/FOLDER/

$ rsync -rv -e 'ssh -p 6600 -l unimelb:smithj' /Downloads/files/ mediaflux.researchsoftware.unimelb.edu.au::projects/proj-test-1234.5.6/dir01/folder01/

Example 2: Download Files From Your Mediaflux Project to Your Local Computer

$ rsync -rv -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME' mediaflux.researchsoftware.unimelb.edu.au::projects/YOUR_MEDIAFLUX_PROJECT/FOLDER/ YOUR_LOCAL_DIRECTORY/

$ rsync -rv -e 'ssh -p 6600 -l unimelb:smithj' mediaflux.researchsoftware.unimelb.edu.au::projects/proj-test-1234.5.6/dir01/folder01/ /Downloads/files/

Example 3: Using "Exclude" & "Include" Directories and Files

Exclude and/or include specific files or directories during synchronisation:

  • For include (need to be paired with exclude option): $ rsync -rv --include "FILTER_CONDITION" --exclude="*" -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME' YOUR_LOCAL_DIRECTORY/ mediaflux.researchsoftware.unimelb.edu.au::projects/YOUR_MEDIAFLUX_PROJECT/FOLDER/

$ rsync -rv --include "*.zip" --exclude "*" -e 'ssh -p 6600 -l unimelb:smithj' /Downloads/files/ mediaflux.researchsoftware.unimelb.edu.au::projects/proj-test-1234.5.6/dir01/folder01/

 

  • For exclude: $ rsync -rv --exclude "FILTER_CONDITION" -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME' YOUR_LOCAL_DIRECTORY/ mediaflux.researchsoftware.unimelb.edu.au::projects/YOUR_MEDIAFLUX_PROJECT/FOLDER/

Example 4: Delete Directories/Files in the Destination Directory

Deletes directories / files in the destination directory if they no longer exist in the source directory:

$ rsync -rv --delete -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME' SOURCE_DIRECTORY/ DESTINATION_DIRECTORY/

Example 5: Only show changes that would be made(doesn't do them) Recommended to do before every transfer

The -n flag previews the changes that would be made without actually doing them. We recommend you preview your changes with this flag before actually doing the upload/download to help prevent data loss.

$ rsync -rv -n  -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME' SOURCE_DIRECTORY/ DESTINATION_DIRECTORY/ 

 

Errors from unexpected usage

Not specifying your project name after projects when trying to rsync will give you an error

Eg: this cmd

$ rsync -rv -e 'ssh -p 6600 -l unimelb:YOUR_UOM_USERNAME' YOUR_LOCAL_DIRECTORY/ mediaflux.researchsoftware.unimelb.edu.au::projects

will give this error:

Failed to add /projects/. to initial file list: java.io.IOException: java.lang.RuntimeException: java.lang.RuntimeException: arc.mf.server.Services$ExServiceError: call to service 'asset.get' failed: No permission to access metadata for asset (id) 199

Conclusion

Congratulations! You've learned the basics of using rsync to synchronise your data to your UoM Mediaflux project. If you have any question, please feel free to contact us via

UoM Staff: https://go.unimelb.edu.au/tu78

UoM Students: https://go.unimelb.edu.au/6o78