Table of Contents |
---|
...
Introduction
Please see the Terminology section first.
...
Excerpt |
---|
The Mediaflux Data Mover tool is used to move data to and from Mediaflux. It has two primary capabilities:
These two basic capabilities are further combined at the University of Melbourne into two additional specialised capabilities:
The Data Mover also has the following attributes:
The Data Mover is formally supported for
The Data Mover tool (client application) is free to use for all users. The user does not need a Mediaflux system to use it - it's just like the Zoom video conferencing tool (you don't need a Zoom system to use the Zoom client tool). |
...
Anchor DataMoverInstallation DataMoverInstallation
...
Installing and Configuring the Data Mover
You can fetch the Data Mover manually and install it, or you can wait until you receive a Shareable (for upload or download). Currently for Linux it is only possible to install manually. Clicking on the Shareable link will initiate the download and installationdownload and installation.
The Data Mover is formally supported for macOS, Windows 10 and Linux. Data Mover is built using Java, so you can see the system requirements for OpenJDK 17.0.2 for a list of supported platforms. In addition to those listed, we know that Windows 7 does currently work.
...
- Download the Data Mover manually and Install
- Preparation
- When doing a new install, remove any .jar files that you may have downloaded (part of the auto-update process see section 3) in
your
.Arcitecta/DataMover/updates
folder
(under your home directory). - If you don't remove these, they will conflict with the newly installed version.
- Uninstall the old version of the app (or just install over the top of it).
- Do not rename the old application to something else as this will cause conflicts. If you want to keep an earlier version, then zip up the old application first.
- When doing a new install, remove any .jar files that you may have downloaded (part of the auto-update process see section 3) in
- Download
- The URL is https://mediaflux.researchsoftware.unimelb.edu.au/mflux/data/mover/index.html
- From there you can download the version for your operating system (OSX, Windows 10 and Linux)
- If you'd like to download a specific version with a tool like curl or wget the URLs are:
- macOS : https://mediaflux.researchsoftware.unimelb.edu.au/mflux/data/mover/installers/mac/Mediaflux%20Data%20Mover.dmg
- Windows : https://mediaflux.researchsoftware.unimelb.edu.au/mflux/data/mover/installers/windows/Mediaflux%20Data%20Mover.msi
- Linux : https://mediaflux.researchsoftware.unimelb.edu.au/mflux/data/mover/installers/linux/mediaflux-data-mover.zip
- Install
- macOS
- Double click on the file
Mediaflux Data Mover.dmg file to install
If you encounter an issue where macOS complains that the .dmg file is damaged, you can resolve that by removing the extended attributes. Start the Terminal (Command line) Application, change directory to the directory where you downloaded the .dmg (in the example it's the Downloads directory) and issue a command like this
Code Block cd ~/Downloads xattr -cr Mediaflux\ Explorer-1.5.0.dmg
- Using the GUI that is presented, drag the
Data Mover
to /Applications
- Double click on the file
- Windows
- Double click on the file
Mediaflux Data Mover.msi
to install.
- Double click on the file
- Linux
- Unpack the m
ediaflux-data-mover.zip
file. Note that the resulting directorymediaflux-data-mover
must not be stored in a directory that containsbin
as one of its elements due to a Linux Java bug (avoid/usr/local/bin
or~/bin
for example). Set the binary to be executable with a command like:
Code Block chmod +x mediaflux-data-mover/bin/mediaflux-data-mover
- Optionally, add
<path>/
mediaflux-data-mover/bin
to yourPATH
variable - Optionally, read the README.txt for instructions on setting up your web browser to automatically open arcio links with the Data Mover\
- Unpack the m
- Check
- After you install, start up the
Data Mover
and make sure the version running (see bottom left of GUI) is the version you expect.
- After you install, start up the
- macOS
- Preparation
- Use a
Shareable
link to download and install- Paste a
Shareable
(upload or download) into a browser - It will be detected whether you have the
Data Mover
already or not. If not, it will download the correct operating system installer (Windows 7/10, MacOS, Linux) - Execute the installer (for whatever platform you are on as in Section 1 above)
- Paste a
- An optional XML configuration file called
settings.xml
can be created in the.Arcitecta/DataMover
(beneath the home directory) folder. This file controls variesData Mover
behaviour. Details are found on this page.
...
Anchor | ||||
---|---|---|---|---|
|
...
Updating the Data Mover via the GUI
You will download the initial install of the DataMover
from our Mediaflux server (see section 2 above). However, thereafter, the Data Mover
tool GUI will offer you updates when they are available (from Arcitecta, the vendor). When you start it, there will be a prompt in the bottom right of the main screen where you can update and relaunch the Data Mover
.
...
Anchor | ||||
---|---|---|---|---|
|
...
Starting the DataMover GUI / Consuming Upload and Download Shareables
There are two ways to use a Shareable with the Data Mover.
- When you click on a
Shareable
URL (e.g. received by email), it will automatically start theData Mover
if it's installed, or take you through the process to install it if not. - You can also start the
Data Mover
manually- The
Data Mover
is just an application so start it how you would start any other application on your platform (e.g. double click or select from menu). - Click 'Add New'
- Paste in the
Shareable
(copy it from wherever you got it, usually an email)
- The
The Data Mover will know what kind (upload or download) of Shareable it is, and you will then be presented with a GUI for uploading or downloading. In that GUI you can select the source (for Upload) or destination (for Download) for the data, and click Upload or Download to activate the data transfer task. After the task is completed, it moves to the Completed section where you can download the activity log.
...
Figure 2b (middle) : After clicking Download
Download the download is progressing
...
Figure 3c (right): After the upload completes, the secondary GUI disappears and Completed has been selected on the main GUI.After DataMover
tasks are completed, there is a log
Watch Folders
Data Mover has the ability to watch a folder for data as they are created. This enables you to, for example, upload files as they are created by an instrument. To enable this feature, check the Enable Watching checkbox when selecting the location to upload. If you wish to start with an empty folder, also check the Allow empty uploads checkbox. Data Mover will follow the following process:
- Every 5 seconds, check the folder for the presence of new files
- If a new file is detected, wait 60 seconds and check the file again. If the file hasn't changed in that time, queue it for upload
You will see the progress bar will oscillate indicating that it is waiting for files to be created. Watching will continue until it is disabled or the Data Mover is closed. To disable watching and complete the upload, click the "eye" () icon next to the indicator of files and bytes uploaded.
Follow symlinks
By default, symlinks in the source upload will be uploaded as symlinks to the Mediaflux system. If you would like the file referenced by the symlink to be uploaded instead, effectively "dereferencing" the symlinks, you can select the Follow Symlinks checkbox when selecting the location to upload.
Log file
After DataMover
tasks are completed, there is a log file of the task available for download (click on the Completed
tab, and then for the task of interest, click on the CSV icon (second from right next to Bin). The log records the transaction details of all files that should have been uploaded or downloaded. In the downloaded CSV file is a column called state. It may have one of the following values:
...
Anchor | ||||
---|---|---|---|---|
|
...
Behaviour when data pre-exist
When the data pre-exist (upload or download) it is important for you to know how the DataMover
handles this.
...
Download
You are presented with multiple choices (there are also tool tips on these selections that you can review) to direct what the DataMover
behaviour is if data pre-exist on download.
- Rename - if the destination directory already exists, creates a new copy of the directory as
directoryname.1
(ordirectoryname.2
etc.). - Update - inspects any files already in the destination directory and if changed (detected using file path, size and checksum), overwrites with the version on the server. If you have made any changes on your client machine they will be overwritten.
- Overwrite - if a file to be downloaded already exists in the destination directory (detected using file path only), always overwrites with the version on the server.
- Skip - if a file to be downloaded already exists in the destination directory (detected using file path only), it will be skipped.
- Fail - if a file to be downloaded already exists in the destination directory (detected using file path only), that file will be noted as failed.
...
Upload
Upload is a little more complex. This is because the DataMover
Data Mover
can recover from failures (such as a network failure) which may leave a partially uploaded file fragment in the asset in Mediaflux.
- If the last version of the target asset (determined by path) is not a partial fragment (determined by path, size and checksum if needed) of the source file being uploaded, you get a new asset version transmitting the source file from the beginning. This is the use case that the source file has changed (but has the same path).
- If the the target asset is a partial file fragment (determined by path, size and checksum if needed) of the source file being uploaded, you get a new asset version copying from the previous version (that was previously uploaded) and then transmitting the rest of the source file. This is the use case that the source file has not changed, but a previous upload failed and is being restarted.
- If the last version of the target asset (determined by path) is the same (determined by size and checksum) as the source the upload is skipped. This is the use case of uploading the same file twice.
...
Utilising the Command-Line Interface of the Data Mover
Not all environments offer a graphical (windowing) environment. This is most common in Unix high-performance computing environments although slowly becoming a thing of the past. For this reason, the Data Mover also has a Command-Line Interface (CLI) as described in this section.
...
Anchor | ||||
---|---|---|---|---|
|
...
Creating and Working With Upload and Download Shareables
Shareables
can be created by Mediaflux Users
with the Mediaflux Explorer
(V 1.5.1 and later) and are consumed by the Data Mover.
Download shareables allow the consumer to recursively download data from a namespace (folder) when they otherwise have no access. Upload shareables allow the consumer to upload data to a specified Mediaflux namespace (folder) when they otherwise have no access. The consumer has no visibility on the data as it arrives in Mediaflux - it is an opaque or anonymous upload process.
...
Download
Include Page | ||||
---|---|---|---|---|
|
...
Upload
Include Page | ||||
---|---|---|---|---|
|