...
Here are all the details for the command-line arguments to this client.
Examples
You will need to know where (the path) to locate your data in Mediaflux (the --dest argument of the command) and where to upload from (the last positional argument)
Example 1 - parallel upload with checksum check
Upload data with four worker threads and turn on checksums for upload integrity checking (recommended). As the location of the config files is not specified, the client will look for it in the .Arcitecta directory of your home directory.
Code Block |
---|
unimelb-mf-upload --csum-check --nb-workers 4 --dest /projects/proj-myproject-1128.1.59/12Jan2018 /data/projects/punim0058
|
Example 2 - using a configuration file
Upload data with one worker thread and specify explicitly where the configuration file is.
Code Block |
---|
unimelb-mf-upload --mf.config /Users/nebk/.Arcitecta/mflux.cfg --dest /projects/proj-myproject-1128.1.59/12Jan2018 /data/projects/punim0058 |
The Configuration File might look like this:
Code Block |
---|
host=mediaflux.researchsoftware.unimelb.edu.au
port=443
transport=https
token=phooP1Angohb2ooyahbiLiuwa6ahjuoKooViedaifooPhiqu1ookahXae7keichael4Shae2ael8ietit2phawucai0Aighifu6olah9OquahDei2aevae3keich8ain1OoLa4O |
Pre-existing files
The client checks whether files already exist in Mediaflux or not. If they do exist it will skip the upload. The checks it uses are:
- File path/name exists and is the same
- File size is the same
- If checksums are enabled, the checksum is the same
If any of these fail, the file does not pre-exist and will be re-uploaded. In the case that the path/name is the same, but the source file has changed content, it will be uploaded to the pre-existing asset in Mediaflux as a new version.
Checksums
Checksums (a unique number computed from the contents of a file) are an important data integrity mechanism. The Mediaflux server computes a checksum for each file it receives. The upload client can compute checksums from the source data on the client side and compare with the checksum computed by the server when it receives the file. If the checksums match, we can be very confident that the file uploaded correctly. Many other clients for other protocols (e.g. sFTP and SMB) do not do this.
...
If the checksums differ, it will then proceed to re-upload the local file (following the process in Case 1. above) because it has changed and make a new asset version. Thus, overall 2 checksums are computed by the client and one by the server.
Pre-existing files
The client checks whether files already exist in Mediaflux or not. If they do exist it will skip the upload. The checks it uses are:
- File path/name exists and is the same
- File size is the same
- If checksums are enabled, the checksum is the same
If any of these fail, the file does not pre-exist and will be re-uploaded. In the case that the path/name is the same, but the source file has changed content, it will be uploaded to the pre-existing asset in Mediaflux as a new version.
Examples
You will need to know where (the path) to locate your data in Mediaflux (the --dest argument of the command) and where to upload from (the last positional argument)
Example 1 - parallel upload with checksum check
Upload data with four worker threads and turn on checksums for upload integrity checking (recommended). As the location of the config files is not specified, the client will look for it in the .Arcitecta directory of your home directory.
Code Block |
---|
unimelb-mf-upload --csum-check --nb-workers 4 --dest /projects/proj-myproject-1128.1.59/12Jan2018 /data/projects/punim0058
|
Example 2 - using a configuration file
Upload data with one worker thread and specify explicitly where the configuration file is.
Code Block |
---|
unimelb-mf-upload --mf.config /Users/nebk/.Arcitecta/mflux.cfg --dest /projects/proj-myproject-1128.1.59/12Jan2018 /data/projects/punim0058 |
The Configuration File might look like this:
Code Block | ||
---|---|---|
| ||
host=mediaflux.researchsoftware.unimelb.edu.au
port=443
transport=https
token=phooP1Angohb2ooyahbiLiuwa6ahjuoKooViedaifooPhiqu1ookahXae7keichael4Shae2ael8ietit2phawucai0Aighifu6olah9OquahDei2aevae3keich8ain1OoLa4O |
Scheduled uploads
If you have a location that should be uploaded on a regular schedule such as an instrument PC that saves data to a given directory on the local computer, you can schedule uploads with unimelb-mf-upload. It is best to request an upload token if you want to do this as the credential will be stored on the computer that is doing the uploads. Contact Research Computing Services to request a token.
Windows
In this example:
- we will put the unimelb-mf-client files in the %HOMEPATH%\Documents directory
- we will save logs to the %HOMEPATH%\Documents\logs directory
- will will put the configuration file in the %HOMEPATH%\Documents directory
...
Create a Configuration File. In this case we are going to use a secure token. In our example, it will be stored in %HOMEPATH%\Documents\mflux.cfg.
Code Block | ||
---|---|---|
| ||
host=mediaflux.researchsoftware.unimelb.edu.au port=443 transport=https token=phooP1Angohb2ooyahbiLiuwa6ahjuoKooViedaifooPhiqu1ookahXae7keichael4Shae2ael8ietit2phawucai0Aighifu6olah9OquahDei2aevae3keich8ain1OoLa4O |
Create a batch file to perform the upload using Notepad. In our example, it will be stored in %HOMEPATH%\Documents\upload.bat:
Code Block | ||
---|---|---|
| ||
%HOMEPATH%\Documents\unimelb-mf-clients-0.7.7\bin\windows\unimelb-mf-upload --mf.config %HOMEPATH%\Documents\mflux.cfg --log-dir %HOMEPATH%\Documents\logs --dest /projects/proj-demonstration-1128.4.15 %HOMEPATH%\Documents\data-to-upload |
...
- Give your task a name and description, then click Next >
- choose a start date and time and click Next >
- choose Start a program and click Next >
- click the Browse button and find the script you created above.
- Click Next > and then check the Open the Properties dialog for this task when I click Finish box, then click Finish.
- Under Security options, choose which user you would like the task to run under. You may wish to make it so the scheduled job will run even if the user is not logged in.
Linux
In this example:
- we will put the unimelb-mf-client files in the ~/bin directory
- we will save logs to the ~/logs directory
- will will put the configuration file in the ~/.Arcitecta directory
...
Create a Configuration File. In this case we are going to use a secure token. In our example, it will be stored in ~/.Arcitecta/mflux.cfg.
Code Block | ||
---|---|---|
| ||
host=mediaflux.researchsoftware.unimelb.edu.au port=443 transport=https token=phooP1Angohb2ooyahbiLiuwa6ahjuoKooViedaifooPhiqu1ookahXae7keichael4Shae2ael8ietit2phawucai0Aighifu6olah9OquahDei2aevae3keich8ain1OoLa4O |
Create a shell script to perform the upload using the text editor of your choice. In our example, it will be stored in ~/bin/upload.sh:
Code Block | ||
---|---|---|
| ||
#!/bin/bash ~/bin/unimelb-mf-clients-0.7.4/bin/unix/unimelb-mf-upload --mf.config ~/.Arcitecta/mflux.cfg --log-dir ~/logs --dest /projects/proj-demonstration-1128.4.15 ~/data-to-upload |
On Linux there's typically two options for scheduling tasks: cron and systemd timers. In this example, we will use a cron job.
Edit the crontab file with the following command:
Code Block | ||
---|---|---|
| ||
crontab -e |
Create a new scheduled task at the end of the crontab file. To see documentation on the format, try the man 5 crontab command. In our example, we will run the command once per day at 1 am local time.
Code Block | ||
---|---|---|
| ||
# To define the time you can provide concrete values for # minute (m), hour (h), day of month (dom), month (mon), # and day of week (dow) or use '*' in these fields (for 'any'). # # For more information see the manual pages of crontab(5) and cron(8) # # m h dom mon dow command 0 1 * * * $HOME/bin/upload.sh |
Save the file, and your job will be scheduled.
Code Block | ||
---|---|---|
| ||
crontab: installing new crontab |
macOS
In this example:
- we will put the unimelb-mf-clients in the ~/Applications folder
- we will save logs to the ~/Documents/logs folder
- we will put the configuration file in the ~/.Arcitecta folder
Download from the GitLab page, selecting the Mac 64bit release. Extract the tar.gz file by clicking on it. It will be extracted to a folder in your Downloads folder, so move it o the Applications folder.
Create a Configuration File. In this case we are going to use a secure token. In our example, it will be stored in ~/.Arcitecta/mflux.cfg.
Code Block | ||
---|---|---|
| ||
host=mediaflux.researchsoftware.unimelb.edu.au
port=443
transport=https
token=phooP1Angohb2ooyahbiLiuwa6ahjuoKooViedaifooPhiqu1ookahXae7keichael4Shae2ael8ietit2phawucai0Aighifu6olah9OquahDei2aevae3keich8ain1OoLa4O |
Create a shell script to perform the upload using the text editor of your choice. In our example, it will be stored in ~/bin/upload.sh:
Code Block | ||
---|---|---|
| ||
#!/bin/bash
~/Applications/unimelb-mf-clients-0.7.4/bin/unix/unimelb-mf-upload --mf.config ~/.Arcitecta/mflux.cfg --log-dir ~/logs --dest /projects/proj-demonstration-1128.4.15 ~/data-to-upload |
Edit the crontab file with the following command. By default the vim text editor will be used.
Code Block | ||
---|---|---|
| ||
crontab -e # this will use the default text editor, usually vim
# if you would prefer to use the pico text editor, use the following command instead:
EDITOR=/usr/bin/pico crontab -e |
Create a new scheduled task at the end of the crontab file. To see documentation on the format, try the man 5 crontab command. In our example, we will run the command once per day at 1 am local time.
Code Block | ||
---|---|---|
| ||
# To define the time you can provide concrete values for # minute (m), hour (h), day of month (dom), month (mon), # and day of week (dow) or use '*' in these fields (for 'any'). # # For more information see the manual pages of crontab(5) and cron(8) # # m h dom mon dow command 0 1 * * * $HOME/bin/upload.sh |
Save the file, and your job will be scheduled.
Code Block | ||
---|---|---|
| ||
crontab: installing new crontab |
...