Primary Data WorkFlow

This page describes the primary workflow that utilises the Data Mover tool to upload data from an instrument to Mediaflux and then dispatches the data to the User. There are two distinct components to the WorkFlow:

  1. Upload data to a Mediaflux Instrument Project from an instrument acquisition computer and notify platform instrument staff.
    1. This is achieved via the Data Mover GUI that is operated by either platform staff or trusted power users. The Data Mover is pre-configured with the Mediaflux Instrument Project destination.
    2. The person using the GUI selects the data to upload and enters an end-user email address (for dispatch of the data to that User)
    3. The person using the GUI activates the upload which sends the selected data  to the  Mediaflux Instrument Project
    4. Platform staff are notified by email of the successful upload
  2. Provide further dispatch of that data (and only that data) to an end user by allowing them to use one of the following methods:
    1. Download it (from the Instrument Project) to their local storage via a Download Shareable (received by the email address provided in step 1b) and the Data Mover. This method is best for big data (> tens of GB).
    2. Download it (from the Instrument Project) to their local storage via a direct shareable link (received by the email address provided in step 1b).  This link does not require the Data Mover and just downloads the data directly into a zip file.  This method is optimal for small data (< tens of GB)
    3. Copy it to their own Mediaflux Project f(rom the Instrument Project) via a web-based GUI (URL received by the email address provided in step 1b). This step requires the User to log in to Mediaflux.
    4. End user and platform staff can be notified when these transactions complete

With this process, Platform Instrument staff

  • can be confident that the data have securely and robustly reached their Mediaflux Instrument Project and the end user
  • do not need to play any time-consuming role in managing the dispatch of data to the end user


Figure 1 - A schematic of the key data flow components  The user at the Instrument utilises the GUI of the pre-configured Data Mover  to upload data to the Mediaflux Instrument Project.  With the Download Shareable, received by email, the recipient User can download the data (and only the data just uploaded) from the Instrument Project to their local workstation. Only a Mediaflux User will receive the WEB URL in the email, and can also copy the data to their own Mediaflux Project. The notification email (containing the Download Shareable and optionally the WEB URL) that the User receives from Mediaflux is not shown in this diagram. 

Transactions

When data are uploaded to the Mediaflux Instrument Project, an additional namespace (folder/directory) is added at the top of the upload.  This is named with the date and time of the oldest file uploaded (the date and time of when the upload was done is less useful).  This approach, where this extra layer is inserted is referred to as 'Transactional'. Every upload is a transaction. If you upload the same Instrument data twice, you will get two transactions and two distinct uploads of the same data.     

This approach allows platform staff to be clear about what precisely has been uploaded (if the uploads could be updated, there would be no clarity for platform staff about what has been stored in the Instrument Project).   When the upload to the Instrument Project is complete, a Manifest Asset is also created in the top-level parent transactional namespace in the Mediaflux Instrument Project.  That asset contains a range of information about the upload and it can be viewed and queried for.

For reference, the complete form of the transaction parent namespace into which data are uploaded is described here.


Transiently Held Data

It is quite common for instrument platforms to hold the data in Mediaflux only for a short-period of time (the lifetime of the downloads links) and then destroy it. In that time, the researcher can download/copy the data as it is ultimately their responsibility to manage it, not the platform's.  For maximum data security, the platform may still wish to replicate the data to the Noble Park data centre to our Disaster Recovery (DR) Mediaflux system.  We have developed a process so that

  • It's easy for platform operators to destroy expired (i.e. data older than the lifetime of the download links) data via a web-based dashboard
  • The replicas on the DR system will also get destroyed some time (currently 30 days) after the primary copies are destroyed by the operator

If you interested in this capability, please contact the Data Solutions team.


Mapping Source Paths to Mediaflux Paths

By default, the unique transactional folder (see above) created for instrument uploads is located under a specific parent folder in the instrument project. That parent can be anywhere, but it's fixed by the shareable.  Sometimes however, users prefer a structure which has some relationship to the source file system structure.

Let's show by example.  Let us say that your data are being uploaded to the parent folder /projects/proj-neil-1128.4.1000/DM-uploads.  By default, the upload process will create a transactional folder and then locate the data in that.  For example, if you uploaded /data/uom/neil/mydata from the instrument, you will get /projects/proj-neil-1128.4.1000/DM-uploads/<transactional folder>/mydata

You can see that only the child part ("mydata") of the uploaded folder is preserved in Mediaflux.  It is not uncommon though for instrument operators to want to preserve some of that structure (often it reflects organisations, departments and users).  This is possible with the Data Mover via a special mapping capability that moves the data somewhere else after it is first uploaded.  

With this approach, it's possible to maintain any part of the source folder structure. For example, we can arrange so that the upload could go to any of

  • /projects/proj-neil-1128.4.1000/DM-uploads/neil/<transactional folder>/mydata
  • /projects/proj-neil-1128.4.1000/DM-uploads/uom/neil/<transactional folder>/mydata
  • /projects/proj-neil-1128.4.1000/DM-uploads/data/uom/neil/<transactional folder>/mydata

If you interested in this capability, please contact the Data Solutions team.



Storing Meta-data in the Download

The standard process delivers to the user, just the data that was uploaded.  It is also possible now to configure your Instrument upload shareable so that when the download links are made, the download will include, along with the data,  a file called _metadata.xml. This file contains meta-data such as the keywords that were set during the upload (e.g. via the Data Mover GUI).

If you interested in this capability, please contact the Data Solutions team.