Basic Concepts
1. Introduction
This section describes some basic concepts around using DaRIS as an end-user. The data repository is built on top of the commercial Mediaflux digital asset management platform. In current use, DaRIS is primarily used to hold bio-medical imaging data and metadata. These data and meta-data are stored in what are termed 'assets' which are managed by Mediaflux, and physically reside on a data store accessible by the Mediaflux server.
The system allows us to :
- Have a standard process to receive the data directly from the source (e.g. an MR scanner)
- Organize data by projects, each of which has a unique team that can access it
- Access permissions are managed by roles assigned to users
- Store data and meta-data
- Download data to compute servers with on-the-fly format conversions for processing
For those who are technically minded, here is an article about the DaRIS framework and data model: http://www.frontiersin.org/neuroinformatics/10.3389/neuro.11.019.2009/full
2. Data Model
It is useful to understand the Data model that DaRIS uses. This is the way that the data are organised. Our data model (called PSSD) is a simple hierarchy, where the Project is the highest level entity. A Project consists of a targeted scientific experiment operated by a finite collection of people.
Each Project has a number of Subjects. Subjects may be animal (e.g. human or mouse) or non-animal (e.g. a contrast-agent sample). Each Subject has a number (usually 1 but sometimes more) of ExMethod*s (Ex for Executable). An *ExMethod describes an experimental method (e.g. acquire a mouse, put it to sleep, make an MR scan, wake it up, kill it, remove its brain etc). Except for some specific projects, we are using very simple ExMethods at the moment (acquire some data). Each ExMethod has a number of Studies. A Study is an acquisition of data (e.g. MR or other). A Study contains a number of DataSet*s. For a MR, a *DataSet is equivalent to a Series (image volume). Thus the hierarchy is Project.Subject.ExMethod.Study.DataSet
- Project --has(m)→ Subject --has(1)→ ExMethod --has(m)→ Study --contains(m)→ Data Set.
3. Citable Identifiers (CIDs)
We assign citable identifiers to objects in the data model (they are like internet addresses). This is of fundamental importance to using DaRIS, and you can read about them here.
Methods, Metadata and ExMethods
Please read here about the basics of Methods and meta-data. This is of central importance to how DaRIS works so please take the time to read.
4. Authorisation (Access Control)
An important concept in this system is that of authorisation. Authorisation refers to the process of controlling access to information and services.
DaRIS uses what is called 'role-based access'. Every user in the system is granted roles which control what the user can do:
- All users gets basic access to the framework
- Specific entitled users are allowed to create Projects and administer them
- Specific entitled users are given
power user
status - When a user creates a Project, they assign users who are allowed to access their Project. They assign them a role from one of (these roles are hierarchical):
- Project Administrator : full access and control over the project
- Subject Administrator : can see private meta-data (usually the people recruiting subjects are given this role)
- Member : Access to all public meta-data and all data
- Guest : Access to all public meta-data and no data (not implemented)
6. Integration with Compute Platform
DaRIS can download data either to your desktop or to a sink attached to the Mediaflux server. This sink is attached to a file system visible to the server. In this way, for example, a file system attached to a high-performance computing (HPC) cluster can also be attached to DaRIS. This allows you to download data directly to the HPC platform's file system.
7. End-to-End Process
To provide some more context, Here is an example of a simple end-to-end work flow to acquire some MRI data in our system :
- To initiate your Project, you use the portal and generate the Project:
- Select the Method(s) you will use with this Project*
- Assign the team that has access to the Project*.
- Receive the citable ID back from the system
- For each subject, and before data are acquired you use the portal to generate the Subject and receive back its CID
- Acquire data for a specific subject
- Upload the data to the desired Subject
- Use a DICOM client and overload the appropriate DICOM element (usually the Patient name; see your DaRIS administrator) with the Subject CID
- Drag-and-drop in the portal
- Access the newly arrived data with the portal and download (with format conversion) to the compute platform