Mediaflux Replication

Mediaflux Replication

Mediaflux is a database-backed asset management system. At the University of Melbourne, it is primarily used as a file store for research data. When a file is stored in Mediaflux, it is stored as an asset, which consists of a database entry containing the file’s metadata, and an associated content object containing the file’s data.

Files held in Mediaflux are versioned. This means that if a file is updated, both the older version and the new version will be stored and can be retrieved if necessary.

In order to avoid data loss, Mediaflux is backed by two forms of replication: database replication and asset replication.

Database replication

Mediaflux is configured to replicate its database in real-time to a second server. That is, every transaction is committed to both the local database and a remote database copy. This configuration provides redundancy in the event of a local storage system failure.

Replication is the equivalent of a continuous database backup, however it will also replicate object deletions, so in addition to database replication, we back up the database to tape several times a day.

Asset content replication

When an asset is created or modified on the primary Mediaflux storage cluster, this asset/version pair is added to an asset processing queue. After a delay (currently 10 seconds), this asset version will be copied to the DR server.

Note that asset deletions are not currently replicated. If a file is deleted and then a new file with the same name is created, the DR will keep both files (renaming them to avoid name collisions).

This differs from a traditional backup which provide point-in-time snapshots across an entire dataset. Instead, we work at the asset level; you can think of this as a backup for each asset independently. This is mostly relevant when recovering entire directories, see Recovering Files.

Currently, content replicas are kept indefinitely, though this may change in the future.

Recovering files

If a file is overwritten by a new version, the older version will be restored on the primary server alone, as Mediaflux keeps all versions of a file.

If a file or directory is deleted, it will be recovered from the DR server by an administrator.

If an entire directory is deleted, it will be recovered from the DR server. Please note that:

  • Files that have been deleted on the primary cluster will still exist in the replica copy

  • Files that have been deleted and re-created will have multiple copies in the replica copy

Glossary

Production cluster

The Production Mediaflux cluster is the main system that users interact with. It consists of:

  • one controller node (containing the database)

  • one database replica node

  • two cluster nodes (which assist with moving data to/from users' machines)

Disaster Recovery (DR) cluster

The Disaster Recovery cluster receives asset content from the Primary Mediaflux cluster. It is accessible only to administrative staff. Requests can be made by users to recover asset content from the DR cluster.

Asset processing queue

An asset processing queue is a FIFO data structure that can be used to perform operations on a list of asset id/version pairs. For asset replication, any asset version that is added to the queue is copied to the DR server.