An object store for scientific data that is:
-
a storage platform designed from the ground up to be fast, scalable and fault tolerant.
-
fully RESTful. The API is accessible from desktops, HPC systems, exotic hardware, the cloud and your smartphone.
-
designed from scratch for complex scientific data and allows the storage and querying of complex user-defined metadata.
-
a full data management system that supports in storage layer operations like quality-control, format conversion, filtering or subsetting.
-
integrated with S3, Microsoft Azure Storage, Google Cloud Storage and IBM's Tivoli TSM storage managment system.
-
integrated with HSM operations and caching
-
part of our reproducible science platform Skyport combined to create Researchobjects when combined with CWL and CWLprov
Shock is actively being developed at github.com/MG-RAST/Shock.
(see Shock: Active Storage for Multicloud Streaming Data Analysis, Big Data Computing (BDC), 2015 IEEE/ACM 2nd International Symposium on, 2015)
Check out the notes on building and installing Shock and configuration.
We know that you already have docker and docker-compose installed and curl is available locally. To improve readability of JSON output we recommend using a JSON pretty printer such as jq or python -m json.tool
docker-compose upThis will automatically download and start Shock and MongoDB Dockerimages. Note that in this demo-configuration Shock does not store data persistently.
Open another terminal:
curl http://localhost:7445/ | jq .Should return a JSON object describing Shock, i.e. version number.
curl -X POST -F 'upload=@test/testdata/10kb.fna' http://localhost:7445/node | jq .returns
{
"status": 200,
"data": {
"id": "8eb28ad3-2561-4847-8034-6d5473fecfad",
"version": "88e3227e7ba47a8d595f243ef16f9c2b",
"file": {
"name": "10kb.fna",
"size": 11914,
"checksum": {
"md5": "730c276ea1510e2b7ef6b682094dd889"
},
"format": "",
"virtual": false,
"virtual_parts": null,
"created_on": "2019-09-23T20:22:52.506403Z",
"locked": null
},
"attributes": null,
"indexes": {
"size": {
"total_units": 1,
"average_unit_size": 1048576,
"created_on": "2019-09-23T20:22:52.5165769Z",
"locked": null
}
},
"version_parts": {
"acl_ver": "b46701cc24139e5cca2100e09ec48c19",
"attributes_ver": "2d7c3414972b950f3d6fa91b32e7920f",
"file_ver": "fcd9613da51b9b181ff94434a19add87",
"indexes_ver": "88455c093e82651aa042252dca2a37f8"
},
"tags": null,
"linkage": null,
"priority": 0,
"created_on": "2019-09-23T20:22:52.5193448Z",
"last_modified": "0001-01-01T00:00:00Z",
"expiration": "0001-01-01T00:00:00Z",
"type": "basic",
"parts": null,
"locations": null
},
"error": null
}
The resulting JSON object contains an id field (line 4 in this example), in this example its value is 8eb28ad3-2561-4847-8034-6d5473fecfad. This identifier is a uuid, which can be used to download the file.
Saving the node id in an environment variable allows to simply copy-and-paste the following examples:
NODE_ID=<uuid>curl http://localhost:7445/node/${NODE_ID} | jq .curl -OJ "http://localhost:7445/node/${NODE_ID}?download"returns: curl: Saved to filename '10kb.fna'
Option -OJ makes sure that curl saves the file using the correct filename.
curl -X PUT -F 'attributes_str={"project":"extraterrestrial_lifeforms", "sample-nr": 1}' http://localhost:7445/node/${NODE_ID} | jq .List all nodes in the project:
curl 'http://localhost:7445/node?query&project=extraterrestrial_lifeforms' | jq .- Getting Started Tutorial -- hands-on walkthrough with curl examples.
- API Documentation -- full REST API reference with examples.
- Building Shock -- build from source, Docker images, and testing.
- Configuration Guide -- all config options, Locations.yaml, and Types.yaml.
- Concepts -- architecture, caching, HSM, and locations.
- Caching and Data Migration -- cache TTL, auto-upload, S3/MinIO setup.
- Data Types -- Types.yaml format and type system.
- Use Cases -- real-world deployment examples.