Unified authorization service for neuroscience datasets.
DatasetGateway is a single Django service that centralizes dataset access control across multiple platforms:
- CAVE — drop-in replacement for middle_auth with compatible API endpoints
- Neuroglancer — implements the ngauth protocol for GCS token-based access
- Clio and neuprint — provides authorization APIs these services call to check user permissions
- WebKnossos — planned; will require building compatible APIs based on their open source code, similar to the CAVE integration approach
- pixi
- Docker (for production deployment only)
- A Google OAuth 2.0 client (for login — the setup wizard walks you through it)
cd dsg
pixi install
pixi run setup # interactive wizard — generates .env, runs migrationspixi run serveStarts the Django dev server. If .env doesn't exist yet, the setup wizard
runs automatically.
pixi run deployBuilds the Docker image, starts the container, runs migrations and seed commands. Put a reverse proxy (nginx/caddy) in front for TLS.
The Django admin is at /admin/.
Login requires a Google OAuth 2.0 client. Without one the server runs but
all login/authorize links will fail with a client_id error. The setup
wizard (pixi run setup) will walk you through creating one if
secrets/client_credentials.json is missing.
Alternatively, you can set it up manually:
- Go to the Google Cloud Console and create an OAuth 2.0 Client ID (type: Web application).
- Add
http://localhost:8200/accounts/google/login/callback/as an authorized redirect URI (and your production URI if known). - Download the JSON credentials and save them:
mkdir -p dsg/secrets
cp ~/Downloads/client_secret_*.json dsg/secrets/client_credentials.jsonThe secrets/ directory is gitignored. Alternatively, you can set environment
variables instead of using the JSON file:
export GOOGLE_CLIENT_ID="your-client-id.apps.googleusercontent.com"
export GOOGLE_CLIENT_SECRET="your-client-secret"All users authenticate via Google OpenID Connect. On successful login,
the server creates a DB-stored API key and sets it as the dsg_token
cookie. This single cookie is shared by all services in the ecosystem.
API requests are authenticated by checking for the token in this order:
dsg_tokencookieAuthorization: Bearer {token}header?dsg_token=query parameter
CAVE services (MaterializationEngine, AnnotationEngine, etc.) call
DatasetGateway's /api/v1/user/cache endpoint on every request to validate
the user's token and retrieve their permissions. This is a drop-in
replacement for CAVE's original middle_auth server — CAVE services
only need their AUTH_URL environment variable pointed at DatasetGateway.
Users log in via /api/v1/authorize, which redirects through Google
OAuth and sets the dsg_token cookie.
Neuroglancer uses the ngauth protocol.
Users log in via a popup that hits /auth/login → Google OAuth →
dsg_token cookie. Because Neuroglancer runs on a different origin
(e.g., neuroglancer.org), it cannot read the cookie directly. Instead
it calls POST /token, which reads the cookie server-side and returns a
short-lived token. Neuroglancer then exchanges that token for a
time-limited GCS access credential via POST /gcs_token, which grants
read access to the specific cloud storage bucket holding the dataset.
Other services (neuPrint, celltyping-light, Clio) validate users by
calling /api/v1/user/cache with the dsg_token value, the same way
CAVE services do. When all services share a cookie domain (configured
via AUTH_COOKIE_DOMAIN), users log in once and are authenticated
everywhere.
cd dsg
pixi run -e dev python -m pytestDatasetGateway is designed for a single-server Docker deployment behind a reverse proxy that handles TLS.
cd dsg
pixi run setup # generates .env interactively (set DJANGO_DEBUG=False for production)
pixi run deploy # builds Docker image, starts container, runs migrations + seedsThen create an admin user:
docker compose -f docker-compose.yml exec dsg python manage.py make_admin user@example.comPut a reverse proxy (nginx or Caddy) in front for TLS, pointed at
localhost:8080. The setup wizard defaults SECURE_SSL_REDIRECT=False
since most deployments terminate TLS at the proxy.
The SQLite database and static files are stored in Docker volumes
(dsg-data and dsg-static) so they survive container
restarts. If you need PostgreSQL or Redis, swap the DATABASES / CACHES
settings and add services to docker-compose.yml.
| Variable | Default | Description |
|---|---|---|
DJANGO_SECRET_KEY |
insecure dev key | Secret key for sessions and CSRF. Set in production. |
DJANGO_DEBUG |
True |
Set to False in production. |
DJANGO_ALLOWED_HOSTS |
* |
Comma-separated list of allowed hostnames. |
DATABASE_PATH |
db.sqlite3 |
Path to SQLite database file. |
SECURE_SSL_REDIRECT |
True (prod) |
Set to False if reverse proxy handles TLS. |
DSG_ORIGIN |
(empty) | Public origin for CSRF trusted origins (e.g., https://dataset-gateway.mydomain.org). |
DSG_PORT |
8200 |
Port for the development server. |
GOOGLE_CLIENT_ID |
(empty) | Google OAuth 2.0 client ID (overrides client_credentials.json). |
GOOGLE_CLIENT_SECRET |
(empty) | Google OAuth 2.0 client secret (overrides client_credentials.json). |
NGAUTH_ALLOWED_ORIGINS |
^https?://.*\.neuroglancer\.org$ |
Regex for allowed CORS origins. |
AUTH_COOKIE_DOMAIN |
(empty) | Cookie domain for cross-subdomain auth (e.g., .example.org). |
PORT |
8080 |
Port for gunicorn (Docker). |
GUNICORN_WORKERS |
2 |
Number of gunicorn worker processes. |
LOG_LEVEL |
info |
Gunicorn log level. |
- User manual — setup, admin workflows, user workflows, management commands
- Architecture — system design, authorization model, deployment strategy
- CAVE auth endpoints — CAVE API compatibility reference and SCIM 2.0 provisioning
- Implementation record — what was built, with retrospective notes on deviations from the original plan
- Admin manual — administration and operational reference