-
Notifications
You must be signed in to change notification settings - Fork 5
Expand file tree
/
Copy pathpubs.qmd
More file actions
85 lines (60 loc) · 4.29 KB
/
pubs.qmd
File metadata and controls
85 lines (60 loc) · 4.29 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
---
title: "Research & Resources"
subtitle: Publications, presentations, datasets, and source code
bibliography: isamples-outcomes.bib
csl: cite-full-inline.csl
nocite: |
@*
number-sections: false
---
## Presentations {.unnumbered}
### 2020 SPNHC Conference Talk {.unnumbered}
{{< video https://youtu.be/eRUw5NMksFo?t=105 >}}
### iSamples Metadata Model Talk {.unnumbered}
<embed src="assets/2022-11_iSamplesMetadata.pdf" type="application/pdf" width="100%" height="600px">
[Download slides (PDF)](assets/2022-11_iSamplesMetadata.pdf)
## Publications {.unnumbered}
::: {#refs}
:::
## Zenodo Community {.unnumbered}
The [iSamples Zenodo Community](https://zenodo.org/communities/isamples) archives datasets for reproducible research, including the geoparquet files powering this site's tutorials and Interactive Explorer.
- [iSamples Combined Dataset](https://zenodo.org/communities/isamples) — 6.7M samples from SESAR, OpenContext, GEOME, and Smithsonian
- All data files are also served from [`data.isamples.org`](https://data.isamples.org) with HTTP range request support
## GitHub Repositories {.unnumbered}
All iSamples source code is available at the [isamplesorg GitHub org](https://github.com/isamplesorg/). The repositories form a tight pipeline from **schema** through **serialization** to **consumers**:
```
metadata + vocabularies ← canonical data model & SKOS terms
│
▼
pqg ← property-graph parquet format + tooling
│
▼
data.isamples.org + Zenodo ← published parquet snapshots (narrow, wide, H3, lite, facet caches)
│
┌──────┴──────┐
▼ ▼
examples isamplesorg.github.io
(Python) (Web + DuckDB-WASM + Cesium)
```
### Core repositories {.unnumbered}
| Repository | Role | Layer |
|---|---|---|
| [metadata](https://github.com/isamplesorg/metadata) | Canonical data model — the 8 entity types (MaterialSampleRecord, SamplingEvent, SamplingSite, GeospatialCoordLocation, …) and their relationships | schema |
| [vocabularies](https://github.com/isamplesorg/vocabularies) | SKOS vocabularies for material type, context, and specimen categories | schema |
| [pqg](https://github.com/isamplesorg/pqg) | Property-graph Parquet format spec + conversion tooling (narrow ↔ wide); H3 augmentation and facet caches | serialization |
| [examples](https://github.com/isamplesorg/examples) | Python client and Jupyter notebooks — DuckDB + lonboard for interactive analysis. Also known as `isamples-python` (see below) | consumer |
| [isamplesorg.github.io](https://github.com/isamplesorg/isamplesorg.github.io) | This documentation site — Quarto, Observable, browser-side DuckDB-WASM, Cesium globe | consumer |
### Domain extensions {.unnumbered}
Domain-specific vocabularies extend the core terms via `skos:broader`:
- [metadata_profile_earth_science](https://github.com/isamplesorg/metadata_profile_earth_science) — mineral groups, rock/sediment types, sampled-feature roles
- [metadata_profile_biology](https://github.com/isamplesorg/metadata_profile_biology) — sampled-feature extensions for biological specimens
- [metadata_profile_archaeology](https://github.com/isamplesorg/metadata_profile_archaeology) — OpenContext-style material and object-type extensions
### Legacy / infrastructure {.unnumbered}
- [isamples_inabox](https://github.com/isamplesorg/isamples_inabox) — the original iSamples-in-a-Box server (Solr + FastAPI). The public [iSamples Central](https://central.isample.xyz/isamples_central/) API was offline as of August 2025; the Solr schema there remains the authoritative precedent for query-dimension names (see [Query Specification](query-spec.qmd))
### Related documents {.unnumbered}
- [Query Specification](query-spec.qmd) — substrate-neutral query contract (v0.1)
- [Serialization catalog](SERIALIZATIONS.md) — every published parquet file with role, size, upstream, and consumer
::: {.callout-note}
### Naming note: `examples` vs `isamples-python`
The Python client repo is called `examples` on GitHub but is referred to as `isamples-python` in its own README, the Zenodo deposition metadata, and most prose documentation. This mismatch is known and slated for reconciliation — likely a GitHub repo rename with automatic redirects handling prior links.
:::