Ontology Vocabularies¶
Attach formal ontology meaning to the free-text values you store in a muDM feature's properties. By mapping a human-friendly string like "pyramidal" to a stable URI such as http://purl.obolibrary.org/obo/CL_0000598, your data becomes machine-resolvable against community ontologies (Cell Ontology, UBERON, and others) — without changing the simple strings biologists actually read.
Fully optional and backwards compatible
Vocabularies are an additive layer. Any GeoJSON is valid muDM and any muDM document is valid GeoJSON — the vocabularies member is optional and defaults to None. Existing data and existing readers are unaffected.
Why vocabularies?¶
A muDM feature carries arbitrary key/value metadata in its GeoJSON properties:
from mudm import MuDMFeature
from geojson_pydantic import Point
feat = MuDMFeature(
type="Feature",
geometry=Point(type="Point", coordinates=(1.0, 2.0)),
properties={"cell_type": "pyramidal", "brain_region": "hippocampus_CA1"},
)
Strings like "pyramidal" are convenient for humans but ambiguous for machines: another lab might write "pyr", "pyramidal cell", or "PC" for the same concept. A vocabulary records, alongside the data, exactly which ontology term each value stands for. Tools can then resolve "pyramidal" to a canonical URI, follow it to an ontology, and reason about it (e.g. "is this a subtype of neuron?").
muDM models this with two small, plain objects, both importable from the top-level package:
| Object | Role |
|---|---|
OntologyTerm |
A single term: a required uri, plus an optional human label and description. |
Vocabulary |
A mapping for one property: an optional namespace and description, plus a terms dict from property value to OntologyTerm. |
The building blocks¶
OntologyTerm¶
A term is a reference to an entry in some ontology. Only uri is required.
from mudm import OntologyTerm
# Full term
term = OntologyTerm(
uri="http://purl.obolibrary.org/obo/CL_0000598",
label="pyramidal neuron",
description="A projection neuron",
)
# Minimal term — just the URI
minimal = OntologyTerm(uri="http://example.org/term/1")
assert minimal.label is None
assert minimal.description is None
| Field | Type | Required | Description |
|---|---|---|---|
uri |
str |
yes | Full URI of the ontology term, e.g. http://purl.obolibrary.org/obo/CL_0000598. |
label |
str or None |
no | Human-readable label, e.g. "pyramidal neuron". |
description |
str or None |
no | Optional longer description of the term. |
Vocabulary¶
A Vocabulary describes the allowed values for a single property and maps each value to its term. The terms dict is keyed by the exact string that appears in properties.
from mudm import OntologyTerm, Vocabulary
cell_types = Vocabulary(
namespace="http://purl.obolibrary.org/obo/CL_",
description="Cell ontology",
terms={
"pyramidal": OntologyTerm(
uri="http://purl.obolibrary.org/obo/CL_0000598",
label="pyramidal neuron",
),
"interneuron": OntologyTerm(
uri="http://purl.obolibrary.org/obo/CL_0000099",
),
},
)
assert "pyramidal" in cell_types.terms
assert cell_types.namespace == "http://purl.obolibrary.org/obo/CL_"
| Field | Type | Required | Description |
|---|---|---|---|
namespace |
str or None |
no | Common URI prefix for the ontology, e.g. http://purl.obolibrary.org/obo/CL_. |
description |
str or None |
no | Optional description of the vocabulary. |
terms |
dict[str, OntologyTerm] |
yes | Mapping from a property value (the dict key) to its OntologyTerm. |
A minimal vocabulary needs only terms:
from mudm import OntologyTerm, Vocabulary
v = Vocabulary(terms={"a": OntologyTerm(uri="http://example.org/a")})
assert v.namespace is None
Attaching vocabularies to data¶
Both MuDMFeature and MuDMFeatureCollection expose a vocabularies member with the same signature:
It accepts either of two shapes:
- a dict
{propertyName: Vocabulary}— inline definitions, keyed by property name; or - a string URI pointing to an external vocabulary document.
Inline vocabularies on a collection¶
Define vocabularies once on the collection and let every feature inherit them. The dict key ("cell_type") matches the property key used inside properties.
from mudm import MuDMFeature, MuDMFeatureCollection, OntologyTerm, Vocabulary
from geojson_pydantic import Point
fc = MuDMFeatureCollection(
type="FeatureCollection",
features=[
MuDMFeature(
type="Feature",
geometry=Point(type="Point", coordinates=(1.0, 2.0)),
properties={"cell_type": "pyramidal"},
),
],
vocabularies={
"cell_type": Vocabulary(
namespace="http://purl.obolibrary.org/obo/CL_",
terms={
"pyramidal": OntologyTerm(
uri="http://purl.obolibrary.org/obo/CL_0000598",
label="pyramidal neuron",
),
},
),
},
)
assert "cell_type" in fc.vocabularies
assert fc.vocabularies["cell_type"].terms["pyramidal"].label == "pyramidal neuron"
Referencing an external vocabulary document¶
For large or shared vocabularies, store a URI instead of inlining. The value is a plain string; resolving and fetching it is left to the consuming tool.
from mudm import MuDMFeature, MuDMFeatureCollection
from geojson_pydantic import Point
fc = MuDMFeatureCollection(
type="FeatureCollection",
features=[
MuDMFeature(
type="Feature",
geometry=Point(type="Point", coordinates=(1.0, 2.0)),
properties={},
),
],
vocabularies="https://neuromorpho.org/vocab/neuroscience-v1.json",
)
assert fc.vocabularies == "https://neuromorpho.org/vocab/neuroscience-v1.json"
Multiple property vocabularies on one collection¶
The dict form lets one collection carry independent vocabularies for several properties — for example a cell-type vocabulary drawn from Cell Ontology and a brain-region vocabulary drawn from UBERON.
from mudm import MuDMFeature, MuDMFeatureCollection, OntologyTerm, Vocabulary
from geojson_pydantic import Point
fc = MuDMFeatureCollection(
type="FeatureCollection",
features=[
MuDMFeature(
type="Feature",
geometry=Point(type="Point", coordinates=(1.0, 2.0)),
properties={"cell_type": "pyramidal", "brain_region": "hippocampus_CA1"},
),
],
vocabularies={
"cell_type": Vocabulary(
namespace="http://purl.obolibrary.org/obo/CL_",
terms={
"pyramidal": OntologyTerm(uri="http://purl.obolibrary.org/obo/CL_0000598"),
},
),
"brain_region": Vocabulary(
namespace="http://purl.obolibrary.org/obo/UBERON_",
terms={
"hippocampus_CA1": OntologyTerm(
uri="http://purl.obolibrary.org/obo/UBERON_0003881"
),
},
),
},
)
assert len(fc.vocabularies) == 2
assert (
fc.vocabularies["brain_region"].terms["hippocampus_CA1"].uri
== "http://purl.obolibrary.org/obo/UBERON_0003881"
)
Resolution order: feature overrides collection¶
A feature may define its own vocabularies to override the collection's for the same property. The model does not merge these for you — resolution is a deliberate choice made by the reader. The canonical pattern is a single or: prefer the feature's vocabularies, falling back to the collection's when the feature has none.
A full example where the feature wins:
from mudm import MuDMFeature, MuDMFeatureCollection, OntologyTerm, Vocabulary
from geojson_pydantic import Point
collection_vocab = {
"cell_type": Vocabulary(
terms={"pyramidal": OntologyTerm(uri="http://example.org/COLLECTION")},
),
}
feature_vocab = {
"cell_type": Vocabulary(
terms={"pyramidal": OntologyTerm(uri="http://example.org/FEATURE")},
),
}
feat = MuDMFeature(
type="Feature",
geometry=Point(type="Point", coordinates=(1.0, 2.0)),
properties={"cell_type": "pyramidal"},
vocabularies=feature_vocab,
)
fc = MuDMFeatureCollection(
type="FeatureCollection",
features=[feat],
vocabularies=collection_vocab,
)
# Resolution: check the feature first, then fall back to the collection.
resolved = feat.vocabularies or fc.vocabularies
assert resolved["cell_type"].terms["pyramidal"].uri == "http://example.org/FEATURE"
Whole-object override, not per-key merge
feat.vocabularies or fc.vocabularies selects one object or the other in its entirety. If a feature defines vocabularies at all, the collection's vocabularies are not consulted for that feature — even for property keys the feature does not redefine. If you need per-key fallback, merge the two dicts yourself before resolving.
Backwards compatibility¶
vocabularies is optional on both models and defaults to None. A feature or collection that never sets it behaves exactly like plain GeoJSON.
from mudm import MuDMFeature, MuDMFeatureCollection
from geojson_pydantic import Point
feat = MuDMFeature(
type="Feature",
geometry=Point(type="Point", coordinates=(1.0, 2.0)),
properties={},
)
assert feat.vocabularies is None
fc = MuDMFeatureCollection(
type="FeatureCollection",
features=[feat],
)
assert fc.vocabularies is None
Because the member is omitted when None, documents without vocabularies serialize to ordinary GeoJSON, and any GeoJSON loads cleanly into muDM. See Coordinate Transforms for the same additive philosophy applied to coordinate systems.
JSON on the wire¶
Vocabularies serialize and deserialize losslessly. The two forms below are the same feature collection — one built in Python, one as it appears on disk.
from mudm import MuDMFeature, MuDMFeatureCollection, OntologyTerm, Vocabulary
from geojson_pydantic import Point
fc = MuDMFeatureCollection(
type="FeatureCollection",
features=[
MuDMFeature(
type="Feature",
geometry=Point(type="Point", coordinates=(1.0, 2.0)),
properties={"cell_type": "pyramidal"},
),
],
vocabularies={
"cell_type": Vocabulary(
namespace="http://purl.obolibrary.org/obo/CL_",
description="Cell ontology",
terms={
"pyramidal": OntologyTerm(
uri="http://purl.obolibrary.org/obo/CL_0000598",
label="pyramidal neuron",
),
},
),
},
)
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": { "type": "Point", "coordinates": [1.0, 2.0] },
"properties": { "cell_type": "pyramidal" }
}
],
"vocabularies": {
"cell_type": {
"namespace": "http://purl.obolibrary.org/obo/CL_",
"description": "Cell ontology",
"terms": {
"pyramidal": {
"uri": "http://purl.obolibrary.org/obo/CL_0000598",
"label": "pyramidal neuron"
}
}
}
}
}
Round-tripping through Python preserves every field:
from mudm import MuDMFeature, MuDMFeatureCollection, OntologyTerm, Vocabulary
from geojson_pydantic import Point
fc = MuDMFeatureCollection(
type="FeatureCollection",
features=[
MuDMFeature(
type="Feature",
geometry=Point(type="Point", coordinates=(1.0, 2.0)),
properties={"cell_type": "pyramidal"},
),
],
vocabularies={
"cell_type": Vocabulary(
terms={
"pyramidal": OntologyTerm(
uri="http://purl.obolibrary.org/obo/CL_0000598"
),
},
),
},
)
data = fc.model_dump()
fc2 = MuDMFeatureCollection(**data)
assert fc2.vocabularies["cell_type"].terms["pyramidal"].uri == (
"http://purl.obolibrary.org/obo/CL_0000598"
)
When a string URI is used instead of an inline dict, the vocabularies member is simply that string in the JSON.
Where to next¶
- Metadata & Properties — the
propertiesmember that vocabularies annotate. - Coordinate Transforms — physical/voxel coordinate systems, another optional metadata layer.
- Provenance & Traceability — record how a document was produced.
- Worked examples — end-to-end documents that combine these layers.
- For pipelines, converters, tiling, and visualization that consume these vocabularies, see the mudm-tools documentation.
API reference¶
OntologyTerm
¶
Bases: BaseModel
A reference to a formal ontology term.
Attributes:
| Name | Type | Description |
|---|---|---|
uri |
str
|
Full URI of the ontology term (e.g. "http://purl.obolibrary.org/obo/CL_0000598"). |
label |
Optional[str]
|
Human-readable label (e.g. "pyramidal neuron"). |
description |
Optional[str]
|
Optional longer description of the term. |
Vocabulary
¶
Bases: BaseModel
Maps property values to formal ontology terms.
Attributes:
| Name | Type | Description |
|---|---|---|
namespace |
Optional[str]
|
Common URI prefix for the ontology (e.g. "http://purl.obolibrary.org/obo/CL_"). |
description |
Optional[str]
|
Optional description of this vocabulary. |
terms |
Dict[str, OntologyTerm]
|
Mapping from property values to ontology terms. |