Roadmap¶
This page tracks how mudm-tools has evolved and where it is headed. The project grew from a lightweight annotation format into a high-performance, Rust-accelerated tiling platform for 2D and 3D spatial biology data.
How to read this page
Phases 1–3 are complete. Phase 4 (adoption and sustainability) is in progress as of mid-2026. Where a phase maps onto a shipped feature, you'll find a link to the relevant guide so you can try it today. Dates and benchmark numbers describe the current pipeline.
Phase 1: Consolidation and Documentation (Complete)¶
-
Refinement of the core model
- Finalized and stabilized the
mudmcore model on Pydantic v2. - Published updated specifications and documentation.
- Implemented hierarchical references (
parentId,ref) for skeletons and compartment trees.
- Finalized and stabilized the
-
Community engagement
- Established communication channels via GitHub issues.
- Gathered feedback from stakeholders and early adopters.
Phase 2: Expanded Features and Extensions (Complete)¶
-
Harmonization with GeoJSON Pydantic
- Full compatibility with geojson-pydantic — muDM models extend the GeoJSON Pydantic types.
-
Harmonization with the OME model
- Integrated OME-compatible coordinate systems and multiscale metadata. See the OME-NGFF guide.
- Provenance tracking for data lineage.
-
Tiling with TileJSON and binary formats
- TileJSON 3.0.0 metadata for muDM tilesets.
- 2D vector-tile pipeline: GeoJSON → PBF (MVT) and tiled Parquet.
- Python
TileWriter/TileReaderfor the legacy PBF workflow.
Phase 3: 3D Data and Rust Acceleration (Complete)¶
This phase delivered the Rust extension (mudm_tools._rs, built with PyO3 +
maturin) and the full 3D tiling stack. See the
3D tiling guide to run it.
-
Rust-accelerated tiling engine
- Hot-path tiling rewritten in Rust.
- 2D pipeline:
StreamingTileGenerator2Dwith quadtree clipping, Douglas–Peucker simplification, and parallel tile encoding via rayon (see the 2D tiling guide). - 3D pipeline:
StreamingTileGeneratorwith octree indexing, QEM mesh simplification, and multi-format output.
-
3D geometry and mesh support
- Added
PolyhedralSurfaceandTINgeometry types. - OBJ mesh ingestion with parallel parsing (see the converters guide).
- Fragment file format (MJF2) for sharded intermediate storage.
- Added
-
Output formats
- 3D Tiles (GLB) with meshopt or Draco compression.
- PBF3 — custom protobuf format for 3D tile data.
- Tiled Parquet (ZSTD) for ML training pipelines (see the GeoParquet/glTF guide).
- Neuroglancer precomputed format for web-based 3D visualization (see the Neuroglancer guide).
- 2D PBF (MVT) — pure-Rust encoder/decoder.
-
Compression
- Meshopt (lossless, fast decode, Brotli-friendly) — default for viewer output.
- Draco (lossy quantization, smallest on disk) — optional.
- ZSTD for Parquet columns.
- Brotli HTTP transport compression for GLB serving.
-
Neuroglancer precomputed path (hardened in this phase)
- Multi-LOD Draco meshes for level-of-detail streaming.
- Opt-in
neuroglancer_uint64_sharded_v1output that packs every segment into.shardfiles for object-store-scale hosting, instead of one loose file per segment. - Deterministic fragment ordering so repeated runs produce byte-identical output (reproducible builds, content-addressable caching).
- Large-mesh Draco-encode performance improvements.
The sharded layout is opt-in on
generate_neuroglancer_multilod(the basicgenerate_neuroglancermethod takes onlyoutput_dirandworld_bounds). The compiled Rust method (not introspectable by autodoc — hand-written here) has the signature:generator.generate_neuroglancer_multilod( output_dir, world_bounds, vertex_quantization_bits=10, max_memory_bytes=0, sharded=False, minishard_bits=6, shard_bits=0, )Parameter Default Meaning output_dir— Destination directory for the precomputed dataset. world_bounds— World-space bounding box used to place meshes. vertex_quantization_bits10Draco position-quantization bits. max_memory_bytes0Soft memory ceiling for streaming. 0= auto: the generator's resolved budget (MUDM_MAX_MEMORY_GBenv, else 0.8 × physical RAM, else an 8 GiB fallback) — never truly unbounded.shardedFalseEmit neuroglancer_uint64_sharded_v1.shardfiles instead of loose per-segment files.minishard_bits6Minishard index bits (sharded mode only). shard_bits0Shard index bits (sharded mode only). When to enable sharding
Loose per-segment files (the default) are simplest for local viewing. Switch on
sharded=Truewhen hosting many thousands of segments on an object store (S3/GCS), where one.shardfile per group is far cheaper than millions of small objects. See the Neuroglancer guide for a full walkthrough. -
Dataset pipelines
- MouseLight: 38 brains, 876K rows, meshopt 3D Tiles (84 min total).
- Hemibrain: 5,000 neurons (95 cell types), Parquet tiling complete.
Phase 4: Adoption and Long-Term Sustainability (In Progress)¶
-
Format converter registry (shipped)
- Pluggable converter system in
mudm_tools.converterswith aconvert()/list_formats()API. - Built-in converters: GeoJSON, OBJ, Xenium.
- CLI front end:
python -m mudm_tools.converters.cli.
See the converters guide and CLI reference.
- Pluggable converter system in
-
Xenium 2D spatial-transcriptomics pipeline at scale (shipped)
- End-to-end converter for 10x Genomics Xenium output (transcripts, cell boundaries, nuclei).
- Rust-native Parquet ingestion (
add_parquet_points,add_parquet_polygons) — bypasses GeoJSON serialization entirely. - Rust-native Parquet output (
generate_parquet_native) — parallel per-zoom part files. - Validated at scale: 42.97M features (breast-cancer dataset) in ~20 minutes. See the 2D tiling guide.
-
Web viewers (shipped)
- 2D Leaflet viewer (
src/mudm_tools/viewers/viewer2d/) with a DAPI raster base layer and multi-layer MVT overlays, gene-category color mapping, layer toggles, a hover/click info panel, and a dataset selector. - 3D Three.js viewer (
src/mudm_tools/viewers/viewer3d/) for streaming 3D Tiles, with bounding-volume display, slice planes, an axis gizmo, and overview/info panels. - Both are served by the
mudm-serveconsole script — see the CLI reference.
- 2D Leaflet viewer (
-
Documentation and guides (in progress)
- This documentation site (the page you're reading).
- Task-oriented guides for 2D tiling, 3D tiling, converters, Neuroglancer, GeoParquet/glTF, OME-NGFF, and the legacy pipeline.
-
A Python API reference and runnable example scripts under
src/mudm_tools/examples/, e.g.:
-
Remaining work (general direction)
- Reference implementations in additional languages.
- A governance model and standards process for the muDM specification.
- Broader dataset coverage and additional case studies.
- Continued community engagement: user meetings and feedback sessions.
Want to try the current pipeline?
Start with Getting Started for a guided tour, then pick the guide that matches your data: 2D tiling for spatial transcriptomics, or 3D tiling for meshes and neuron morphologies.