Usage

CLI

The tro-utils command-line tool is the primary interface for managing TROs.

Global options

These options (or equivalent environment variables) apply to every command:

Option

Env var

Description

--declaration PATH

Path to the .jsonld TRO file

--profile PATH

TRS_PROFILE

Path to the TRS profile JSON

--gpg-fingerprint KEY

GPG_FINGERPRINT

GPG key fingerprint for signing

--gpg-passphrase PASS

GPG_PASSPHRASE

GPG key passphrase

--tro-creator TEXT

Creator field for a new TRO

--tro-name TEXT

Name field for a new TRO

--tro-description TEXT

Description field for a new TRO

Commands

tro-utils verify-timestamp <declaration>

Verifies the RFC 3161 timestamp and GPG signature on the TRO.

tro-utils verify-package <declaration> <package> [-a ARRANGEMENT] [-s SUBPATH] [-v]

Verifies a replication package (directory or .zip) against the hashes stored in an arrangement.

tro-utils arrangement add <directory> [-m COMMENT] [-i IGNORE]
tro-utils arrangement add --from-snapshot <snapshot.jsonld> [-m COMMENT]
tro-utils arrangement snapshot <directory> -o <snapshot.jsonld> [-m COMMENT] [-i IGNORE]
tro-utils arrangement list [-v]

Adds a directory snapshot as a new arrangement, lists existing arrangements, or computes a reusable arrangement snapshot file without a TRO (see Arrangement snapshots).

tro-utils composition info [-v]

Shows artifacts in the current composition with their MIME type and hash.

tro-utils performance add [-m COMMENT] [-s START] [-e END] [-a ATTRIBUTE] [-A ACCESSED] [-M CONTRIBUTED]

Records a TrustedResearchPerformance entry in the TRO. -A and -M can be repeated to reference multiple input or output arrangements:

tro-utils --declaration my.jsonld performance add \
  -m "Processing run" \
  -A arrangement/0 -A arrangement/1 \
  -M arrangement/2
tro-utils sign

GPG-signs the TRO declaration, writing a .sig file.

tro-utils report -t TEMPLATE -o OUTPUT

Renders a Jinja2 report template with an embedded workflow diagram.


Arrangement snapshots

Scanning a large directory is expensive. arrangement snapshot decouples the scan from the TRO creation step: compute the snapshot once and reuse it across multiple TROs or workflow runs.

Compute a snapshot (no TRO required)

tro-utils arrangement snapshot /mnt/large-dataset \
  -m "Reference dataset v1.2" \
  -o dataset-v1.2.jsonld

This produces a self-contained JSON-LD file that records every file path and its SHA-256 hash, along with the MIME type of each artifact. No TRO declaration file is needed.

Reuse a snapshot when building a TRO

# Use the pre-computed snapshot instead of re-scanning
tro-utils --declaration my.jsonld arrangement add --from-snapshot dataset-v1.2.jsonld

# Optionally override the comment stored in the snapshot
tro-utils --declaration my.jsonld arrangement add \
  --from-snapshot dataset-v1.2.jsonld \
  -m "Static data mount"

Artifacts already present in the TRO composition are matched by content hash — no duplicates are ever inserted, even if the same file appears in multiple snapshots or live arrangements.


Multiple accessed / contributed arrangements

A TrustedResearchPerformance can reference more than one input (accessed) or output (contributed) arrangement. This models workflows that merge several data sources or produce several outputs.

CLI

Repeat -A / -M as many times as needed:

tro-utils --declaration my.jsonld performance add \
  -m "Merge and process" \
  -A arrangement/0 \
  -A arrangement/1 \
  -M arrangement/2

When a single arrangement is provided the JSON-LD value is a plain object; when multiple are provided it becomes a list — both forms are handled transparently on load.

Python API

tro.add_performance(
    start_time=start,
    end_time=end,
    comment="Merge and process",
    accessed_arrangement=["arrangement/0", "arrangement/1"],  # list accepted
    modified_arrangement="arrangement/2",                     # single string also accepted
    attrs=[TRPAttribute.NET_ISOLATION],
)

The accessed_arrangement and modified_arrangement parameters accept a single str, a (arrangement_id, mount_path) tuple, a list of either, or None.


Mount paths and ArrangementBinding

The same arrangement can be mounted at different paths in different performances (e.g. /input in one run and /output in another). To represent this unambiguously in RDF, each reference is wrapped in an intermediate trov:ArrangementBinding object that ties the arrangement, the mount path, and the performance together.

"trov:accessedArrangement": [
  {
    "@id": "trp/0/binding/0",
    "@type": "trov:ArrangementBinding",
    "trov:arrangement": { "@id": "arrangement/0" },
    "trov:boundTo": "/mnt/input"
  }
]

Binding IDs are generated automatically when you call add_performance. The boundTo field is optional — omit it when the path is not meaningful.

CLI — ARRANGEMENT_ID:MOUNT_PATH syntax

Supply a mount path by separating the arrangement ID and path with ::

tro-utils --declaration my.jsonld performance add \
  --start 2024-01-01T10:00:00 \
  --end   2024-01-01T11:00:00 \
  -A arrangement/0:/mnt/input \
  -A arrangement/1 \
  -M arrangement/2:/mnt/output

Entries without a : are plain arrangement IDs (no mount path recorded). The two forms can be mixed freely in a single command.

Python API

add_performance accepts each arrangement as either a plain str (no mount path) or a (arrangement_id, mount_path) tuple:

tro.add_performance(
    start_time=start,
    end_time=end,
    comment="Containerised run",
    accessed_arrangement=[
        ("arrangement/0", "/mnt/input"),  # tuple: (id, boundTo path)
        "arrangement/1",                  # plain string: no path
    ],
    modified_arrangement=("arrangement/2", "/mnt/output"),
    attrs=[TRPAttribute.NET_ISOLATION],
)

A single value or a list is accepted for both parameters.

The resolved ArrangementBinding objects are stored on TrustedResearchPerformance.accessed_arrangements and TrustedResearchPerformance.contributed_to_arrangements:

for binding in perf.accessed_arrangements:
    print(binding.binding_id)       # e.g. "trp/0/binding/0"
    print(binding.arrangement_id)   # e.g. "arrangement/0"
    print(binding.path)             # e.g. "/mnt/input"  (or None)

TRO — high-level facade

tro_utils.tro_utils.TRO is the recommended entry point for programmatic use. It wraps TransparentResearchObject and adds signing, timestamping, and report generation.

from tro_utils.tro_utils import TRO

# Create or load a TRO
tro = TRO(
    filepath="sample_tro.jsonld",
    gpg_fingerprint="ABCD1234...",
    gpg_passphrase="secret",
    profile="trs.jsonld",
    tro_creator="Alice",
    tro_name="My TRO",
    tro_description="A sample transparent research object",
)

# Add an arrangement (scans a directory)
tro.add_arrangement("/path/to/workdir", comment="Before workflow", ignore=[".git"])

# Or load a pre-computed snapshot (avoids rescanning an expensive directory)
tro.add_arrangement_from_snapshot("/path/to/dataset.jsonld", comment="Static mount")

# Record a performance
from datetime import datetime
tro.add_performance(
    start_time=datetime(2024, 3, 1, 9, 22, 1),
    end_time=datetime(2024, 3, 2, 10, 0, 11),
    comment="My workflow run",
    attrs=["trov:InternetIsolation"],
    accessed_arrangement="arrangement/0",   # str | (id, path) tuple | list | None
    modified_arrangement="arrangement/1",   # str | (id, path) tuple | list | None
)

# Save, sign, and timestamp
tro.save()
tro.trs_signature()
tro.request_timestamp()
tro.verify_timestamp()

# Verify a replication package
result = tro.verify_replication_package(
    arrangement_id="arrangement/1",
    package="/path/to/package.zip",
    subpath=None,
)
print(result.is_valid)          # True / False
print(result.mismatched_hashes) # list of (path, expected, actual) tuples

# Render a report
tro.generate_report(template="tro.md.jinja2", report="report.md")

Key properties

Property

Returns

Description

tro_filename

str

Path to the .jsonld file

sig_filename

str

Path to the .sig file

tsr_filename

str

Path to the .tsr file

data

dict

Full JSON-LD representation (calls to_jsonld())


TransparentResearchObject — data model

tro_utils.models.TransparentResearchObject is the root data-model class. Use it directly when you need fine-grained control or are building tooling on top of the model.

from tro_utils.models import TransparentResearchObject

# Load an existing TRO
tro = TransparentResearchObject.load("sample_tro.jsonld")

# Inspect structure
print(tro.name, tro.description)
print(tro.composition.artifacts)
for arr in tro.arrangements:
    print(arr.arrangement_id, arr.comment)
for perf in tro.performances:
    print(perf.performance_id, perf.started_at, perf.ended_at)

# Save
tro.save("output.jsonld")

Model hierarchy

TransparentResearchObject          tro.py
├── TrustedResearchSystem          trs.py
│   └── TRSCapability[]            trs.py
├── TimeStampingAuthority          tsa.py   (optional)
├── ArtifactComposition            composition.py
│   ├── ResearchArtifact[]         artifact.py
│   │   └── HashValue              hash_value.py
│   └── CompositionFingerprint     composition.py
├── ArtifactArrangement[]          arrangement.py
│   └── ArtifactLocation[]         arrangement.py
├── TrustedResearchPerformance[]   performance.py
│   ├── ArrangementBinding[]       performance.py  (accessed_arrangements)
│   ├── ArrangementBinding[]       performance.py  (contributed_to_arrangements)
│   └── PerformanceAttribute[]     performance.py
└── TROAttribute[]                 attribute.py

All model classes inherit from TROVModel and implement to_jsonld() / from_jsonld().

ArtifactArrangement also supports standalone snapshot serialisation:

from tro_utils.models import ArtifactArrangement, ArtifactComposition

# Compute and persist a snapshot (no TRO needed)
comp = ArtifactComposition()
arr = ArtifactArrangement.from_directory(
    "/mnt/large-dataset", comp, arrangement_id="arrangement/0", comment="Dataset v1.2"
)
arr.save_snapshot("dataset-v1.2.jsonld", comp)

# Later — merge the snapshot into any TRO's composition
tro_model = TransparentResearchObject.load("my.jsonld")
tro_model.add_arrangement_from_snapshot("dataset-v1.2.jsonld")
tro_model.save("my.jsonld")

Enums

tro_utils.TROVCapability maps capability names to trov:Can* JSON-LD types. tro_utils.TRPAttribute maps the same names to trov:* performance-attribute types. Use TROVCapability.translate(trp_member) to convert between the two.

from tro_utils import TROVCapability, TRPAttribute

cap = TROVCapability.NET_ISOLATION        # "trov:CanProvideInternetIsolation"
attr = TRPAttribute.NET_ISOLATION         # "trov:InternetIsolation"
assert TROVCapability.translate(attr) == cap

Available enum keys: RECORD_NETWORK, NET_ISOLATION, ENV_ISOLATION, NON_INTERACTIVE, EXCLUDE_INPUT, EXCLUDE_OUTPUT, ALL_DATA_INCLUDED, REQUIRE_INPUT_DATA, REQUIRE_LOCAL_DATA, DATA_PERSIST, OUTPUT_INCLUDED, CODE_INCLUDED, SOFTWARE_RECORD, NET_ACCESS, MACHINE_ENFORCEMENT.


ReplicationPackage

from tro_utils.replication_package import ReplicationPackage

result = ReplicationPackage.verify(
    arrangement=tro.arrangements[1],
    composition=tro.composition,
    package="/path/to/package",   # directory or .zip
    subpath=None,
)

if not result.is_valid:
    print("Missing in arrangement:", result.files_missing_in_arrangement)
    print("Mismatched hashes:", result.mismatched_hashes)
    print("Missing in package:", result.files_missing_in_package)