roboto.domain.topics.record#

Module Contents#

class roboto.domain.topics.record.CanonicalDataType(*args, **kwds)#

Bases: enum.Enum

Normalized data types used across different robotics frameworks.

Well-known and simplified data types that provide a common vocabulary for describing message path data types across different frameworks and technologies. These canonical types are primarily used for UI rendering decisions and cross-platform compatibility.

The canonical types abstract away framework-specific details while preserving the essential characteristics needed for data processing and visualization.

References

ROS 1 field types: http://wiki.ros.org/msg
ROS 2 field types: https://docs.ros.org/en/iron/Concepts/Basic/About-Interfaces.html#field-types
uORB: https://docs.px4.io/main/en/middleware/uorb.html#adding-a-new-topic

Example mappings:

float32 -> CanonicalDataType.Number
uint8[] -> CanonicalDataType.Array
sensor_msgs/Image -> CanonicalDataType.Image
geometry_msgs/Pose -> CanonicalDataType.Object
std_msgs/Header -> CanonicalDataType.Object
string -> CanonicalDataType.String
char -> CanonicalDataType.String
bool -> CanonicalDataType.Boolean
byte -> CanonicalDataType.Byte

Array = 'array'#: A sequence of values.

Boolean = 'boolean'#

Byte = 'byte'#

Categorical = 'categorical'#

Data that can take a limited, fixed set of values. To be interpreted correctly by Roboto clients, a MessagePathRecord with this type must have a "categories" metadata key on the MessagePathRecord, which must be the ordered list of values that the Categorical can take.

For example, a signal that is logged as either “off” or “on” could be represented as a Categorical with the metadata "categories"=["off", "on"]. This allows Roboto to map the value “off” to 0 and “on” to 1 –each corresponding to their index position in the metadata array– and therefore visualize these state transitions as a plot.

The default visual representation of Categorical data will be the same as String data, but the Roboto visualizer will be capable of rendering Categorical data in a plot.

Image = 'image'#: Special purpose type for data that can be rendered as an image.

LatDegFloat = 'latdegfloat'#: Geographic point in degrees. E.g. 47.6749387 (used in ULog ver_data_format >= 2)

LatDegInt = 'latdegint'#: Geographic point in degrees, expressed as an integer. E.g. 317534036 (used in ULog ver_data_format < 2)

LonDegFloat = 'londegfloat'#: Geographic point in degrees. E.g. 9.1445274 (used in ULog ver_data_format >= 2)

LonDegInt = 'londegint'#: Geographic point in degrees, expressed as an integer. E.g. 1199146398 (used in ULog ver_data_format < 2)

Number = 'number'#

NumberArray = 'number_array'#

Object = 'object'#: A struct with attributes.

String = 'string'#

Timestamp = 'timestamp'#: Time elapsed since the Unix epoch, identifying a single instant on the time-line. Roboto clients will look for a "unit" metadata key on the MessagePath record, and will assume “ns” if none is found. If the timestamp is in a different unit, add the following metadata to the MessagePath record: { "unit": "s"|"ms"|"us"|"ns" } The unit must be a known value from TimeUnit.

Unknown = 'unknown'#: This is a fallback and should be used sparingly.

class roboto.domain.topics.record.MessagePathMetadataWellKnown#

Bases: roboto.compat.StrEnum

Well-known metadata key names (with well-known semantics) that may be set in metadata.

These are most often set by Roboto’s first-party ingestion actions and used by Roboto clients.

Categories = 'categories'#

An ordered list of values that a Categorical can take.

Examples

"categories"=["off", "on"]
"categories"=["left", "up", "right", "down"]

ColumnName = 'column_name'#

The original name or path to this field in the source data schema. May differ from message_path if character substitutions were applied to conform to naming requirements.

Notes

Use of this metadata field is soft-deprecated as of SDK v0.24.0.
Prefer use of source_path and path_in_schema instead. Those attributes are now first-class fields on MessagePathRecord and can be specified via AddMessagePathRequest.

Unit = 'unit'#: Unit of a field. E.g., ‘ns’ for a timestamp. If provided, must match a known, supported unit from TimeUnit.

class roboto.domain.topics.record.MessagePathRecord(/, **data)#

Bases: pydantic.BaseModel

Record representing a message path within a topic.

Defines a specific field or signal within a topic’s data schema, including its data type, metadata, and statistical information. Message paths use dot notation to specify nested attributes within complex message structures.

Message paths are the fundamental units for accessing individual data elements within time-series robotics data, enabling fine-grained analysis and visualization of specific signals or measurements.

Parameters:: data (Any)

canonical_data_type: CanonicalDataType#: Normalized data type, used primarily internally by the Roboto Platform.

created: datetime.datetime#

created_by: str#

data_type: str#: ‘Native’/framework-specific data type of the attribute at this path. E.g. “float32”, “uint8[]”, “geometry_msgs/Pose”, “string”.

message_path: str#: Dot-delimited path to the attribute within the datum record.

message_path_id: str#

metadata: collections.abc.Mapping[str, Any] = None#: Key-value pairs to associate with this metadata for discovery and search, e.g. { ‘min’: ‘0.71’, ‘max’: ‘1.77 }

modified: datetime.datetime#

modified_by: str#

org_id: str#: This message path’s organization ID, which is the organization ID of the containing topic.

parents(delimiter='.')#

Logical message path ancestors of this path.

Examples

Given a deeply nested field root.sub_obj_1.sub_obj_2.leaf_field:

>>> field = "root.sub_obj_1.sub_obj_2.leaf_field"
>>> record = MessagePathRecord(message_path=field)  # other fields omitted for brevity
>>> print(record.parents())
['root.sub_obj_1.sub_obj_2', 'root.sub_obj_1', 'root']

Parameters:: delimiter (str)
Return type:: list[str]

path_in_schema: list[str]#: List of path components representing the field’s location in the original data schema. Unlike message_path, which must conform to Roboto-specific naming requirements and assumes dots separated path parts imply nested data, this preserves the exact path from the source data for accurate attribute access. This is expected to be the split representation of source_path.

representations: collections.abc.MutableSequence[RepresentationRecord] = None#: Zero to many Representations of this MessagePath.

source_path: str#

The original name of this field in the source data schema. May differ from message_path if character substitutions were applied to conform to naming requirements.

This is the preferred field to use when specifying message_path_include or message_path_exclude to the get_data or get_data_as_df methods of Topic and Event.

topic_id: str#

class roboto.domain.topics.record.MessagePathRepresentationMapping(/, **data)#

Bases: pydantic.BaseModel

Mapping between message paths and their data representation.

Associates a set of message paths with a specific representation that contains their data. This mapping is used to efficiently locate and access data for specific message paths within topic representations.

Parameters:: data (Any)

message_paths: collections.abc.MutableSequence[MessagePathRecord]#

representation: RepresentationRecord#

class roboto.domain.topics.record.MessagePathStatistic(*args, **kwds)#

Bases: enum.Enum

Statistics computed by Roboto in our standard ingestion actions.

Count = 'count'#

Max = 'max'#

Mean = 'mean'#

Median = 'median'#

Min = 'min'#

class roboto.domain.topics.record.RepresentationRecord(/, **data)#

Bases: pydantic.BaseModel

Record representing a data representation for topic content.

A representation is a pointer to processed topic data stored in a specific format and location. Representations enable efficient access to topic data by providing multiple storage formats optimized for different use cases.

Most message paths within a topic point to the same representation (e.g., an MCAP or Parquet file containing all topic data). However, some message paths may have multiple representations for analytics or preview formats.

Representations are versioned and associated with specific files or storage locations through the association field.

Parameters:: data (Any)

association: roboto.association.Association#: Identifier and entity type with which this Representation is associated. E.g., a file, a database.

created: datetime.datetime#

format: str | None = None#: Content format descriptor for this representation. For image topics: the image encoding (e.g. “jpeg”, “png”) for simplified representations, or the ROS schema name (e.g. “sensor_msgs/Image”) for original/passthrough representations. None for non-image topics or legacy representations.

modified: datetime.datetime#

representation_id: str#

storage_format: RepresentationStorageFormat#

topic_id: str#

transformations: list[str] = None#

Ordered list of transformation descriptors applied to produce this representation. Empty for original/passthrough representations.

Each entry is a "<kind>:<param>" string where <kind> is a TransformationKind member. Construct entries via TransformationKind.with_param() and parse them via TransformationKind.parse() to keep the vocabulary centralized.

Example: ["downsample:0.5", "encode:jpeg"]

version: int#

class roboto.domain.topics.record.RepresentationSelector(/, **data)#

Bases: pydantic.BaseModel

Criteria for selecting among multiple representations of the same data.

When a message path has multiple representations (e.g., both raw sensor data and a processed JPEG encoding), this is a hard filter: only matching representations qualify, and message paths with no matching representation are dropped from selection results — callers must handle empty or partial output.

Legacy carve-out for ``content_format``: representations with no format set (i.e., predating the field) are treated as matching any content_format request. This keeps older data accessible. When both an explicit format match and a legacy representation are available for the same message path, the explicit match wins.

Instances are immutable (frozen=True) so they can be safely shared — including as default arguments to methods like Topic.get_data().

Parameters:: data (Any)

content_format#: If set, only representations whose format field matches this value qualify (e.g., "jpeg"). Representations with no format also qualify under the legacy carve-out. None means no constraint.

transformations#: If set, only representations whose transformations field matches exactly qualify. [] matches representations with no transformations (i.e., raw/original data). None means no constraint.

content_format: str | None = None#

matches(representation)#

Check whether a representation satisfies this selector’s criteria.

A representation matches when each non-None selector field is satisfied. For content_format, representations with no format set are treated as matching (legacy carve-out — see class docstring).

Parameters:: representation (RepresentationRecord)
Return type:: bool

model_config#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod raw()#

Select representations with no transformations applied (original data).

Return type:: RepresentationSelector

select_representations(mappings)#

Select one representation per message path that matches this selector.

When the API returns multiple representations for the same message paths (e.g., both a raw MCAP and a processed JPEG MCAP for an image topic), this method picks a matching representation for each path and deduplicates so each message path appears in exactly one mapping.

Non-matching representations are excluded. When both an explicit format match and a legacy representation (no format set) cover the same message path, the explicit match wins. Message paths covered by no matching representation are dropped — callers must handle empty or partial results.

Parameters:: mappings (list[MessagePathRepresentationMapping]) – All representation mappings, potentially with overlapping message paths.
Returns:: Deduplicated mappings of message paths to matching representations. Empty if no representation matches.
Return type:: list[MessagePathRepresentationMapping]

transformations: list[str] | None = None#

class roboto.domain.topics.record.RepresentationStorageFormat(*args, **kwds)#

Bases: enum.Enum

Supported storage formats for topic data representations.

Defines the available formats for storing and accessing topic data within the Roboto platform. Each format has different characteristics and use cases.

MCAP = 'mcap'#: MCAP format - optimized for robotics time-series data with efficient random access.

PARQUET = 'parquet'#: Parquet format - columnar storage optimized for analytics and large-scale data processing.

class roboto.domain.topics.record.SchemaFieldRecord(/, **data)#

Bases: pydantic.BaseModel

A single field within a topic schema.

One entry per unique field path within a schema; field paths are deduplicated across topics that share the schema.

Parameters:: data (Any)

canonical_data_type: CanonicalDataType#: Normalized data type used for cross-framework compatibility and UI rendering decisions.

created: datetime.datetime | None = None#

created_by: str#

data_type: str#: Native, framework-specific data type of the field. E.g. “float32”, “uint8[]”, “geometry_msgs/Pose”.

field_id: str#

model_config#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

modified: datetime.datetime | None = None#

modified_by: str#

name: str#: Human-readable display name of the field (typically the final component of path_in_schema).

org_id: str#

path_in_schema: tuple[str, Ellipsis]#: Path components locating this field in the source data schema. Each component is a schema-native attribute name, in order from the schema root to the leaf.

schema_id: str#

unit: str | None = None#: Optional unit of the field’s values (e.g., "ns", "m/s"). None if the field is unitless or unknown.

class roboto.domain.topics.record.TimelineExtentRecord(/, **data)#

Bases: pydantic.BaseModel

Min/max timestamp bounds for one topic partition measured against one timeline source.

Written by ingest when a partition’s timestamps are summarized for a given source (e.g., a schema timestamp field, or message log/publish time).

Stored timestamps come through verbatim from the data source: they may be absolute nanoseconds since the Unix epoch, or partition-relative (e.g., monotonic from zero). unix_epoch_offset_ns is the calibration that projects stored values onto Unix-epoch wall-clock: session_time_ns = stored_time_ns + unix_epoch_offset_ns. A value of 0 means the stored timestamps are already absolute Unix-epoch ns, or that no calibration has been applied yet.

Parameters:: data (Any)

created: datetime.datetime | None = None#

created_by: str#

max_timestamp: int | None = None#: Largest stored timestamp in this extent, in nanoseconds. Absolute or partition-relative per the source.

min_timestamp: int | None = None#: Smallest stored timestamp in this extent, in nanoseconds. Absolute or partition-relative per the source.

model_config#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

modified: datetime.datetime | None = None#

modified_by: str#

org_id: str#

timeline_extent_id: str#

timeline_source_id: str#: ID of the timeline source these bounds are measured against.

topic_part_id: str#: ID of the topic partition these bounds apply to.

unix_epoch_offset_ns: int = 0#: Nanoseconds to add to each stored timestamp to obtain Unix-epoch wall-clock time: session_time_ns = stored_time_ns + unix_epoch_offset_ns. 0 when stored timestamps are already absolute Unix-epoch ns, or when no calibration has been recorded for this partition/source pair.

type roboto.domain.topics.record.TimelineSourceKind = Literal['schema_field', 'message_log_time', 'message_publish_time']#

Discriminator for how a TimelineSourceRecord derives its timestamps.

"schema_field" points at a timestamp field inside the schema (field_id is set). "message_log_time" and "message_publish_time" point at the message envelope’s log or publish timestamp respectively (field_id is None).

class roboto.domain.topics.record.TimelineSourceRecord(/, **data)#

Bases: pydantic.BaseModel

A registered time source for a schema.

A time source either points at a timestamp field inside the schema (source="schema_field", field_id set) or at the message envelope’s log or publish timestamp (source in {"message_log_time", "message_publish_time"}, field_id is None). Time sources are scoped to a schema, not a topic, so topics that share a schema share their time sources.

Parameters:: data (Any)

created: datetime.datetime | None = None#

created_by: str#

field_id: str | None = None#: ID of the schema field supplying timestamps. Set when source == "schema_field"; otherwise None.

is_default: bool = False#: Whether this time source is the default for its schema when no source is specified explicitly.

model_config#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

modified: datetime.datetime | None = None#

modified_by: str#

name: str#: Human-readable label for this time source.

org_id: str#

schema_id: str#: ID of the schema this time source is registered against.

source: TimelineSourceKind#

a schema field ("schema_field"), or the message envelope’s log or publish timestamp ("message_log_time" / "message_publish_time").

Type:: Where timestamps come from

timeline_source_id: str#

class roboto.domain.topics.record.TopicIdentityRecord(/, **data)#

Bases: pydantic.BaseModel

A durable log-stream identity.

Within an organization, topic names are unique. Contributions from different files with the same topic name share a single identity record.

Parameters:: data (Any)

created: datetime.datetime | None = None#

created_by: str#

model_config#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

modified: datetime.datetime | None = None#

modified_by: str#

name: str#: Human-readable topic name (e.g., "/camera/image_raw"). Unique within an organization.

org_id: str#

topic_id: str#: Stable identifier for this topic identity.

class roboto.domain.topics.record.TopicPartitionRecord(/, **data)#

Bases: pydantic.BaseModel

One file’s contribution to a logical topic.

Pairs a topic identity with a file and carries the per-contribution facts that vary by file: the schema used (schema_id), message count, device provenance, and, for formats that pack multiple logical groups into one file, sub-file segmentation (segment_index, segment_name) and row-level storage bounds (data_from_index, data_to_index). Row bounds are half-open [data_from_index, data_to_index), matching Python slice semantics; both are set together for a row-bounded partition, or both are None when the partition covers the whole file. A partition references a file, not a specific version; reads always resolve to the current version.

Parameters:: data (Any)

created: datetime.datetime | None = None#

created_by: str#

data_from_index: int | None = None#: Inclusive lower bound of this partition’s row range within the file, or None if the partition covers the whole file. Must be set if and only if data_to_index is set.

data_to_index: int | None = None#: Exclusive upper bound of this partition’s row range within the file, forming a half-open range [data_from_index, data_to_index). None if the partition covers the whole file. Must be set if and only if data_from_index is set, and strictly greater than it.

device_id: str | None = None#: ID of the device that produced this contribution, if known.

fs_node_id: str#: ID of the file this partition’s data lives in.

message_count: int | None = None#: Number of messages this partition contributes, if known.

model_config#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

modified: datetime.datetime | None = None#

modified_by: str#

org_id: str#

schema_id: str#: ID of the schema this contribution conforms to.

segment_index: int = 0#: Zero-based index of the logical segment within the file this partition represents. 0 for formats that hold a single logical group per file.

segment_name: str | None = None#: Optional human-readable name for this segment, when the file format names its segments.

topic_id: str#: ID of the topic identity this contribution belongs to.

topic_part_id: str#

class roboto.domain.topics.record.TopicRecord(/, **data)#

Bases: pydantic.BaseModel

Record representing a topic in the Roboto platform.

A topic is a collection of timestamped data records that share a common name and association (typically a file). Topics represent logical data streams from robotics systems, such as sensor readings, robot state information, or other time-series data.

Data from the same file with the same topic name are considered part of the same topic. Data from different files or with different topic names belong to separate topics, even if they have similar schemas.

When source files are chunked by time or size but represent the same logical data collection, they will produce multiple topic records for the same “logical topic” (same name and schema) across those chunks.

Parameters:: data (Any)

association: roboto.association.Association#: Identifier and entity type with which this Topic is associated. E.g., a file, a dataset.

created: datetime.datetime#

created_by: str#

default_representation: RepresentationRecord | None = None#: Default Representation for this Topic. Assume that if a MessagePath is not more specifically associated with a Representation, it should use this one.

end_time: int | None = None#: Timestamp of oldest message in topic, in nanoseconds since epoch (assumed Unix epoch).

message_count: int | None = None#

message_paths: collections.abc.MutableSequence[MessagePathRecord] = None#: Zero to many MessagePathRecords associated with this TopicSource.

metadata: collections.abc.Mapping[str, Any] = None#: Arbitrary metadata.

modified: datetime.datetime#

modified_by: str#

org_id: str#

schema_checksum: str | None = None#: Checksum of topic schema. May be None if topic does not have a known/named schema.

schema_id: str | None = None#: ID of the schema record for this topic. May be None if the topic has no schema, or if the schema record has not yet been populated.

schema_name: str | None = None#: Type of messages in topic. E.g., “sensor_msgs/PointCloud2”. May be None if topic does not have a known/named schema.

start_time: int | None = None#: Timestamp of earliest message in topic, in nanoseconds since epoch (assumed Unix epoch).

topic_id: str#

topic_name: str#

class roboto.domain.topics.record.TopicSchemaRecord(/, **data)#

Bases: pydantic.BaseModel

A content-addressed topic schema.

Within an organization, two schemas with identical fields share a single record (identified by a deterministic checksum of the fields). name is a mutable, informational label (last-writer-wins) and is not part of the schema’s identity.

Parameters:: data (Any)

checksum: str#: Deterministic checksum computed over the schema’s fields; identical schemas share a checksum.

created: datetime.datetime | None = None#

created_by: str#

model_config#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

modified: datetime.datetime | None = None#

modified_by: str#

name: str | None = None#: Informational label for the schema (e.g., "sensor_msgs/PointCloud2"). Not part of identity.

org_id: str#

schema_id: str#: Stable identifier for this schema record.

class roboto.domain.topics.record.TransformationKind#

Bases: roboto.compat.StrEnum

Canonical vocabulary of transformations that can be applied when producing a representation.

A transformation is serialized into RepresentationRecord.transformations as a "<kind>:<param>" string (e.g. "downsample:0.5", "encode:jpeg"). This enum is the source of truth for the set of supported kinds; the parameter tail remains free-form because different kinds carry different parameter shapes (floats, format tokens, etc.).

Producers should construct transformation strings via with_param() and consumers should destructure them via parse() to keep the vocabulary centralized.

Examples

>>> TransformationKind.DOWNSAMPLE.with_param(0.5)
'downsample:0.5'
>>> TransformationKind.parse("encode:jpeg")
(<TransformationKind.ENCODE: 'encode'>, 'jpeg')

DOWNSAMPLE = 'downsample'#: Spatial or temporal downsampling. Parameter is a float scale factor in (0, 1].

ENCODE = 'encode'#: Re-encoding to a different content format. Parameter is the target format token (e.g. "jpeg").

classmethod parse(descriptor)#

Parse a "<kind>:<param>" transformation descriptor into its kind and raw parameter.

Raises:: ValueError – If the kind prefix is not a known TransformationKind member.
Parameters:: descriptor (str)
Return type:: tuple[TransformationKind, str]

with_param(param)#

Construct a transformation descriptor string for this kind with the given parameter.

Parameters:: param (object)
Return type:: str