libera_utils.packets#

Module for reading packet data using Space Packet Parser

Functions

parse_packets_to_dataframe(...[, apid, ...])

Parse packets from files into a pandas DataFrame using Space Packet Parser v6.0.0rc3.

parse_packets_to_dataset(packet_files, ...)

Parse packets from files into an xarray Dataset using specified packet definition.

parse_packets_to_l1a_dataset(packet_files, apid)

Parse packets to L1A dataset with configurable sample expansion.

read_azel_packet_data(packet_data_filepaths)

Read Az/El packet data from a list of file paths.

read_sc_packet_data(packet_data_filepaths[, ...])

Read spacecraft packet data from a list of file paths.

libera_utils.packets._aggregate_fields(dataset: Dataset, group: AggregationGroup) ndarray#

Aggregate multiple sequential fields into a single binary blob per packet.

Optimized version using vectorized numpy operations with zero-copy view conversion. Assumes all fields exist (validated by Space Packet Parser during parsing).

Parameters:
  • dataset (xr.Dataset) – Dataset containing the individual fields to aggregate.

  • group (AggregationGroup) – Configuration for the aggregation group.

Returns:

Array of aggregated binary data with dtype matching group.dtype.

Return type:

np.ndarray

libera_utils.packets._drop_duplicates(dataset: Dataset, coordinate_name: str)#

Detect and drop duplicate values based on a coordinate

Issues warnings when duplicates are detected

Parameters:
  • dataset (xr.Dataset) – The dataset to deduplicate

  • coordinate_name (str) – The name of the coordinate over which to search for duplicates

Returns:

  • dataset (xr.Dataset) – Deduplicated dataset

  • n_duplicates (int) – Number of duplicates detected and dropped

libera_utils.packets._expand_sample_group(dataset: Dataset, group: SampleGroup) tuple[dict[str, ndarray], ndarray]#

Expand a sample group (timestamps and measured values) into separate field arrays.

For samples within a packet (1 or many), expand those samples into separate arrays, with coordinates of sample time rather than packet time.

Notes

For periodic samples based on an epoch, we use the epoch and the period to calculate sample times assuming that the epoch is the first sample time. For samples that each have their own timestamp, we convert each sample time to microseconds.

Parameters:
  • dataset (xr.Dataset) – Dataset containing the packet data.

  • group (SampleGroup) – Configuration for the sample group.

Returns:

Dictionary of field name to field array, and time array.

Return type:

tuple[dict[str, np.ndarray], np.ndarray]

libera_utils.packets._expand_sample_times(dataset: Dataset, time_fields: TimeFieldMapping, n_samples: int) ndarray#

Expand sample time fields into a flat array.

Parameters:
  • dataset (xr.Dataset) – Dataset containing the time fields.

  • time_fields (TimeFieldMapping) – Time field mapping with patterns that may include %i placeholders.

  • n_samples (int) – Number of samples per packet.

Returns:

Flattened array of sample times as datetime64[us].

Return type:

np.ndarray

libera_utils.packets._get_aggregated_field_names(dataset: Dataset, group: AggregationGroup) set[str]#

Get all field names that are aggregated for an aggregation group.

Parameters:
  • dataset (xr.Dataset) – Dataset containing the fields.

  • group (AggregationGroup) – Aggregation group configuration.

Returns:

Set of field names that are aggregated.

Return type:

set[str]

libera_utils.packets._get_expanded_field_names(dataset: Dataset, group: SampleGroup) set[str]#

Get all field names that are expanded for a sample group.

This extracts all the field names for a sample group that we use to expand the samples (time fields and data fields) so that we can remove these fields from the primary array to save space.

Parameters:
  • dataset (xr.Dataset) – Dataset containing the fields.

  • group (SampleGroup) – Sample group configuration.

Returns:

Set of field names that are expanded.

Return type:

set[str]

libera_utils.packets.parse_packets_to_dataframe(packet_definition: str | CloudPath | Path | XtcePacketDefinition, packet_data_filepaths: list[str | CloudPath | Path], apid: int | None = None, skip_header_bytes: int = 0) DataFrame#

Parse packets from files into a pandas DataFrame using Space Packet Parser v6.0.0rc3.

Parameters:
  • packet_definition (str | PathType | XtcePacketDefinition) – XTCE packet definition file path or pre-loaded XtcePacketDefinition object.

  • packet_data_filepaths (list[str | PathType]) – List of filepaths to packet files.

  • apid (Optional[int]) – Filter on APID so we don’t get mismatches in case the parser finds multiple parsable packet definitions in the files. This can happen if the XTCE document contains definitions for multiple packet types and >1 of those packet types is present in the packet data files.

  • skip_header_bytes (int) – Number of header bytes to skip when reading packet files. Default is 0.

Returns:

pandas DataFrame containing parsed packet data.

Return type:

pd.DataFrame

libera_utils.packets.parse_packets_to_dataset(packet_files: list[PathLike | str], packet_definition: str | PathLike, apid: int, **generator_kwargs) Dataset#

Parse packets from files into an xarray Dataset using specified packet definition.

This function does not make any changes to the packet data other than filtering by a single APID.

Parameters:
  • packet_files (list[PathLike | str]) – List of filepaths to packet files.

  • packet_definition (str | PathLike) – Path to the XTCE packet definition file.

  • apid (int) – Application Process Identifier to filter for.

  • **generator_kwargs – Additional keyword arguments passed to the packet generator.

Returns:

xarray Dataset containing parsed packet data.

Return type:

xr.Dataset

libera_utils.packets.parse_packets_to_l1a_dataset(packet_files: list[PathLike | str], apid: int) Dataset#

Parse packets to L1A dataset with configurable sample expansion.

This function parses binary packet files and expands multi-sample fields according to the a configuration identified by APID. It creates proper xarray Datasets with time coordinates as dimensions.

Parameters:
  • packet_files (list[PathLike | str]) – List of filepaths to packet files.

  • apid (int) – The APID (Application Process Identifier) value for the packet type. Used to select the appropriate configuration for generating the L1A Dataset structure.

Returns:

xarray Dataset with: - Main packet data array with packet timestamp dimension - Separate arrays for each sample group with optional multi-field expansion - All time coordinates properly set as dimensions

Return type:

xr.Dataset

libera_utils.packets.read_azel_packet_data(packet_data_filepaths: list[str | CloudPath | Path], apid: int = 1048) DataFrame#

Read Az/El packet data from a list of file paths.

Parameters:
packet_data_filepathslist[str | Path | CloudPath]]

The list of file paths to the raw packet data

apidint

Application Packet ID to filter for. Default is 1048 for Az/El sample packets.

:returns: **packet_data – The configured packet data as a pandas DataFrame with restructured samples.**
:rtype: pd.DataFrame
libera_utils.packets.read_sc_packet_data(packet_data_filepaths: list[str | CloudPath | Path], apid: int = 11) DataFrame#

Read spacecraft packet data from a list of file paths.

Parameters:
  • packet_data_filepaths (list[str | PathType]) – The list of file paths to the raw packet data

  • apid (int) – Application Packet ID to filter for. Default is 11 for JPSS geolocation packets.

Returns:

packet_data – The configured packet data as a pandas DataFrame.

Return type:

pd.DataFrame