libera_utils.io.product_definition.LiberaDataProductDefinition#

class libera_utils.io.product_definition.LiberaDataProductDefinition(*, coordinates: dict[str, LiberaVariableDefinition], variables: dict[str, LiberaVariableDefinition], attributes: dict[str, Any])#

Bases: BaseModel

Pydantic model for a Libera data product definition.

Used for validating existing data product Datasets with helper methods for creating valid Datasets and DataArrays.

data_variables#

A dictionary of variable names and their corresponding LiberaVariable objects, which contain metadata and data.

Type:

dict[str, LiberaVariable]

product_metadata#

The metadata associated with the data product, including dynamic metadata and spatio-temporal metadata.

Type:

ProductMetadata | None

Attributes:
dynamic_attributes

Return product-level attributes that are dynamically defined (null values) in the data product definition

model_extra

Get extra fields set during validation.

model_fields_set

Returns the set of fields that have been explicitly set on this model instance.

static_attributes

Return product-level attributes that are statically defined (have values) in the data product definition

Methods

check_dataset_conformance(dataset[, strict])

Check the conformance of a Dataset object against a DataProductDefinition

copy(*[, include, exclude, update, deep])

Returns a copy of the model.

create_conforming_dataset(data[, ...])

Create a Dataset from numpy arrays that is valid against the data product definition

enforce_dataset_conformance(dataset)

Analyze and update a Dataset to conform to the expectations of the DataProductDefinition

from_yaml(product_definition_filepath)

Create a DataProductDefinition from a Libera data product definition YAML file.

generate_data_product_filename(utc_start, ...)

Generate a standardized Libera data product filename.

model_construct([_fields_set])

Creates a new instance of the Model class with validated data.

model_copy(*[, update, deep])

!!! abstract "Usage Documentation"

model_dump(*[, mode, include, exclude, ...])

!!! abstract "Usage Documentation"

model_dump_json(*[, indent, ensure_ascii, ...])

!!! abstract "Usage Documentation"

model_json_schema([by_alias, ref_template, ...])

Generates a JSON schema for a model class.

model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

model_post_init(context, /)

Override this method to perform additional initialization after __init__ and model_construct.

model_rebuild(*[, force, raise_errors, ...])

Try to rebuild the pydantic-core schema for the model.

model_validate(obj, *[, strict, extra, ...])

Validate a pydantic model instance.

model_validate_json(json_data, *[, strict, ...])

!!! abstract "Usage Documentation"

model_validate_strings(obj, *[, strict, ...])

Validate the given object with string data against the Pydantic model.

construct

dict

from_orm

json

parse_file

parse_obj

parse_raw

schema

schema_json

update_forward_refs

validate

__init__(**data: Any) None#

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Methods

check_dataset_conformance(dataset[, strict])

Check the conformance of a Dataset object against a DataProductDefinition

create_conforming_dataset(data[, ...])

Create a Dataset from numpy arrays that is valid against the data product definition

enforce_dataset_conformance(dataset)

Analyze and update a Dataset to conform to the expectations of the DataProductDefinition

from_yaml(product_definition_filepath)

Create a DataProductDefinition from a Libera data product definition YAML file.

generate_data_product_filename(utc_start, ...)

Generate a standardized Libera data product filename.

Attributes

dynamic_attributes

Return product-level attributes that are dynamically defined (null values) in the data product definition

model_computed_fields

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_extra

Get extra fields set during validation.

model_fields

model_fields_set

Returns the set of fields that have been explicitly set on this model instance.

static_attributes

Return product-level attributes that are statically defined (have values) in the data product definition

coordinates

variables

attributes

_check_dataset_attrs(dataset_attrs: dict[str, Any]) list[str]#

Validate the product level attributes of a Dataset against the product definition

Static attributes must match exactly. Some special attributes have their values checked for validity.

Parameters:

dataset_attrs (dict[str, Any]) – Dataset attributes to validate

Returns:

List of error messages describing problems found. Empty list if no problems.

Return type:

list[str]

static _get_static_project_attributes(file_path=None)#

Loads project-wide consistent product-level attribute metadata from a YAML file.

These global attributes are expected on every Libera data product so we store them in a global config.

Parameters:

file_path (Path) – The path to the global attribute metadata YAML file.

Returns:

Dictionary of key-value pairs for static product attributes.

Return type:

dict

classmethod _set_attributes(raw_attributes: dict[str, Any]) dict[str, Any]#

Validates product level attributes and adds requirements for globally consistent attributes

Parameters:

raw_attributes (dict[str, Any]) – The attributes specification in the product definition.

Returns:

The validated attributes dictionary, including standard defaults that we always require.

Return type:

dict[str, Any]

check_dataset_conformance(dataset: Dataset, strict: bool = True) list[str]#

Check the conformance of a Dataset object against a DataProductDefinition

This method is responsible only for finding errors, not fixing them.

Parameters:
  • dataset (Dataset) – Dataset object to validate against expectations in the product configuration

  • strict (bool) – Default True. Raises an exception for nonconformance.

Returns:

List of error messages describing problems found. Empty list if no problems.

Return type:

list[str]

create_conforming_dataset(data: dict[str, ndarray], user_product_attributes: dict[str, Any] | None = None, user_variable_attributes: dict[str, dict[str, Any]] | None = None, strict: bool = True) tuple[Dataset, list[str]]#

Create a Dataset from numpy arrays that is valid against the data product definition

Parameters:
  • data (dict[str, np.ndarray]) – Dictionary of variable/coordinate data keyed by variable/coordinate name.

  • user_product_attributes (dict[str, Any] | None) – Algorithm developers should not need to use this kwarg. Product level attributes for the data product. This allows the user to specify product level attributes that are required but not statically specified in the product definition (e.g. the algorithm version used to generate the product)

  • user_variable_attributes (dict[str, dict[str, Any]] | None) – Algorithm developers should not need to use this kwarg. Per-variable attributes for each variable’s DataArray. Key is variable name, value is an attributes dict. This allows the user to specify variable level attributes that are required but not statically defined in the product definition.

  • strict (bool) – Default True. Raises an exception for nonconformance.

Returns:

Tuple of (Dataset, error_messages) where error_messages contains any validation problems. Empty list if the dataset is fully valid.

Return type:

tuple[Dataset, list[str]]

Notes

  • We make no distinction between coordinate and data variable input data and assume that we can determine which is which based on coordinate/variable names the product definition.

  • This method is not responsible for primary validation or error reporting. We call out to check_dataset_conformance at the end for that.

property dynamic_attributes#

Return product-level attributes that are dynamically defined (null values) in the data product definition

These attributes are _required_ but are expected to be defined externally to the data product definition

enforce_dataset_conformance(dataset: Dataset) tuple[Dataset, list[str]]#

Analyze and update a Dataset to conform to the expectations of the DataProductDefinition

This method is for modifying an existing xarray Dataset. If you are creating a Dataset from scratch with numpy arrays, consider using create_conforming_dataset instead.

Parameters:

dataset (Dataset) – Possibly non-compliant dataset

Returns:

Tuple of (updated Dataset, error_messages) where error_messages contains any problems that could not be fixed. Empty list if all problems were fixed.

Return type:

tuple[Dataset, list[str]]

Notes

  • This method is responsible for trying (and possibly failing) to coerce a Dataset into a valid form with attributes and encodings. We use check_dataset_conformance to check for validation errors.

classmethod from_yaml(product_definition_filepath: str | CloudPath | Path)#

Create a DataProductDefinition from a Libera data product definition YAML file.

Parameters:

product_definition_filepath (str | PathType) – Path to YAML file with product and variable definitions

Returns:

Configured instance with loaded metadata and optional data

Return type:

DataProductDefinition

generate_data_product_filename(utc_start: datetime, utc_end: datetime) LiberaDataProductFilename#

Generate a standardized Libera data product filename.

Parameters:
  • utc_start (datetime) – Start time of data in the file

  • utc_end (datetime) – End time of data in the file

Returns:

Properly formatted filename object

Return type:

LiberaDataProductFilename

model_config: ClassVar[ConfigDict] = {'frozen': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

property model_extra: dict[str, Any] | None#

Get extra fields set during validation.

Returns:

A dictionary of extra fields, or None if config.extra is not set to “allow”.

property model_fields_set: set[str]#

Returns the set of fields that have been explicitly set on this model instance.

Returns:

A set of strings representing the fields that have been set,

i.e. that were not filled from defaults.

property static_attributes#

Return product-level attributes that are statically defined (have values) in the data product definition