libera_utils.io.product_definition.LiberaDataProductDefinition#
- class libera_utils.io.product_definition.LiberaDataProductDefinition(*, coordinates: dict[str, LiberaVariableDefinition], variables: dict[str, LiberaVariableDefinition], attributes: dict[str, Any])#
Bases:
BaseModelPydantic model for a Libera data product definition.
Used for validating existing data product Datasets with helper methods for creating valid Datasets and DataArrays.
- data_variables#
A dictionary of variable names and their corresponding LiberaVariable objects, which contain metadata and data.
- product_metadata#
The metadata associated with the data product, including dynamic metadata and spatio-temporal metadata.
- Type:
ProductMetadata | None
- Attributes:
dynamic_attributesReturn product-level attributes that are dynamically defined (null values) in the data product definition
model_extraGet extra fields set during validation.
model_fields_setReturns the set of fields that have been explicitly set on this model instance.
static_attributesReturn product-level attributes that are statically defined (have values) in the data product definition
Methods
check_dataset_conformance(dataset[, strict])Check the conformance of a Dataset object against a data product definition.
copy(*[, include, exclude, update, deep])Returns a copy of the model.
create_product_dataset(data[, ...])Create a product Dataset from numpy arrays.
enforce_dataset_conformance(dataset)Analyze and update a Dataset to conform to the expectations of the DataProductDefinition
from_yaml(product_definition_filepath)Create a DataProductDefinition from a Libera data product definition YAML file.
generate_data_product_filename(dataset, ...)Generate a standardized Libera data product filename.
model_construct([_fields_set])Creates a new instance of the Model class with validated data.
model_copy(*[, update, deep])!!! abstract "Usage Documentation"
model_dump(*[, mode, include, exclude, ...])!!! abstract "Usage Documentation"
model_dump_json(*[, indent, ensure_ascii, ...])!!! abstract "Usage Documentation"
model_json_schema(by_alias, ref_template, ...)Generates a JSON schema for a model class.
model_parametrized_name(params)Compute the class name for parametrizations of generic classes.
model_post_init(context, /)Override this method to perform additional initialization after __init__ and model_construct.
model_rebuild(*[, force, raise_errors, ...])Try to rebuild the pydantic-core schema for the model.
model_validate(obj, *[, strict, extra, ...])Validate a pydantic model instance.
model_validate_json(json_data, *[, strict, ...])!!! abstract "Usage Documentation"
model_validate_strings(obj, *[, strict, ...])Validate the given object with string data against the Pydantic model.
construct
dict
from_orm
json
parse_file
parse_obj
parse_raw
schema
schema_json
update_forward_refs
validate
- __init__(**data: Any) None#
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
Methods
check_dataset_conformance(dataset[, strict])Check the conformance of a Dataset object against a data product definition.
create_product_dataset(data[, ...])Create a product Dataset from numpy arrays.
enforce_dataset_conformance(dataset)Analyze and update a Dataset to conform to the expectations of the DataProductDefinition
from_yaml(product_definition_filepath)Create a DataProductDefinition from a Libera data product definition YAML file.
generate_data_product_filename(dataset, ...)Generate a standardized Libera data product filename.
Attributes
Return product-level attributes that are dynamically defined (null values) in the data product definition
model_computed_fieldsConfiguration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Get extra fields set during validation.
model_fieldsReturns the set of fields that have been explicitly set on this model instance.
Return product-level attributes that are statically defined (have values) in the data product definition
- _check_dataset_attrs(dataset_attrs: dict[str, Any]) list[str]#
Validate the product level attributes of a Dataset against the product definition
Static attributes must match exactly. Some special attributes have their values checked for validity.
- static _get_static_project_attributes(file_path=None) dict[str, Any]#
Loads project-wide consistent product-level attribute metadata from a YAML file.
These global attributes are expected on every Libera data product so we store them in a global config.
- classmethod _set_attributes(raw_attributes: dict[str, Any]) dict[str, Any]#
Validates product level attributes and adds requirements for globally consistent attributes.
Any attributes defined with null values are treated as required dynamic attributes that must be set either by the user’s data product definition or dynamically on the Dataset before writing.
- check_dataset_conformance(dataset: Dataset, strict: bool = True) list[str]#
Check the conformance of a Dataset object against a data product definition.
This method is responsible only for finding errors, not fixing them. It warns on every violation and logs all errors it finds at the end. If strict is True, it raises an exception if any errors are found. If strict is False, it just returns the list of error messages.
- create_product_dataset(data: dict[str, ndarray], dynamic_product_attributes: dict[str, Any] | None = None, dynamic_variable_attributes: dict[str, dict[str, Any]] | None = None) Dataset#
Create a product Dataset from numpy arrays.
This method creates a Dataset from numpy arrays, setting attributes and encodings according to the product definition. This does not guarantee a fully conformant Dataset. To bring the Dataset into conformance, use enforce_dataset_conformance on the resulting Dataset and check the result with check_dataset_conformance.
- Parameters:
data (dict[str, np.ndarray]) – Dictionary of variable/coordinate data keyed by variable/coordinate name.
dynamic_product_attributes (dict[str, Any] | None) – Algorithm developers should not need to use this kwarg. Product level attributes for the data product. This allows the user to specify product level attributes that are required but not statically specified in the product definition (e.g. the algorithm version used to generate the product)
dynamic_variable_attributes (dict[str, dict[str, Any]] | None) – Algorithm developers should not need to use this kwarg. Per-variable attributes for each variable’s DataArray. Key is variable name, value is an attributes dict. This allows the user to specify variable level attributes that are required but not statically defined in the product definition.
- Returns:
The created Dataset. This Dataset is not guaranteed to be conformant and should be checked with check_dataset_conformance.
- Return type:
Dataset
Notes
We make no distinction between coordinate and data variable input data and determine which is which based on coordinate/variable sections in the product definition.
This method is not responsible for primary validation or error reporting. The caller is responsible for checking the result with check_dataset_conformance and fixing any errors that arise.
- property dynamic_attributes#
Return product-level attributes that are dynamically defined (null values) in the data product definition
These attributes are _required_ but are expected to be defined externally to the data product definition
- enforce_dataset_conformance(dataset: Dataset) Dataset#
Analyze and update a Dataset to conform to the expectations of the DataProductDefinition
This method attempts to bring a Dataset into conformance with a product definition, including enforcing conformance of variable DataArrays. When making changes, the data product definition takes precedence over any existing metadata or settings on the Dataset. Logs are emitted for all changes made. When the Dataset configuration contradicts the data product definition, warnings are also issued. This method is not responsible for validating the final result and does not guarantee that the resulting Dataset will pass the validation checks because some problems simply can’t be fixed.
- Parameters:
dataset (Dataset) – Possibly non-compliant dataset
- Returns:
The updated Dataset. This Dataset is not guaranteed to be fully conformant and should be checked with check_dataset_conformance to verify.
- Return type:
Dataset
- classmethod from_yaml(product_definition_filepath: str | CloudPath | Path)#
Create a DataProductDefinition from a Libera data product definition YAML file.
- Parameters:
product_definition_filepath (str | PathType) – Path to YAML file with product and variable definitions
- Returns:
Configured instance with loaded metadata and optional data
- Return type:
DataProductDefinition
- generate_data_product_filename(dataset: Dataset, time_variable: str) LiberaDataProductFilename#
Generate a standardized Libera data product filename.
- Parameters:
dataset (Dataset) – The Dataset for which to create a filename. Used to extract algorithm version and start and end times.
time_variable (str) – Name of the time dimension to use for determining the start and end time.
- Returns:
Properly formatted filename object
- Return type:
- model_config = {'frozen': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- property model_extra: dict[str, Any] | None#
Get extra fields set during validation.
- Returns:
A dictionary of extra fields, or None if config.extra is not set to “allow”.
- property model_fields_set: set[str]#
Returns the set of fields that have been explicitly set on this model instance.
- Returns:
- A set of strings representing the fields that have been set,
i.e. that were not filled from defaults.
- property static_attributes#
Return product-level attributes that are statically defined (have values) in the data product definition