libera_utils.io.product_definition#
Data Product configuration and writing for Libera NetCDF4 data product files
Classes
|
Pydantic model for a Libera data product definition. |
|
Pydantic model for a Libera variable definition. |
- class libera_utils.io.product_definition.LiberaDataProductDefinition(*, coordinates: dict[str, LiberaVariableDefinition], variables: dict[str, LiberaVariableDefinition], attributes: dict[str, Any])#
Pydantic model for a Libera data product definition.
Used for validating existing data product Datasets with helper methods for creating valid Datasets and DataArrays.
- data_variables#
A dictionary of variable names and their corresponding LiberaVariable objects, which contain metadata and data.
- product_metadata#
The metadata associated with the data product, including dynamic metadata and spatio-temporal metadata.
- Type:
ProductMetadata | None
- Attributes:
dynamic_attributesReturn product-level attributes that are dynamically defined (null values) in the data product definition
model_extraGet extra fields set during validation.
model_fields_setReturns the set of fields that have been explicitly set on this model instance.
static_attributesReturn product-level attributes that are statically defined (have values) in the data product definition
Methods
check_dataset_conformance(dataset[, strict])Check the conformance of a Dataset object against a DataProductDefinition
copy(*[, include, exclude, update, deep])Returns a copy of the model.
create_conforming_dataset(data[, ...])Create a Dataset from numpy arrays that is valid against the data product definition
enforce_dataset_conformance(dataset)Analyze and update a Dataset to conform to the expectations of the DataProductDefinition
from_yaml(product_definition_filepath)Create a DataProductDefinition from a Libera data product definition YAML file.
generate_data_product_filename(utc_start, ...)Generate a standardized Libera data product filename.
model_construct([_fields_set])Creates a new instance of the Model class with validated data.
model_copy(*[, update, deep])!!! abstract "Usage Documentation"
model_dump(*[, mode, include, exclude, ...])!!! abstract "Usage Documentation"
model_dump_json(*[, indent, ensure_ascii, ...])!!! abstract "Usage Documentation"
model_json_schema([by_alias, ref_template, ...])Generates a JSON schema for a model class.
model_parametrized_name(params)Compute the class name for parametrizations of generic classes.
model_post_init(context, /)Override this method to perform additional initialization after __init__ and model_construct.
model_rebuild(*[, force, raise_errors, ...])Try to rebuild the pydantic-core schema for the model.
model_validate(obj, *[, strict, extra, ...])Validate a pydantic model instance.
model_validate_json(json_data, *[, strict, ...])!!! abstract "Usage Documentation"
model_validate_strings(obj, *[, strict, ...])Validate the given object with string data against the Pydantic model.
construct
dict
from_orm
json
parse_file
parse_obj
parse_raw
schema
schema_json
update_forward_refs
validate
- _check_dataset_attrs(dataset_attrs: dict[str, Any]) list[str]#
Validate the product level attributes of a Dataset against the product definition
Static attributes must match exactly. Some special attributes have their values checked for validity.
- static _get_static_project_attributes(file_path=None)#
Loads project-wide consistent product-level attribute metadata from a YAML file.
These global attributes are expected on every Libera data product so we store them in a global config.
- Parameters:
file_path (Path) – The path to the global attribute metadata YAML file.
- Returns:
Dictionary of key-value pairs for static product attributes.
- Return type:
- classmethod _set_attributes(raw_attributes: dict[str, Any]) dict[str, Any]#
Validates product level attributes and adds requirements for globally consistent attributes
- check_dataset_conformance(dataset: Dataset, strict: bool = True) list[str]#
Check the conformance of a Dataset object against a DataProductDefinition
This method is responsible only for finding errors, not fixing them.
- create_conforming_dataset(data: dict[str, ndarray], user_product_attributes: dict[str, Any] | None = None, user_variable_attributes: dict[str, dict[str, Any]] | None = None, strict: bool = True) tuple[Dataset, list[str]]#
Create a Dataset from numpy arrays that is valid against the data product definition
- Parameters:
data (dict[str, np.ndarray]) – Dictionary of variable/coordinate data keyed by variable/coordinate name.
user_product_attributes (dict[str, Any] | None) – Algorithm developers should not need to use this kwarg. Product level attributes for the data product. This allows the user to specify product level attributes that are required but not statically specified in the product definition (e.g. the algorithm version used to generate the product)
user_variable_attributes (dict[str, dict[str, Any]] | None) – Algorithm developers should not need to use this kwarg. Per-variable attributes for each variable’s DataArray. Key is variable name, value is an attributes dict. This allows the user to specify variable level attributes that are required but not statically defined in the product definition.
strict (bool) – Default True. Raises an exception for nonconformance.
- Returns:
Tuple of (Dataset, error_messages) where error_messages contains any validation problems. Empty list if the dataset is fully valid.
- Return type:
Notes
We make no distinction between coordinate and data variable input data and assume that we can determine which is which based on coordinate/variable names the product definition.
This method is not responsible for primary validation or error reporting. We call out to check_dataset_conformance at the end for that.
- property dynamic_attributes#
Return product-level attributes that are dynamically defined (null values) in the data product definition
These attributes are _required_ but are expected to be defined externally to the data product definition
- enforce_dataset_conformance(dataset: Dataset) tuple[Dataset, list[str]]#
Analyze and update a Dataset to conform to the expectations of the DataProductDefinition
This method is for modifying an existing xarray Dataset. If you are creating a Dataset from scratch with numpy arrays, consider using create_conforming_dataset instead.
- Parameters:
dataset (Dataset) – Possibly non-compliant dataset
- Returns:
Tuple of (updated Dataset, error_messages) where error_messages contains any problems that could not be fixed. Empty list if all problems were fixed.
- Return type:
Notes
This method is responsible for trying (and possibly failing) to coerce a Dataset into a valid form with attributes and encodings. We use check_dataset_conformance to check for validation errors.
- classmethod from_yaml(product_definition_filepath: str | CloudPath | Path)#
Create a DataProductDefinition from a Libera data product definition YAML file.
- Parameters:
product_definition_filepath (str | PathType) – Path to YAML file with product and variable definitions
- Returns:
Configured instance with loaded metadata and optional data
- Return type:
DataProductDefinition
- generate_data_product_filename(utc_start: datetime, utc_end: datetime) LiberaDataProductFilename#
Generate a standardized Libera data product filename.
- Parameters:
utc_start (datetime) – Start time of data in the file
utc_end (datetime) – End time of data in the file
- Returns:
Properly formatted filename object
- Return type:
- model_config: ClassVar[ConfigDict] = {'frozen': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- property static_attributes#
Return product-level attributes that are statically defined (have values) in the data product definition
- class libera_utils.io.product_definition.LiberaVariableDefinition(*, dtype: str, attributes: dict[str, ~typing.Any] = {}, dimensions: list[str] = [], encoding: dict = <factory>)#
Pydantic model for a Libera variable definition.
This model is the same for both data variables and coordinate variables
- attributes#
The attribute metadata for the variable, containing specific key value pairs for CF metadata compliance
- Type:
VariableAttributes
- dimensions#
A list of dimensions that the variable’s data array references. These should be instances of LiberaDimension.
- Type:
list[LiberaDimension]
- encoding#
A dictionary specifying how the variable’s data should be encoded when written to a NetCDF file.
- Type:
- Attributes:
dynamic_attributesReturn attributes for a variable that are dynamically defined (null values) in the data product definition
model_extraGet extra fields set during validation.
model_fields_setReturns the set of fields that have been explicitly set on this model instance.
static_attributesReturn attributes for a variable that are statically defined (have values) in the data product definition
Methods
check_data_array_conformance(data_array, ...)Validate variable data array based on product definition.
copy(*[, include, exclude, update, deep])Returns a copy of the model.
create_conforming_data_array(data, variable_name)Create a DataArray for a single variable that is valid against the data product definition.
enforce_data_array_conformance(data_array, ...)Analyze and fix a DataArray to conform to variable specifications in data product definition
model_construct([_fields_set])Creates a new instance of the Model class with validated data.
model_copy(*[, update, deep])!!! abstract "Usage Documentation"
model_dump(*[, mode, include, exclude, ...])!!! abstract "Usage Documentation"
model_dump_json(*[, indent, ensure_ascii, ...])!!! abstract "Usage Documentation"
model_json_schema([by_alias, ref_template, ...])Generates a JSON schema for a model class.
model_parametrized_name(params)Compute the class name for parametrizations of generic classes.
model_post_init(context, /)Override this method to perform additional initialization after __init__ and model_construct.
model_rebuild(*[, force, raise_errors, ...])Try to rebuild the pydantic-core schema for the model.
model_validate(obj, *[, strict, extra, ...])Validate a pydantic model instance.
model_validate_json(json_data, *[, strict, ...])!!! abstract "Usage Documentation"
model_validate_strings(obj, *[, strict, ...])Validate the given object with string data against the Pydantic model.
construct
dict
from_orm
json
parse_file
parse_obj
parse_raw
schema
schema_json
update_forward_refs
validate
- _check_data_array_attributes(data_array_attrs: dict[str, Any], variable_name: str) list[str]#
Validate the variable level attributes of a DataArray against the product definition
Attributes must match exactly
- classmethod _set_encoding(encoding: dict | None)#
Merge configured encoding with required defaults, issuing warnings on conflicts.
- check_data_array_conformance(data_array: DataArray, variable_name: str) list[str]#
Validate variable data array based on product definition.
This does not verify that all required coordinate data exists on the DataArray. Dimensions lacking coordinates are treated as index dimensions. If coordinate data is later added to a Dataset under a dimension of the same name, the dimension will reference that coordinate data.
- create_conforming_data_array(data: ndarray, variable_name: str, user_variable_attributes: dict[str, Any] | None = None) DataArray#
Create a DataArray for a single variable that is valid against the data product definition.
Coordinate data is not required. Dimensions that reference coordinate dimensions are created as index dimensions. If coordinate data is added later (e.g. to a Dataset), these dimensions will reference the coordinates.
- Parameters:
data (np.ndarray) – Data for the variable DataArray.
variable_name (str) – Name of the variable. Used for log messages and warnings.
user_variable_attributes (dict[str, Any] | None) – Algorithm developers should not need to use this kwarg. Variable level attributes defined by the user. This allows a user to specify dynamic attributes that may be required by the definition but not statically defined in yaml.
- Returns:
A valid DataArray for the specified variable
- Return type:
DataArray
- property dynamic_attributes#
Return attributes for a variable that are dynamically defined (null values) in the data product definition
These attributes are _required_ but are expected to be defined externally to the data product definition
- enforce_data_array_conformance(data_array: DataArray, variable_name: str) tuple[DataArray, list[str]]#
Analyze and fix a DataArray to conform to variable specifications in data product definition
- Parameters:
data_array (DataArray) – The variable data array to analyze and update
variable_name (str) – Name of the variable being enforced (for logging)
- Returns:
Tuple of (updated DataArray, error_messages) where error_messages contains any problems that could not be fixed. Empty list if all problems were fixed.
- Return type:
- model_config: ClassVar[ConfigDict] = {'frozen': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- property static_attributes#
Return attributes for a variable that are statically defined (have values) in the data product definition