libera_utils.io.netcdf.DataProductConfig#

class libera_utils.io.netcdf.DataProductConfig(*, data_product_id: ~libera_utils.aws.constants.DataProductIdentifier, static_project_metadata: ~libera_utils.io.netcdf.StaticProjectMetadata = <factory>, version: str, variable_configuration_path: ~pathlib.Path | None = None, variables: dict[str, ~libera_utils.io.netcdf.LiberaVariable] | None = None, product_metadata: ~libera_utils.io.netcdf.ProductMetadata | None = None, data_product_dataset: ~xarray.core.dataset.Dataset | None = None, data_start_time: ~datetime.datetime | None = None, data_end_time: ~datetime.datetime | None = None)#

Bases: BaseModel

Pydantic model for a Libera data product configuration.

data_product_id#

The identifier for the data product, which is used to generate the filename.

Type:

DataProductIdentifier

static_project_metadata#

The static metadata associated with the Libera project, loaded automatically.

Type:

StaticProjectMetadata

version#

The version number of the data product in X.Y.Z format, where X is the major version, Y is the minor version, and Z is the patch version.

Type:

str

variable_configuration_path#

The path to the variable configuration file, which can be used to load variable metadata.

Type:

Path | None

variables#

A dictionary of variable names and their corresponding LiberaVariable objects, which contain metadata and data.

Type:

dict[str, LiberaVariable] | None

product_metadata#

The metadata associated with the data product, including dynamic metadata and spatio-temporal metadata.

Type:

ProductMetadata | None

Notes

This is the primary object used to configure and write properly formatted NetCDF4 files that can be archived with the Libera SDC. It includes methods for loading variable metadata, validating the configuration, and writing the data product to a file.

Attributes:
model_extra

Get extra fields set during validation.

model_fields_set

Returns the set of fields that have been explicitly set on this model instance.

variable_encoding_dict

Create the needed variable encodings for writing data from the variable metadata

Methods

add_data_to_variable(variable_name, ...)

Adds the actual data to an existing LiberaVariable

add_variable_metadata_from_file(...)

A wrapper around the load_data_product_variables_with_metadata method.

copy(*[, include, exclude, update, deep])

Returns a copy of the model.

enforce_version_format(version_string)

Enforces the proper formatting of the version string as M.m.p.

ensure_data_product_id(raw_data_product_id)

Converts raw data product id string to DataProductIdentifier class if necessary.

from_data_config_file(product_config_filepath)

Primary means of making a data product config all at once

get_static_project_metadata([file_path])

Loads the static project metadata field of the object from a file

load_data_product_variables_with_metadata(...)

Method to create a properly made LiberaVariables from a config file.

load_variables_from_config()

If a model is instantiated with a configuration path listed then populate the variables from that file

model_construct([_fields_set])

Creates a new instance of the Model class with validated data.

model_copy(*[, update, deep])

!!! abstract "Usage Documentation"

model_dump(*[, mode, include, exclude, ...])

!!! abstract "Usage Documentation"

model_dump_json(*[, indent, include, ...])

!!! abstract "Usage Documentation"

model_json_schema([by_alias, ref_template, ...])

Generates a JSON schema for a model class.

model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

model_post_init(context, /)

Override this method to perform additional initialization after __init__ and model_construct.

model_rebuild(*[, force, raise_errors, ...])

Try to rebuild the pydantic-core schema for the model.

model_validate(obj, *[, strict, ...])

Validate a pydantic model instance.

model_validate_json(json_data, *[, strict, ...])

!!! abstract "Usage Documentation"

model_validate_strings(obj, *[, strict, ...])

Validate the given object with string data against the Pydantic model.

use_variable_configuration(...)

Optional validator method that allows the user to specify a path to the variable configuration file.

write(folder_location[, allow_incomplete, ...])

The primary writing method for the Libera Data Products

construct

dict

from_orm

json

parse_file

parse_obj

parse_raw

schema

schema_json

update_forward_refs

validate

__init__(**data: Any) None#

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Methods

add_data_to_variable(variable_name, ...)

Adds the actual data to an existing LiberaVariable

add_variable_metadata_from_file(...)

A wrapper around the load_data_product_variables_with_metadata method.

enforce_version_format(version_string)

Enforces the proper formatting of the version string as M.m.p.

ensure_data_product_id(raw_data_product_id)

Converts raw data product id string to DataProductIdentifier class if necessary.

from_data_config_file(product_config_filepath)

Primary means of making a data product config all at once

get_static_project_metadata([file_path])

Loads the static project metadata field of the object from a file

load_data_product_variables_with_metadata(...)

Method to create a properly made LiberaVariables from a config file.

load_variables_from_config()

If a model is instantiated with a configuration path listed then populate the variables from that file

use_variable_configuration(...)

Optional validator method that allows the user to specify a path to the variable configuration file.

write(folder_location[, allow_incomplete, ...])

The primary writing method for the Libera Data Products

Attributes

model_computed_fields

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_extra

Get extra fields set during validation.

model_fields

model_fields_set

Returns the set of fields that have been explicitly set on this model instance.

variable_encoding_dict

Create the needed variable encodings for writing data from the variable metadata

data_product_id

static_project_metadata

version

variable_configuration_path

variables

product_metadata

data_product_dataset

data_start_time

data_end_time

_check_for_complete_variables()#

An internal method checking if all defined variables have data and metadata

Notes

This method iterates through all the variables in the data product and checks if each variable has both data and metadata associated with it. If any variable is missing either, it returns False.

Returns:

True if all variables have both data and metadata, False otherwise.

Return type:

bool

_format_version_for_filename()#

Internal method for ensuring version string is proper for file output

Returns:

A formatted version string suitable for use in filenames, with dots replaced by dashes and prefixed with “V”.

Return type:

str

Notes

This method replaces the dots in the version string with dashes and prepends a “V” to it.

_generate_data_product_filename(utc_start_time: datetime, utc_end_time: datetime, revision: datetime | None = None) LiberaDataProductFilename#

Generate a valid data product filename using the Filenaming methods

Parameters:
  • utc_start_time (datetime) – The start time of the data product, used to generate the filename.

  • utc_end_time (datetime) – The end time of the data product, used to generate the filename.

  • revision (datetime | None) – The revision date of the data product, used to generate the filename. If None, the current UTC time is used.

Returns:

An instance of LiberaDataProductFilename representing the generated filename for the data product.

Return type:

LiberaDataProductFilename

Notes

This method generates a filename for the data product based on its ID, version, start and end times, and revision date. It uses the LiberaDataProductFilename class to create the filename.

_generate_internal_dataset(allow_incomplete: bool = False)#

An internal method to create the data product dataset from variables data and metadata

Parameters:

allow_incomplete (bool) – If True, allows variables without data to be skipped. If False, raises an error if any variable is missing data or metadata.

Raises:

ValueError – If any variable is missing data or metadata and allow_incomplete is False.

Notes

This method creates an xarray Dataset from the variables defined in the data product. It checks that each variable has both data and metadata associated with it. The static project metadata is added as attributes to the dataset.

_set_data_start_end_time() None#

An internal method that sets the start and end times of the data in this product

add_data_to_variable(variable_name: str, variable_data: ndarray | DataArray)#

Adds the actual data to an existing LiberaVariable

Parameters:
  • variable_name (str) – The name of the variable to which the data will be added.

  • variable_data (np.ndarray | DataArray) – The data to be added to the variable. It can be a numpy ndarray or an xarray DataArray.

Raises:
  • KeyError – If the variable name does not exist in the configuration.

  • TypeError – If the variable data is not of type np.ndarray or DataArray.

  • ValueError – If the variable data does not match the expected dimensions defined in the variable’s metadata.

Notes

This method takes the name of a variable and the data to be added to that variable. It checks if the variable exists in the configuration and then sets the data for that variable.

add_variable_metadata_from_file(variable_config_file_path)#

A wrapper around the load_data_product_variables_with_metadata method.

Parameters:

variable_config_file_path (str | Path) – The path to the configuration file containing variable metadata.

Raises:

ValueError – If the provided file path does not point to a valid JSON or YAML file.

Notes

This allows the model to be validated after the variables have been added.

classmethod enforce_version_format(version_string: str)#

Enforces the proper formatting of the version string as M.m.p.

Parameters:

version_string (str) – The version string to be validated, expected to be in the format M.m.p.

Returns:

The validated version string in the format M.m.p.

Return type:

str

Raises:

ValueError – If the version string does not match the expected format M.m.p, where M, m, and p are integers.

Notes

This method checks if the version string is formatted as M.m.p, where M, m, and p are integers. If the version string does not match this format, it raises a ValueError.

classmethod ensure_data_product_id(raw_data_product_id: str | DataProductIdentifier) DataProductIdentifier#

Converts raw data product id string to DataProductIdentifier class if necessary.

Parameters:

raw_data_product_id (str | DataProductIdentifier) – The raw data product ID, which can be a string or an instance of DataProductIdentifier.

Returns:

An instance of DataProductIdentifier representing the data product ID.

Return type:

DataProductIdentifier

Notes

This method checks if the provided data product ID is already an instance of DataProductIdentifier. If it is, it returns it as is. If it is a string, it converts it to a DataProductIdentifier instance.

classmethod from_data_config_file(product_config_filepath: str | AnyPath, data: list[DataArray] | list[ndarray] | None = None)#

Primary means of making a data product config all at once

Parameters:
  • product_config_filepath (str | AnyPath) – The path to the configuration file containing the product metadata and variable definitions.

  • data (list[DataArray] | list[np.ndarray] | None) – Optional data to be associated with the variables. If provided, it should be a list of DataArray or numpy ndarray objects, where each entry corresponds to a variable defined in the configuration file.

Returns:

An instance of DataProductConfig with the loaded data product ID, version, and variables.

Return type:

DataProductConfig

Raises:

ValueError – If the data is provided as a list but contains an empty entry, or if the configuration file does not contain the expected structure.

Notes

This method reads a configuration file (in YAML format) that contains the static product metadata, version, and variable metadata. It then creates an instance of DataProductConfig with the loaded data.

classmethod get_static_project_metadata(file_path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/lasp-libera-sdc-libera-utils/checkouts/latest/libera_utils/data/static_project_metadata.yml'))#

Loads the static project metadata field of the object from a file

Parameters:

file_path (Path) – The path to the corresponding yml file.

Returns:

An instance of StaticProjectMetadata containing the static metadata for the Libera project.

Return type:

StaticProjectMetadata

classmethod load_data_product_variables_with_metadata(file_path: str | Path)#

Method to create a properly made LiberaVariables from a config file.

Parameters:

file_path (str | Path) – The path to the configuration file containing variable metadata.

Returns:

A dictionary where the keys are variable names and the values are LiberaVariable objects

Return type:

dict

Notes

This method is used as part of validator if a filepath is passed in to construct the Data ProductConfig object. It reads a JSON or YAML file containing variable metadata, and returns a dictionary of LiberaVariable objects with their metadata.

load_variables_from_config()#

If a model is instantiated with a configuration path listed then populate the variables from that file

Returns:

The instance of DataProductConfig with the variables loaded from the configuration file, if applicable.

Return type:

DataProductConfig

Notes

This method is called after the model is validated. It checks if the variable_configuration_path is not None and if the variables are None. If so, it calls the add_variable_metadata_from_file method to load the variable metadata from the specified file. This allows the model to be validated after the variables have been added.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

property model_extra: dict[str, Any] | None#

Get extra fields set during validation.

Returns:

A dictionary of extra fields, or None if config.extra is not set to “allow”.

property model_fields_set: set[str]#

Returns the set of fields that have been explicitly set on this model instance.

Returns:

A set of strings representing the fields that have been set,

i.e. that were not filled from defaults.

classmethod use_variable_configuration(variable_configuration_path: str | Path)#

Optional validator method that allows the user to specify a path to the variable configuration file.

Parameters:

variable_configuration_path (str | Path | None) – The path to the variable configuration file. It can be a string, a Path object, or None.

Returns:

A Path object representing the variable configuration path, or None if no path is provided.

Return type:

Path | None

Notes

This method checks if the provided variable configuration path is None or a string. If it is a string, it converts it to a Path object. If the path is None, it returns None.

property variable_encoding_dict: dict#

Create the needed variable encodings for writing data from the variable metadata

Notes

This property returns a dictionary where the keys are variable names and the values are the encoding dictionaries for each variable. If the variables are None, it returns None.

Returns:

A dictionary of variable encodings, where each key is a variable name and the value is its encoding. If the variable has no data the no encoding is defined. If no variables are defined returns None.

Return type:

dict | None

write(folder_location: str | AnyPath, allow_incomplete: bool = False, start_time: datetime = None, end_time: datetime = None) AnyPath#

The primary writing method for the Libera Data Products

Parameters:
  • folder_location (str | AnyPath) – The location where the data product file will be written. It can be a string or an AnyPath object.

  • allow_incomplete (bool) – If True, allows the writing of the data product even if some variables are incomplete (i.e., missing data or metadata). If False, raises an error if any variable is incomplete.

  • start_time (datetime | None) – The start time of the data product. If not provided, it will be set to the earliest time in the data.

  • end_time (datetime | None) – The end time of the data product. If not provided, it will be set to the latest time in the data.

Returns:

The path to the written data product file.

Return type:

AnyPath

Raises:
  • ValueError – If not all variables have metadata or data, and allow_incomplete is False. It will also raise an error if the data product dataset is None and no internal dataset can be generated.

  • ValueError – If start_time is provided without end_time, or vice versa.