libera_utils.io.filenaming#
Module for file naming utilities
Functions
|
Formats a semantic version string X.Y.Z into a filename-compatible string like VX-Y-Z, for X = major version, Y = minor version, Z = patch. |
|
Retrieve the current version of a (algorithm) package and format it for inclusion in a filename |
Classes
|
Abstract base class for data product filenames. |
|
Filename class that ensures validity of a filename based on regex pattern. |
|
Filename validation class for L0 Production Data Set (PDS) files from EDOS. |
|
Filename validation class for Libera SDC data products. |
|
Class for naming manifest files |
- class libera_utils.io.filenaming.AbstractDataProductFilename(*args, **kwargs)#
Abstract base class for data product filenames.
This class adds the data product specific requirements that all data products must have: a processing step ID and a data product ID. For example, an L0Filename or a LiberaDataProductFilename are both AbstractDataProductFilenames.
- Attributes:
archive_prefixProperty that contains the generated prefix used for archiving, when applicable
data_product_idProperty that contains the DataProductIdentifier for this file type
filename_partsProperty that contains a namespace of filename parts
pathProperty containing the file path
Methods
from_file_path(*args, **kwargs)Factory method to produce an AbstractValidFilename from a valid Libera file path (str or Path)
from_filename_parts(*args, **kwargs)Abstract method that must be implemented to provide hinting for required parts
generate_prefixed_path(parent_path)Generates an absolute path of the form {parent_path}/{prefix_structure}/{file_basename} The parent_path can be an S3 bucket or an absolute local filepath (must start with /)
regex_match(path)Parse and validate a given path against class-attribute defined regex
- abstract property data_product_id: DataProductIdentifier#
Property that contains the DataProductIdentifier for this file type
- class libera_utils.io.filenaming.AbstractValidFilename(*args, **kwargs)#
Filename class that ensures validity of a filename based on regex pattern.
Notes
This is an abstract base class that must be inherited by concrete filename classes.
This class internally stores a CloudPath or Path object in the path property (composition).
- Attributes:
archive_prefixProperty that contains the generated prefix used for archiving, when applicable
filename_partsProperty that contains a namespace of filename parts
pathProperty containing the file path
Methods
from_file_path(*args, **kwargs)Factory method to produce an AbstractValidFilename from a valid Libera file path (str or Path)
from_filename_parts(*args, **kwargs)Abstract method that must be implemented to provide hinting for required parts
generate_prefixed_path(parent_path)Generates an absolute path of the form {parent_path}/{prefix_structure}/{file_basename} The parent_path can be an S3 bucket or an absolute local filepath (must start with /)
regex_match(path)Parse and validate a given path against class-attribute defined regex
- abstractmethod classmethod _format_filename_parts(*args: Any, **kwargs: Any)#
Format parts into a filename
Note: When this is implemented by concrete classes, *args and **kwargs become specific parameters
- classmethod _from_filename_parts(*, basepath: str | Path | S3Path | None = None, **parts: Any)#
Create instance from filename parts.
The part kwarg names are named according to the regex for the file type.
- Parameters:
basepath (Union[str, Path, S3Path], Optional) – Allows prepending a basepath or prefix.
parts (Any) – Passed directly to _format_filename_parts. This is a dict of variable kwargs that will differ in each filename class based on the required parts for that particular filename type.
- Return type:
- abstractmethod _parse_filename_parts()#
Parse the filename parts into objects from regex matched strings
- Returns:
namespace object containing filename parts as parsed objects
- Return type:
- abstract property archive_prefix: str#
Property that contains the generated prefix used for archiving, when applicable
- property filename_parts#
Property that contains a namespace of filename parts
- classmethod from_file_path(*args, **kwargs)#
Factory method to produce an AbstractValidFilename from a valid Libera file path (str or Path)
- abstractmethod classmethod from_filename_parts(*args: Any, **kwargs: Any)#
Abstract method that must be implemented to provide hinting for required parts
- generate_prefixed_path(parent_path: str | CloudPath | Path) CloudPath | Path#
Generates an absolute path of the form {parent_path}/{prefix_structure}/{file_basename} The parent_path can be an S3 bucket or an absolute local filepath (must start with /)
- Parameters:
parent_path (Union[str, Path, S3Path]) – Absolute path to the parent directory or S3 bucket prefix. The generated path prefix is appended to the parent path and followed by the file basename.
- Return type:
- class libera_utils.io.filenaming.L0Filename(*args, **kwargs)#
Filename validation class for L0 Production Data Set (PDS) files from EDOS.
- Attributes:
archive_prefixProperty that contains the generated prefix for L0 archiving
data_product_idProperty that contains the DataProductIdentifier for this file type
filename_partsProperty that contains a namespace of filename parts
pathProperty containing the file path
Methods
from_file_path(*args, **kwargs)Factory method to produce an AbstractValidFilename from a valid Libera file path (str or Path)
from_filename_parts(*, id_char, scid, ...[, ...])Create instance from filename parts
generate_prefixed_path(parent_path)Generates an absolute path of the form {parent_path}/{prefix_structure}/{file_basename} The parent_path can be an S3 bucket or an absolute local filepath (must start with /)
regex_match(path)Parse and validate a given path against class-attribute defined regex
- classmethod _format_filename_parts(*, id_char: str, scid: int, first_apid: int, fill: str, created_time: datetime, numeric_id: int, file_number: int, extension: str, signal: str | None = None)#
Construct a path from filename parts
- Parameters:
id_char (str) – Either P (for PDS files, Construction Records) or X (for Delivery Records)
scid (int) – Spacecraft ID
first_apid (int) – First APID in the file
fill (str) – Custom string up to 14 characters long
created_time (datetime.datetime) – Creation time of the file
numeric_id (int) – Data set ID, 0-9, one digit
file_number (str) – File number within the data set. Construction records are always file number zero.
extension (str) – File name extension. Either PDR or PDS
signal (Optional[str], Optional) – Optional signal suffix. Always ‘.XFR’
- Returns:
Formatted filename
- Return type:
- _parse_filename_parts()#
Parse the filename parts into objects from regex matched strings
- Returns:
namespace object containing filename parts as parsed objects
- Return type:
- property data_product_id: DataProductIdentifier#
Property that contains the DataProductIdentifier for this file type
- classmethod from_filename_parts(*, id_char: str, scid: int, first_apid: int, fill: str, created_time: datetime, numeric_id: int, file_number: int, extension: str, signal: str | None = None, basepath: str | Path | S3Path | None = None)#
Create instance from filename parts
This method exists primarily to expose typehinting to the user for use with the generic _from_filename_parts. The part names are named according to the regex for the file type.
- Parameters:
id_char (str) – Either P (for PDS files, Construction Records) or X (for Delivery Records)
scid (int) – Spacecraft ID
first_apid (int) – First APID in the file
fill (str) – Custom string up to 14 characters long
created_time (datetime.datetime) – Creation time of the file
numeric_id (int) – Data set ID, 0-9, one digit
file_number (str) – File number within the data set. Construction records are always file number zero.
extension (str) – File name extension. Either PDR or PDS
signal (Optional[str]) – Optional signal suffix. Always ‘.XFR’
basepath (Optional[Union[str, Path, S3Path]]) – Allows prepending a basepath or prefix.
- Return type:
- class libera_utils.io.filenaming.LiberaDataProductFilename(*args, **kwargs)#
Filename validation class for Libera SDC data products.
- Attributes:
applicable_dateProperty that returns the applicable date based on the midpoint of start and end times.
archive_prefixProperty that contains the generated prefix for L1B and L2 archiving
data_product_idProperty that contains the DataProductIdentifier for this file type
filename_partsProperty that contains a namespace of filename parts
pathProperty containing the file path
processing_step_idProperty that contains the ProcessingStepIdentifier that generates this file
Methods
from_file_path(*args, **kwargs)Factory method to produce an AbstractValidFilename from a valid Libera file path (str or Path)
from_filename_parts(*, product_name, ...[, ...])Create instance from filename parts.
generate_prefixed_path(parent_path)Generates an absolute path of the form {parent_path}/{prefix_structure}/{file_basename} The parent_path can be an S3 bucket or an absolute local filepath (must start with /)
regex_match(path)Parse and validate a given path against class-attribute defined regex
- classmethod _format_filename_parts(*, data_level: str, product_name: str, version: str, utc_start: datetime, utc_end: datetime, revision: datetime, extension: str)#
Construct a path from filename parts
- Parameters:
data_level (str) – L1B or L2
product_name (str) – Libera instrument, cam or rad for L1B and cloud-fraction etc. for L2. May contain anything except for underscores.
version (str) – Software version that the file was created with. Corresponds to the algorithm version as determined by the algorithm software.
utc_start (datetime.datetime) – First timestamp in the SPK
utc_end (datetime.datetime) – Last timestamp in the SPK
revision (datetime.datetime) – Time when the file was created.
extension (str) – File extension (.nc or .h5)
- Returns:
Formatted filename
- Return type:
- _parse_filename_parts()#
Parse the filename parts into objects from regex matched strings
- Returns:
namespace object containing filename parts as parsed objects
- Return type:
- property applicable_date: date#
Property that returns the applicable date based on the midpoint of start and end times.
Issues a warning if the time range covers more than 24 hours.
- Returns:
The date of the midpoint between utc_start and utc_end
- Return type:
- property data_product_id: DataProductIdentifier#
Property that contains the DataProductIdentifier for this file type
- classmethod from_filename_parts(*, product_name: str | DataProductIdentifier, version: str, utc_start: datetime, utc_end: datetime, data_level: str | DataLevel | None = None, revision: datetime = datetime.datetime(2025, 12, 8, 23, 33, 28, 559294, tzinfo=datetime.timezone.utc), extension: str | None = None, basepath: str | Path | S3Path | None = None)#
Create instance from filename parts. All keyword arguments other than basepath are required!
This method exists primarily to expose typehinting to the user for use with the generic _from_filename_parts. The part names are named according to the regex for the file type.
- Parameters:
data_level (str | DataLevel | None) – L1B or L2 identifying the level of the data product. Default None will infer the data level from the product name (DataProductIdentifier)
product_name (str | DataProductIdentifier) – Product type. e.g. CF-RAD for L2 or RAD-4CH for L1B. May contain anything except for underscores.
version (str) – Software version that the file was created with. Corresponds to the algorithm version as determined by the algorithm software.
utc_start (datetime.datetime) – First timestamp in the SPK
utc_end (datetime.datetime) – Last timestamp in the SPK
revision (datetime.datetime) – Time when the file was created. Default is now in UTC time.
extension (str | None) – File extension. Default None will infer extension based on product_name.
basepath (Optional[Union[str, Path, S3Path]]) – Allows prepending a basepath or prefix.
- Return type:
- property processing_step_id: ProcessingStepIdentifier | None#
Property that contains the ProcessingStepIdentifier that generates this file
- class libera_utils.io.filenaming.ManifestFilename(*args, **kwargs)#
Class for naming manifest files
- Attributes:
archive_prefixManifests are not archived like data products, but for convenience and ease of debugging they will be kept in the dropbox bucket by input/output and day they were made.
filename_partsProperty that contains a namespace of filename parts
pathProperty containing the file path
Methods
from_file_path(*args, **kwargs)Factory method to produce an AbstractValidFilename from a valid Libera file path (str or Path)
from_filename_parts(manifest_type, ulid_code)Create instance from filename parts.
generate_prefixed_path(parent_path)Generates an absolute path of the form {parent_path}/{prefix_structure}/{file_basename} The parent_path can be an S3 bucket or an absolute local filepath (must start with /)
regex_match(path)Parse and validate a given path against class-attribute defined regex
- classmethod _format_filename_parts(manifest_type: ManifestType, ulid_code: ULID)#
Construct a path from filename parts
- Parameters:
manifest_type (ManifestType) – Input or output
ulid_code (ulid.ULID) – ULID code for use in filename parts
- Returns:
Formatted filename
- Return type:
- _parse_filename_parts()#
Parse the filename parts into objects from regex matched strings
- Returns:
namespace object containing filename parts as parsed objects
- Return type:
- property archive_prefix: str#
Manifests are not archived like data products, but for convenience and ease of debugging they will be kept in the dropbox bucket by input/output and day they were made. This is used by the step function clean up function in the CDK. # Generate prefix structure # <manifest_type>/<year>/<month>/<day>
- classmethod from_filename_parts(manifest_type: ManifestType, ulid_code: ULID, basepath: str | Path | S3Path | None = None)#
Create instance from filename parts.
This method exists primarily to expose typehinting to the user for use with the generic _from_filename_parts. The part names are named according to the regex for the file type.
- Parameters:
manifest_type (ManifestType) – Input or output
ulid_code (ulid.ULID) – ULID code for use in filename parts
basepath (Optional[Union[str, Path, S3Path]]) – Allows prepending a basepath or prefix.
- Return type:
- libera_utils.io.filenaming._ensure_utc_timezone(dt_obj: datetime) datetime#
Ensure datetime object has UTC timezone info.
If the datetime is timezone-naive, assume it is in UTC and add timezone info. If the datetime is timezone-aware, convert it to UTC.
- Parameters:
dt_obj (datetime) – Input datetime object
- Returns:
Timezone-aware datetime in UTC
- Return type:
datetime
- libera_utils.io.filenaming.format_semantic_version(semantic_version: str) str#
Formats a semantic version string X.Y.Z into a filename-compatible string like VX-Y-Z, for X = major version, Y = minor version, Z = patch.
Result is uppercase. Release candidate suffixes are allowed as no strict checking is done on the contents of X, Y, or Z. e.g. 1.2.3rc1 becomes V1-2-3RC1