libera_utils.io.filenaming#

Module for file naming utilities

Functions

format_semantic_version(semantic_version)

Formats a semantic version string X.Y.Z into a filename-compatible string like VX-Y-Z, for X = major version, Y = minor version, Z = patch.

get_current_revision_str()

Get the current r%y%j%H%M%S string for filename revisions.

get_current_version_str(package_name)

Retrieve the current version of a (algorithm) package and format it for inclusion in a filename

Classes

AbstractValidFilename(*args, **kwargs)

Composition of a CloudPath/Path instance with some methods to perform regex validation on filenames

AttitudeKernelFilename(*args, **kwargs)

Class to construct, store, and manipulate an SPK filename

EphemerisKernelFilename(*args, **kwargs)

Class to construct, store, and manipulate an SPK filename

L0Filename(*args, **kwargs)

Filename validation class for L0 files from EDOS.

LiberaDataProductFilename(*args, **kwargs)

Filename validation class for L1B and L2 science products

ManifestFilename(*args, **kwargs)

Class for naming manifest files

ProductName(value)

Enum of valid product names as used in filenames, defined and sourced from the LASP-ASDC ICD

class libera_utils.io.filenaming.AbstractValidFilename(*args, **kwargs)#

Composition of a CloudPath/Path instance with some methods to perform regex validation on filenames

Attributes:
archive_prefix

Property that contains the generated prefix used for archiving, when applicable

data_product_id

Property that contains the DataProductIdentifier for this file type

filename_parts

Property that contains a namespace of filename parts

path

Property containing the file path

processing_step_id

Property that contains the ProcessingStepIdentifier that generates this file

Methods

from_file_path(*args, **kwargs)

Factory method to produce an AbstractValidFilename from a valid Libera file path (str or Path)

from_filename_parts(*args[, basepath])

Abstract method that must be implemented to provide hinting for required parts

generate_prefixed_path(parent_path)

Generates an absolute path of the form {parent_path}/{prefix_structure}/{file_basename} The parent_path can be an S3 bucket or an absolute local filepath (must start with /)

regex_match(path)

Parse and validate a given path against class-attribute defined regex

static _calculate_applicable_time(start: datetime, end: datetime) date#

Based on the start time and end time of a file, returns the applicable time (date)

Parameters:
Returns:

The date of the mean time between start and end

Return type:

datetime.date

abstractmethod classmethod _format_filename_parts(**parts)#

Format parts into a filename

Note: When this is implemented by concrete classes, **parts becomes a set of explicitly named arguments

classmethod _from_filename_parts(*, basepath: str | Path | S3Path = None, **parts: Any)#

Create instance from filename parts.

The part kwarg names are named according to the regex for the file type.

Parameters:
  • basepath (Union[str, Path, S3Path], Optional) – Allows prepending a basepath or prefix.

  • parts (Any) – Passed directly to _format_filename_parts. This is a dict of variable kwargs that will differ in each filename class based on the required parts for that particular filename type.

Return type:

AbstractValidFilename

abstractmethod _parse_filename_parts()#

Parse the filename parts into objects from regex matched strings

Returns:

namespace object containing filename parts as parsed objects

Return type:

types.SimpleNamespace

abstract property archive_prefix: str#

Property that contains the generated prefix used for archiving, when applicable

abstract property data_product_id: DataProductIdentifier#

Property that contains the DataProductIdentifier for this file type

property filename_parts#

Property that contains a namespace of filename parts

classmethod from_file_path(*args, **kwargs) AVF#

Factory method to produce an AbstractValidFilename from a valid Libera file path (str or Path)

abstractmethod classmethod from_filename_parts(*args: Any, basepath: str | Path | S3Path = None, **kwargs: Any)#

Abstract method that must be implemented to provide hinting for required parts

generate_prefixed_path(parent_path: str | Path | S3Path) Path | S3Path#

Generates an absolute path of the form {parent_path}/{prefix_structure}/{file_basename} The parent_path can be an S3 bucket or an absolute local filepath (must start with /)

Parameters:

parent_path (Union[str, Path, S3Path]) – Absolute path to the parent directory or S3 bucket prefix. The generated path prefix is appended to the parent path and followed by the file basename.

Return type:

pathlib.Path or cloudpathlib.s3.s3path.S3Path

property path: Path | S3Path#

Property containing the file path

abstract property processing_step_id: ProcessingStepIdentifier#

Property that contains the ProcessingStepIdentifier that generates this file

regex_match(path: str | Path | S3Path)#

Parse and validate a given path against class-attribute defined regex

Returns:

Match group dict of filename parts

Return type:

dict

class libera_utils.io.filenaming.AttitudeKernelFilename(*args, **kwargs)#

Class to construct, store, and manipulate an SPK filename

Attributes:
archive_prefix

Property that contains the generated prefix for SPICE archiving

data_product_id

Property that contains the DataProductIdentifier for this file type

filename_parts

Property that contains a namespace of filename parts

path

Property containing the file path

processing_step_id

Property that contains the ProcessingStepIdentifier that generates this file

Methods

from_file_path(*args, **kwargs)

Factory method to produce an AbstractValidFilename from a valid Libera file path (str or Path)

from_filename_parts(*, ck_object, version, ...)

Create instance from filename parts.

generate_prefixed_path(parent_path)

Generates an absolute path of the form {parent_path}/{prefix_structure}/{file_basename} The parent_path can be an S3 bucket or an absolute local filepath (must start with /)

regex_match(path)

Parse and validate a given path against class-attribute defined regex

classmethod _format_filename_parts(*, ck_object: str, version: str, utc_start: datetime, utc_end: datetime, revision: datetime)#

Format filename parts as a string

Parameters:
  • ck_object (str) – Name of object whose attitude is represented in this CK.

  • utc_start (datetime.datetime) – Start time of data.

  • utc_end (datetime.datetime) – End time of data.

  • version (str) – Software version that the file was created with. Corresponds to the algorithm version as determined by the algorithm software.

  • revision (datetime.datetime) – When the file was last revised.

Return type:

str

_parse_filename_parts()#

Parse the filename parts into objects from regex matched strings

Returns:

namespace object containing filename parts as parsed objects

Return type:

types.SimpleNamespace

property archive_prefix: str#

Property that contains the generated prefix for SPICE archiving

property data_product_id: DataProductIdentifier#

Property that contains the DataProductIdentifier for this file type

classmethod from_filename_parts(*, ck_object: str, version: str, utc_start: datetime, utc_end: datetime, revision: datetime, basepath: str | Path | S3Path | None = None)#

Create instance from filename parts.

This method exists primarily to expose typehinting to the user for use with the generic _from_filename_parts. The part arg names are named according to the regex for the file type.

Parameters:
  • ck_object (str) – Name of object whose attitude is represented in this CK.

  • version (str) – Software version that the file was created with. Corresponds to the algorithm version as determined by the algorithm software.

  • utc_start (datetime.datetime) – Start time of data.

  • utc_end (datetime.datetime) – End time of data.

  • revision (datetime.datetime) – When the file was last revised.

  • basepath (Optional[Union[str, Path, S3Path]]) – Allows prepending a basepath or prefix.

Return type:

AttitudeKernelFilename

property processing_step_id: ProcessingStepIdentifier#

Property that contains the ProcessingStepIdentifier that generates this file

class libera_utils.io.filenaming.EphemerisKernelFilename(*args, **kwargs)#

Class to construct, store, and manipulate an SPK filename

Attributes:
archive_prefix

Property that contains the generated prefix for SPICE archiving

data_product_id

Property that contains the DataProductIdentifier for this file type

filename_parts

Property that contains a namespace of filename parts

path

Property containing the file path

processing_step_id

Property that contains the ProcessingStepIdentifier that generates this file

Methods

from_file_path(*args, **kwargs)

Factory method to produce an AbstractValidFilename from a valid Libera file path (str or Path)

from_filename_parts(*, spk_object, version, ...)

Create instance from filename parts.

generate_prefixed_path(parent_path)

Generates an absolute path of the form {parent_path}/{prefix_structure}/{file_basename} The parent_path can be an S3 bucket or an absolute local filepath (must start with /)

regex_match(path)

Parse and validate a given path against class-attribute defined regex

classmethod _format_filename_parts(*, spk_object: str, version: str, utc_start: datetime, utc_end: datetime, revision: datetime)#

Format filename parts as a string

Parameters:
  • spk_object (str) – Name of object whose ephemeris is represented in this SPK.

  • version (str) – Software version that the file was created with. Corresponds to the algorithm version as determined by the algorithm software.

  • utc_start (datetime.datetime) – Start time of data.

  • utc_end (datetime.datetime) – End time of data.

  • revision (datetime.datetime) – Time when the file was last revised

Return type:

str

_parse_filename_parts()#

Parse the filename parts into objects from regex matched strings

Returns:

namespace object containing filename parts as parsed objects

Return type:

types.SimpleNamespace

property archive_prefix: str#

Property that contains the generated prefix for SPICE archiving

property data_product_id: DataProductIdentifier#

Property that contains the DataProductIdentifier for this file type

classmethod from_filename_parts(*, spk_object: str, version: str, utc_start: datetime, utc_end: datetime, revision: datetime, basepath: str | Path | S3Path | None = None)#

Create instance from filename parts.

This method exists primarily to expose typehinting to the user for use with the generic _from_filename_parts. The part arg names are named according to the regex for the file type.

Parameters:
  • spk_object (str) – Name of object whose attitude is represented in this SPK.

  • version (str) – Software version that the file was created with. Corresponds to the algorithm version as determined by the algorithm software.

  • utc_start (datetime.datetime) – Start time of data.

  • utc_end (datetime.datetime) – End time of data.

  • revision (datetime.datetime) – When the file was last revised.

  • basepath (Optional[Union[str, Path, S3Path]]) – Allows prepending a basepath or prefix.

Return type:

EphemerisKernelFilename

property processing_step_id: ProcessingStepIdentifier#

Property that contains the ProcessingStepIdentifier that generates this file

class libera_utils.io.filenaming.L0Filename(*args, **kwargs)#

Filename validation class for L0 files from EDOS.

Attributes:
archive_prefix

Property that contains the generated prefix for L0 archiving

data_product_id

Property that contains the DataProductIdentifier for this file type

filename_parts

Property that contains a namespace of filename parts

path

Property containing the file path

processing_step_id

Property that contains the ProcessingStepIdentifier that generates this file

Methods

from_file_path(*args, **kwargs)

Factory method to produce an AbstractValidFilename from a valid Libera file path (str or Path)

from_filename_parts(*, id_char, scid, ...[, ...])

Create instance from filename parts

generate_prefixed_path(parent_path)

Generates an absolute path of the form {parent_path}/{prefix_structure}/{file_basename} The parent_path can be an S3 bucket or an absolute local filepath (must start with /)

regex_match(path)

Parse and validate a given path against class-attribute defined regex

classmethod _format_filename_parts(*, id_char: str, scid: int, first_apid: int, fill: str, created_time: datetime, numeric_id: int, file_number: int, extension: str, signal: str | None = None)#

Construct a path from filename parts

Parameters:
  • id_char (str) – Either P (for PDS files, Construction Records) or X (for Delivery Records)

  • scid (int) – Spacecraft ID

  • first_apid (int) – First APID in the file

  • fill (str) – Custom string up to 14 characters long

  • created_time (datetime.datetime) – Creation time of the file

  • numeric_id (int) – Data set ID, 0-9, one digit

  • file_number (str) – File number within the data set. Construction records are always file number zero.

  • extension (str) – File name extension. Either PDR or PDS

  • signal (Optional[str], Optional) – Optional signal suffix. Always ‘.XFR’

Returns:

Formatted filename

Return type:

str

_parse_filename_parts()#

Parse the filename parts into objects from regex matched strings

Returns:

namespace object containing filename parts as parsed objects

Return type:

types.SimpleNamespace

property archive_prefix: str#

Property that contains the generated prefix for L0 archiving

property data_product_id: DataProductIdentifier#

Property that contains the DataProductIdentifier for this file type

classmethod from_filename_parts(*, id_char: str, scid: int, first_apid: int, fill: str, created_time: datetime, numeric_id: int, file_number: int, extension: str, signal: str | None = None, basepath: str | Path | S3Path | None = None)#

Create instance from filename parts

This method exists primarily to expose typehinting to the user for use with the generic _from_filename_parts. The part names are named according to the regex for the file type.

Parameters:
  • id_char (str) – Either P (for PDS files, Construction Records) or X (for Delivery Records)

  • scid (int) – Spacecraft ID

  • first_apid (int) – First APID in the file

  • fill (str) – Custom string up to 14 characters long

  • created_time (datetime.datetime) – Creation time of the file

  • numeric_id (int) – Data set ID, 0-9, one digit

  • file_number (str) – File number within the data set. Construction records are always file number zero.

  • extension (str) – File name extension. Either PDR or PDS

  • signal (Optional[str]) – Optional signal suffix. Always ‘.XFR’

  • basepath (Optional[Union[str, Path, S3Path]]) – Allows prepending a basepath or prefix.

Return type:

L0Filename

property processing_step_id: ProcessingStepIdentifier#

Property that contains the ProcessingStepIdentifier that generates this file

class libera_utils.io.filenaming.LiberaDataProductFilename(*args, **kwargs)#

Filename validation class for L1B and L2 science products

Attributes:
archive_prefix

Property that contains the generated prefix for L1B and L2 archiving

data_product_id

Property that contains the DataProductIdentifier for this file type

filename_parts

Property that contains a namespace of filename parts

path

Property containing the file path

processing_step_id

Property that contains the ProcessingStepIdentifier that generates this file

Methods

from_file_path(*args, **kwargs)

Factory method to produce an AbstractValidFilename from a valid Libera file path (str or Path)

from_filename_parts(*, data_level, ...[, ...])

Create instance from filename parts.

generate_prefixed_path(parent_path)

Generates an absolute path of the form {parent_path}/{prefix_structure}/{file_basename} The parent_path can be an S3 bucket or an absolute local filepath (must start with /)

regex_match(path)

Parse and validate a given path against class-attribute defined regex

classmethod _format_filename_parts(*, data_level: str, product_name: str, version: str, utc_start: datetime, utc_end: datetime, revision: datetime, extension: str)#

Construct a path from filename parts

Parameters:
  • data_level (str) – L1B or L2

  • product_name (str) – Libera instrument, cam or rad for L1B and cloud-fraction etc. for L2. May contain anything except for underscores.

  • version (str) – Software version that the file was created with. Corresponds to the algorithm version as determined by the algorithm software.

  • utc_start (datetime.datetime) – First timestamp in the SPK

  • utc_end (datetime.datetime) – Last timestamp in the SPK

  • revision (datetime.datetime) – Time when the file was created.

  • extension (str) – File extension (.nc or .h5)

Returns:

Formatted filename

Return type:

str

_parse_filename_parts()#

Parse the filename parts into objects from regex matched strings

Returns:

namespace object containing filename parts as parsed objects

Return type:

types.SimpleNamespace

property archive_prefix: str#

Property that contains the generated prefix for L1B and L2 archiving

property data_product_id: DataProductIdentifier#

Property that contains the DataProductIdentifier for this file type

classmethod from_filename_parts(*, data_level: str, product_name: str, version: str, utc_start: datetime, utc_end: datetime, revision: datetime, extension: str = 'nc', basepath: str | Path | S3Path | None = None)#

Create instance from filename parts. All keyword arguments other than basepath are required!

This method exists primarily to expose typehinting to the user for use with the generic _from_filename_parts. The part names are named according to the regex for the file type.

Parameters:
  • data_level (str) – L1B or L2 identifying the level of the data product

  • product_name (str) – Product type. e.g. cloud-fraction for L2 or cam for L1B. May contain anything except for underscores.

  • version (str) – Software version that the file was created with. Corresponds to the algorithm version as determined by the algorithm software.

  • utc_start (datetime.datetime) – First timestamp in the SPK

  • utc_end (datetime.datetime) – Last timestamp in the SPK

  • revision (datetime.datetime) – Time when the file was created.

  • extension (str) – File extension (.nc or .h5)

  • basepath (Optional[Union[str, Path, S3Path]]) – Allows prepending a basepath or prefix.

Return type:

LiberaDataProductFilename

property processing_step_id: ProcessingStepIdentifier#

Property that contains the ProcessingStepIdentifier that generates this file

class libera_utils.io.filenaming.ManifestFilename(*args, **kwargs)#

Class for naming manifest files

Attributes:
archive_prefix

Manifests are not archived like data products, but for convenience and ease of debugging they will be kept in the dropbox bucket by input/output and day they were made.

data_product_id

Property that contains the DataProductIdentifier for this file type

filename_parts

Property that contains a namespace of filename parts

path

Property containing the file path

processing_step_id

Property that contains the ProcessingStepIdentifier that generates this file

Methods

from_file_path(*args, **kwargs)

Factory method to produce an AbstractValidFilename from a valid Libera file path (str or Path)

from_filename_parts(manifest_type, ulid_code)

Create instance from filename parts.

generate_prefixed_path(parent_path)

Generates an absolute path of the form {parent_path}/{prefix_structure}/{file_basename} The parent_path can be an S3 bucket or an absolute local filepath (must start with /)

regex_match(path)

Parse and validate a given path against class-attribute defined regex

classmethod _format_filename_parts(manifest_type: ManifestType, ulid_code: ULID)#

Construct a path from filename parts

Parameters:
  • manifest_type (ManifestType) – Input or output

  • ulid_code (ulid.ULID) – ULID code for use in filename parts

Returns:

Formatted filename

Return type:

str

_parse_filename_parts()#

Parse the filename parts into objects from regex matched strings

Returns:

namespace object containing filename parts as parsed objects

Return type:

types.SimpleNamespace

property archive_prefix: str#

Manifests are not archived like data products, but for convenience and ease of debugging they will be kept in the dropbox bucket by input/output and day they were made. This is used by the step function clean up function in the CDK. # Generate prefix structure # <manifest_type>/<year>/<month>/<day>

property data_product_id: DataProductIdentifier#

Property that contains the DataProductIdentifier for this file type

classmethod from_filename_parts(manifest_type: ManifestType, ulid_code: ULID, basepath: str | Path | S3Path = None)#

Create instance from filename parts.

This method exists primarily to expose typehinting to the user for use with the generic _from_filename_parts. The part names are named according to the regex for the file type.

Parameters:
  • manifest_type (ManifestType) – Input or output

  • ulid_code (ulid.ULID) – ULID code for use in filename parts

  • basepath (Optional[Union[str, Path, S3Path]]) – Allows prepending a basepath or prefix.

Return type:

ManifestFilename

property processing_step_id: ProcessingStepIdentifier#

Property that contains the ProcessingStepIdentifier that generates this file

class libera_utils.io.filenaming.ProductName(value)#

Enum of valid product names as used in filenames, defined and sourced from the LASP-ASDC ICD

Methods

capitalize(/)

Return a capitalized version of the string.

casefold(/)

Return a version of the string suitable for caseless comparisons.

center(width[, fillchar])

Return a centered string of length width.

count(sub[, start[, end]])

Return the number of non-overlapping occurrences of substring sub in string S[start:end].

encode(/[, encoding, errors])

Encode the string using the codec registered for encoding.

endswith(suffix[, start[, end]])

Return True if S ends with the specified suffix, False otherwise.

expandtabs(/[, tabsize])

Return a copy where all tab characters are expanded using spaces.

find(sub[, start[, end]])

Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end].

format(*args, **kwargs)

Return a formatted version of S, using substitutions from args and kwargs.

format_map(mapping)

Return a formatted version of S, using substitutions from mapping.

index(sub[, start[, end]])

Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end].

isalnum(/)

Return True if the string is an alpha-numeric string, False otherwise.

isalpha(/)

Return True if the string is an alphabetic string, False otherwise.

isascii(/)

Return True if all characters in the string are ASCII, False otherwise.

isdecimal(/)

Return True if the string is a decimal string, False otherwise.

isdigit(/)

Return True if the string is a digit string, False otherwise.

isidentifier(/)

Return True if the string is a valid Python identifier, False otherwise.

islower(/)

Return True if the string is a lowercase string, False otherwise.

isnumeric(/)

Return True if the string is a numeric string, False otherwise.

isprintable(/)

Return True if the string is printable, False otherwise.

isspace(/)

Return True if the string is a whitespace string, False otherwise.

istitle(/)

Return True if the string is a title-cased string, False otherwise.

isupper(/)

Return True if the string is an uppercase string, False otherwise.

join(iterable, /)

Concatenate any number of strings.

ljust(width[, fillchar])

Return a left-justified string of length width.

lower(/)

Return a copy of the string converted to lowercase.

lstrip([chars])

Return a copy of the string with leading whitespace removed.

maketrans(x[, y, z])

Return a translation table usable for str.translate().

partition(sep, /)

Partition the string into three parts using the given separator.

removeprefix(prefix, /)

Return a str with the given prefix string removed if present.

removesuffix(suffix, /)

Return a str with the given suffix string removed if present.

replace(old, new[, count])

Return a copy with all occurrences of substring old replaced by new.

rfind(sub[, start[, end]])

Return the highest index in S where substring sub is found, such that sub is contained within S[start:end].

rindex(sub[, start[, end]])

Return the highest index in S where substring sub is found, such that sub is contained within S[start:end].

rjust(width[, fillchar])

Return a right-justified string of length width.

rpartition(sep, /)

Partition the string into three parts using the given separator.

rsplit(/[, sep, maxsplit])

Return a list of the substrings in the string, using sep as the separator string.

rstrip([chars])

Return a copy of the string with trailing whitespace removed.

split(/[, sep, maxsplit])

Return a list of the substrings in the string, using sep as the separator string.

splitlines(/[, keepends])

Return a list of the lines in the string, breaking at line boundaries.

startswith(prefix[, start[, end]])

Return True if S starts with the specified prefix, False otherwise.

strip([chars])

Return a copy of the string with leading and trailing whitespace removed.

swapcase(/)

Convert uppercase characters to lowercase and lowercase characters to uppercase.

title(/)

Return a version of the string where each word is titlecased.

translate(table, /)

Replace each character in the string using the given translation table.

upper(/)

Return a copy of the string converted to uppercase.

zfill(width, /)

Pad a numeric string with zeros on the left, to fill a field of the given width.

property data_product_id: DataProductIdentifier#

DataProductIdentifier for this product name

property processing_step_id: ProcessingStepIdentifier#

ProcessingStepIdentifier for this product name

libera_utils.io.filenaming.format_semantic_version(semantic_version: str) str#

Formats a semantic version string X.Y.Z into a filename-compatible string like VX-Y-Z, for X = major version, Y = minor version, Z = patch.

Result is uppercase. Release candidate suffixes are allowed as no strict checking is done on the contents of X, Y, or Z. e.g. 1.2.3rc1 becomes V1-2-3RC1

Parameters:

semantic_version (str) – String matching X.Y.Z where X, Y and Z are integers of any length

Return type:

str

libera_utils.io.filenaming.get_current_revision_str() str#

Get the current r%y%j%H%M%S string for filename revisions.

Returns:

Current (now) revision string.

Return type:

str

libera_utils.io.filenaming.get_current_version_str(package_name: str) str#

Retrieve the current version of a (algorithm) package and format it for inclusion in a filename

Parameters:

package_name (str) – Package for which to retrieve a version string. This should be your algorithm package and it must use a semantic versioning scheme, configured in project metadata.

Returns:

Version string in format vM1m2p3

Return type:

str