libera_utils.io.smart_open#

Module for smart_open

Functions

is_gzip(path)

Determine if a string points to a gzip file.

is_s3(path)

Determine if a string points to an s3 location or not.

smart_copy_file(source_path, dest_path[, delete])

Copy function that can handle local files or files in an S3 bucket.

smart_open(path[, mode, enable_gzip])

Open function that can handle local files or files in an S3 bucket.

libera_utils.io.smart_open._copy_local_to_local(source_path: str | Path, dest_path: str | Path, delete: bool | None = False)#

Copy a local source file to a local destination.

Parameters:
  • source_path (Union[str, Path]) – Path to the source file to be copied.

  • dest_path (Union[str, Path]) – Path to the destination for the copied file.

  • delete (bool, Optional) – If true, deletes files copied from source (default = False)

Returns:

The path to the newly created file

Return type:

Path

libera_utils.io.smart_open._copy_local_to_s3(source_path: str | Path, dest_path: str | S3Path, delete: bool | None = False)#

Copy a local file to an S3 object.

Parameters:
  • source_path (Union[str, Path]) – Path to the source file to be copied.

  • dest_path (Union[str, S3Path]) – Path to the destination for the copied file. Files residing in an s3 bucket must begin with “s3://”.

  • delete (bool, optional) – If true, deletes files copied from source (default = False)

Returns:

The path to the newly created file

Return type:

S3Path

libera_utils.io.smart_open._copy_s3_to_local(source_path: str | S3Path, dest_path: str | Path, delete: bool | None = False)#

Copy an S3 object to a local file.

Parameters:
  • source_path (Union[str, S3Path]) – Path to the source file to be copied. Files residing in an s3 bucket must begin with “s3://”.

  • dest_path (Union[str, Path]) – Path to the destination for the copied file.

  • delete (bool, optional) – If true, deletes files copied from source (default = False)

Returns:

The path to the newly created file

Return type:

Path

libera_utils.io.smart_open._copy_s3_to_s3(source_path: str | S3Path, dest_path: str | S3Path, delete: bool | None = False)#

Copy an S3 object to a different S3 object.

Parameters:
  • source_path (Union[str, S3Path]) – Path to the source file to be copied. Files residing in an s3 bucket must begin with “s3://”.

  • dest_path (Union[str, S3Path]) – Path to the Destination file to be copied to. Files residing in an s3 bucket must begin with “s3://”.

  • delete (bool, optional) – If true, deletes files copied from source (default = False)

Returns:

The path to the newly created file

Return type:

S3Path

libera_utils.io.smart_open.is_gzip(path: str | Path | S3Path)#

Determine if a string points to a gzip file.

Parameters:

path (Union[str, Path, S3Path]) – Path to check.

Return type:

bool

libera_utils.io.smart_open.is_s3(path: str | Path | S3Path)#

Determine if a string points to an s3 location or not.

Parameters:

path (Union[str, Path, S3Path]) – Path to determine if it is and s3 location or not.

Return type:

bool

libera_utils.io.smart_open.smart_copy_file(source_path: str | Path | S3Path, dest_path: str | Path | S3Path, delete: bool | None = False)#

Copy function that can handle local files or files in an S3 bucket. Returns the path to the newly created file as a Path or an S3Path, depending on the destination.

Parameters:
  • source_path (Union[str, Path, S3Path]) – Path to the source file to be copied. Files residing in an s3 bucket must begin with “s3://”.

  • dest_path (Union[str, Path, S3Path]) – Path to the Destination file to be copied to. Files residing in an s3 bucket must begin with “s3://”.

  • delete (bool, optional) – If true, deletes files copied from source (default = False)

Returns:

The path to the newly created file

Return type:

Path or S3Path

libera_utils.io.smart_open.smart_open(path: str | Path | S3Path, mode: str | None = 'rb', enable_gzip: bool | None = True)#

Open function that can handle local files or files in an S3 bucket. It also correctly handles gzip files determined by a *.gz extension.

Parameters:
  • path (Union[str, Path, S3Path]) – Path to the file to be opened. Files residing in an s3 bucket must begin with “s3://”.

  • mode (str, Optional) – Optional string specifying the mode in which the file is opened. Defaults to ‘rb’.

  • enable_gzip (bool, Optional) – Flag to specify that *.gz files should be opened as a GzipFile object. Setting this to False is useful when creating the md5sum of a *.gz file. Defaults to True.

Return type:

IO or gzip.GzipFile