ds_protocol_sftp_py_lib¶
File: __init__.py
Region: ds-protocol-sftp-py-lib
Description¶
A Python package from the ds-protocol-sftp-py-lib library.
Example
from ds_protocol_sftp_py_lib import __version__
print(f"Package version: {__version__}")
Submodules¶
Attributes¶
Classes¶
Tabular dataset object which identifies data within a data store, |
|
Settings for the SFTP dataset. |
|
SFTP Linked Service implementation. |
|
Settings for SFTP Linked Service connections. |
Package Contents¶
- class ds_protocol_sftp_py_lib.SftpDataset[source]¶
Bases:
ds_resource_plugin_py_lib.common.resource.dataset.TabularDataset[SftpLinkedServiceType,SftpDatasetSettingsType,ds_resource_plugin_py_lib.common.serde.serialize.PandasSerializer,ds_resource_plugin_py_lib.common.serde.deserialize.PandasDeserializer],Generic[SftpLinkedServiceType,SftpDatasetSettingsType]Tabular dataset object which identifies data within a data store, such as table/csv/json/parquet/parquetdataset/ and other documents.
The input of the dataset is a pandas DataFrame. The output of the dataset is a pandas DataFrame.
- linked_service: SftpLinkedServiceType¶
- settings: SftpDatasetSettingsType¶
- serializer: ds_resource_plugin_py_lib.common.serde.serialize.PandasSerializer¶
- deserializer: ds_resource_plugin_py_lib.common.serde.deserialize.PandasDeserializer¶
- property type: ds_protocol_sftp_py_lib.enums.ResourceType¶
Get the type of the dataset.
- read() None[source]¶
Read files from the SFTP server.
- Returns:
The output is stored in the output attribute as a DataFrame containing the contents of the matched files.
- Return type:
None
- Raises:
ReadError – If there is an error reading from the SFTP dataset.
- create() None[source]¶
Create data on the SFTP server.
Note
This method is not idempotent. If called multiple times with the same parameters, it will raise a CreateError if the file already exists. If a network or server error occurs after the file is created but before the method returns, retrying may result in a CreateError due to the file’s existence. Orchestration and retry policies should account for this non-idempotent behavior.
- Returns:
None
- Raises:
CreateError – If there is an error creating the dataset on the SFTP server, or if the file already exists.
- update() None[source]¶
Update operation is not supported for in this provider.
- Returns:
None
- Raises:
NotSupportedError – Always raised since update is not supported for SftpDataset.
- upsert() None[source]¶
Upsert a file on the SFTP server. If the file already exists, it will be overwritten.
- Returns:
None
- Raises:
UpsertError – If there is an error upserting the dataset on the SFTP server.
- delete() None[source]¶
Delete operation is not supported for in this provider.
- Returns:
None
- Raises:
NotSupportedError – Always raised since delete is not supported for SftpDataset.
- purge() None[source]¶
Purge the dataset, deleting all files matching the pattern from the SFTP server.
- Returns:
None
- Raises:
PurgeError – If there is an error purging files from the SFTP server
- list() None[source]¶
List the files in the directory on the SFTP server based on the specified pattern and settings.
- Returns:
The output is stored in the output attribute as a DataFrame containing the file information.
- Return type:
None
- Raises:
ListError – If there is an error listing the files in the SFTP dataset.
- rename() None[source]¶
Rename operation is not supported for in this provider.
- Returns:
None
- Raises:
NotSupportedError – Always raised since rename is not supported for SftpDataset.
- _get_folder_and_file_path() str[source]¶
Get combined path of folder_path and file_name, using forward slashes. This ensures consistent path formatting across Windows, Linux, and macOS. It also replaces any Windows-style backslashes with forward slashes.
- Returns:
The full file path as a POSIX-style string.
- Return type:
str
- _ensure_file_does_not_exist(remote_path: str) None[source]¶
Ensure the target file does not already exist on the SFTP server.
- Parameters:
remote_path (str) – Full target file path on the SFTP server.
- Raises:
FileExistsError – If the target file already exists.
- _list_directory(path: str) list[paramiko.SFTPAttributes][source]¶
List the files in the specified directory on the SFTP server.
- Parameters:
path (str) – The directory path to list files from.
- Returns:
A list of SFTPAttributes for the files in the directory.
- Return type:
list[SFTPAttributes]
- _get_files_by_pattern(path: str, fnmatch_pattern: str) list[paramiko.SFTPAttributes][source]¶
Get files from the SFTP server that match the specified pattern.
- Parameters:
path (str) – The directory path to search for files.
fnmatch_pattern (str) – The pattern to match file names against.
- Returns:
A list of SFTPAttributes for the matching files.
- Return type:
list[SFTPAttributes]
- _ensure_sftp_directory(remote_directory: str, max_depth: int = 20) None[source]¶
Ensure that the specified directory exists on the SFTP server. If it does not exist, create it.
- Parameters:
remote_directory (str) – The directory path to ensure on the SFTP server.
max_depth (int) – The maximum directory depth to traverse when ensuring the directory exists. Default is 20.
- Returns:
None
- Raises:
CreateError – If the maximum directory depth is exceeded while ensuring the SFTP directory.
- _read_files_as_dataframe(files: list[paramiko.SFTPAttributes]) pandas.DataFrame[source]¶
Read the dataset from the SFTP server as a dataframe.
- Parameters:
files (list[SFTPAttributes]) – List of SFTPAttributes for the files to read.
- Returns:
The combined data from the files as a single DataFrame.
- Return type:
pd.DataFrame
- _list_directory_files(files: list[paramiko.SFTPAttributes]) pandas.DataFrame[source]¶
List the files in the directory as a dataframe.
- Parameters:
files (list[SFTPAttributes]) – List of SFTPAttributes for the files to list.
- Returns:
A dataframe containing the file information.
- Return type:
pd.DataFrame
- class ds_protocol_sftp_py_lib.SftpDatasetSettings[source]¶
Bases:
ds_resource_plugin_py_lib.common.resource.dataset.DatasetSettingsSettings for the SFTP dataset.
- folder_path: str¶
Path to the folder containing the file(s) to read/write on the SFTP server.
- file_name: str¶
Name of the file to read/write on the SFTP server.
- list: ListSettings¶
Settings for listing the SFTP dataset.
- class ds_protocol_sftp_py_lib.SftpLinkedService[source]¶
Bases:
ds_resource_plugin_py_lib.common.resource.linked_service.LinkedService[SftpLinkedServiceSettingsType],Generic[SftpLinkedServiceSettingsType]SFTP Linked Service implementation.
- settings¶
Linked service settings.
- _connection¶
Underlying SFTP client connection.
- Type:
SFTPClient | None
- settings: SftpLinkedServiceSettingsType¶
- _sftp: ds_protocol_sftp_py_lib.utils.sftp.provider.Sftp | None = None¶
- property type: ds_protocol_sftp_py_lib.enums.ResourceType¶
Get the type of linked service.
- Returns:
The type of the linked service.
- Return type:
- property connection: ds_protocol_sftp_py_lib.utils.sftp.provider.Sftp¶
Get the SFTP client connection.
- Returns:
The active SFTP client connection.
- Return type:
- Raises:
ConnectionError – If the connection is not initialized.
- _init_sftp() ds_protocol_sftp_py_lib.utils.sftp.provider.Sftp[source]¶
Initialize the Sftp client.
- Returns:
An initialized Sftp provider instance.
- Return type:
- connect() None[source]¶
Initialize the Sftp client instance if not already initialized.
- Raises:
ConnectionError – If connection fails.
AuthenticationError – If authentication fails.
- class ds_protocol_sftp_py_lib.SftpLinkedServiceSettings[source]¶
Bases:
ds_resource_plugin_py_lib.common.resource.linked_service.LinkedServiceSettingsSettings for SFTP Linked Service connections.
- host¶
SFTP server hostname.
- Type:
str
- username¶
Username for authentication.
- Type:
str
- password¶
Password for authentication.
- Type:
str | None
- private_key¶
Private key for authentication.
- Type:
str | None
- passphrase¶
Passphrase for private key.
- Type:
str | None
- timeout¶
Connection timeout in seconds.
- Type:
float | None
- host_key_fingerprint¶
Expected host key fingerprint.
- Type:
str | None
- host_key_validation¶
Whether to validate host key.
- Type:
bool
- port¶
SFTP server port.
- Type:
int
- host: str¶
Hostname or IP address of the SFTP server.
- username: str¶
Username for authentication.
- password: str | None = None¶
Password for authentication.
- private_key: str | None = None¶
Private key for authentication.
- passphrase: str | None = None¶
Passphrase for private key.
- timeout: float | None = None¶
Connection timeout in seconds.
- host_key_fingerprint: str | None = None¶
Expected host key fingerprint (base64-encoded MD5, as produced by Paramiko’s get_fingerprint(); e.g., ‘AbCdEfGhIjKlMnOpQrStUvWxYz0123456789abcdEf==’).
- host_key_validation: bool = True¶
Whether to validate host key.
- port: int = 22¶
SFTP server port.
- ds_protocol_sftp_py_lib.__version__¶