ds_provider_microsoft_py_lib ============================ .. py:module:: ds_provider_microsoft_py_lib .. autoapi-nested-parse:: **File:** ``__init__.py`` **Region:** ``ds-provider-microsoft-py-lib`` Description ----------- A Python package from the ds-provider-microsoft-py-lib library. .. rubric:: Example .. code-block:: python from ds_provider_microsoft_py_lib import __version__ print(f"Package version: {__version__}") Submodules ---------- .. toctree:: :maxdepth: 1 /autoapi/ds_provider_microsoft_py_lib/dataset/index /autoapi/ds_provider_microsoft_py_lib/enums/index /autoapi/ds_provider_microsoft_py_lib/linked_service/index Attributes ---------- .. autoapisummary:: ds_provider_microsoft_py_lib.__version__ Classes ------- .. autoapisummary:: ds_provider_microsoft_py_lib.MsSqlTable ds_provider_microsoft_py_lib.MsSqlTableDatasetSettings ds_provider_microsoft_py_lib.MsSqlLinkedService ds_provider_microsoft_py_lib.MsSqlLinkedServiceSettings Package Contents ---------------- .. py:class:: MsSqlTable Bases: :py:obj:`ds_resource_plugin_py_lib.common.resource.dataset.TabularDataset`\ [\ :py:obj:`MsSqlLinkedServiceType`\ , :py:obj:`MsSqlTableDatasetSettingsType`\ , :py:obj:`ds_resource_plugin_py_lib.common.serde.serialize.PandasSerializer`\ , :py:obj:`ds_resource_plugin_py_lib.common.serde.deserialize.PandasDeserializer`\ ], :py:obj:`Generic`\ [\ :py:obj:`MsSqlLinkedServiceType`\ , :py:obj:`MsSqlTableDatasetSettingsType`\ ] Tabular dataset object which identifies data within a data store, such as table/csv/json/parquet/parquetdataset/ and other documents. The input of the dataset is a pandas DataFrame. The output of the dataset is a pandas DataFrame. .. py:attribute:: linked_service :type: MsSqlLinkedServiceType .. py:attribute:: settings :type: MsSqlTableDatasetSettingsType .. py:attribute:: serializer :type: ds_resource_plugin_py_lib.common.serde.serialize.PandasSerializer | None .. py:attribute:: deserializer :type: ds_resource_plugin_py_lib.common.serde.deserialize.PandasDeserializer | None .. py:property:: type :type: ds_provider_microsoft_py_lib.enums.ResourceType Get the type of the Dataset. :returns: ResourceType .. py:method:: create(**_kwargs: Any) -> None Create/write data to the specified table. Writes self.input (pandas DataFrame) to the database table with the configured create settings (mode, etc.). :param _kwargs: Additional keyword arguments to pass to the request. :raises ConnectionError: If the connection fails. :raises CreateError: If the create operation fails. .. py:method:: read(**_kwargs: Any) -> None Read rows from the configured table into `self.output`. :param _kwargs: Additional keyword arguments for interface compatibility. :returns: None :raises ReadError: If reading data fails. .. py:method:: purge(**_kwargs: Any) -> None Remove all content from the target table. Drops the entire table, leaving the structure empty. Per contract, the target is empty after purge() returns. This is idempotent -- purging an already-empty (or non-existent) table is a no-op. :param _kwargs: Additional keyword arguments (ignored). :raises ConnectionError: If the connection is not established. :raises PurgeError: If the purge operation fails. .. py:method:: delete(**_kwargs: Any) -> None Delete specific rows from the target table. Removes only the rows in self.input, matched by all columns as identity. Per contract: empty input is a no-op (returns immediately). Deleting a row that does not exist is not an error. :param _kwargs: Additional keyword arguments (ignored). :raises ConnectionError: If the connection is not established. :raises DeleteError: If the delete operation fails. .. py:method:: update(**_kwargs: Any) -> None Update existing rows in the target table. This operation is not supported for SQL Server datasets at this time. :param _kwargs: Additional keyword arguments (ignored). :raises NotSupportedError: Always -- update is not supported. .. py:method:: rename(**_kwargs: Any) -> None Rename a resource (table) in the backend. This operation is not supported for SQL Server datasets at this time. :param _kwargs: Additional keyword arguments (ignored). :raises NotSupportedError: Always -- rename is not supported. .. py:method:: close() -> None Clean up the connection to the backend. Per contract: must be safe to call multiple times and never raise. :returns: None .. py:method:: list(**_kwargs: Any) -> None Discover available resources (tables) in the schema. Uses SQLAlchemy's Inspector to reflect and retrieve all tables in the configured schema with their metadata (type: table or view). :param _kwargs: Additional keyword arguments (ignored). :raises ConnectionError: If the connection is not established. :raises ListError: If the list operation fails. .. py:method:: upsert(**_kwargs: Any) -> None Insert or update rows in the target table. This operation is not supported for SQL Server datasets at this time. :param _kwargs: Additional keyword arguments (ignored). :raises NotSupportedError: Always -- upsert is not supported. .. py:method:: _get_table() -> sqlalchemy.Table Get the SQLAlchemy Table object for the configured schema and table. :returns: The SQLAlchemy Table object. :rtype: Table .. py:method:: _pandas_dtype_to_sqlalchemy(dtypes: pandas.Series) -> dict[str, Any] :staticmethod: Convert pandas dtypes Series to a dict mapping column names to SQLAlchemy types. :param dtypes: Pandas Series where index is column names and values are dtypes. :returns: Dictionary mapping column names to SQLAlchemy types. :rtype: dict[str, Any] .. py:method:: _validate_column(table: sqlalchemy.Table, column_name: str) -> None Validate that a column exists in the table. :param table: The SQLAlchemy Table object. :param column_name: The name of the column to validate. :raises ValueError: If the column doesn't exist in the table. .. py:method:: _validate_columns(table: sqlalchemy.Table, column_names: collections.abc.Sequence[str]) -> None Validate that all requested columns exist in the reflected table. :param table: Reflected SQLAlchemy table. :param column_names: Column names to validate. :returns: None :raises ValidationError: If one or more columns do not exist in the table. .. py:method:: _build_select_columns(table: sqlalchemy.Table) -> sqlalchemy.sql.Select[Any] Build a SELECT statement for configured columns or all columns. :param table: Reflected SQLAlchemy table. :returns: SELECT statement with chosen columns. :rtype: Select[Any] :raises ValidationError: If any selected column does not exist. .. py:method:: _build_filters(stmt: sqlalchemy.sql.Select[Any], table: sqlalchemy.Table) -> sqlalchemy.sql.Select[Any] Apply equality filters from read settings to the SELECT statement. :param stmt: Current SELECT statement. :param table: Reflected SQLAlchemy table. :returns: SELECT statement with WHERE conditions applied. :rtype: Select[Any] :raises ValidationError: If any filter column does not exist. .. py:method:: _build_order_by(stmt: sqlalchemy.sql.Select[Any], table: sqlalchemy.Table) -> sqlalchemy.sql.Select[Any] Apply ORDER BY clauses from read settings to the SELECT statement. :param stmt: Current SELECT statement. :param table: Reflected SQLAlchemy table. :returns: SELECT statement with ORDER BY applied. :rtype: Select[Any] :raises ValidationError: If any order-by column does not exist. .. py:method:: _quote_identifier(name: str) -> str Quote identifiers safely for SQL Server using SQLAlchemy's identifier preparer. Reject identifiers containing obvious injection primitives like quotes, semicolons, or brackets before quoting. :param name: The identifier name to quote. :returns: The safely quoted identifier. :rtype: str :raises ValueError: If the identifier contains unsafe characters. .. py:method:: get_details() -> dict[str, Any] Get details about the dataset. Constructs and returns a dictionary containing metadata about the current dataset configuration, including table name, schema name, and optional query filters and delete settings. :returns: A dictionary containing: - table_name (str): The name of the target table - schema_name (str): The schema containing the table - query_filter (Any, optional): Filter criteria if specified - delete_table (str, optional): Delete table setting if specified :rtype: dict[str, Any] .. py:method:: _is_na_scalar(v: Any) -> bool :staticmethod: Check whether *v* is a scalar NA value (NaN, NaT, None, pd.NA). ``pd.isna()`` returns an array-like result for non-scalar inputs (list, tuple, dict, ndarray), which makes a bare ``if pd.isna(v)`` raise ``ValueError: The truth value of an array is ambiguous``. This helper guards against that by only calling ``pd.isna`` on values that are known to be scalar. :param v: Any value from a record dict. :returns: ``True`` when *v* is a scalar NA-like value. :rtype: bool .. py:method:: _sanitize_records(records: collections.abc.Sequence[dict[collections.abc.Hashable, Any]]) -> collections.abc.Sequence[dict[collections.abc.Hashable, Any]] :staticmethod: Replace NaN and NaT values with None in record dicts. SQL Server rejects ``float('nan')`` over the TDS/ODBC protocol with *"The supplied value is not a valid instance of data type float"*. Converting these sentinel values to ``None`` causes SQLAlchemy to emit proper SQL ``NULL`` parameters instead. Non-scalar values (lists, tuples, dicts, ndarrays) are left as-is because ``pd.isna()`` returns an array-like result for them, which cannot be evaluated as a boolean. :param records: Row dicts produced by ``DataFrame.to_dict(orient="records")``. :returns: The same rows with NaN/NaT replaced by None. :rtype: Sequence[dict[Hashable, Any]] .. py:method:: _get_identity_columns(table: sqlalchemy.Table) -> collections.abc.Sequence[str] :staticmethod: Return the names of identity (auto-increment) columns on *table*. :param table: A reflected or constructed SQLAlchemy Table. :returns: Column names that have an identity property. :rtype: Sequence[str] .. py:method:: _set_identity_insert(conn: Any, *, enabled: bool) -> None Toggle ``IDENTITY_INSERT`` for the configured table. :param conn: Active SQLAlchemy connection. :param enabled: ``True`` to turn identity insert ON, ``False`` for OFF. .. py:method:: _copy_into_table(conn: Any, table: sqlalchemy.Table, content: pandas.DataFrame) -> None Insert rows from a DataFrame into a SQL Server table. Handles identity-column awareness (toggling ``IDENTITY_INSERT``) and sanitises NaN / NaT values so that SQL Server receives valid parameters. :param conn: SQLAlchemy connection inside an active transaction. :param table: SQLAlchemy Table object (metadata only). :param content: DataFrame containing rows to insert. .. py:method:: _resolve_create_primary_key_columns(content: pandas.DataFrame) -> collections.abc.Sequence[str] | None Resolve and validate create-time primary key columns. :param content: Input DataFrame used for table creation. :returns: Primary key columns for new table creation. :rtype: Sequence[str] | None :raises ValidationError: If `primary_key` is enabled but columns are invalid. .. py:method:: _build_table_from_input(content: pandas.DataFrame) -> sqlalchemy.Table Build a SQLAlchemy Table definition from input DataFrame dtypes. :param content: Input DataFrame to build the table from. :returns: SQLAlchemy Table definition. :rtype: Table .. py:method:: _output_from_empty_input() -> pandas.DataFrame Build a consistent empty-operation output while preserving input schema. :returns: Empty dataframe or a schema-preserving input copy. :rtype: pd.DataFrame .. py:method:: _validate_read_settings() -> None Validate read settings before query construction. :returns: None :raises ValidationError: If limit or order direction is invalid. .. py:class:: MsSqlTableDatasetSettings Bases: :py:obj:`ds_resource_plugin_py_lib.common.resource.dataset.DatasetSettings` The object containing the settings of the dataset. .. py:attribute:: table :type: str Table name for dataset operations. .. py:attribute:: schema :type: str Schema for dataset operations. .. py:attribute:: read :type: ReadSettings Settings for read(). .. py:attribute:: create :type: CreateSettings Settings for create(). .. py:class:: MsSqlLinkedService Bases: :py:obj:`ds_resource_plugin_py_lib.common.resource.linked_service.LinkedService`\ [\ :py:obj:`MsSqlLinkedServiceSettingsType`\ ], :py:obj:`Generic`\ [\ :py:obj:`MsSqlLinkedServiceSettingsType`\ ] Linked service for connecting to Microsoft SQL Server. This linked service manages connections to SQL Server databases. It handles authentication, connection lifecycle, and error handling according to the linked service contract. .. rubric:: Example >>> settings = MsSqlLinkedServiceSettings( ... server="localhost", ... database="mydb", ... username="user", ... password="pass" ... ) >>> service = MsSqlLinkedService( ... settings=settings, ... id=uuid.uuid4(), ... name="my_mssql", ... version="0.0.1" ... ) >>> service.connect() >>> with service as svc: ... data = svc.connection.execute(...) .. py:attribute:: settings :type: MsSqlLinkedServiceSettingsType .. py:attribute:: _connection :type: sqlalchemy.engine.Engine | None :value: None The SQLAlchemy Engine instance representing the connection to the SQL Server database. .. py:method:: check_settings_is_set() -> None Check if settings are set correctly. :returns: None :raises AttributeError: If settings are not set correctly. .. py:property:: connection :type: sqlalchemy.engine.Engine Get the backend connection (SQLAlchemy Engine). :returns: The SQLAlchemy Engine instance. :rtype: Engine :raises ConnectionError: If connect() has not been called. .. py:property:: type :type: ds_provider_microsoft_py_lib.enums.ResourceType Get the type of the linked service. :returns: ResourceType .. py:method:: _get_connection_string() -> str Build the ODBC connection string. :returns: The ODBC connection string. :rtype: str .. py:method:: _create_engine() -> sqlalchemy.engine.Engine Connect to SQL Server and return SQLAlchemy Engine. :returns: The SQLAlchemy Engine instance. :rtype: Engine :raises ConnectionError: If the engine cannot be created. :raises AuthenticationError: If credentials are invalid. .. py:method:: connect() -> None Establish a connection to Microsoft SQL Server. The result is stored internally and accessible via the `connection` property. :returns: None :raises ConnectionError: If the connection cannot be established. :raises AuthenticationError: If credentials are invalid. Rules: - Idempotent: Calling connect() on an already-connected service reuses the connection. - Must authenticate using credentials from self.settings. - Must fail loudly if connection cannot be established. .. py:method:: test_connection() -> tuple[bool, str] Verify that the connection to Microsoft SQL Server is healthy. Performs a lightweight check against the backend (a simple SELECT 1 query). This method does not raise on connection failure -- instead returns (False, "error message"). Exceptions are reserved for unexpected internal errors. :returns: (True, ""). On failure: (False, "reason"). :rtype: tuple[bool, str] -- On success Rules: - Must not raise on connection failure. - Must not modify any data. - Should complete quickly. - Idempotent: Yes. .. py:method:: close() -> None Release connections, sessions, or handles held by the linked service. This method is safe to call multiple times and does not raise even if the connection is already closed. Called automatically by `__exit__` when using a context manager. :returns: None Rules: - Must release any open connections, sessions, or handles. - Must not raise if the connection is already closed. - Must be safe to call multiple times. - Idempotent: Yes. .. py:class:: MsSqlLinkedServiceSettings Bases: :py:obj:`ds_resource_plugin_py_lib.common.resource.linked_service.LinkedServiceSettings` The object containing the Microsoft SQL Server linked service settings. .. py:attribute:: server :type: str The hostname or IP address of the SQL Server instance. .. py:attribute:: database :type: str The name of the database to connect to. .. py:attribute:: username :type: str The username for authentication. .. py:attribute:: password :type: str The password for authentication. This field is masked in logs and serialized output. .. py:attribute:: port :type: int :value: 1433 The port number for the SQL Server instance. Defaults to 1433, the standard port for SQL Server. .. py:attribute:: driver :type: str :value: 'ODBC Driver 18 for SQL Server' The ODBC driver to use for the connection. Defaults to "ODBC Driver 18 for SQL Server" .. py:attribute:: encrypt :type: bool :value: True Whether to encrypt the connection. Defaults to True. .. py:attribute:: trust_server_certificate :type: bool :value: False Whether to trust the server certificate when encrypting. Defaults to False. .. py:attribute:: connection_timeout :type: int :value: 30 The connection timeout in seconds. Defaults to 30. .. py:data:: __version__