ds_provider_postgresql_py_lib.dataset
=====================================

.. py:module:: ds_provider_postgresql_py_lib.dataset

.. autoapi-nested-parse::

   **File:** ``__init__.py``
   **Region:** ``ds_provider_postgresql_py_lib/dataset``

   PostgreSQL Dataset

   This module implements a dataset for PostgreSQL databases.

   .. rubric:: Example

   >>> dataset = PostgreSQLDataset(
   ...     deserializer=PandasDeserializer(format=DatasetStorageFormatType.JSON),
   ...     serializer=PandasSerializer(format=DatasetStorageFormatType.JSON),
   ...     settings=PostgreSQLDatasetSettings(
   ...         table="users",
   ...         read=ReadSettings(
   ...             columns=["id", "name"],
   ...             filters={"status": "active"},
   ...             order_by=["created_at"],
   ...         ),
   ...     ),
   ...     linked_service=PostgreSQLLinkedService(
   ...         settings=PostgreSQLLinkedServiceSettings(
   ...             uri="postgresql://user:password@localhost:5432/mydb",
   ...         ),
   ...     ),
   ... )
   >>> dataset.read()
   >>> data = dataset.output


Submodules
----------

.. toctree::
   :maxdepth: 1

   /autoapi/ds_provider_postgresql_py_lib/dataset/postgresql/index


Classes
-------

.. autoapisummary::

   ds_provider_postgresql_py_lib.dataset.PostgreSQLDataset
   ds_provider_postgresql_py_lib.dataset.PostgreSQLDatasetSettings


Package Contents
----------------

.. py:class:: PostgreSQLDataset

   Bases: :py:obj:`ds_resource_plugin_py_lib.common.resource.dataset.TabularDataset`\ [\ :py:obj:`PostgreSQLLinkedServiceType`\ , :py:obj:`PostgreSQLDatasetSettingsType`\ , :py:obj:`ds_resource_plugin_py_lib.common.serde.serialize.PandasSerializer`\ , :py:obj:`ds_resource_plugin_py_lib.common.serde.deserialize.PandasDeserializer`\ ], :py:obj:`Generic`\ [\ :py:obj:`PostgreSQLLinkedServiceType`\ , :py:obj:`PostgreSQLDatasetSettingsType`\ ]


   Tabular dataset object which identifies data within a data store,
   such as table/csv/json/parquet/parquetdataset/ and other documents.

   The input of the dataset is a pandas DataFrame.
   The output of the dataset is a pandas DataFrame.


   .. py:attribute:: linked_service
      :type:  PostgreSQLLinkedServiceType


   .. py:attribute:: settings
      :type:  PostgreSQLDatasetSettingsType


   .. py:attribute:: serializer
      :type:  ds_resource_plugin_py_lib.common.serde.serialize.PandasSerializer | None


   .. py:attribute:: deserializer
      :type:  ds_resource_plugin_py_lib.common.serde.deserialize.PandasDeserializer | None


   .. py:property:: type
      :type: ds_provider_postgresql_py_lib.enums.ResourceType


      Get the type of the dataset.

      :returns: The dataset resource type.
      :rtype: ResourceType


   .. py:method:: create(**_kwargs: Any) -> None

      Create/write data to the configured table.

      :param _kwargs: Additional keyword arguments for interface compatibility.

      :returns: None

      :raises CreateError: If writing data fails.


   .. py:method:: read(**_kwargs: Any) -> None

      Read rows from the configured table into `self.output`.

      :param _kwargs: Additional keyword arguments for interface compatibility.

      :returns: None

      :raises ReadError: If reading data fails.


   .. py:method:: delete(**_kwargs: Any) -> None

      Delete rows matching configured identity columns.

      :param _kwargs: Additional keyword arguments for interface compatibility.

      :returns: None

      :raises DeleteError: If deleting rows fails.


   .. py:method:: update(**_kwargs: Any) -> None

      Update rows matching configured identity columns.

      :param _kwargs: Additional keyword arguments for interface compatibility.

      :returns: None

      :raises UpdateError: If updating rows fails.


   .. py:method:: upsert(**_kwargs: Any) -> None

      Insert or update rows using PostgreSQL ON CONFLICT semantics.

      :param _kwargs: Additional keyword arguments for interface compatibility.

      :returns: None

      :raises UpsertError: If upserting rows fails.


   .. py:method:: purge(**_kwargs: Any) -> None

      Purge table contents or drop the table.

      :param _kwargs: Additional keyword arguments for interface compatibility.

      :returns: None

      :raises PurgeError: If purging table data fails.


   .. py:method:: list(**_kwargs: Any) -> None

      List operation is not supported for this provider.

      :param _kwargs: Additional keyword arguments for interface compatibility.

      :returns: None

      :raises NotSupportedError: Always, as list is not supported.


   .. py:method:: rename(**_kwargs: Any) -> None

      Rename operation is not supported for this provider.

      :param _kwargs: Additional keyword arguments for interface compatibility.

      :returns: None

      :raises NotSupportedError: Always, as rename is not supported.


   .. py:method:: close() -> None

      Close the dataset and underlying linked service.

      :returns: None


   .. py:method:: _output_from_empty_input() -> pandas.DataFrame

      Build a consistent empty-operation output while preserving input schema.

      :returns: Empty dataframe or a schema-preserving input copy.
      :rtype: pd.DataFrame


   .. py:method:: _get_table() -> sqlalchemy.Table

      Get the reflected SQLAlchemy table for configured schema and table.

      :returns: Reflected table object.
      :rtype: Table


   .. py:method:: _build_table_from_input(content: pandas.DataFrame) -> sqlalchemy.Table

      Build a SQLAlchemy Table definition from input DataFrame dtypes.

      :param content: Input DataFrame to build the table from.

      :returns: SQLAlchemy Table definition.
      :rtype: Table


   .. py:method:: _resolve_create_primary_key_columns(content: pandas.DataFrame) -> collections.abc.Sequence[str] | None

      Resolve and validate create-time primary key columns.

      :param content: Input DataFrame used for table creation.

      :returns: Primary key columns for new table creation.
      :rtype: Sequence[str] | None

      :raises ValidationError: If `primary_key` is enabled but columns are invalid.


   .. py:method:: _copy_into_table(conn: Any, table: sqlalchemy.Table, content: pandas.DataFrame) -> None

      Insert rows using PostgreSQL COPY.


   .. py:method:: _validate_columns(table: sqlalchemy.Table, column_names: collections.abc.Sequence[str]) -> None

      Validate that all requested columns exist in the reflected table.

      :param table: Reflected SQLAlchemy table.
      :param column_names: Column names to validate.

      :returns: None

      :raises ValidationError: If one or more columns do not exist in the table.


   .. py:method:: _validate_read_settings() -> None

      Validate read settings before query construction.

      :returns: None

      :raises ValidationError: If limit or order direction is invalid.


   .. py:method:: _build_select_columns(table: sqlalchemy.Table) -> sqlalchemy.sql.Select[Any]

      Build a SELECT statement for configured columns or all columns.

      :param table: Reflected SQLAlchemy table.

      :returns: SELECT statement with chosen columns.
      :rtype: Select[Any]

      :raises ValidationError: If any selected column does not exist.


   .. py:method:: _build_filters(stmt: sqlalchemy.sql.Select[Any], table: sqlalchemy.Table) -> sqlalchemy.sql.Select[Any]

      Apply equality filters from read settings to the SELECT statement.

      :param stmt: Current SELECT statement.
      :param table: Reflected SQLAlchemy table.

      :returns: SELECT statement with WHERE conditions applied.
      :rtype: Select[Any]

      :raises ValidationError: If any filter column does not exist.


   .. py:method:: _build_order_by(stmt: sqlalchemy.sql.Select[Any], table: sqlalchemy.Table) -> sqlalchemy.sql.Select[Any]

      Apply ORDER BY clauses from read settings to the SELECT statement.

      :param stmt: Current SELECT statement.
      :param table: Reflected SQLAlchemy table.

      :returns: SELECT statement with ORDER BY applied.
      :rtype: Select[Any]

      :raises ValidationError: If any order-by column does not exist.


.. py:class:: PostgreSQLDatasetSettings

   Bases: :py:obj:`ds_resource_plugin_py_lib.common.resource.dataset.DatasetSettings`


   Settings for PostgreSQL dataset operations.

   The `read` settings contains read-specific configuration that only
   applies to the read() operation, not to create(), delete(), update(), etc.


   .. py:attribute:: schema
      :type:  str
      :value: 'public'


      Schema for dataset operations.


   .. py:attribute:: table
      :type:  str

      Table for dataset operations.


   .. py:attribute:: read
      :type:  ReadSettings

      Settings for read().


   .. py:attribute:: create
      :type:  CreateSettings

      Settings for create().


   .. py:attribute:: update
      :type:  UpdateSettings | None
      :value: None


      Settings for update().


   .. py:attribute:: upsert
      :type:  UpsertSettings | None
      :value: None


      Settings for upsert().


   .. py:attribute:: delete
      :type:  DeleteSettings | None
      :value: None


      Settings for delete().


   .. py:attribute:: purge
      :type:  PurgeSettings

      Settings for purge().