ds_provider_grasp_py_lib.dataset ================================ .. py:module:: ds_provider_grasp_py_lib.dataset .. autoapi-nested-parse:: **File:** ``__init__.py`` **Region:** ``ds_provider_grasp_py_lib/dataset`` Grasp Datasets This module provides access to both Grasp Cart and Grasp Ingress datasets. .. rubric:: Example >>> dataset = GraspCartDataset( ... id=uuid.uuid4(), ... name="cart-dataset", ... version="1.0.0", ... deserializer=PandasDeserializer(format=DatasetStorageFormatType.JSON), ... serializer=PandasSerializer(format=DatasetStorageFormatType.JSON), ... settings=GraspCartDatasetSettings( ... owner_id="owner_id", ... product_group_name="product_group_name", ... product_name="product_name", ... version="version", ... include_history=True, ... ), ... linked_service=GraspAwsLinkedService( ... id=uuid.uuid4(), ... name="aws-linked-service", ... version="1.0.0", ... settings=GraspAwsLinkedServiceSettings( ... access_key_id="access_key_id", ... access_key_secret="access_key_secret", ... region="region", ... ), ... ), ... ) >>> dataset.read() >>> data = dataset.output Submodules ---------- .. toctree:: :maxdepth: 1 /autoapi/ds_provider_grasp_py_lib/dataset/cart/index /autoapi/ds_provider_grasp_py_lib/dataset/ingress/index Classes ------- .. autoapisummary:: ds_provider_grasp_py_lib.dataset.GraspCartDataset ds_provider_grasp_py_lib.dataset.GraspCartDatasetSettings ds_provider_grasp_py_lib.dataset.GraspIngressDataset ds_provider_grasp_py_lib.dataset.GraspIngressDatasetSettings Package Contents ---------------- .. py:class:: GraspCartDataset Bases: :py:obj:`ds_resource_plugin_py_lib.common.resource.dataset.TabularDataset`\ [\ :py:obj:`AWSLinkedServiceType`\ , :py:obj:`GraspCartDatasetSettingsType`\ , :py:obj:`ds_resource_plugin_py_lib.common.serde.serialize.AwsWranglerSerializer`\ , :py:obj:`ds_resource_plugin_py_lib.common.serde.deserialize.AwsWranglerDeserializer`\ ], :py:obj:`Generic`\ [\ :py:obj:`AWSLinkedServiceType`\ , :py:obj:`GraspCartDatasetSettingsType`\ ] Tabular dataset object which identifies data within a data store, such as table/csv/json/parquet/parquetdataset/ and other documents. The input of the dataset is a pandas DataFrame. The output of the dataset is a pandas DataFrame. .. py:attribute:: linked_service :type: AWSLinkedServiceType .. py:attribute:: settings :type: GraspCartDatasetSettingsType .. py:method:: __post_init__() -> None .. py:property:: type :type: ds_provider_grasp_py_lib.enums.ResourceType Get the type of the dataset. .. py:method:: _get_s3_path(tenant_id: str) -> str .. py:method:: create() -> None Insert all rows in ``self.input`` into the target as a single atomic transaction. Must not delete, update, or overwrite existing data. :raises CreateError: If the operation fails. :raises NotSupportedError: If the provider does not support create. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``create()`` .. py:method:: read() -> None Read data from the Grasp Cart dataset. :raises ReadError: If the read operation fails, including when no files are found at the S3 path or when the S3 path is invalid. .. py:method:: delete() -> NoReturn Remove specific rows from the target matched by identity columns defined in ``self.settings``. Atomic. Idempotent. :raises DeleteError: If the operation fails. :raises NotSupportedError: If the provider does not support delete. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``delete()`` .. py:method:: update() -> NoReturn Update existing rows in the target matched by identity columns defined in ``self.settings``. Atomic. Must not insert new rows. :raises UpdateError: If the operation fails. :raises NotSupportedError: If the provider does not support update. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``update()`` .. py:method:: upsert() -> NoReturn Insert rows that do not exist, update rows that do, matched by identity columns defined in ``self.settings``. Atomic. :raises UpsertError: If the operation fails. :raises NotSupportedError: If the provider does not support upsert. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``upsert()`` .. py:method:: rename() -> NoReturn Rename the resource in the backend. Atomic. Not idempotent. :raises RenameError: If the operation fails. :raises NotSupportedError: If the provider does not support renaming. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``rename()`` .. py:method:: purge() -> NoReturn Remove all content from the target. ``self.input`` is not used. Atomic. Idempotent. :raises PurgeError: If the operation fails. :raises NotSupportedError: If the provider does not support purge. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``purge()`` .. py:method:: list() -> NoReturn Discover available resources and populate ``self.output`` with a DataFrame of resources and their metadata. Idempotent. :raises ListError: If the operation fails. :raises NotSupportedError: If the provider does not support listing. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``list()`` .. py:method:: close() -> None Close the dataset. .. py:class:: GraspCartDatasetSettings Bases: :py:obj:`ds_resource_plugin_py_lib.common.resource.dataset.DatasetSettings` Settings for Grasp Cart dataset operations. .. py:attribute:: owner_id :type: str The owner ID of the cart. .. py:attribute:: product_group_name :type: str The product group name of the cart. .. py:attribute:: product_name :type: str The product name of the cart. .. py:attribute:: version :type: str :value: '1.0' The version of the cart. .. py:attribute:: include_history :type: bool :value: False Whether to include history in the cart. .. py:class:: GraspIngressDataset Bases: :py:obj:`ds_resource_plugin_py_lib.common.resource.dataset.TabularDataset`\ [\ :py:obj:`AWSLinkedServiceType`\ , :py:obj:`GraspIngressDatasetSettingsType`\ , :py:obj:`ds_resource_plugin_py_lib.common.serde.serialize.AwsWranglerSerializer`\ , :py:obj:`ds_resource_plugin_py_lib.common.serde.deserialize.AwsWranglerDeserializer`\ ], :py:obj:`Generic`\ [\ :py:obj:`AWSLinkedServiceType`\ , :py:obj:`GraspIngressDatasetSettingsType`\ ] Tabular dataset object which identifies data within a data store, such as table/csv/json/parquet/parquetdataset/ and other documents. The input of the dataset is a pandas DataFrame. The output of the dataset is a pandas DataFrame. .. py:attribute:: linked_service :type: AWSLinkedServiceType .. py:attribute:: settings :type: GraspIngressDatasetSettingsType .. py:attribute:: serializer :type: ds_resource_plugin_py_lib.common.serde.serialize.AwsWranglerSerializer | None .. py:attribute:: deserializer :type: ds_resource_plugin_py_lib.common.serde.deserialize.AwsWranglerDeserializer | None .. py:property:: type :type: ds_provider_grasp_py_lib.enums.ResourceType Get the type of the dataset. .. py:method:: _get_s3_path(tenant_id: str, session_id: str) -> str Get the S3 path for the Grasp Ingress dataset. :returns: The S3 path for the Grasp Ingress dataset. :rtype: str .. py:method:: create() -> None Insert all rows in ``self.input`` into the target as a single atomic transaction. Must not delete, update, or overwrite existing data. :raises CreateError: If the operation fails. :raises NotSupportedError: If the provider does not support create. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``create()`` .. py:method:: read() -> None Read data from the Grasp Ingress dataset. :raises ReadError: If the read operation fails, including when no files are found at the S3 path or when the S3 path is invalid. .. py:method:: delete() -> NoReturn Remove specific rows from the target matched by identity columns defined in ``self.settings``. Atomic. Idempotent. :raises DeleteError: If the operation fails. :raises NotSupportedError: If the provider does not support delete. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``delete()`` .. py:method:: update() -> NoReturn Update existing rows in the target matched by identity columns defined in ``self.settings``. Atomic. Must not insert new rows. :raises UpdateError: If the operation fails. :raises NotSupportedError: If the provider does not support update. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``update()`` .. py:method:: upsert() -> NoReturn Insert rows that do not exist, update rows that do, matched by identity columns defined in ``self.settings``. Atomic. :raises UpsertError: If the operation fails. :raises NotSupportedError: If the provider does not support upsert. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``upsert()`` .. py:method:: rename() -> NoReturn Rename the resource in the backend. Atomic. Not idempotent. :raises RenameError: If the operation fails. :raises NotSupportedError: If the provider does not support renaming. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``rename()`` .. py:method:: purge() -> NoReturn Remove all content from the target. ``self.input`` is not used. Atomic. Idempotent. :raises PurgeError: If the operation fails. :raises NotSupportedError: If the provider does not support purge. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``purge()`` .. py:method:: list() -> NoReturn Discover available resources and populate ``self.output`` with a DataFrame of resources and their metadata. Idempotent. :raises ListError: If the operation fails. :raises NotSupportedError: If the provider does not support listing. .. seealso:: Full contract: ``docs/DATASET_CONTRACT.md`` -- ``list()`` .. py:method:: close() -> None Close the dataset. .. py:class:: GraspIngressDatasetSettings Bases: :py:obj:`ds_resource_plugin_py_lib.common.resource.dataset.DatasetSettings` Settings for Grasp Ingress dataset operations.