ds_provider_grasp_py_lib.dataset.ingress¶
File: ingress.py
Region: ds_provider_grasp_py_lib/dataset/ingress
Grasp Ingress Dataset
This module implements a dataset for Grasp Ingress.
Example
>>> dataset = GraspIngressDataset(
... id=uuid.uuid4(),
... name="ingress-dataset",
... version="1.0.0",
... deserializer=PandasDeserializer(format=DatasetStorageFormatType.JSON),
... serializer=PandasSerializer(format=DatasetStorageFormatType.JSON),
... settings=GraspIngressDatasetSettings(
... owner_id="owner_id",
... product_group_name="product_group_name",
... product_name="product_name",
... version="version",
... include_history=True,
... ),
... linked_service=GraspAwsLinkedService(
... id=uuid.uuid4(),
... name="aws-linked-service",
... version="1.0.0",
... settings=GraspAwsLinkedServiceSettings(
... access_key_id="access_key_id",
... access_key_secret="access_key_secret",
... region="region",
... ),
... ),
... )
>>> dataset.read()
>>> data = dataset.output
Attributes¶
Classes¶
Settings for Grasp Ingress dataset operations. |
|
Tabular dataset object which identifies data within a data store, |
Module Contents¶
- ds_provider_grasp_py_lib.dataset.ingress.logger¶
- class ds_provider_grasp_py_lib.dataset.ingress.GraspIngressDatasetSettings[source]¶
Bases:
ds_resource_plugin_py_lib.common.resource.dataset.DatasetSettingsSettings for Grasp Ingress dataset operations.
- ds_provider_grasp_py_lib.dataset.ingress.GraspIngressDatasetSettingsType¶
- ds_provider_grasp_py_lib.dataset.ingress.AWSLinkedServiceType¶
- class ds_provider_grasp_py_lib.dataset.ingress.GraspIngressDataset[source]¶
Bases:
ds_resource_plugin_py_lib.common.resource.dataset.TabularDataset[AWSLinkedServiceType,GraspIngressDatasetSettingsType,ds_resource_plugin_py_lib.common.serde.serialize.AwsWranglerSerializer,ds_resource_plugin_py_lib.common.serde.deserialize.AwsWranglerDeserializer],Generic[AWSLinkedServiceType,GraspIngressDatasetSettingsType]Tabular dataset object which identifies data within a data store, such as table/csv/json/parquet/parquetdataset/ and other documents.
The input of the dataset is a pandas DataFrame. The output of the dataset is a pandas DataFrame.
- linked_service: AWSLinkedServiceType¶
- settings: GraspIngressDatasetSettingsType¶
- serializer: ds_resource_plugin_py_lib.common.serde.serialize.AwsWranglerSerializer | None¶
- deserializer: ds_resource_plugin_py_lib.common.serde.deserialize.AwsWranglerDeserializer | None¶
- property type: ds_provider_grasp_py_lib.enums.ResourceType¶
Get the type of the dataset.
- _get_s3_path(tenant_id: str, session_id: str) str[source]¶
Get the S3 path for the Grasp Ingress dataset.
- Returns:
The S3 path for the Grasp Ingress dataset.
- Return type:
str
- create() None[source]¶
Insert all rows in
self.inputinto the target as a single atomic transaction. Must not delete, update, or overwrite existing data.- Raises:
CreateError – If the operation fails.
NotSupportedError – If the provider does not support create.
See also
Full contract:
docs/DATASET_CONTRACT.md–create()
- read() None[source]¶
Read data from the Grasp Ingress dataset.
- Raises:
ReadError – If the read operation fails, including when no files are found at the S3 path or when the S3 path is invalid.
- delete() NoReturn[source]¶
Remove specific rows from the target matched by identity columns defined in
self.settings. Atomic. Idempotent.- Raises:
DeleteError – If the operation fails.
NotSupportedError – If the provider does not support delete.
See also
Full contract:
docs/DATASET_CONTRACT.md–delete()
- update() NoReturn[source]¶
Update existing rows in the target matched by identity columns defined in
self.settings. Atomic. Must not insert new rows.- Raises:
UpdateError – If the operation fails.
NotSupportedError – If the provider does not support update.
See also
Full contract:
docs/DATASET_CONTRACT.md–update()
- upsert() NoReturn[source]¶
Insert rows that do not exist, update rows that do, matched by identity columns defined in
self.settings. Atomic.- Raises:
UpsertError – If the operation fails.
NotSupportedError – If the provider does not support upsert.
See also
Full contract:
docs/DATASET_CONTRACT.md–upsert()
- rename() NoReturn[source]¶
Rename the resource in the backend. Atomic. Not idempotent.
- Raises:
RenameError – If the operation fails.
NotSupportedError – If the provider does not support renaming.
See also
Full contract:
docs/DATASET_CONTRACT.md–rename()
- purge() NoReturn[source]¶
Remove all content from the target.
self.inputis not used. Atomic. Idempotent.- Raises:
PurgeError – If the operation fails.
NotSupportedError – If the provider does not support purge.
See also
Full contract:
docs/DATASET_CONTRACT.md–purge()
- list() NoReturn[source]¶
Discover available resources and populate
self.outputwith a DataFrame of resources and their metadata. Idempotent.- Raises:
ListError – If the operation fails.
NotSupportedError – If the provider does not support listing.
See also
Full contract:
docs/DATASET_CONTRACT.md–list()