ds_protocol_graphql_py_lib.dataset¶
File: __init__.py
Region: ds_protocol_graphql_py_lib/dataset
Package initialization for GraphQL dataset module.
Example
>>> from ds_protocol_graphql_py_lib.dataset import GraphqlDataset, GraphqlDatasetSettings
>>> from ds_protocol_http_py_lib import HttpLinkedService, HttpLinkedServiceSettings
>>> linked_service = HttpLinkedService(
... settings=HttpLinkedServiceSettings(host="https://api.example.graphql/graphql"),
... id="service-id",
... name="graphql_service",
... version="1.0.0",
... )
>>> dataset = GraphqlDataset(
... linked_service=linked_service,
... settings=GraphqlDatasetSettings(url="https://api.example.graphql/graphql"),
... id="dataset-id",
... name="graphql_dataset",
... version="1.0.0",
... )
Submodules¶
Classes¶
Represent Graphql dataset. |
|
The object containing the settings of the dataset. |
|
Settings specific to reading data from GraphQL API. |
Package Contents¶
- class ds_protocol_graphql_py_lib.dataset.GraphqlDataset[source]¶
Bases:
ds_resource_plugin_py_lib.common.resource.dataset.TabularDataset[ds_protocol_http_py_lib.dataset.http.HttpLinkedServiceType,GraphqlDatasetSettingsType,ds_resource_plugin_py_lib.common.serde.serialize.PandasSerializer,ds_protocol_graphql_py_lib.serde.deserializer.GraphqlDeserializer],Generic[ds_protocol_http_py_lib.dataset.http.HttpLinkedServiceType,GraphqlDatasetSettingsType]Represent Graphql dataset.
- settings: GraphqlDatasetSettingsType¶
- linked_service: ds_protocol_http_py_lib.dataset.http.HttpLinkedServiceType¶
- deserializer: ds_protocol_graphql_py_lib.serde.deserializer.GraphqlDeserializer | None¶
- property type: ds_protocol_graphql_py_lib.enums.ResourceType¶
Get the type of the dataset.
- property supports_checkpoint: bool¶
Indicate whether this provider supports incremental loads via checkpointing.
GraphQL provider does not yet support checkpoint-based incremental loads. All reads are full loads.
- Returns:
False, indicating checkpointing is not supported.
- read() None[source]¶
Read Graphql dataset.
Sends a GraphQL query to the endpoint with the query, variables, and operation name specified in settings.read. Populates self.output with the result as a DataFrame.
Handles various GraphQL response patterns via GraphqlDeserializer: - Direct arrays: {“data”: {“users”: […]}} - Relay connections: {“data”: {“users”: {“edges”: [{“node”: {…}}]}}} - Single objects: {“data”: {“user”: {…}}}
- Returns:
None. The result is stored in self.output as a DataFrame.
- Raises:
ConnectionError – If the linked service connection is not initialized.
ReadError – If read settings are not provided or if the GraphQL query fails.
- create() None[source]¶
Create new rows in the GraphQL endpoint using mutations.
Sends all rows in a single atomic GraphQL mutation request. Populates self.output with the created rows.
- Returns:
None. The result is stored in self.output as a DataFrame.
- Raises:
ConnectionError – If the linked service connection is not initialized.
CreateError – If create settings are not provided or if the GraphQL mutation fails.
- delete() None[source]¶
Delete specific rows from the GraphQL endpoint using mutations.
Sends all rows in a single atomic GraphQL mutation request. Populates self.output with the deleted rows.
- Returns:
None. The result is stored in self.output as a DataFrame.
- Raises:
ConnectionError – If the linked service connection is not initialized.
DeleteError – If delete settings are not provided, if identity columns are missing, or if the GraphQL mutation fails.
- list() None[source]¶
Discover available resources in the GraphQL schema via introspection.
Executes a GraphQL introspection query to fetch all available queries and their arguments from the schema. Populates self.output with a DataFrame containing resource metadata (name, type, description, etc.).
- Returns:
None. The result is stored in self.output as a DataFrame.
- Raises:
ConnectionError – If the linked service connection is not initialized.
ListError – If the GraphQL introspection query fails.
- class ds_protocol_graphql_py_lib.dataset.GraphqlDatasetSettings[source]¶
Bases:
ds_resource_plugin_py_lib.common.resource.dataset.DatasetSettingsThe object containing the settings of the dataset.
- url: str¶
The URL of the GraphQL endpoint to connect to. This is the base URL where the GraphQL API is hosted.
- primary_keys: list[str] | None = None¶
Optional list of column names that serve as primary keys for the dataset. This can be used for operations that require unique identification of rows.
- headers: dict[str, str] | None = None¶
Optional HTTP headers to include in requests to the GraphQL endpoint, such as authentication tokens or content type.
- read: GraphqlReadSettings | None = None¶
Settings for read operations.
- delete: GraphqlDeleteSettings | None = None¶
Settings for delete operations.
- create: GraphqlCreateSettings | None = None¶
Settings for create operations.
- class ds_protocol_graphql_py_lib.dataset.GraphqlReadSettings[source]¶
Bases:
ds_common_serde_py_lib.SerializableSettings specific to reading data from GraphQL API.
- query: str¶
The GraphQL query string to execute for reading data. This should be a valid GraphQL query that the endpoint can execute to return the desired data. For example: “{ users { id name email } }”
- variables: dict[str, Any] | None = None¶
Optional variables to include with the GraphQL query.
- operation_name: str | None = None¶
Optional operation name for the GraphQL query, used when the query contains multiple operations.