ds_stoa.fetch
=============

.. py:module:: ds_stoa.fetch

.. autoapi-nested-parse::

   This module serves as the entry point for the data fetching
   functionality from the GraspDP datalake.

   It exposes the `fetch` function, which is designed to retrieve data from the
   datalake using pre-signed URLs. This function is capable of fetching data in parallel,
   significantly improving performance for large datasets.

   The `fetch` function returns the data as a Pandas DataFrame, making it immediately useful
   for data analysis and manipulation tasks.

   **Example usage**::

       from ds_stoa.fetch import fetch

       # Example pre-signed URLs (these would be provided by your data provider)
       pre_signed_urls = {
           "dataset1": "http://example.com/path/to/dataset1.parquet",
           "dataset2": "http://example.com/path/to/dataset2.parquet",
       }

       # Fetching data and loading it into a DataFrame
       dataframe = fetch(pre_signed_urls)
       print(dataframe)


Submodules
----------

.. toctree::
   :maxdepth: 1

   /autoapi/ds_stoa/fetch/_fetch/index


Functions
---------

.. autoapisummary::

   ds_stoa.fetch.fetch


Package Contents
----------------

.. py:function:: fetch(pre_signed_urls: Dict) -> pandas.DataFrame

   Fetch data from a collection of pre-signed URLs in
   parallel and consolidate into a single DataFrame.

   :param pre_signed_urls: A dictionary where keys are identifiers and values are pre-signed URLs.
   :type pre_signed_urls: Dict[str, str]
   :return: A consolidated Pandas DataFrame containing data from all fetched URLs.
   :rtype: pd.DataFrame

   **Example**::

       pre_signed_urls = {
           "file1": "http://example.com/data1.parquet",
           "file2": "http://example.com/data2.parquet",
       }
       dataframe = fetch(pre_signed_urls)