ds_stoa.fetch ============= .. py:module:: ds_stoa.fetch .. autoapi-nested-parse:: This module serves as the entry point for the data fetching functionality from the GraspDP datalake. It exposes the `fetch` function, which is designed to retrieve data from the datalake using pre-signed URLs. This function is capable of fetching data in parallel, significantly improving performance for large datasets. The `fetch` function returns the data as a Pandas DataFrame, making it immediately useful for data analysis and manipulation tasks. **Example usage**:: from ds_stoa.fetch import fetch # Example pre-signed URLs (these would be provided by your data provider) pre_signed_urls = { "dataset1": "http://example.com/path/to/dataset1.parquet", "dataset2": "http://example.com/path/to/dataset2.parquet", } # Fetching data and loading it into a DataFrame dataframe = fetch(pre_signed_urls) print(dataframe) Submodules ---------- .. toctree:: :maxdepth: 1 /autoapi/ds_stoa/fetch/_fetch/index Functions --------- .. autoapisummary:: ds_stoa.fetch.fetch Package Contents ---------------- .. py:function:: fetch(pre_signed_urls: Dict) -> pandas.DataFrame Fetch data from a collection of pre-signed URLs in parallel and consolidate into a single DataFrame. :param pre_signed_urls: A dictionary where keys are identifiers and values are pre-signed URLs. :type pre_signed_urls: Dict[str, str] :return: A consolidated Pandas DataFrame containing data from all fetched URLs. :rtype: pd.DataFrame **Example**:: pre_signed_urls = { "file1": "http://example.com/data1.parquet", "file2": "http://example.com/data2.parquet", } dataframe = fetch(pre_signed_urls)