ISOBase¶
Abstract base class for all ISO client implementations. ISOBase uses PEP 695 generics (ISOBase[RawT: StrEnum, ProcessedT: StrEnum]) so that each concrete ISO client is statically typed over its own raw and processed data-type enums.
Responsibilities¶
- Defines the five abstract methods that every ISO must implement:
download_raw_data,upload_raw_data,download_processed_data,upload_processed_data, andprocess_raw_data. - Provides concrete S3 key-building hooks (
_raw_s3_key,_processed_s3_key,_hive_output_path) that produce the standardized hive-partitioned layout. - Exposes timezone-aware
now()andtoday()convenience methods.
Hive Path Convention¶
Processed data is written to:
Raw data follows:
Classes¶
ISOBase
¶
Bases: ABC
Source code in src/progridpy/iso/base.py
Functions¶
download_raw_data
abstractmethod
¶
download_raw_data(start_date: str | datetime | None = None, end_date: str | datetime | None = None, data_types: RawT | list[RawT] | None = None, download_src: str | FileLocation = ISO, output_dir: str | Path | None = None, overwrite: bool = False, verbose: bool = False) -> list[Path]
Download raw data from S3 or ISO. The start and end dates are inclusive.
Source code in src/progridpy/iso/base.py
upload_raw_data
abstractmethod
¶
upload_raw_data(start_date: str | datetime | None = None, end_date: str | datetime | None = None, data_types: RawT | list[RawT] | None = None, input_dir: str | Path | None = None, overwrite: bool = False, verbose: bool = False) -> list[str]
Upload raw data files to S3 bucket.
Source code in src/progridpy/iso/base.py
download_processed_data
abstractmethod
¶
download_processed_data(start_date: str | datetime | None = None, end_date: str | datetime | None = None, data_types: ProcessedT | list[ProcessedT] | None = None, output_dir: str | Path | None = None, overwrite: bool = False, verbose: bool = False) -> list[Path]
Download processed data from S3 bucket for the specified date range.
Source code in src/progridpy/iso/base.py
upload_processed_data
abstractmethod
¶
upload_processed_data(start_date: str | datetime | None = None, end_date: str | datetime | None = None, data_types: ProcessedT | list[ProcessedT] | None = None, input_dir: str | Path | None = None, overwrite: bool = False, verbose: bool = False) -> list[str]
Upload processed data files to S3 bucket.
Source code in src/progridpy/iso/base.py
process_raw_data
abstractmethod
¶
process_raw_data(start_date: str | datetime | None = None, end_date: str | datetime | None = None, data_types: ProcessedT | list[ProcessedT] | None = None, input_dir: str | Path | None = None, output_dir: str | Path | None = None, file_format: Literal['parquet', 'csv'] = 'parquet', overwrite: bool = False, verbose: bool = False) -> list[Path]
Process the downloaded raw data into a standardized format.