S3 Handler¶
Concurrent S3 transfer handler built on boto3. Provides thread-pool-based parallel uploads and downloads with progress bars (Rich), multipart support, and server-side encryption.
Key Components¶
S3Handler-- Context-managed class that creates a boto3 session and S3 client on entry. Supports concurrent upload/download ofS3ObjectReflists, key-existence checks with configurable batch thresholds, and verbose logging via Loguru.S3TransferConfig-- Frozen dataclass controlling region, pool size, concurrency, multipart thresholds, retry attempts, and encryption settings.S3ObjectRef-- Frozen dataclass pairing an S3 key with a localPath.extract_date_from_hive_path-- Parsesyear=YYYY/month=MM/day=DDsegments from a hive-partitioned S3 key and returns adatetime.
Classes¶
S3Handler
¶
Source code in src/progridpy/aws/s3.py
Functions¶
download_objects
¶
download_objects(bucket: str, objects: list[S3ObjectRef], overwrite: bool = False, description: str | None = None) -> tuple[list[Path], list[Path], list[Path]]
Download S3 objects to local paths.
Returns: (downloaded, skipped, failed) paths.
Source code in src/progridpy/aws/s3.py
upload_objects
¶
upload_objects(bucket: str, objects: list[S3ObjectRef], overwrite: bool = False, description: str | None = None) -> tuple[list[str], list[str], list[Path]]
Upload local files to S3.
Returns: (uploaded_keys, skipped_keys, failed_local_paths).