ISO Module¶
The progridpy.iso package implements the data-download, upload, and processing pipeline for US Independent System Operators. Each ISO is a sub-package with a consistent internal structure: a client, types, and a registry.
Architecture¶
All ISO clients inherit from ISOBase[RawT, ProcessedT], a PEP 695 generic abstract class parameterized over two StrEnum types -- one for raw data categories and one for processed granularity levels.
Implementations¶
| ISO | Client Class | Raw Types | Processed Granularity | Timezone Handling |
|---|---|---|---|---|
| SPP | SPP |
SPPRawDataType (9 members) |
NODAL, SYSTEM, ZONAL | UTC canonical; local derived |
| MISO | MISO |
MISORawDataType (18 members) |
NODAL, SYSTEM, REGIONAL | Fixed EST market time |
| ERCOT | ERCOT |
ERCOTRawDataType (7 members) |
NODAL, SYSTEM, WEATHER_ZONE, LOAD_ZONE | Local + DSTFlag; UTC derived |
Internal Structure per ISO¶
Each ISO sub-package contains:
client.py-- TheISOBasesubclass implementingdownload_raw_data,upload_raw_data,download_processed_data,upload_processed_data, andprocess_raw_data.types.py--StrEnumclasses for raw and processed data types, reader types, join modes, processing bindings, and the ISO-specificDataDefinitionsubclass.registry.py-- ConcreteDataRegistrysubclasses mapping each data-type enum to itsDataDefinition, including file metadata and processing bindings.
Registry-Driven Processing¶
Processing is declarative. Each raw data type carries ProcessingBinding tuples that declare which processed dataset it feeds, the join mode to use against the scaffold DataFrame, and which output columns it contributes. The processing flow per date:
- Load the scaffold binding (the one marked
required=True) to establish the primary dimension (e.g., DA LMP provides the node dimension for NODAL). - Iterate remaining bindings, load raw files, and join using the declared
JoinMode. - Fill missing columns with NA, select output columns from the processed registry definition.
- Write to the hive-partitioned output path.
Classes¶
ERCOT
¶
Bases: ISOBase[ERCOTRawDataType, ERCOTProcessedDataType]
Source code in src/progridpy/iso/ercot/client.py
Functions¶
clear_and_calculate_gain
¶
clear_and_calculate_gain(trade_df: DataFrame, processed_df: DataFrame, min_offer_price: float = -500, max_bid_price: float = 2000) -> DataFrame
Clear trades and calculate financial gains based on market prices.
da_spp is the clearing price. No separate clearing column needed. Supply offers clear when da_spp >= offer_price. Demand bids clear when da_spp <= offer_price.
Source code in src/progridpy/iso/ercot/client.py
process_trade
¶
process_trade(trade_dir: str | Path, processed_dir: str | Path, start_date: str | datetime | None = None, end_date: str | datetime | None = None) -> DataFrame
Process trade files and calculate gains based on DA/RT spread.
Source code in src/progridpy/iso/ercot/client.py
1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 | |
MISO
¶
Bases: ISOBase[MISORawDataType, MISOProcessedDataType]
Source code in src/progridpy/iso/miso/client.py
Functions¶
clear_and_calculate_gain
¶
clear_and_calculate_gain(trade_df: DataFrame, processed_df: DataFrame, min_offer_price: float = -500, max_bid_price: float = 2000) -> DataFrame
Clear trades and calculate financial gains based on market prices.
Supply offers clear when clearing_lmp >= offer_price. Demand bids clear when clearing_lmp <= offer_price. Dead nodes are excluded from clearing.
Source code in src/progridpy/iso/miso/client.py
1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 | |
process_trade
¶
process_trade(trade_dir: str | Path, processed_dir: str | Path, start_date: str | datetime | None = None, end_date: str | datetime | None = None) -> DataFrame
Process trade files and calculate gains based on market clearing.
Source code in src/progridpy/iso/miso/client.py
1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 | |
SPP
¶
Bases: ISOBase[SPPRawDataType, SPPProcessedDataType]
Source code in src/progridpy/iso/spp/client.py
Functions¶
download_raw_data
¶
download_raw_data(start_date: str | datetime | None = None, end_date: str | datetime | None = None, data_types: SPPRawDataType | list[SPPRawDataType] | None = None, download_src: str | FileLocation = ISO, output_dir: str | Path | None = None, overwrite: bool = False, verbose: bool = False) -> list[Path]
Download raw data from either SPP website or S3 bucket. The start and end dates are inclusive.
Arguments: start_date (str | datetime | None): the start date to fetch reports for, defaults to today. String formats supported: YYYYMMDD, YYYY/MM/DD, YYYY-MM-DD end_date (str | datetime | None): the end date to fetch reports for, defaults to today. String formats supported: YYYYMMDD, YYYY/MM/DD, YYYY-MM-DD data_types (SPPRawDataType | list[SPPRawDataType]): the data types to fetch, defaults to all. download_src (str | FileLocation): the source to download from, defaults to S3. String values supported: "s3", "iso" output_dir (str | Path): the path to save the data to, defaults to data/spp/raw. overwrite (bool): whether to overwrite existing files, defaults to False. verbose (bool): whether to print verbose output, defaults to False. Returns: list[Path]: A list of paths to the downloaded files.
Source code in src/progridpy/iso/spp/client.py
upload_raw_data
¶
upload_raw_data(start_date: str | datetime | None = None, end_date: str | datetime | None = None, data_types: SPPRawDataType | list[SPPRawDataType] | None = None, input_dir: str | Path | None = None, overwrite: bool = False, verbose: bool = False) -> list[str]
Upload raw data files to S3.
Scans input_dir/{dir_name}/ for files matching the date range, then uploads them with deterministic S3 keys.
Source code in src/progridpy/iso/spp/client.py
download_processed_data
¶
download_processed_data(start_date: str | datetime | None = None, end_date: str | datetime | None = None, data_types: SPPProcessedDataType | list[SPPProcessedDataType] | None = None, output_dir: str | Path | None = None, overwrite: bool = False, verbose: bool = False) -> list[Path]
Download processed data from S3 using deterministic hive-partitioned keys.
Source code in src/progridpy/iso/spp/client.py
upload_processed_data
¶
upload_processed_data(start_date: str | datetime | None = None, end_date: str | datetime | None = None, data_types: SPPProcessedDataType | list[SPPProcessedDataType] | None = None, input_dir: str | Path | None = None, overwrite: bool = False, verbose: bool = False) -> list[str]
Upload processed parquet files to S3.
Scans input_dir for hive-partitioned data.parquet files, filters by date range, then uploads with deterministic S3 keys.
Source code in src/progridpy/iso/spp/client.py
process_raw_data
¶
process_raw_data(start_date: str | datetime | None = None, end_date: str | datetime | None = None, data_types: SPPProcessedDataType | list[SPPProcessedDataType] | None = None, input_dir: str | Path | None = None, output_dir: str | Path | None = None, file_format: Literal['parquet', 'csv'] = 'parquet', overwrite: bool = False, verbose: bool = False) -> list[Path]
Process the downloaded raw data into a standardized format.
Args: start_date: Start date to filter data (if None, process from oldest available) end_date: End date to filter data (if None, process to latest available) data_types: Data type(s) to process (defaults to all types) input_dir: Directory containing raw data files (defaults to self.raw_dir) output_dir: Directory to save processed files (defaults to self.processed_dir) file_format: Output file format, either "parquet" or "csv" (defaults to "parquet") overwrite: Whether to overwrite existing processed files verbose: Whether to print verbose output
Returns: list[Path]: List of paths to successfully processed files
Source code in src/progridpy/iso/spp/client.py
327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 | |
download_rt_lmp_rolling
¶
download_rt_lmp_rolling(output_dir: str | Path | None = None, overwrite: bool = False) -> Path | None
Download today's real-time rolling LMP data by merging 5-minute interval files.
Source code in src/progridpy/iso/spp/client.py
download_archival_data
¶
download_archival_data(year: int, data_type: SPPRawDataType, output_dir: str | Path, overwrite: bool = False) -> Path
Download a yearly zip archive from SPP's file-browser API.
Args: year: The year to download (e.g. 2023). data_type: The raw data type to download the archive for. output_dir: Base directory where the zip will be saved under spp/{zip_dir}/{year}.zip. overwrite: Whether to overwrite an existing zip file.
Returns: Path to the downloaded zip file.
Raises: ValueError: If the data type is not available as an archive.
Source code in src/progridpy/iso/spp/client.py
extract_archival_data
¶
extract_archival_data(input_dir: str | Path, output_dir: str | Path, data_type: SPPRawDataType, overwrite: bool = False) -> list[Path]
Extract and filter archival zip files into the daily file structure.
Reads yearly zip archives from {input_dir}/spp/{zip_dir}/*.zip and writes individual daily CSV files to {output_dir}/{registry_dir_name}/ matching the same output naming as download_raw_data.
Args: input_dir: Base directory containing spp/{zip_dir}/*.zip archives. output_dir: Base directory for extracted daily CSV files. data_type: The raw data type to extract. overwrite: Whether to overwrite existing output files.
Returns: List of paths to written output files.
Raises: ValueError: If the data type is not available as an archive.
Source code in src/progridpy/iso/spp/client.py
clear_and_calculate_gain
¶
clear_and_calculate_gain(trade_df: DataFrame, processed_df: DataFrame, min_offer_price: float = -500, max_bid_price: float = 2000) -> DataFrame
Clear trades and calculate financial gains based on market prices.
da_lmp is the clearing price. Supply offers clear when da_lmp >= offer_price. Demand bids clear when da_lmp <= offer_price.
Source code in src/progridpy/iso/spp/client.py
process_trade
¶
process_trade(trade_dir: str | Path, processed_dir: str | Path, start_date: str | datetime | None = None, end_date: str | datetime | None = None) -> DataFrame
Process trade files and calculate gains based on DA/RT spread.
Source code in src/progridpy/iso/spp/client.py
2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 | |
SPPProcessedDataType
¶
Bases: StrEnum
Defines the granularity of the processed dataset.