Skip to content

Major Refactor - Explicit Pixel Grids & CF Dimensions - v0.1.0#275

Merged
jdbcode merged 65 commits intomainfrom
simplify_pixel_grid_params
Apr 7, 2026
Merged

Major Refactor - Explicit Pixel Grids & CF Dimensions - v0.1.0#275
jdbcode merged 65 commits intomainfrom
simplify_pixel_grid_params

Conversation

@jdbcode
Copy link
Copy Markdown
Member

@jdbcode jdbcode commented Nov 17, 2025

Major Refactor - Explicit Pixel Grids & CF Dimensions

This PR represents a major refactor to the core of Xee's backend. It is merges the simplify_pixel_grid_params branch into main. The simplify_pixel_grid_params has months of changes - the significant changes to Xee API have been through review, doc changes and CI/CD Python version changes were directly pushed. The most significant changes are adopting explicit pixel grid parameters for opening datasets (see discussion) and updating the default dimension ordering to align with community standards (see discussion). After this PR is merged we should release v0.1.0 (from v0.0.x).

⚠️ This version contains breaking changes. All users must update their existing xarray.open_dataset calls when upgrading to this version. Please refer to the Migration Guide for detailed instructions.


💥 Breaking Changes & Rationale

1. Explicit Pixel Grid Definition

The previous, heuristic-based grid definition arguments (scale, geometry, and projection) have been removed from xr.open_dataset(..., engine='ee').

  • What Changed: The open_dataset function now requires three explicit parameters to define the pixel grid: crs, crs_transform (a 6-tuple affine matrix), and shape_2d (width/height pixel count).
  • Rationale: This shift forces users to explicitly define the output grid, eliminating ambiguity and ensuring archival reproducibility. This is crucial for precise, repeatable geospatial workflows.

2. CF-friendly Dimension Ordering

The default order of spatial dimensions in the resulting Xarray objects has been changed.

  • What Changed: Datasets are now returned in the dimension order [time, y, x] instead of the old [time, x, y].
  • Rationale: This brings Xee into alignment with CF conventions and the expectations of most geospatial libraries (like rioxarray and cartopy), significantly reducing the need for manual .transpose() calls.

✨ New Features & Helper Utilities

To support the new explicit grid workflow, a new xee.helpers module has been added with key utilities:

  • extract_grid_params(ee_obj): Automatically derives the required crs, crs_transform, and shape_2d from an existing ee.Image or ee.ImageCollection.
    • Rationale: This is the primary way to achieve a "match source grid" workflow, allowing reviewers to verify the use of the object's native grid parameters simply.
  • fit_geometry(...): Computes the required grid parameters to cover a specific shapely.geometry (AOI) at either a fixed scale/resolution or a fixed shape/pixel count.
  • Refined Transform Logic: The internal calculation for the affine transform now uses math.floor and math.ceil to precisely snap the grid to the bounding box extents, improving coordinate accuracy and preventing sub-pixel misalignment.

📚 Documentation & Infrastructure Updates

  • Migration Guide Added: A new detailed guide, docs/migration-guide-v0.1.0.md, has been created to assist users in updating their code to the new v0.1.0 API.
  • Extensive Documentation Refactor: New Core Concepts (concepts.md) and a User Guide (guide.md) were added to clarify the philosophy behind the pixel grid parameters and collect common workflows. The main README.md and examples have also been fully updated.
  • CI/CD Updates: Python 3.10 support was removed from all CI workflows, and the default publish environment was updated to Python 3.11.

📋 Checklist

  • Implemented new PixelGridParams signature and removed old implicit arguments.
  • Updated default dimension ordering to [time, y, x].
  • Added core grid helper utilities: extract_grid_params, fit_geometry, and set_scale.
  • Updated documentation and added a dedicated Migration Guide.
  • Updated CI/CD infrastructure to target Python 3.11.
  • All tests in ext_integration_test.py and ext_integration_test.py pass.

tylere and others added 30 commits February 3, 2025 09:58
PiperOrigin-RevId: 712905652
Ensure shape_2d is a tuple
@jdbcode
Copy link
Copy Markdown
Member Author

jdbcode commented Dec 9, 2025

The only blocker: Can we bring back python 3.10?

Yes, I've added 3.10 back in.

@jdbcode
Copy link
Copy Markdown
Member Author

jdbcode commented Dec 9, 2025

@schwehr and @naschmitz I've addressed all of the comments. Can you please take another look.

@jdbcode jdbcode requested review from naschmitz and schwehr December 9, 2025 00:50
Copy link
Copy Markdown
Collaborator

@schwehr schwehr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the

@tylere
Copy link
Copy Markdown
Contributor

tylere commented Mar 6, 2026

These are the

@schwehr could you clarify what you were starting to say in your comment? Are these issues that still need to be addressed?

@schwehr
Copy link
Copy Markdown
Collaborator

schwehr commented Mar 6, 2026

These are the

@schwehr could you clarify what you were starting to say in your comment? Are these issues that still need to be addressed?

I think it's basically ready for @jdbcode to submit

@cgmorton
Copy link
Copy Markdown

I don't have anything technical to add but did want to say that I am really looking forward to this update. I had tried to use Xee but was having a lot of difficulty working with the combination of undocumented projection parameters. I tested out this version and it worked perfectly and exactly as I was expecting.

@jdbcode
Copy link
Copy Markdown
Member Author

jdbcode commented Mar 10, 2026

@cgmorton, thanks for the encouragement! We have plans to pick this back up and get it merged and released to PyPI within the next couple of weeks.

@jakenotjay
Copy link
Copy Markdown

@jdbcode just to say, we've taken a fork on this branch and have started using it for our workloads, its much quicker without generating the coordinate array by API

@jdbcode jdbcode merged commit 56c7923 into main Apr 7, 2026
6 checks passed
@jdbcode
Copy link
Copy Markdown
Member Author

jdbcode commented Apr 7, 2026

Tomorrow (2026-04-08) I'll publish a pre-release to PyPI and update the readme and install instructions to point people at the pre-release for a few weeks then promote to stable release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants