-
Python-geo
Python-geo
Python-geo is a collection of python packages that facilitate the development of python scripts for geoinformatics applications. It includes following python packages:
- access - for calculating the spatial accessibility of resources.
- async-tiff - fast reader for TIFF-files. NEW 2026
- boto3 - for working files in S3 storage, for example Allas. Allas S3 example in CSC geocomputing Github.
- cartopy - for map plotting.
- cdsapi - access to Copernicus Climate Data Store. NEW 2026
- cfgrib - map GRIB files to the NetCDF Common Data Model
- contextily - to retrieve tile maps from the internet.
- copc-lib - reader and writer interface for Cloud Optimized Point Clouds (COPC)
- dask - provides advanced parallelism for analytics, enabling performance at scale, including dask-geopandas, Dask-ML and Dask JupyterLab extension.
- Dask parallization example in CSC geocomputing Github.
- STAC example in CSC geocomputing Github.
- dask-image - image processing with Dask Arrays.
- datashader - for big data rendering. NEW 2026
- duckdb - to execute analytical SQL queries fast.
- esda - Exploratory Spatial Data Analysis.
- fiona - reads and writes spatial data files.
- geoalchemy2 - provides extensions to SQLAlchemy for working with spatial databases, primarily PostGIS.
- geocube - convert geopandas vector data into rasterized xarray data.
- geodatasets download and cache spatial data example files.
- geopandas - GeoPandas extends the datatypes used by pandas.
- geoparquet-io - fast reader for GeoParquet files. NEW 2026
- geopy - client for several popular geocoding web services.
- geoviews - geographic visualizations for HoloViews. NEW 2026
- geo2ml - for preparing spatial data for machine learning.
- Google Earth Engine API - see how to set up GEE authentication.
- holoviews - plot big datasets. NEW 2026
- h3pandas - for hexagonal geospatial indexing system, with Pandas and GeoPandas.
- h3-py - Python bindings for H3, a hierarchical hexagonal geospatial indexing system.
- h5py - for HDF5 files. NEW 2026
- icechunk - cloud-native transactional tensor storage engine. NEW 2026
- igraph - for fast routing. Routing examples in CSC geocomputing Github
- laspy - for reading, modifying, and creating .LAS LIDAR files.
- leafmap - for geospatial analysis and interactive mapping in a Jupyter environment.
- lidar - for delineating the nested hierarchy of surface depressions in digital elevation models (DEMs).
- lonboard - fast, interactive geospatial data visualization in Jupyter. NEW 2026
- metpy - reading, visualizing, and performing calculations with weather data.
- movingpandas - for trajectory data
- networkx - for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. Routing examples in CSC geocomputing Github
- papermill - for parameterizing and executing Jupyter Notebooks. NEW 2026
- pot - solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning. NEW 2026
- pyproj - performs cartographic transformations and geodetic computations.
- pyogrio - vectorized spatial vector file format I/O using GDAL/OGR.
- obstore - fast access to S3, Google Cloud Storage and Azure Storage. NEW 2026
- odc-stac - STAC data to xarray, STAC example in CSC geocomputing Github. NEW 2026
- openeo - for connecting to Earth observation cloud back-ends in a simple and unified way.
- open3d - for 3D data processing
- osmnx - download spatial geometries and construct, project, visualize, and analyze street networks from OpenStreetMap's APIs. Routing examples in CSC geocomputing Github
- owslib - for retrieving data from Open Geospatial Consortium (OGC) web services
- pcraster - for spatio-temporal environmental modelling.
- planetary-computer - supports accessing data in Microsoft's Planetary Computer NEW 2026
- psycopg2 - PostgreSQL database adapter for Python.
- python-pdal - PDAL Python extension for lidar data
- pysal - spatial analysis functions.
- pdal - for lidar data
- pysheds - for watershed delineation.
- pystac-client - for working with STAC Catalogs and APIs. STAC example in CSC geocomputing Github.
- python-cdo - scripting interface to CDO (Climate Data Operators).
- rasterio - access to geospatial raster data.
- rasterstats - for summarizing geospatial raster datasets based on vector geometries. It includes functions for zonal statistics and interpolated point queries. rasterstats example in CSC geocomputing Github
- rio-cogeo - for Cloud Optimized GeoTIFF (COG) creation.
- rtree - spatial indexing and search.
- r5py - for rapid realistic routing on multimodal transport networks, see below how to set memory correctly for r5py.
- shap - for explaining the output of any machine learning model. NEW 2026
- sentinelhub - for working with new Sentinel Hub services.
- shapely - manipulation and analysis of geometric objects in the Cartesian plane.
- scikit-gstat - for variogram analysis. NEW 2026
- scikit-learn - machine learning for Python. Spatial machine learning scikit-learn (shallow learning) exercises
- skimage - algorithms for image processing.
- scipy - inc pandas, numpy, matplotlib etc
- sparse - for sparse arrays. NEW 2026
- spectral - for processing hyperspectral image data. NEW 2026
- stackstac - STAC data to xarray, STAC example in CSC geocomputing Github. Has not been updated lately, use rather
odc-stac. - swiftclient, keystoneclient - for working with SWIFT storage, for example Allas. Allas Swift example in CSC geocomputing Github.
- whiteboxtools - wide-scope processing of geospatial data, many tools operate in parallel, see CSC whiteboxtools page for details. Also Whitebox Workflows for Python.
- xarray - for multidimensional raster data, inc. rioxarray. STAC example in CSC geocomputing Github.
- cf_xarray - interpret Climate and Forecast metadata convention attributes present on xarray objects. NEW 2026
- flox - fast GroupBy reductions for Xarray. NEW 2026
- xarray-eopf - for reading the ESA EOPF data products in Zarr format. NEW 2026
- xarray-spatial - efficient common raster analysis functions for xarray. xarray-spatial example in CSC geocomputing Github
- xclim - for climate analysis. NEW 2026
- xgboost - Gradient Boosting machine learning algorithms. NEW 2026
-
zarr - for reading and writing data to Zarr format. NEW 2026
-
And many more, for retrieving the full list use:
list-packages
Additionally python-geo includes:
- jupyter - Jupyter Notebooks and JupyterLab. Use from web interface with Jupyter app. Includes Dask Extension and Resource usage Extension.
- spyder - Scientific Python Development Environment with graphical interface (similar to RStudio for R).
- GDAL/OGR commandline tools
- GMT The Generic Mapping Tools
- PDAL - Point Data Abstraction Library
Python has multiple packages for parallel computing, for example multiprocessing, joblib and dask. In our Puhti Python examples there are examples how to utilize these different parallelisation libraries.
If you think that some important GIS package for Python is missing from here, you can ask for installation from CSC Service Desk.
Available
The python-geo module is available:
- 3.14.3 (Python 3.14.3, PDAL 2.10.0, GDAL 3.12.2, created April 2026), in Roihu-CPU
The version number is the same as the Python version.
In Puhti, Mahti and LUMI python-geo is named geoconda
Usage
For using Python packages and other tools listed above, you can initialize them with:
By default the latest python-geo module is loaded. If you want a specific version you can specify the version number of python-geo:
To check the exact packages and versions included in the loaded module:
You can add more Python packages to python-geo by following the instructions in our
Python usage guide.
You can edit your Python code with web interface or LUMI web interface :
r5py memory settings
r5py by default does not correctly understand how much memory it has available in a supercomputer so, it has to be defined manually. It is using Java in the background, so add environmental variable to set maximum memory available for Java:
export _JAVA_OPTIONS="-Xmx4g"from command-line before starting Python ORos.environ["_JAVA_OPTIONS"] = "-Xmx4g"in the beginning of your Python code.
Google Earth Engine authentication set up
For using Google Earth Engine (GEE) API with earthengine-api package, GEE account and project are needed. Before first usage, also set up GEE authentication:
This prints out a long link and asks for a code. Copy the link to the web browser of your local laptop. Follow the instructions on the web page and finally copy the created code back to Terminal.
Using Allas or LUMI-O from Python
There are two Python libraries installed in Python-geo that can interact with Allas or LUMI-O. Swiftclient uses the swift protocol and boto3 uses S3 protocol. You can find CSC examples how to use both here.
It is also possible to read and write files from and to Allas or other cloud object storage directly with GDAL-based packages such as geopandas and rasterio. Please check our Using geospatial files directly from cloud, inc Allas tutorial for instructions and examples.
With large quantities of raster data, consider using virtual rasters.
License
All packages are licensed under various free and open source licenses (FOSS), see the linked pages above for exact details.
Citation
Please see the above linked package pages for citation information per package.
Acknowledgement
Please acknowledge CSC and Geoportti in your publications, it is important for project continuation and funding reports. As an example, you can write "The authors wish to thank CSC - IT Center for Science, Finland (urn:nbn:fi:research-infras-2016072531) and the Open Geospatial Information Infrastructure for Research (Geoportti, urn:nbn:fi:research-infras-2016072513) for computational resources and support".
Installation
Python-geo was installed to Roihu using Tykkys conda-containerize functionality. In LUMI, geoconda was installed using LUMI container wrapper. The functionality of the tools is almost identical with --post option being --post-install on LUMI container wrapper. The WhiteboxTools conda package installs only WhiteboxTools installer, therefore for proper installation of Whiteboxtools required additional post installation command and folder to wrap commandline tools.
conda-containerize new --mamba \
--prefix install_dir --post download_wbt \
-w miniconda/envs/env1/lib/python3.11/site-packages/whitebox/WBT/whitebox_tools \
python-geo_3.11.10.yml
Python-geo conda environment files and download_wbt and start_wbt.py needed for WhiteboxTools are available in CSCs geocomputing repository. Note that for reproducibility, you'll need to define the package versions in the environment file, which can be checked using list-packages command after loading the python-geo module.
References
- CSC Python parallelisation examples
- Multiprocessing Basics
- Automating GIS processes course materials by University of Helsinki
- Aalto Spatial Analytics course material by Henrikki Tenkanen / Aalto University
- Introduction to GIS Programming by Dr. Qiusheng Wu / University of Tennessee
- Geographic Data Science with Python by Sergio Rey, Dani Arribas-Bel, Levi Wolf
- Python Foundation for Spatial Analysis by Ujaval Gandhi