GDAL

GDAL (Geospatial Data Abstraction Library) is a GIS translator library for accessing and transforming geospatial data. Most commonly it is used in file format or coordinate system changes.

Available

GDAL is available in Puhti with following versions:

  • 3.0.4 via conda: geoconda-3.8,
  • 3.0.2 via conda: geoconda-3.7,
  • 2.4.3 via conda: snap
  • 2.4.2 via conda: mapnik
  • 2.4.1 via conda: solaris and Orfeo ToolBox
  • 3.0.1 stand-alone: gdal module,
  • 2.4.2 stand-alone: gdal module, r-env, additionally FORCE and Saga-GIS use this GDAL, but the GDAL commandline tools are not included in these modules.
  • 2.4.2 in r-env-singularity Singularity container

Note

The stand-alone versions don't have python bindings installed so e.g gdal_calc works only in the conda installations. Also, the supported file formats vary slightly between the gdal installations. For instance, the PostGIS driver is not available in gdal/3.0.1 but is included in the conda versions.

Usage

Using gdal

GDAL is included in the modules listed above, so it can be used when any of these modules is loaded, or it can be loaded separately with:

module load geoconda

If you need to use a stand-alone version of gdal or plan to build software on top of gdal, you can load gdal with

module load gcc/9.1.0 gdal

By default the latest gdal module is loaded. If you want a specific version you can specify the version number

module load gcc/9.1.0 gdal/<VERSION>-omp

You can test if gdal loaded successfully with following

gdalinfo --version

Using files directly from Allas

It is possible to read files from Allas directly with GDAL, but not to write. For results, write them first to Puhti scratch and move later to Allas. The below mentioned virtual drivers are supported also in many GDAL-based tools. The set up is the same as below, but instead of the example gdalinfo command open the file from Python or R script. In R and Python it is possible also to write to Allas directly from script. We have tested successfully:

Reading data directly from Allas is slower than reading from scratch or other Puhti lustre disks, for example reading a ~500 Mb files from scratch takes ~1 second, but from Allas ~10 seconds. In most cases still comapered to full duration of an analysis in Puhti, these seconds are not important.

Public files in Allas can be read with vsicurl:

gdalinfo /vsicurl/https://a3s.fi/<name_of_your_bucket>/<name_of_your_file>

Private files can be read by SWIFT or S3 API. SWIFT is more secure, but the credetials need to be updated after 8 hours. S3 has permanent keys, is therefore little bit easier to use, but less secure. Both of these have a random reading and streaming API.

SWIFT. Set up the connection in Puhti and then read the files with vsiswift-driver:

module load allas
allas-conf
export SWIFT_AUTH_TOKEN=$OS_AUTH_TOKEN 
export SWIFT_STORAGE_URL=$OS_STORAGE_URL
gdalinfo /vsiswift/<name_of_your_bucket>/<name_of_your_file>

The export commands are needed because GDAL is looking for different environment variables than what allas-conf is writing. These commands need to be given each time you start working with Puhti, because the token is valid for 8 hours. Inside batchjobs use allas-conf -k.

S3. Set up the connection in Puhti and then read the files with vsis3-driver:

module load allas
allas-conf --mode s3cmd
gdalinfo /vsis3/<name_of_your_bucket>/<name_of_your_file>
  • module load allas sets AWS_S3_ENDPOINT environment variable, which needs to be run each time S3 is used.
  • allas-conf command saves your credentials in your home directory to .aws/credentials file. This needs to be run only once before first use or when you want to switch to another CSC project.

Virtual rasters

With large quantities of raster data (also in Allas), the most convenient method of accessing them might be GDAl virtual rasters. More information here.

License and citing

GDAL/OGR is licensed under an MIT/X style license

In your publications please acknowledge also oGIIR and CSC, for example “The authors wish to acknowledge for computational resources CSC – IT Center for Science, Finland (urn:nbn:fi:research-infras-2016072531) and the Open Geospatial Information Infrastructure for Research (oGIIR, urn:nbn:fi:research-infras-2016072513).”

References