GDAL (Geospatial Data Abstraction Library) is a GIS translator library for accessing and transforming geospatial data. Most commonly it is used in file format or coordinate system changes.
GDAL is available in Puhti with following versions:
- 3.0.2 via conda: geoconda,
- 2.4.3 via conda: snap
- 2.4.2 via conda: mapnik
- 2.4.1 via conda: solaris and Orfeo ToolBox
- 3.0.1 stand-alone: gdal module,
- 2.4.2 stand-alone: gdal module, r-env, additionally FORCE and Saga-GIS use this GDAL, but the GDAL commandline tools are not included in these modules.
GDAL is included in the modules listed above, so it can be used when any of these modules is loaded, or it can be loaded separately with:
module load geoconda
If you need to use a stand-alone version of gdal or plan to build software on top of gdal, you can load gdal with
module load gcc/9.1.0 gdal
By default the latest gdal module is loaded. If you want a specific version you can specify the version number
module load gcc/9.1.0 gdal/<VERSION>-omp
You can test if gdal loaded successfully with following
The stand-alone versions don't have python bindings installed so e.g gdal_calc works only in the conda installations. Also, the supported file formats vary slightly between the gdal installations. For instance, the PostGIS driver is not available in gdal/3.0.1 but is included in the conda versions.
Using files directly from Allas
It is possible to read files from Allas directly with GDAL, but not to write. The below mentioned virtual drivers are supported also in many GDAL-based tools. The set up is the same as below, but instead of the example gdalinfo command open the file from Python or R script. In R and Python it is possible also to write to Allas directly from script. We have tested successfully:
For results, write them first to Puhti scratch and move later to Allas.
Public files in Allas can be read with
Private files can be read by SWIFT or S3 API. SWIFT is more secure, but the credetials need to be updated after 8 hours. S3 has permanent keys, is therefore little bit easier to use, but less secure. Both of these have a random reading and streaming API.
SWIFT. Set up the connection in Puhti or Taito and then read the files with
module load allas allas-conf export SWIFT_AUTH_TOKEN=$OS_AUTH_TOKEN export SWIFT_STORAGE_URL=$OS_STORAGE_URL gdalinfo /vsiswift/<name_of_your_bucket>/<name_of_your_file>
The export commands are needed because GDAL is looking for different environment variables than what allas-conf is writing. These commands need to be given each time you start working with Puhti, because the token is valid for 8 hours. Inside batchjobs use allas-conf -k.
S3. Create your S3 credentials with allas-conf in Puhti or Taito.
module load allas allas-conf --mode s3cmd
Save your credentials in your home directory to .aws/credentials file like this:
[default] AWS_ACCESS_KEY_ID=<access_key> AWS_SECRET_ACCESS_KEY=<secret_key>
These steps you have to do only once.
Set the service endpoint for Allas and read the file using vsis3-driver:
export AWS_S3_ENDPOINT=a3s.fi gdalinfo /vsis3/<name_of_your_bucket>/<name_of_your_file>
License and citing
GDAL/OGR is licensed under an MIT/X style license
In your publications please acknowledge also oGIIR and CSC, for example “The authors wish to acknowledge for computational resources CSC – IT Center for Science, Finland (urn:nbn:fi:research-infras-2016072531) and the Open Geospatial Information Infrastructure for Research (oGIIR, urn:nbn:fi:research-infras-2016072513).”