GDAL library has an API which provides geocoding capabilities using webservices, supported providers are: OSM_NOMINATIM, MAPQUEST_NOMINATIM, YAHOO, GEONAMES or BING.
I’ve been busy with geocoding since I’ve developed a popular QGIS geocoding plugin, while GeoCoding plugin uses GeoPy, which is a pure-python implementation of geocoding webservices consumers, in this post I will explore GDAL geocoding capabilities using python.
Unfortunately, geocoding functions seems not to be exposed in the swig-generated python API, nor they are available in the pythonic gdal API provided by django (GeoDjango GDAL API), this time I’ll try to build an API using ctypes for bindings.
First geocoding test
This first test will load the shared library using ctypes and display the first result.
#!/bin/env python
"""
Testing geocoding GDAL implementation
"""
from ctypes import *
# This loads libgdal shared library
lgdal = CDLL('libgdal.so')
# Setup bindings telling ctypes what types are returned
# from the library calls.
lgdal.OGR_F_GetFieldAsString.restype = c_char_p # null terminated string
lgdal.OGR_Fld_GetNameRef.restype = c_char_p # null terminated string
# Tells ctypes that this function expects a pointer and a string buffer
# (still a pointer)
lgdal.OGR_G_ExportToWkt.argtypes = [c_void_p, POINTER(c_char_p)]
This first part of the program loads the shared library (libgdal.so) and setup some bindings.
This is important because python doesn’t know the types returned from the library functions or the type of the functions parameters, in order to return python strings from C null terminated char * we need to set c_char_p as restype (the returned type) and in order to pass a buffer pointer to OGR_G_ExportToWkt function we are specifying POINTER(c_char_p) as the second parameter.
# Starts a geocoding session with default values (no configuration)
hSession = lgdal.OGRGeocodeCreateSession(None)
# Search for the location "Milano" and put results in a layer
hLayer = lgdal.OGRGeocode(hSession, "Milano", None, None)
This piece of code creates a geocoding session, this session normally holds all configuration parameters (such as the geocoder to use, the user credentials and other geocoder-specific parameters), in this example we are just using the default configuration which uses Nominatim geocoder.
The geocoding results are stored in a layer, layer’s address is stored in hLayer.
# Get layer definitions
hFDefn = lgdal.OGR_L_GetLayerDefn(hLayer)
# Get the first feature from the results layer
hFeature = lgdal.OGR_L_GetNextFeature(hLayer)
# Read all fields
for iField in range (lgdal.OGR_FD_GetFieldCount(hFDefn)):
hFieldDefn = lgdal.OGR_FD_GetFieldDefn( hFDefn, iField )
print("%s\t" % lgdal.OGR_Fld_GetNameRef( hFieldDefn )),
print( "%s," % lgdal.OGR_F_GetFieldAsString( hFeature, iField))
Here we are reading field definitions, and the first feature from the results layer, looping over all its fields we are able to print field names and values.
We are still missing the geometry informations, that are available as an OGRGeometry object.
In the last part of the program we are going to extract the geometry and print as WKT string.
# Print geometry as WKT
hGeometry = lgdal.OGR_F_GetGeometryRef(hFeature)
buf = c_char_p()
lgdal.OGR_G_ExportToWkt(hGeometry, buf)
print( "%s" % buf.value )
This is the output:
place_id 97562386,
osm_type relation,
osm_id 44915,
place_rank 16,
boundingbox 45.3867378234863,45.5358505249023,9.04065799713135,9.27803993225098,
lat 45.4666211,
lon 9.1906166,
display_name Milano, MI, LOM, Italia,
class place,
type city,
importance 0.87585157580298,
icon http://nominatim.openstreetmap.org/images/mapicons/poi_place_city.p.20.png,
city Milano,
county MI,
state LOM,
country Italia,
country_code it,
village ,
hamlet ,
postcode ,
locality ,
POLYGON ((9.040658 45.4474642,9.0411095 45.4477548, ... ))
Configuring the geocoder
In this second example, we will use a different geocoder with a new session, but first we will abstract the feature printing process in its own function. Also, since ctypes does not support enums we are explicitely setting some GDAL constants.
# [...] load gdal as in previous example...
# ctypes does not support enums...
# enum OGRFieldType
OFTInteger = 0
OFTIntegerList = 1
OFTReal = 2
OFTRealList = 3
OFTString = 4
OFTStringList = 5
OFTWideString = 6
OFTWideStringList = 7
OFTBinary = 8
OFTDate = 9
OFTTime = 10
OFTDateTime = 11
def printGeoCodeResults(hLayer):
for iFeature in range(lgdal.OGR_L_GetFeatureCount( hLayer)):
hFDefn = lgdal.OGR_L_GetLayerDefn(hLayer)
hFeature = lgdal.OGR_L_GetNextFeature(hLayer)
print( "=" * 80)
print( "Fature #: %s" % iFeature)
print( "=" * 80)
for iField in range (lgdal.OGR_FD_GetFieldCount(hFDefn)):
hFieldDefn = lgdal.OGR_FD_GetFieldDefn( hFDefn, iField )
print("%s\t" % lgdal.OGR_Fld_GetNameRef( hFieldDefn )),
if lgdal.OGR_Fld_GetType(hFieldDefn) == OFTInteger:
print( "%d," % lgdal.OGR_F_GetFieldAsInteger( hFeature, iField ))
elif lgdal.OGR_Fld_GetType(hFieldDefn) == OFTReal:
print( "%.3f," % lgdal.OGR_F_GetFieldAsDouble( hFeature, iField))
elif lgdal.OGR_Fld_GetType(hFieldDefn) == OFTString :
print( "%s," % lgdal.OGR_F_GetFieldAsString( hFeature, iField) )
else:
print( "%s," % lgdal.OGR_F_GetFieldAsString( hFeature, iField) )
# Get geometry as WKT:
hGeometry = lgdal.OGR_F_GetGeometryRef(hFeature)
if hGeometry is not None:
buf = c_char_p()
lgdal.OGR_G_ExportToWkt(hGeometry, buf)
print( "%s" % buf.value )
else:
print( "no geometry" )
We can now print all retrieved features with this piece of code:
# Starts a geocoding session with default values
hSession = lgdal.OGRGeocodeCreateSession()
hLayer = lgdal.OGRGeocode(hSession, "Milan", None, None)
printGeoCodeResults(hLayer)
# Destroy the session freeing allocated memory and buffers
lgdal.OGRGeocodeFreeResult(hLayer)
lgdal.OGRGeocodeDestroySession(hSession)
Note that we are calling OGR cleanup functions that destroy session and free the result layer, this is not necessary in this short script (since python will release all memory when exits) but must be taken in consideration for long runnning processes or complex programs.
Let’s now start another session using Mapquest geocoder instead of the default Nominatim:
# Starts a geocoding session with no cache and
# Mapquest geocoder
config = ["SERVICE=MAPQUEST_NOMINATIM", "WRITE_CACHE=FALSE"]
# Holds config strings
sessionConfigType = c_char_p * len(config)
sessionConfig = sessionConfigType(*config)
# Pass by reference, the C parameter is: char ** papszOptions
hSession = lgdal.OGRGeocodeCreateSession(byref(sessionConfig))
hLayer = lgdal.OGRGeocode(hSession, "Milan", None, None)
printGeoCodeResults(hLayer)
# Destroy the session freeing allocated memory and buffers
lgdal.OGRGeocodeFreeResult(hLayer)
lgdal.OGRGeocodeDestroySession(hSession)
A full list of config options can be found on GDAL API documentation.
In the next article I’ll show how we can build a pythonic API around GDAL geocoding features: stay tuned!