In the last post we explained how to make a little more complex calculations with gpu.js. But, how efficient is?

The temperature calculation is a task I did many years ago, with pure python. Using pure python is a really bad idea in this case, having tools like numpy, cython, etc. The times were about 50 seconds or more, while gpu.js lasts about 1.5 seconds! More than an order of magnitude.

The code

I made an example script to test the timing. The result should be the same as in gpu.js, but I made the residuals interpolation calculations in different alternatives, two of them may be different.

To run the script you will need two things:


My pip list command returns this:

cycler (0.10.0)
Cython (0.28.5)
GDAL (2.3.1)
matplotlib (2.2.3)
numpy (1.15.1)
scikit-learn (0.19.2)
scipy (1.1.0)
sklearn (0.0)

Basically, scikit-learn, with numpy and scipy plus the cython library. Also, matplotlib to plot the data.

To compile the cython part, there is a file that has to be run by:

python build_ext --inplace

Now, by running


You will get all the benchmarks

Multi linear regression

To get the regression coefficients, I used scikit-learn:

def calculate_regression(data_file):
    regr = LinearRegression()

    with open(data_file) as f_p:
        data = load(f_p)
        temps = []
        predictors = []
        lats = []
        lons = []
        for station_data in data:
            predictors.append([station_data['alt'], station_data['dist']])
            lons.append(station_data['lon']), temps)
        score = regr.score(predictors, temps)
        residuals = regr.predict(predictors) - temps

        print("Multiple linear regression score: {}".format(score))
        return {'coefs': regr.coef_, 'intercept': regr.intercept_,
                'residuals': array(residuals),
                'lats': array(lats), 'lons': array(lons)}

Which is quite straightforward. Just prepare the data and follow the docs.

Note that the residuals are created applying the regression to the original data:

residuals = regr.predict(predictors) - temps

It’s a clean and fast way to do it and allows to access the results later in the script.

Applying the regression

Applying the regression results is easy with numpy, since it’s just adding several matrices:

def create_regression_field(regression, vars_file):
    d_s = gdal.Open(vars_file)
    distances = d_s.GetRasterBand(1).ReadAsArray()
    altitudes = d_s.GetRasterBand(2).ReadAsArray()
    temperature = (regression['intercept'] +
                   altitudes * regression['coefs'][0] +
                   distances * regression['coefs'][1])
    return temperature

Interpolating the residuals

Interpolating the residuals can be done in several ways. I’ve tested three, two after looking example around and the original I used both at my workplace and in the gpu.js example.


The radial basis function is the one most srecommended by scipy. The results can be a bit strange and the performance is poor, but:

def rbf(regression, dimensions):
    xi = linspace(regression['lons'].min(), regression['lons'].max(),
    yi = linspace(regression['lats'].min(), regression['lats'].max(),
    xi, yi = meshgrid(xi, yi)
    xi, yi = xi.flatten(), yi.flatten()
    interp = Rbf(regression['lons'], regression['lats'],
                 regression['residuals'], function='linear')

    residuals_field = interp(xi, yi).reshape(dimensions)
    return residuals_field

The code, basically prepares the data for the Rbf function.


The inverse of the distance weighted code is taken from a GitHub repo. It’s really efficient and the result is good, but more difficult to understand than the regular inverse of the distance. Also, maintains steep changes, which is not the best situation in our case, where we want a smooth residuals field all around, even if a single station has a different local value:

def idw(regression, dimensions):
    X1 = array(list(zip(regression['lons'], regression['lats'])))

    idw_tree = tree(X1, regression['residuals'])

    xi = linspace(regression['lons'].min(), regression['lons'].max(),
    yi = linspace(regression['lats'].min(), regression['lats'].max(),
    X2 = meshgrid(xi, yi)
    X2 = reshape(X2, (2, -1)).T
    z2 = idw_tree(X2)

    return z2.reshape(dimensions)

Again, the code is basically preparing the data for the function.

Inverse of the distance using cython

This is the original code I used, and the one in the previous post. Calculating it with pure numpy was a bit difficult, so I made the original algorithm optimized with cython, so it’s as fast as coded in C. The code to call it is:

def cython_id(regression, dimensions):

    data = {}

    for i in range(len(regression['lons'])):
        data[i] = {'x': regression['lons'][i],
                   'y': regression['lats'][i],
                   'value': regression['residuals'][i]}

    geotransform = [min(regression['lons']),
        (max(regression['lons']) - min(regression['lons']))/dimensions[1],
        (min(regression['lats']) - max(regression['lats']))/dimensions[0]

    result = interpolate_residuals(data, dimensions, geotransform)
    return result

Note that I used geotransform, which turns things properly.

The cython code is:

#cython: boundscheck=False, wraparound=False, nonecheck=False, cdivision=True

import numpy as np
cimport numpy as np
from libc.math cimport sqrt
from libc.math cimport pow
from cpython cimport array
import array

DTYPE = np.float64
ctypedef np.float64_t DTYPE_t

def interpolate_residuals(residues, size, geotransform):
    cdef array.array da = array.array('d', [])
    array.resize(da, size[0] * size[1])
    cdef double[:] cda = da

    xpos0 = []
    ypos0 = []
    values0 = []

    for key in residues.keys():

    cdef int N
    N = len(xpos0)
    cdef array.array xpos = array.array('d', xpos0)
    cdef double[:] cxpos = xpos
    cdef array.array ypos = array.array('d', ypos0)
    cdef double[:] cypos = ypos
    cdef array.array values = array.array('d', values0)
    cdef double[:] cvalues = values

    cdef int i, j
    cdef int xsize = size[1]
    cdef int ysize = size[0]
    cdef double y
    cdef double x

    cdef array.array geotransform0 = array.array('d', geotransform)
    cdef double[:] cgeotransform = geotransform0

    for j in range(ysize):
        y = cgeotransform[3] + j * cgeotransform[5]
        for i in range(xsize):
            x = cgeotransform[0] + i * cgeotransform[1]
            cda[i + j * xsize] = point_residue(x, y, cxpos, cypos, cvalues, N)

    data_array = np.array(cda)
    return data_array.reshape(size)

cdef float point_residue(double x, double y, double[:] xpos, double[:] ypos, double[:] values, int N):
    cdef int power = 2
    cdef int smoothing = 0
    cdef double numerator = 0
    cdef int i
    cdef double denominator
    denominator = 0

    for i in range(N):
        dist = sqrt((x - xpos[i]) ** 2 + (
            y - ypos[i]) ** 2 + smoothing * smoothing)

        if dist < 0.00000000001:
            return values[i]
        numerator = numerator + (values[i] / pow(dist, power))
        denominator = denominator + (1 / pow(dist, power))

    if denominator != 0:
        return numerator / denominator

You have to run

python build_ext --inplace

to compile it before running the script for the first time.


In my computer, which is not a new or powerful one, the times were, for the common steps:

Operation Elapsed time
Regression time 3 ms
Temperature field time 44 ms
Final field time 2 ms
Drawing time 402 ms

With the different methods, the times were:

Operation Residuals field time Total time
Rbf 4101 ms 4551 ms
idw 881 ms 1084 ms
cython 2571 ms 2775 ms

So, in the first place, the residuals interpolation is, by far, the most expensive step. The IDW method I found is the fastest option, although I’m not sure that the result is as good as the cython method with the classical inverse of the distance.

The original gpu.js method lasted:

Operation Elapsed time
Multiple linear regression 2 ms
Calculate the regression field 209 ms
Calculate the residuals field 1084 ms
Calculate the final field 52 ms
Draw the regression field 65 ms
Draw residuals field 70 ms
Draw final result 67 ms
Total time 1549 ms

So it’s a really good performance if you think that it’s run on the browser using a non compiled language (although using the GPU, of course!)

Finally, it would be nice to check the performance against python + GPU, but I have never worked with it.