# gpu.js performance

In the last post we explained how to make a little more complex calculations with gpu.js. But, how efficient is?

The temperature calculation is a task I did many years ago, with pure python. Using pure python is a really bad idea in this case, having tools like numpy, cython, etc. The times were about 50 seconds or more, while gpu.js lasts about 1.5 seconds! More than an order of magnitude.

## The code

I made an example script to test the timing. The result should be the same as in gpu.js, but I made the residuals interpolation calculations in different alternatives, two of them may be different.

To run the script you will need two things:

### Dependencies

My *pip list* command returns this:

```
cycler (0.10.0)
Cython (0.28.5)
GDAL (2.3.1)
matplotlib (2.2.3)
numpy (1.15.1)
scikit-learn (0.19.2)
scipy (1.1.0)
sklearn (0.0)
```

Basically, scikit-learn, with numpy and scipy plus the cython library. Also, matplotlib to plot the data.

To compile the cython part, there is a *setup.py* file that has to be run by:

```
python setup.py build_ext --inplace
```

Now, by running

```
python calculate_temp.py
```

You will get all the benchmarks

### Multi linear regression

To get the regression coefficients, I used scikit-learn:

Which is quite straightforward. Just prepare the data and follow the docs.

Note that the residuals are created applying the regression to the original data:

```
residuals = regr.predict(predictors) - temps
```

It’s a clean and fast way to do it and allows to access the results later in the script.

### Applying the regression

Applying the regression results is easy with numpy, since it’s just adding several matrices:

### Interpolating the residuals

Interpolating the residuals can be done in several ways. I’ve tested three, two after looking example around and the original I used both at my workplace and in the gpu.js example.

#### rbf

The radial basis function is the one most srecommended by scipy. The results can be a bit strange and the performance is poor, but:

The code, basically prepares the data for the *Rbf* function.

### idw

The inverse of the distance weighted code is taken from a GitHub repo. It’s really efficient and the result is good, but more difficult to understand than the regular inverse of the distance. Also, maintains steep changes, which is not the best situation in our case, where we want a smooth residuals field all around, even if a single station has a different local value:

Again, the code is basically preparing the data for the function.

### Inverse of the distance using cython

This is the original code I used, and the one in the previous post. Calculating it with pure numpy was a bit difficult, so I made the original algorithm optimized with cython, so it’s as fast as coded in C. The code to call it is:

Note that I used geotransform, which turns things properly.

The cython code is:

You have to run

```
python setup.py build_ext --inplace
```

to compile it before running the script for the first time.

## Results

In my computer, which is not a new or powerful one, the times were, for the common steps:

Operation | Elapsed time |
---|---|

Regression time | 3 ms |

Temperature field time | 44 ms |

Final field time | 2 ms |

Drawing time | 402 ms |

With the different methods, the times were:

Operation | Residuals field time | Total time |
---|---|---|

Rbf | 4101 ms | 4551 ms |

idw | 881 ms | 1084 ms |

cython | 2571 ms | 2775 ms |

So, in the first place, the residuals interpolation is, by far, the most expensive step. The IDW method I found is the fastest option, although I’m not sure that the result is as good as the cython method with the classical inverse of the distance.

The original gpu.js method lasted:

Operation | Elapsed time |
---|---|

Multiple linear regression | 2 ms |

Calculate the regression field | 209 ms |

Calculate the residuals field | 1084 ms |

Calculate the final field | 52 ms |

Draw the regression field | 65 ms |

Draw residuals field | 70 ms |

Draw final result | 67 ms |

Total time |
1549 ms |

So it’s a really good performance if you think that it’s run on the browser using a non compiled language (although using the GPU, of course!)

Finally, it would be nice to check the performance against python + GPU, but I have never worked with it.