|8.1.2010||The year I started blogging (blogware)|
|9.1.2010||Linux initramfs with iSCSI and bonding support for PXE booting|
|9.1.2010||Using manually tweaked PTX assembly in your CUDA 2 program|
|9.1.2010||OpenCL autoconf m4 macro|
|9.1.2010||Mandelbrot with MPI|
|10.1.2010||Using dynamic libraries for modular client threads|
|11.1.2010||Creating an OpenGL 3 context with GLX|
|11.1.2010||Creating a double buffered X window with the DBE X extension|
|12.1.2010||A simple random file read benchmark|
|14.12.2011||Change local passwords via RoundCube safer|
|5.1.2012||Multi-GPU CUDA stress test|
|6.1.2012||CUDA (Driver API) + nvcc autoconf macro|
|29.5.2012||CUDA (or OpenGL) video capture in Linux|
|31.7.2012||GPGPU abstraction framework (CUDA/OpenCL + OpenGL)|
|7.8.2012||OpenGL (4.3) compute shader example|
|10.12.2012||GPGPU face-off: K20 vs 7970 vs GTX680 vs M2050 vs GTX580|
|4.8.2013||DAViCal with Windows Phone 8 GDR2|
|5.5.2015||Sample pattern generator|
Although I like theorizing about different sampling patterns, in practical implementations I'm only concerned about a handful of practical properties:
To nail the above points, this sample pattern generator works as follows:
Don't hate me, but this is again only for UNIX. Otoh it's only dependent on libpng and runs fine on OSX. Tweak the defines in main.cpp to your liking and: make && ./samplingsampling-1.0.tar.gz
Enjoy your PNG!
Firstly, the initial sample set doesn't cover the area perfectly uniformly, i.e. follow a maximal Poisson disk distribution. If this is a problem, a different initial distribution should be easy to drop in. Check out the literature; ways exist.
Secondly, the spatial sorting of the sets can only maintain a good coherence for all but the last couple of indices. This gives you optimal cache coherency for majority of the indices at the expense of the last few. I reckon this is generally better than having uniform but suboptimal coherency for all indices, but it depends on the HW and cache characteristics of your algorithm.