8.1.2010 | The year I started blogging (blogware) |
9.1.2010 | Linux initramfs with iSCSI and bonding support for PXE booting |
9.1.2010 | Using manually tweaked PTX assembly in your CUDA 2 program |
9.1.2010 | OpenCL autoconf m4 macro |
9.1.2010 | Mandelbrot with MPI |
10.1.2010 | Using dynamic libraries for modular client threads |
11.1.2010 | Creating an OpenGL 3 context with GLX |
11.1.2010 | Creating a double buffered X window with the DBE X extension |
12.1.2010 | A simple random file read benchmark |
14.12.2011 | Change local passwords via RoundCube safer |
5.1.2012 | Multi-GPU CUDA stress test |
6.1.2012 | CUDA (Driver API) + nvcc autoconf macro |
29.5.2012 | CUDA (or OpenGL) video capture in Linux |
31.7.2012 | GPGPU abstraction framework (CUDA/OpenCL + OpenGL) |
7.8.2012 | OpenGL (4.3) compute shader example |
10.12.2012 | GPGPU face-off: K20 vs 7970 vs GTX680 vs M2050 vs GTX580 |
4.8.2013 | DAViCal with Windows Phone 8 GDR2 |
5.5.2015 | Sample pattern generator |
Although I like theorizing about different sampling patterns, in practical implementations I'm only concerned about a handful of practical properties:
To nail the above points, this sample pattern generator works as follows:
The whole process is visualized in ASCII. This example uses a 1/(1 + r^2) density function. First the initial pool of samples is being generated (click to enlarge):
Next the generated subsets are shown:
And after sorting, samples at each index for all subsets:
Finally, the full set. Each subset has its own color, and each index has its own symbol:
Don't hate me, but this is again only for UNIX. Otoh it's only dependent on libpng and runs fine on OSX. Tweak the defines in main.cpp to your liking and: make && ./samplingsampling-1.0.tar.gz
Enjoy your PNG!
Firstly, the initial sample set doesn't cover the area perfectly uniformly, i.e. follow a maximal Poisson disk distribution. If this is a problem, a different initial distribution should be easy to drop in. Check out the literature; ways exist.
Secondly, the spatial sorting of the sets can only maintain a good coherence for all but the last couple of indices. This gives you optimal cache coherency for majority of the indices at the expense of the last few. I reckon this is generally better than having uniform but suboptimal coherency for all indices, but it depends on the HW and cache characteristics of your algorithm.