Table of

8.1.2010The year I started blogging (blogware)
9.1.2010Linux initramfs with iSCSI and bonding support for PXE booting
9.1.2010Using manually tweaked PTX assembly in your CUDA 2 program
9.1.2010OpenCL autoconf m4 macro
9.1.2010Mandelbrot with MPI
10.1.2010Using dynamic libraries for modular client threads
11.1.2010Creating an OpenGL 3 context with GLX
11.1.2010Creating a double buffered X window with the DBE X extension
12.1.2010A simple random file read benchmark
14.12.2011Change local passwords via RoundCube safer
5.1.2012Multi-GPU CUDA stress test
6.1.2012CUDA (Driver API) + nvcc autoconf macro
29.5.2012CUDA (or OpenGL) video capture in Linux
31.7.2012GPGPU abstraction framework (CUDA/OpenCL + OpenGL)
7.8.2012OpenGL (4.3) compute shader example
10.12.2012GPGPU face-off: K20 vs 7970 vs GTX680 vs M2050 vs GTX580
4.8.2013DAViCal with Windows Phone 8 GDR2
5.5.2015Sample pattern generator


OpenGL (4.3) compute shader example


OpenGL 4.3 was released yesterday, and among the larger updates were compute shaders. Today, since I couldn't find a tutorial/example on google, I'm going to show you how to use them.

Compute shaders in the pipeline

The important thing to note is that while the other shaders have a fixed execution order, compute shaders can essentially alter any data anywhere. Shader objects within a program object are implicitly pipelined after another, and a program object is "ready to go" as it is. Compute shaders cannot be baked into a program object alongside other shaders as their execution order is not fixed. Instead, compute shaders have to be placed into program objects by themselves and the application has to instruct OpenGL about the execution order explicitly by switching on and off the compute shader program object and calling DispatchCompute*() to run the compute shaders.

OpenGL compute shaders are GLSL and similar to other shaders: you can read textures, images, and buffers and write images and buffers. Just like with other GPGPU implementations, threads are grouped into work groups and one compute shader invocation processes a bunch of work groups. The work group size is specified along with the kernel source code, and the number of work groups launched is given by the application as arguments to DispatchCompute*().


You should know when to choose a compute shader over the other shaders for your algorithm (this is not one such example). The reasons to use GPGPU are universal and have nothing to do with OpenGL compute shaders specifically.

You can grab the full example program here, but the important files are main.cpp and opengl_cs.cpp. In main.cpp we create an OpenGL 4.3 context (I'm being strict and using a forward-compatible core profile, but you don't have to), a texture for the compute shader to write and the fragment shader to read, and two program objects. One object is for the compute shader and the other is for rendering (vertex + fragment shaders). After that we go into a loop where we update a counter in the compute shader, fill in the texture (as image2D), and blit the texture onto the screen.

#include "opengl.h"

GLuint renderHandle, computeHandle;

void updateTex(int);
void draw();

int main() {

    GLuint texHandle = genTexture();
    renderHandle = genRenderProg(texHandle);
    computeHandle = genComputeProg(texHandle);

    for (int i = 0; i < 1024; ++i) {

    return 0;

void updateTex(int frame) {
    glUniform1f(glGetUniformLocation(computeHandle, "roll"), (float)frame*0.01f);
    glDispatchCompute(512/16512/161); // 512^2 threads in blocks of 16^2
    checkErrors("Dispatch compute shader");

void draw() {
    glDrawArrays(GL_TRIANGLE_STRIP, 04);
    checkErrors("Draw screen");

The compute shader set-up should look familiar as it's just another shader. (There are some specifics which are documented in the GLSLang specification.)

#include "opengl.h"
#include <stdio.h>
#include <stdlib.h>

GLuint genComputeProg(GLuint texHandle) {
    // Creating the compute shader, and the program object containing the shader
    GLuint progHandle = glCreateProgram();
    GLuint cs = glCreateShader(GL_COMPUTE_SHADER);

    // In order to write to a texture, we have to introduce it as image2D.
    // local_size_x/y/z layout variables define the work group size.
    // gl_GlobalInvocationID is a uvec3 variable giving the global ID of the thread,
    // gl_LocalInvocationID is the local index within the work group, and
    // gl_WorkGroupID is the work group's index
    const char *csSrc[] = {
        "#version 430\n",
        "uniform float roll;\
         uniform image2D destTex;\
         layout (local_size_x = 16, local_size_y = 16) in;\
         void main() {\
             ivec2 storePos = ivec2(gl_GlobalInvocationID.xy);\
             float localCoef = length(vec2(ivec2(gl_LocalInvocationID.xy)-8)/8.0);\
             float globalCoef = sin(float(gl_WorkGroupID.x+gl_WorkGroupID.y)*0.1 + roll)*0.5;\
             imageStore(destTex, storePos, vec4(1.0-globalCoef*localCoef, 0.0, 0.0, 0.0));\

    glShaderSource(cs, 2, csSrc, NULL);
    int rvalue;
    glGetShaderiv(cs, GL_COMPILE_STATUS, &rvalue);
    if (!rvalue) {
        fprintf(stderr"Error in compiling the compute shader\n");
        GLchar log[10240];
        GLsizei length;
        glGetShaderInfoLog(cs, 10239, &length, log);
        fprintf(stderr"Compiler log:\n%s\n", log);
    glAttachShader(progHandle, cs);

    glGetProgramiv(progHandle, GL_LINK_STATUS, &rvalue);
    if (!rvalue) {
        fprintf(stderr"Error in linking compute shader program\n");
        GLchar log[10240];
        GLsizei length;
        glGetProgramInfoLog(progHandle, 10239, &length, log);
        fprintf(stderr"Linker log:\n%s\n", log);
    glUniform1i(glGetUniformLocation(progHandle, "destTex"), 0);

    checkErrors("Compute shader");
    return progHandle;

compute shader demo


But why did Khronos introduce compute shaders in OpenGL when they already had OpenCL and its OpenGL interoperability API? Well, OpenCL (and CUDA) are aimed for heavyweight GPGPU projects and offer more features. Also, OpenCL can run on many different types of hardware (apart from GPUs), which makes the API thick and complicated compared to light compute shaders. Finally, the explicit synchronization between OpenGL and OpenCL/CUDA is troublesome to do without crudely blocking (some of the required extensions are not even supported yet). With compute shaders, however, OpenGL is aware of all the dependencies and can schedule things smarter. This aspect of overhead might, in the end, be the most significant benefit for graphics algorithms which often execute for less than a millisecond.



Great article, thanks!!!
- Rich


Thank you very much!
- Aavci


Nice! Thank you!
- linsnos


Why do you set texHandle as arg of genRenderProg() and genRenderProg()? You havent even use it internally. I don't know how it supposed to work it that way...
- Wonderer


Oh yeah you're right; I'm not using the parameter, so it's ignored.  There's no need to use it since it's bound to GL_TEXTURE0 during creation and kept bound throughout the program.
- wili


Thank you


Very helpful !!
- AB


to anyone having problems compiling/running this with a nvidia card, try -L/usr/lib/nvidia-xxx with g++ (xxx being your driver version) and change "uniform image2D destTex" in the shader code to "writeonly uniform image2D destTex"
- meepo


thank you @meepo
- nozam


- jimmi


Thank you!  This was easy to duplicate.  Well done.
- freeflyclone


FYI, getting the following error while trying to run: 

Window depth 24, 800x600
        vendor Intel Open Source Technology Center
        renderer Mesa DRI Intel(R) HD Graphics 5500 (Broadwell GT2) 
        version 4.6 (Core Profile) Mesa 19.3.4
        shader language 4.60
Extension "GL_ARB_compute_shader" found
Error in compiling the compute shader
Compiler log:
0:2(21): error: image uniforms not qualified with `writeonly' must have a format layout qualifier
- Misha


FYI, getting the following error while trying to run: 

Window depth 24, 800x600
        vendor Intel Open Source Technology Center
        renderer Mesa DRI Intel(R) HD Graphics 5500 (Broadwell GT2) 
        version 4.6 (Core Profile) Mesa 19.3.4
        shader language 4.60
Extension "GL_ARB_compute_shader" found
Error in compiling the compute shader
Compiler log:
0:2(21): error: image uniforms not qualified with `writeonly' must have a format layout qualifier
- Misha


I changed to 
		 uniform writeonly image2D destTex;\
then it worked.
- Misha


Глазеть <a href=>дойки порно видео лесбиянок</a>   онлайн бесплатно. Отдельный день новые ролики в HD качестве!
- DennisWab


Пластиковые окна 
<a href=>Пластиковые окна</a>
- MichaelTwiNk


Огромная колекция бесплатного порно видео! <a href=>красивая порнуха</a> ! Смотри бесплатное порно видео онлайн! А также секс, эротика, знакомства чтобы секса, и многое другое! Porno video online!
- DanielCax


Агентство You are Amazing открывает набор моделей видео чата в Харькове 
О работе : в режиме онлайн общаться и развлекать публику. Анонимно - весь трафик идёт зарубеж (Европа и США). Количество взрослого контента девушки регулируют сами. 
Вакансия подходит девушкам, которые хотят  финансовую независимость и высокий уровень жизни. 
- модели сети студий you are amazing зарабатывают от 3000$ 
- месторасположение - самый центр Харькова, метро в 3х минутах 
- просторные апартаменты, круглосуточная охрана; 
- техника последнего поколения 
- бонусы, подарки, яркие корпоративы. 
Сайт - 
Инстаграм - 
<a href=>работа веб-моделью</a>
- RichardbaH


Самое красивое <a href=>секс порно молодых смотреть бесплатно</a>   видео, позволительно смотреть всего у нас, потому который в наших роликах самые сексуальные девушки, которые хорошо знают, сколько такое секс.
- MichaelSuh


<a href=></a>
<a href=></a>
<a href=>Насаживается киской на торс с членом</a>
<a href=>Орион черный</a>
<a href=></a>
<a href=></a>
<a href=>Сживотним</a>
<a href=>Порно большая жопа в джинсах</a>
- MilesPrior


гранд казино официальный на деньги скачать бесплатно
kate mobile 59.1
angry birds go mod apk
android для sony ericsson x10 mini
advanced systemcare mobile
<a href=>лучшие игры по блютуз на андроид</a>
- Louisrow


We’re a digital marketing and SEO agency based in Dubai offering practical and effective marketing solutions to businesses across the UAE. 
<a href=>Conversions supplements</a>
- Michaelnob


<a href=></a>
<a href=></a>
<a href=>Анальный лесбийский двойной фаллоимитатор</a>
<a href=>Пума зрелая дама</a>
<a href=></a>
<a href=></a>
<a href=>Красный белый</a>
<a href=>Полненькую дамочку трахает глубоко страпоном ее подруга</a>
- Frankiestala


Vrlo je jednostavno kupiti automobil na nasoj web stranici <a href=""></a>. Imamo neporecive prednosti u odnosu na trzista automobila i druge cobweb stranice koje postupak kupnje automobila mogu uciniti sto ugodnijim: 
- lako i brzo pretrazivanje automobila u gradovima; 
- najave o kupnji automobila s opseznim opisom; 
- kvalitetna usluga i korisnicka podrska; 
- automobil od vlasnika, a ne od preprodavaca.
- Elliottwap


<a href=>newauction</a>
- Stevenglosy


50 копеек 1992 
<a href=>домонгол</a>
- Jaredmotte


<a href=></a>
<a href=></a>
<a href=>Чин ебет матьв ваной</a>
<a href=>Clean cum</a>
<a href=></a>
<a href=></a>
<a href=>Джонни синс и саванна бонд</a>
<a href=>Чувак развел девушку на минет в присутствии ее парня</a>
- Frankiestala

Nick     E-mail   (optional)

Is this spam? (answer "nope")