8.1.2010	The year I started blogging (blogware)
9.1.2010	Linux initramfs with iSCSI and bonding support for PXE booting
9.1.2010	Using manually tweaked PTX assembly in your CUDA 2 program
9.1.2010	OpenCL autoconf m4 macro
9.1.2010	Mandelbrot with MPI
10.1.2010	Using dynamic libraries for modular client threads
11.1.2010	Creating an OpenGL 3 context with GLX
11.1.2010	Creating a double buffered X window with the DBE X extension
12.1.2010	A simple random file read benchmark
14.12.2011	Change local passwords via RoundCube safer
5.1.2012	Multi-GPU CUDA stress test
6.1.2012	CUDA (Driver API) + nvcc autoconf macro
29.5.2012	CUDA (or OpenGL) video capture in Linux
31.7.2012	GPGPU abstraction framework (CUDA/OpenCL + OpenGL)
7.8.2012	OpenGL (4.3) compute shader example
10.12.2012	GPGPU face-off: K20 vs 7970 vs GTX680 vs M2050 vs GTX580
4.8.2013	DAViCal with Windows Phone 8 GDR2
5.5.2015	Sample pattern generator

8.1.2010

The year I started blogging (blogware)

Although I haven't figured out the value of public personal blogs so far, I did recently have to admit that for Unix hackers and programmers, technical blogs do seem to fill a certain gap between forum discussions and HOWTOs/white papers. So, I decided to join the movement and make a blog about hacks I wouldn't otherwise document, if merely for a future personal reference :-)

So, to start was to find an appropriate blogware, but I instantly got confused by the choices. It didn't take me long to figure out that I wasn't really after anything fancy or bloated with features, but I did have an existing web site and would have preferred something with the exact same look. Also, I'm more comfortable creating the HTML content myself, as I can use the tools I'm already familiar with (e.g. for creating syntax highlighted code print-outs). So I decided to write a few scripts myself to serve as the blogware, and make my first blog entry about them. :-)

I hardly think these scripts are of much interest to anyone, to be perfectly honest. But if it so happens that you're exactly in the same situation as I were, and would like to save a day of quality time with your favourite scripting language, here you go.

Blog software

The idea is to have an existing HTML template tagged with anchor keywords, which the scripts then replace with blog content. For commenting, a form is created for each blog entry from a template. The form calls a cgi script, which sanitizes the input, adds the comment to the appropriate comment file, and calls a script to re-create the static HTMLs.

Let's start off with the main script that actually generates the HTMLs from the templates. When adding a new blog entry, just create a new directory under entries with the following files: description (date<newline>description), content.html (the actual HTML content of the blog entry), comments (empty at first). Then call the src/genroot.pl script manually in the root directory of your blog to create the initial HTMLs. Make sure that the HTTP service is allowed to create or change the HTML files, and to increment the comment files.

This is the main script that creates the HTMLs:

#!/usr/bin/perl

use lib "src"; # This be the path where the modules are at

use gencontent;

# Whether to create individual pages for each blog entry, or to cram

# them all in one page.

my $page_separation = 1;

my $page_root = "."; # Put the entry.html files here

# This script finds an anchor from the specified root template HTML file,

# and replaces the anchor with the blog content.

my $root_template = "src/root_template.html";

my $blog_anchor = "BLOGANCHOR";

my $root_output_file = "index.html";

# We also set the title.  For root, it is "", and

# for separate entries, it's " - <description>".

my $title_anchor = "TITLEANCHOR";

open(ROOT_TEMPLATE_FILE, "<$root_template") or die "Couldn't open template file $root_template";

open(OUTPUT_FILE, ">$root_output_file") or die "Couldn't open output file $root_output_file for writing";

# We're using the same root template for individual entries

my $root_content;

if ($page_separation) {

    $root_content = getTOC("null");

    foreach my $entry (@entries) {

        # Rewinding the root file for each entry

        seek(ROOT_TEMPLATE_FILE, 0, 0);

        # Entries start with a TOC where the current entry is not a link

        my $entry_content = getTOC($entry) . "<br/><hr/>\n";

        $entry_content .= getEntryContent($entry);

        my $entry_name = (split("/", $entry))[1];

        open(ENTRY_FILE, ">$page_root/$entry_name.html") or die "Couldn't open entry html $entry_name.html for writing";

        # Extracting the description  

        open(ENTRY_DESC, "<$entry/description") or die "No description file for entry $entry";

        <ENTRY_DESC>; my $desc = <ENTRY_DESC>; chomp $desc;

        while (<ROOT_TEMPLATE_FILE>) {

            s/$title_anchor/ - $desc/;

            s/$blog_anchor/$entry_content/;

            print ENTRY_FILE $_;

        }

    }

    seek(ROOT_TEMPLATE_FILE, 0, 0);

} else {

    $root_content = getBlogContent();

}

while (<ROOT_TEMPLATE_FILE>) {

    s/$title_anchor//;

    s/$blog_anchor/$root_content/;

    print OUTPUT_FILE $_;

}

exit 0;

genroot.pl

The above script delegates content creation to the following module:

# This module actually generates the blog content

my $entry_path = "entries";

# This is the template to the comment box, to be appended directly at the

# end of each entry.  We're going to find the anchor from the file, 

# and replace the anchor with the name of the entry to be commented.

# (We need this to distinguish comments of different entries in comment.cgi)

my $comment_anchor = "COMMANCHOR";

my $comment_box_src = "src/comment_box.html";

@entries = <$entry_path/*>;

sub getBlogContent {

    my $content;

    $content .= getTOC();

    $content .= "<br/><hr/>\n";

    foreach my $entry (@entries) {

        $content .= getEntryContent($entry);

    }

    return $content;

}

sub getEntryContent {

    my $entry = $_[0];

    my $content;

    open(ENTRY_DESC, "<$entry/description") or die "No description file for entry $entry";

    open(CONTENT, "<$entry/content.html") or die "No content.html file for entry $entry";

    my $date = <ENTRY_DESC>;

    my $desc = <ENTRY_DESC>;

    chomp $date;

    chomp $desc;

    my $entry_name = (split("/", $entry))[1];

    $content .= "<a id=$entry_name></a>";

    $content .= "<h1>$date</h1>";

    $content .= "<h2>$desc</h2>\n";

    # We read in the content data

    while (<CONTENT>) { $content .= $_; }

    $content .= "\n";

    # Adding comments

    $content .= getCommentSection($entry);

}

sub getCommentSection {

    my $entry = $_[0];

    my $comment_section = "<h3>Comments</h3>\n";

    # Adding the actual comments, if comments file exists

    open(COMMENTS, "<$entry/comments") and $comment_section .= getComments(COMMENTS);

    # Adding the comment box

    $comment_section .= "<p>\n";

    open(COMMENT_BOX, "<$comment_box_src") or die "Couldn't load comment box src from $comment_box_src";

    while (<COMMENT_BOX>) { s/$comment_anchor/\"$entry\"/; $comment_section .= $_; }

    $comment_section .= "</p><br/><hr/>\n";

    return $comment_section;

}

sub getComments {

    my $FILE = $_[0];

    my $comments;

    my $fullfile;

    # Reading comment file contents to memory at once, but adding <br/> to newlines

    while (<$FILE>) { $fullfile .= $_; }

    # Fields are separated by " #"

    my @comment_array = split(/ #/, $fullfile);

    for (my $i = 0; $i < int(@comment_array/4); $i++) {

        my $date = $comment_array[$i*4 + 0];

        my $nick = $comment_array[$i*4 + 1];

        my $e_mail = $comment_array[$i*4 + 2];

        # Let's remove any whitespaces from beginning and end of these..

        $date =~ s/^\s*//g; $date =~ s/\s*$//g;

        $nick =~ s/^\s*//g; $nick =~ s/\s*$//g;

        $e_mail =~ s/^\s*//g; $e_mail =~ s/\s*$//g;

        # Cutting the message off at 8k characters in case it's spam

        my $message = substr($comment_array[$i*4 + 3], 0, 8192);

        # We're restoring all escaped #s

        $message =~ s/\\#/#/g;

        $nick =~ s/\\#/#/g;

        $e_mail =~ s/\\#/#/g;

        $comments .= "<h1><small>$date</small></h1><p>";

        $comments .= "<div id=blog_comment><pre wrap>\n$message</pre>";

        # Adding the commenter's nick/e-mail

        if (length $nick || length $e_mail) {

            my $name;

            if (length $nick) { $name = $nick; } 

            else { $name = $e_mail; }

            if (length $e_mail) {

                $comments .= "- <a href=\"mailto:$e_mail\">$name</a>";

            } else {

                $comments .= "- $name";

            }

            $comments .= "\n<br/>";

        }

        $comments .= "</div></p>\n";

    }

    return $comments;

}

sub getTOC {

    my $TOC = "<h1>Table of contents</h1>\n<p>\n";

    my $passive_entry = $_[0]; # Can be empty

    $TOC .= "<table>\n";

    foreach my $entry (@entries) {

        open(ENTRY_DESC, "<$entry/description") or die "No description file for entry $entry";

        my $date = <ENTRY_DESC>;

        my $desc = <ENTRY_DESC>;

        chomp $date;

        chomp $desc;

        my $entry_name = (split("/", $entry))[1];

        if ($entry eq $passive_entry) {

            $TOC .= "<tr><td>$date</td><td>$desc</td></tr>\n";

        } else {

            # Okay this is a kludge I added when I decided to include separate

            # entry pages also.  passive_entry should be set to something non-existant

            # when root TOC is generated in "separate entries" mode.

            # This way it knows not to make the links aliases, but rather separate htmls.

            if ($passive_entry) {

                $TOC .= "<tr><td>$date</td><td><a href=\"$entry_name.html\">$desc</a></td></tr>\n";

            } else {

                $TOC .= "<tr><td>$date</td><td><a href=\"#$entry_name\">$desc</a></td></tr>\n";

            }

        }

    }

    $TOC .= "</table>\n</p>\n";

    return $TOC;

}

return 1;

gencontent.pm

This is the CGI script that processes comments posted via HTTP:

#!/usr/bin/perl

# This file takes a comment POST as input, updates 

# comment files and finally the HTML(s)

my $refresh = "src/genroot.pl";

use CGI;

my $cgi = new CGI;

# Make sure that no one is sending malicious file paths here..

if ($cgi->param("src") =~ m/^\// || $cgi->param("src") =~ m/\.\./) {

    die print "Content-type: text/html\n\nPath (" . $cgi->param("src") . ") was rigged, refusing to write";

}

# We find the comments file from a path derived from "src" input

my $filepath = $cgi->param("src")."/comments";

open(COMMENTS, ">>$filepath") or die print "Content-type: text/html\n\nCouldn't open $filepath";

if (-s $filepath > 500000) { 

    die print "Content-type: text/html\n\nComments exceed 500kB, suspecting spam."; 

}

print COMMENTS getDate()." #".getSaneName()." #".getSaneEmail()." #".getSaneComment()." #";

# We refresh the HTMLs and return back to the previous page

system($refresh);

print $cgi->redirect($ENV{HTTP_REFERER});

sub getDate {

    my @timedata = localtime(time);

    return "$timedata[3].".($timedata[4]+1).".".($timedata[5]+1900);

}

sub getSaneName {

    return clean($cgi->param("nick"));

}

sub getSaneEmail {

    return clean($cgi->param("email"));

}

sub getSaneComment {

    return clean($cgi->param("content"));

}

# As you can see, input sanitation is very crude.

# Plx comment if you see something especially worrying.

sub clean {

    return escapeSeparator(escapeAngleBrac($_[0]));

}

sub escapeAngleBrac {

    my $input = $_[0];

    $input =~ s/</&lt;/g;

    $input =~ s/>/&gt;/g;

    return $input;

}

sub escapeSeparator {

    my $input = $_[0];

    $input =~ s/#/\\#/g;

    return $input;

}

comment.cgi

Grab this package to get all the example templates and such as well.

Comments

8.1.2010

This is an example comment.  It's plain text.  You can omit either nick or e-mail or both.

- wili

5.8.2010

Commenting has been offline for some while since I didn't have any kind of spam filtering.  Now I do and it's back online.

I simply consult spamassassin on my mail server to query the bayesian spam probability of the message body.

If you want to do the same, this is the perl code I use:

###
use IPC::Open2;

my @spamcCmd = ('/usr/bin/ssh', '-i/your/login/key', 'spam@your.spamassassin.host', 'spamc');
# Or if you have it locally:
#my @spamcCmd = ('/usr/bin/spamc');

my $spamLimit = 0.5; # 50% bayesian probability

sub isSpam {
    my $input = $_[0];
    my $pid = open2(OUT, IN, @spamcCmd) || die "Content-type: text/html\n\nCan't contact spam filter";
    print IN $input;
    close(IN);

    my $score = 0.0;
    while (<OUT>) {
        if (s/[^\]]*\[score: ([^\]]+)\]/\1/) {
            $score = $_;
        }
    }
    #print "score: $score\n";
    waitpid $pid, 0;

    return $score >= $spamLimit;
}
###

And if you want to integrate this into the commenting system described in this entry, change the comment writing section of the comment.cgi script to this:

###
my $comment = getSaneComment();
if (isSpam($comment)) {
    open(SPAM, ">>$filepath"."SPAM") or die print "Content-type: text/html\n\nCouldn't open spam DB for $filepath";
    print SPAM getDate()." #".getSaneName()." #".getSaneEmail()." #".getSaneComment()." #";
    close(SPAM);
    die print "Content-type: text/html\n\nThis message was considered spam.  Not posting.  If you are a real person, e-mail me and I re-train my filter (the message is stored).";
} else {
    print COMMENTS getDate()." #".getSaneName()." #".getSaneEmail()." #".getSaneComment()." #";
}
close(COMMENTS);

- wili

wili
Ville Timonen

hack blog

Table of
contents

8.1.2010

The year I started blogging (blogware)

Blog software

Comments

8.1.2010

5.8.2010

wili Ville Timonen

hack blog

Table ofcontents

8.1.2010

The year I started blogging (blogware)

Blog software

Comments

8.1.2010

5.8.2010

wili
Ville Timonen

Table of
contents