Sculpin - Start

There are many static website creators outside. One of the most prominent ones in my eyes are Jekyll, Hugo, and Pelican. I actually have run two sites with Jekyll. On GitHub pages Jekyll is also used. The guys at the Bootstrap project had been using Jekyll for there website but made the announcement that thy are moving to Hugo with the project documentation. I also had a closer look to Pelican but finally didn't use it.

Update: here is a blog post about Sculpin and other PHP site generators.

For the two sites where I am using Jekyll, I wanted to extend the parser to understand some custom markup that I wanted to use. However, I am not so familiar with Ruby and writing Ruby Gems, that made me end up writing a build script in bash which calls two python scripts that apply changes to the generated content after the jekyll build. This works well, but you have to be aware not to use any markup that would be interpreted by the build in parser before I could handle it in my customs scripts. Doing my parsing before the Jekyll parser would do its job, left me with the problem, that I would change my source files permanently, or I would have to keep a copy and revert the source files with my saved copy. This seemed a little too much work for me, and also I would have trouble later to remember exactly the processing steps when there would be a change to the custom markup. Therefore, this time I thought I stick to some technology that I know better. Since I am a professional PHP developer I was looking for a site generator using PHP and found Sculpin.

Installing and getting it running was no problem with the documentation. For a PHP developer the project root with composer and the Symfony layout looks quite familiar. However, there were some tricky parts that cost me some time to figure it out.

Yarn

Following the documentation there should be no problem installing and running yarn to create the resources. This is important when changing Javascript or the styles in Sass so that the compiled files can be recreated. Unfortunately my yarn didn't work. The reason was that I used the Ubuntu package manager to install yarn (actually it was already installed). Removing the package, installing node.js and then yarn fixed my problems:

sudo apt remove cmdtest
sudo apt install npm
sudo npm install --global yarn

After these steps, in the root directory you may run composer yarn-watch that builds the resources once and then watches for changes in the files. Just building the resources again is done with yarn encore dev or yarn encore production. Also note that if you don't see the changes on the generated website, you may run php vendor/bin/sculpin generate so that the newly generated css and js files are written to the output directory.

Pages and no Posts

This site is not really a blog. I do not tend to write blog posts followed by a certain frequency. All the topics that I write about are ordered by a theme. I also try to improve articles when I feel the need of it to complete information instead of writing a new blog post that somehow out dates the older post. Therefore, I tried to remove everything that seems like posts and tried to create a new content element "page".

Classify articles by tags

Articles can be classified by various methods. Blogs usually use a concrete date that gives a classification by time and categories or tags. Technically there is not a big difference between tags and categories. The most significant difference of a category is maybe that an article usually resides in one category and that there are parent and child categories. This is mainly how the file system works as well. On the other hand, an article usually has several tags which have no parent or child hierarchy. I have chosen the latter for my site, so that when browsing via tags an article may appear on several pages.

On the tag page itself I wanted to have some information about the tag, that explains a little why an article has been tagged by this tag. I put the description of the tags in the sculpin_site.yml. There is an entry tags that contains the tag names as a child key. Each tag name has its description in the value of the key. The file looks like this:

tags:
    Bash:
          Some description of why I like the bash and
          what you can do with the shell.
    Python: Some programming language.
...

The description text may have new lines, the only thing is that with the characters the yaml syntax is not broken. In case this happens, try to avoid the special chars or escape them.

To display the tag description on the tags main page the template in source/pages/tags/tag.html is extended by this code snippet right under the headline:

{% if site.tags[page.tag] %};
    <p>{{site.tags[page.tag]|raw}}</p>
{% endif %}

Article abstact

Most articles of newspapers or blogs have an abstract to introduce the reader to the contents of the article. For nerds this could be the tldr section. Some blogs show the first paragraph of the article, other use a separate text block. I decided for the latter. My abstract is defined in the article top section that contains the properties in yaml style. The section for this file looks like the following:

layout: page
title: Sculpin - Start
tags: [ "PHP" ]
date: 2020-06-10
abstract:
  Sculpin is a static website creator. I started to use it for this site. In this article I am describing my changes
  that I did to customize the base installation of Sculpin and some issues that I had and how I got them solved.

The abstract is not displayed on the article page itself but on the article listings. Therefore, I changed the index.html template in the source folder and added the following content right below the header section:

<div>
   {{ post.abstract|raw }}
</div>

In case a post does not contain an abstract, the <div> block will be there but empty.

Source code files

Inline code, like code samples need to be enclosed in three backticks which is translated into <code> html tags. If the code consists of a whole paragraph (the backticks are on a new line), the code tag is enclosed by a <pre>. For the little tools that I am programming and using in my articles I always would have to copy and paste the content of the script file from the file into the current article markdown file. This is not so convenient especially when I want to update the script content. I would rather use the file directly so that I copy the script into my website project without touching the article file again. Also, this makes it easier for the reader to download the file content.

My idea was to use some "parser function" that takes the file content of the script and embeds it into the article. Therefore, I need to interfere with the parser when the site is created from the source files. The Sculpin documentation contains a brief section how to extend Sculpin with Symfony Bundles and links to the Symfony Project on how to create new bundles.

For my needs I considered a bundle to complex. I just wanted to jump in where the parser comes into action by taking the content and perform some actions before or after the parser has transformed the source of an article.

To hook into the parser, I replaced the standard parser of Sculpin by my own class which is extending the standard parser. This is done in two steps. First, define a new parser class like this one:

MarkupParser.php DownloadView all
<?php

namespace Canjanix;

use Sculpin\Bundle\MarkdownBundle\PhpMarkdownExtraParser;

class MarkupParser extends PhpMarkdownExtraParser
{
    /**
     * Placeholder string
     * @var string
     */
    protected const PATTERN = '___MATCH::%d____';

    /**
     * Array with real replacements.
     * @var array
     */
    protected $replacements;

    /**
     * {@inheritdoc}
     */
    public function transform($content): string
    {
        $this->replacements = [];
        // Match any custom pattern and replace these with a placeholder string
        // that will remain untouched by the internal parser.
        $content = $this->handleSourceFile($content);
        $content = $this->handleCodeBlock($content);
        // do the real transformation
        $content = parent::transform($content);
        // replace the placeholders with the real content
        $content = $this->handlePlaceholders($content);
        return $content;
    }

    /**
     * This function is called after the internal parser has transformed the text. It replaces the placeholder
     * strings that were inserted in parseBefore() with the actual content.
     * @param string $content
     * @return string
     */
    protected function handlePlaceholders(string $content): string
    {
        foreach (\array_keys($this->replacements) as $i) {
            $pattern = sprintf(self::PATTERN, $i);
            // first try to replace placeholder enclosed in <p> tags.
            $content = str_replace(
                '<p>' . $pattern . '</p>',
                $this->replacements[$i],
                $content
            );
            // in case replacement didn't work, try it now without the <p> tags.
            $content = str_replace($pattern, $this->replacements[$i], $content);
        }
        return $content;
    }

    /**
     * Handle the custom markup {!--codeblockStart--} and {!codeblockEnd--}
     *
     * @param string $content
     * @return string
     */
    protected function handleCodeBlock(string $content): string
    {
        if (preg_match_all(
            '~^\{\!\-\-codeblockStart\-\-\}(.*?)\{\!\-\-codeblockEnd\-\-\}~ms',
            $content,
            $matches
        )) {
            foreach (\array_keys($matches[0]) as $i) {
                $content = str_replace(
                    $matches[0][$i],
                    sprintf(self::PATTERN, count($this->replacements)),
                    $content
                );
                $this->replacements[] = '<pre><code>'
                    . str_replace(
                        ['&', '{', '}', '"', "'", '<', '>'],
                        ['&amp;', '&#123;', '&#125;', '&quot;', '&apos;', '&lt;', '&gt;'],
                        trim($matches[1][$i])
                    )
                    . PHP_EOL . '</code></pre>';
            }
        }
        return $content;
    }

    /**
     * Handle the custom markup {!--includeSourceFile(file, language)--}
     *
     * @param string $content
     * @return string
     */
    protected function handleSourceFile(string $content): string
    {
        if (preg_match_all(
            '~^\{\!\-\-includeSourceFile\(\s*(.*?)\s*(,\s*(.*?)\s*)?\)\-\-\}~m',
            $content,
            $matches
        )) {
            foreach (\array_keys($matches[0]) as $i) {
                $content = str_replace(
                    $matches[0][$i],
                    sprintf(self::PATTERN, count($this->replacements)),
                    $content
                );
                $this->replacements[] = $this->includeSourceFile(
                    $matches[1][$i],
                    isset($matches[3][$i]) ? $matches[3][$i] : ''
                );
            }
        }
        return $content;
    }

    /**
     * Replace custom parser function {!--includeSourceFile(<filename>, <lang>)--} with
     * code blocks like:
     * <div class="filesource"><span class="filename">filename</span>
     * <a class="download" download="filename" href="/path/to/file">Download</a>
     * <a class="view" href="#">View all</a></div></div>
     * <pre class="fold"><code class="language">
     * file content
     * </code></pre>
     *
     * @param string $file
     * @param string $lang
     * @return string
     */
    protected function includeSourceFile(string $file, string $lang = '')
    {
        $content = '';
        $realFile = dirname(__FILE__) . '/../../source/assets/files/' . $file;
        if (file_exists($realFile)) {
            $content = htmlspecialchars(trim(file_get_contents($realFile)));
            // Avoid twig parsing error because of unclosed comment.
            $content = str_replace('{#', '{&#35;', $content);
        } else {
            echo "\nFile $realFile does not exist\n";
            return $content;
        }
        return '<div class="filesource"><span class="filename">' . basename($file) . '</span> '
            . '<a class="download" download="' . basename($file) . '" href="/assets/files/'
            . $file . '">Download</a><a class="view" href="#">View all</a></div>' . PHP_EOL
            . '<pre class="fold"><code' . (!empty($lang) ? ' class="' . mb_strtolower($lang) . '"' : '') . '>'
            . $content . PHP_EOL
            . '</code></pre>' . PHP_EOL;
    }
}

I placed it in project at app/Canjanix/MarkupParser.php. That's also the reason why I have chosen the namespace Canjanix.

Second, in the sculpin_kernel.yml the settings need an entry that the new parser is used when creating the site. This is done with the following configuration:

sculpin_markdown:
    parser_class: Canjanix\MarkupParser

Third, if it doesn't exist already, create a new class SculpinKernel.php in the app directory. This class is loaded by Sculpin, in case it's there. This is the main entry point for customization. All additional third party addons (bundles) will be registered here. My file has the following content:

SculpinKernel.php DownloadView all
<?php

use Sculpin\Bundle\SculpinBundle\HttpKernel\AbstractKernel;

class SculpinKernel extends AbstractKernel
{
    protected function getAdditionalSculpinBundles(): array
    {
        require_once(dirname(__FILE__) . '/Canjanix/MarkupParser.php');
        return [
        ];
    }
}

I needed to apply changes in this class, because otherwise my custom parser class is not found on the file system. The class loader apparently looks in the vendor directory only. There is no chance (except for the custom SculpinKernel) that a class is loaded from the app directory. This why I have this require_once statement in there. I tried to use the namespace App among other things with no success. For a bigger extension I would have to write my own class loader or create a bundle. For just this one class I am fine with the require_once statement.

The custom parser hooks into the default parser by overwriting the transform() method. This method gets the unparsed content of an article, does the transformation and returns the parsed result. I came up with some custom markup like {!--includeSourceFile(filename, language)--} which tries to read the file in source/assets/files and place it's content into the generated html file of the source file. Another markup that I use is to define special codeblocks. They pretty much look the same as the normal code blocks but do some more escaping. For this article I wanted to place a twig template snippet into a code block. That was not possible with the standard parser. The block would remain empty because the twig syntax would be later parsed as well so that the file content is changed again before the final html output is written.

My parser first looks for any of the custom markup, replaces this markup with some placeholder strings that will not be modified by the internal parser. The real content will be stored in a replacement array. Then the parent::transform() method is called that does the real transformation. After the internal parser has done its job, the placeholder strings need to be replaced by the real content. This must not be parsed again by the internal parser, because that would break things.

Resources on the Internet

Sculpin doesn't seem to have a big community. On my searches I didn't find so many hints to Sculpin. There are a view blog post listed at <?PHPDeveloper.org and a lengthly article from Andrew Marcinkevičius who moved his blog from Octopress to Sculpin.

PHP 8.0

Sculpin in version 3.0.0 doesn't work with PHP 8.0. I tried to upgrade the packages with composer. That failed because some doctrine packages do not allow PHP > 7.x. The upgrade fails. Then I started to fix the code in the vendor directory by myself while running the "generate task" until it executed successfully without any error output at the console. At the end 4 files had to be modified slightly. The main problem was in the parser that used lots of $token{0} to retrieve the first char of the string. PHP 8 does not allow this notation anymore. Instead, I used substr($token, 0, 1) which does not look as elegant as the old style but does its job. The changes in the other two files were minor. This patch file contains all changes.

New setup with Sculpin 3.1

Because of a change of my system I had to reinstall Sculpin on the new machine and restored the site with all the content that I had inside the source folder. The good news is that PHP 8 is now fully supported. The patch from above is not needed anymore.

The bad part was, that I did miss some config files, that I had to restore. Therefore, I will list them here so that one can follow the steps described above.

In general, creating a new content type (in this case pages), you need to run the following command:

vendor/bin/sculpin content:create -b -t tags pages

This command creates all the skeleton pages (i.e. the html templates that are used to render the final content pages) and displays the config to be added to the sculpin_kernel.yml config file. These settings register the new content type in the Sculpin parser so that you may use the layout in your newly created blog elements (in this case page elements).

The sculpin_site.yml basically contains the site name, subtitle and the tags (I have described it above). The sculpin_kernel.yml is listed here:

sculpin_content_types:
    pages:
        type: path
        path: _pages
        singular_name: page
        layout: page
        enabled: true
        permalink: pages/:title
        taxonomies:
            - tags

sculpin_markdown:
    parser_class: Canjanix\MarkupParser

In order to fully remove any of the blog posts, the setting:

   posts:
       enabled: false

did not work. I had to remove it completely.

I also had issues with yarn. My node version was 18.x but yarn insists that it should not newer than 14.x. Here the script nvm comes in handy. This is a tool that lets you switch between several node versions instantly. If a node version is not installed on your system then you also may install it using nvm.

You would circumvent all these issues when using a software versioning system. However, I don't want to have my content distributed at several locations on the internet. So I don't use a versioning system like github or gitlab for this site, neither have my personal git server running.