New Blog: The Saga Continues

Posted on Sat 12 September 2015 in UncategorizedLeave a comment

I ended up switching to a new blog yet again.

This time I'm using Pelican, a static site/blog generator with the amazing Flex theme by Alexandre Vicenzi. A link to the source can also be found within the main navigation menu.


Plugin Discovery Using Entry Points

Posted on Sun 21 December 2014 in Programming • Tagged with PythonLeave a comment

I'm of the belief that just about anyone should have the right to expand on the functionality of a FLOSS tool, but I'm also really particular about code style, so I'd rather not deal with a ton of pull requests. I'm a difficulty person.

With this in mind, a short while ago I was looking for ways to provide users with the ability to create modules which my tool could then load in at runtime.

Luckily, it turned out $deity had provided us mortals with Python's setuptools library, which happened to contain the exact features I was looking for in the form of entry points.

Entry points are magical little things which allow library authors "plug in" to your framework/utility, as long as they know your entry point group's name.

Here's a setup.py file for an imaginary text-tool module, which will plug in to our sweet-tool utility:

#!/usr/bin/env python3
"""Setup module."""
from setuptools import setup, find_packages

setup(
    name='text-tool',
    version='0.0.1',
    description='A tool which performs text transformations!',
    long_description='A tool which performs text transformations!',
    author='John Doe',
    author_email='john.doe@example.com',
    url='https://github.com/john.doe/text-tool',
    license='MIT',
    classifiers=[
        'Development Status :: 2 - Pre-Alpha',
        'License :: OSI Approved :: MIT License',
    ],
    packages=find_packages(),
    entry_points={
        'sweet.modules': ['text = text_tool:TextTool']
    }
)

In the code above, we're defining our awesome text module, which can plug its TextTool class from the text_tool module into the sweet.modules entry point group, identified by the name text.

Frameworks or pluggable tools can then load these entry points by specifying an entry point group name.

Here's some example code which will instantiate modules by iterating over a list of entry points and loading the referenced classes:

from pkg_resources import iter_entry_points


for entry_point in iter_entry_points(group='sweet.modules'):
    print('Loading entry point %s' % entry_point.name)

    module_class = entry_point.load()
    module_instance = module_class()

Though the code above is useful for illustrating the power of entry points, it's useful to be able to disable/enable specific modules, as well as reload them by removing old instances of the module from the python interpreter.

With that in mind, here's the module loader for our sweet-tool:

"""Sweet tool module-loading module."""
from pkg_resources import iter_entry_points


class ModuleManager:

    """The ModuleManager manages the loading of modules."""

    @property
    def modules(self):
        """Return modules."""
        return self._modules

    @modules.setter
    def modules(self, value):
        """Set modules."""
        self._modules = value

    def load_single(self, identifier):
        """Load a module by its identifier."""
        for entry_point in iter_entry_points(group='sweet.modules', name=identifier):
            self.modules[identifier] = entry_point.load()()

    def unload_single(self, identifier):
        """Unload a module by its identifier."""
        # We need to remove the module from the python interpreter in
        # order for live codebase updates to work.
        sys.modules.pop(self.modules[identifier].__module__)

        del self.modules[identifier]
        self.modules.pop("", None)
        self.modules.pop(None, None)

    def load(self):
        """Load all modules."""
        for entry_point in iter_entry_points(group='sweet.modules'):
            self.load_single(entry_point.name)

    def unload(self):
        """Unload all modules."""
        # We need to copy the list of identifiers, because unloading a
        # module removes it from the modules dict
        identifiers = list(self.modules.keys())
        for identifier in identifiers:
            self.unload_single(identifier)

    def start_single(self, identifier):
        """Start a module by its identifier."""
        self.modules[identifier].start()

    def stop_single(self, identifier):
        """Stop a module by its identifier."""
        self.modules[identifier].stop()

    def start(self):
        """Start all modules."""
        self.load()

        for identifier in self.modules.keys():
            self.start_single(identifier)

    def stop(self):
        """Stop all modules."""
        for identifier in self.modules.keys():
            self.stop_single(identifier)

        self.unload()

    def __init__(self):
        """Constructor."""
        self.modules = {}

Using the loader above, we could load, unload, start, stop all or specific modules using only a few lines of code. Furthermore, because we remove the imported module from sys.modules while unloading a module, reloading module code without restarting the parent program becomes possible.

Here's a sample:

from sweet.modules import ModuleManager


modules = ModuleManager()

# Load and start all modules
modules.start()

# Stop and unload all modules
modules.stop()

# Load a specific module by identifier
modules.load_single('text')

# Unload a specific module by identifier
modules.unload_single('text')

# Reload a module without having to shut down the program
modules.stop_single('text')
modules.unload_single('text')
modules.load_single('text')
modules.start_single('text')

Conclusion: entry points are amazingly useful. I think the xkcd python comic pretty much sums up how I felt after writing the code above for the first time.


Docker vs. LXC: There Can Only Be One

Posted on Thu 17 July 2014 in Uncategorized • Tagged with Docker, LXC, HorribleLeave a comment

docker vs lxc

My sincere apologies for the title.

We've all heard of the (not so) new kid on the block: Docker. This application shipping tool has been garnering lots of attention for a while now, resulting in quite the amazing community, as well as it being pretty much directly responsible for breathing life into the "containerize all the things"-movement once again.

Docker has an extensive, yet easy to use set of features: with a user-friendliness level several times that of things like LXC, the amazing layered filesystem/snapshotting functionality, hosted image repositories and last but not least the promise that you'll be able ship the exact same application you developed right to production without the modification of a single file, it would be strange for people not to be all over Docker. The CentOS devs seem to agree with me, since there are in fact official CentOS 7 docker images out there.

Yet, while Docker's awesome and all, I - and most likely you, the reader - don't have a real use-case for it beyond running some tests with Jenkins CI. (or Bamboo if you hate freedom)

An OS is not an artifact

The title for this section was provided by Kris Buytaert, an angry bearded man much like myself. We work together, and he has actually written a blog post about Docker before.The difference between me and him is that he actually knows what he's talking about.

An interesting thing that was brought up by Kris is that a Docker image cannot be considered to be an artifact of your application. Instead, it's an artifact of a minimal ubuntu/whatever installation with your application deployed inside of it. This musing of his was so cryptic that I barely managed to draw meaning out of it.

Now why is this important, you may ask? Well, the way I see it it's because artifacts are supposed to be consistent across your infrastructure. Thus, the Docker approach essentially forces you to go stateless, which is just not viable in most cases.

Containerize all the.. thing?

The number one reason why I will most likely never be using Docker in production is because a Docker container is not intended to be used as a cheap VM, but rather a sort of executable magic package.

In a perfect world, a Docker container executes a single process, sends all of its log output to stdout/stderr and all blobs (think DBs) are stored in separate volumes. That would then be the extent of the container's state. Sadly, we don't live in a world like that.

When I deploy an application, I want its logs to be rotated, perhaps even shipped to logstash. I want to be able to log into the machine the application is running on, just in case. I want to be able to run cron jobs, tweak iptables rules, #puppetize and a lot more. Docker doesn't want me to do any of that.

PID 1

In a recent blog post by a friend and (as of very recently) vocal Docker supporter Kenny Rasschaert, he goes into detail on the various advantages of using Docker to deploy single applications. He also mentions a workaround for using a Docker container as a cheap VM instead of a package on steroids. But in the end, that's all it is: a workaround.

As I mentioned before, Docker containers are intended to run a single application and nothing more. Got logs? Use a volume. Got blobs? Use a volume. Need ssh access? Use nsenter. Need cron jobs? Too bad.

The guys over at Phusion seem to think otherwise. Since you're supposed to run one process inside a Docker container, the Phusion guys made it run systemd. Armed with a real init system (and ntp server, log guzzler, and a bunch of other things), their Docker baseimage defies all preconceptions of normality and grants you sshd, crond and more. With their image you can then build you own container: one that's quite similar to an LXC container.

Hail LXC

So now I'll ask you this: If you're going to cast aside Docker's original purpose - that is, to provide a Platform for your application (think PaaS) - and are going to treat it as if it were LXC by using workarounds, why not just use LXC instead?

Much like Phusion's baseimage, LXC allows you to create containers which you can treat as if they were VMs with nearly no overhead, abiding by the JeOS principle. (I really hope I used that right) Furthermore, LXC containers can be nested, its resources can be limited using cgroups, and more.

As long as you are using a somewhat recent version of the Linux kernel, you'll be able to have those CentOS/Fedora/Debian/Arch containers up and running in next to no time at all.

Closing thoughts

I feel like I'd rather use a tool like LXC rather than having to abuse Docker to achieve the same thing. Though Docker's AuFS functionality would be nice to have, it's not something I need. As for LXC: though its community certainly is smaller, that's not a deal-breaker for me by any stretch of the imagination.

For now I'll continue to power my personal infrastructure using LXC, and perhaps I'll take another look at Docker in the near future.

To the people out there with contrasting viewpoints, or the people who wish to point out inconsistencies in my reasoning (there are more holes in this article than there are in swiss cheese): I look forward to hearing your thoughts. ☺


Getting fail2ban to work with Symfony2 the proper(?) way

Posted on Sat 21 June 2014 in Web development • Tagged with PHP, Symfony2, Security, fail2banLeave a comment

NOTE: This is a repost of an old article. I noticed that it was generating a bunch of 404s for some people, so I figured I'd dig it up.

Described as "a script kiddie's worst nightmare", fail2ban is a tool that reads log files, tries to match lines to predefined rules or filters, extracts an IP address from those lines and "bans" the host that IP address belongs to, in a certain way.

It is extremely customizable, can send email notifications and do even more cool stuff. If you want to learn more about fail2ban, you can visit its homepage here.

This blog post will be focusing on making this wonderful tool work properly with Symfony2, so you can automatically ban anyone trying to gain access to your application in a harmful manner for a certain period.

Creating the Authentication Failure Handler

Fail2ban needs a host or IP address so it can ban any offenders. This makes a lot of sense, since generally it'll be creating temporary iptables rules that drop packets coming from the offending IP address on ports 80 and 443.

Symfony however does not log the offender's IP address when authentication fails, so we'll have to add that functionality ourselves by extending the Default Authentication Failure Handler.

The only functionality we'll add is the logging of IP addresses when authentication fails, in order to be able to extra data from our logs (stored in app/logs/) using fail2ban.

<?php

# src/Your/ExampleBundle/EventHandler/AuthenticationFailureHandler.php
namespace Your\ExampleBundle\EventHandler;

use Symfony\Component\HttpFoundation\Request;
use Symfony\Component\Security\Core\Exception\AuthenticationException;
use Symfony\Component\Security\Http\Authentication\DefaultAuthenticationFailureHandler;

class AuthenticationFailureHandler extends DefaultAuthenticationFailureHandler
{
    public function onAuthenticationFailure(Request $request, AuthenticationException $exception)
    {
        if (null !== $this->logger && null !== $request->getClientIp()) {
            $this->logger->error(sprintf('Authentication failure for IP: %s', $request->getClientIp()));
        }

        return parent::onAuthenticationFailure($request, $exception);
    }
}

As you can see, the only real functionality we've added is the logging. Now for the matching service definition:

# src/Your/ExampleBundle/Resources/config/services.yml
services:
    your.examplebundle.authenticationfailurehandler:
        class: Your\ExampleBundle\EventHandler\AuthenticationFailureHandler
        arguments: ["@http_kernel", "@security.http_utils", {}, "@logger"]
        tags:
            - { name: 'monolog.logger', channel: 'security' }

You'll also need to tell Symfony to use your handler by specifying a failure_handler in your security.yml, like so:

# app/config/security.yml
    firewalls:
        main:
            pattern: ^/
            form_login:
                provider: fos_userbundle
                csrf_provider: form.csrf_provider
                failure_handler: your.examplebundle.authenticationfailurehandler
            logout:       true
            anonymous:    true

Alright, that's it for the Symfony part. You should be seeing the IP address of any offenders in your app/logs/%kernel.environment%.log file, like so:

[2013-11-03 23:24:55] security.INFO: Authentication request failed: Bad credentials [] []
[2013-11-03 23:24:55] security.ERROR: Authentication failure for IP: 127.0.0.1 [] []

Next up: the fail2ban filter!

Creating a custom fail2ban filter for Symfony2

To create a new filter for fail2ban, we'll create a file in /etc/fail2ban/filter.d/symfony.conf with the following contents:

[Definition]
failregex = Authentication\sfailure\sfor\sIP:\s<HOST>\s

That was easy, right? We should create a jail in /etc/fail2ban/jail.local which uses our new filter. The definition for this jail will depend on your configuration, but a basic one could look like this:

[symfony]
enabled   = true
filter    = symfony
logpath   = /var/www/my-project/app/logs/prod.log
port      = http,https
bantime   = 600
banaction = iptables-multiport
maxretry  = 3

Now all that remains is to execute service fail2ban reload and to test your new setup.


A Song of Pickle and Python

Posted on Tue 17 June 2014 in Programming • Tagged with Horrible, PythonLeave a comment

So here's the setting: you're a handsome, extremely charismatic all-rounder tech-type person. Apparently you're also a bit of a narcissist. On a warm and cosy evening you're sitting in front of your fireplace in your leather armchair sipping on a fine alcoholic beverage of your choice. As the individual hairs that make up your beard are swaying in the breeze that managed to find its way in through your open window, you are struck by inspiration! Opening $EDITOR (which is clearly the greatest of all text editors) after rushing your way to your battlestation, you are greeted by this Python3 (because you don't live in the past, y'know) module:

"""This is the Thing module."""


class Thing:

    """This class represents a Thing."""

    @property
    def a_prop(self):
        """Return a prop."""
        return self._a_prop

    @a_prop.setter
    def a_prop(self, value):
        """Set a prop."""
        self._a_prop = value

    def __init__(self):
        """Instantiate Thing."""
        self.a_prop = 'value'

"Damn, that's some nice pep8/pep257 conformity," you think to yourself. "I really am the best." Cursing yourself for letting your mind wander, you hurriedly open the file you were looking for. After a bit of buffering that you're not really aware of, you are presented with yet another module, this time containing two functions:

"""This module handles Thing storage and loading."""
import pickle
import storage


def store_thing(thing):
    """Store an instance of Thing in $BACKEND_STORE."""
    serialized_thing = pickle.dumps(thing)
    storage.store(serialized_thing)


def load_things_from_store:
    """Load and return all instances of Thing in $BACKEND_STORE."""
    serialized_things = storage.load_all()
    things = []

    for serialized_thing in serialized_thing:
        thing = pickle.loads(serialized_thing)
        things.append(thing)

    return things

You instantly recognize these functions. You wrote them, after all. You remember store_thing takes an instance of Thing as a parameter and stores it in $BACKEND_STORE. load_things_from_store fetches all instances of Thing you've previously stored in $BACKEND_STORE and returns them.

Glossing over the code, your eyes stop on that familiar word: pickle. pickle is the library you use for serializing Thing instances in order to be able to save them and recreate them later. "Ah pickle, bane of my existence, why must you torment me so?" you lament. "Were it not for your ease of use and hilarious name, I would have never had to suffer so!" A few days ago, you decided to add the property a_prop to your Thing class. At the time you didn't know your change would completely break the unpickling of your saved Thing instances that were being loaded from $BACKEND_STORE. "But, that all ends today!"

You decide that you're going to solve your pickling issues by invoking two of the darkest magicks in your arsenal: inheritance and functions. Recalling that Drupal - a PHP CMS/Framework - seems to be alive and well despite the fact that its users don't know what anything but functions are, you reckon you should be fine if you take this approach.

Deciding not to procrastinate too much, you manage to quickly add two methods to your Thing class, making the improved version look like this:

"""This is the Thing module."""


class Thing:

    """This class represents a Thing."""

    @property
    def a_prop(self):
        """Return a prop."""
        return self._a_prop

    @a_prop.setter
    def a_prop(self, value):
        """Set a prop."""
        self._a_prop = value

    def dump(self):
        """Turn this instance of Thing into plain data."""
        return self.__dict__

    @classmethod
    def load(cls, data):
        """Recreate and populate a Thing from existing data."""
        # Create a new Thing without calling __init__
        instance = cls.__new__(cls)

        # Populate Thing with data
        for key, value in data.items():
            setattr(instance, key, value)

        return instance

    def __init__(self):
        """Instantiate Thing."""
        self.a_prop = 'value'

You wipe away the tears that had appeared in the corner of your eye. Looking toward Dropbox HQ, you solemnly perform a salute and thank Guido van Rossum for creating such a beautiful work of art. After contemplating what a world without Python would look like for a moment, you turn to look at the masterpiece you've written.

The dump method will return a dict containing all of the attributes of a Thing instance. The load class method can be called without instantiating a new Thing manually, and will populate a new Thing instance with existing data without incurring overhead by triggering the constructor. Furthermore, if you move load and dump into a class of their own, you can have Thing inherit that class and then override the methods if needed. This would be useful in the case of classes where attributes tend to disappear and appear randomly. For instance, you could implement versioning logic in the load method based on a version attribute.

At this point, you make a mental note to never use pickle for versioned objects again.

It's getting late now. You're someone who finishes what they've started though, so you decide you should see this through until the end.

You decide to update the functions that manage storage of Thing instances in $BACKEND_STORE. After adding the correct method calls and replacing pickle with the superior msgpack, your fingers finally relax as you save your module.

"""This module handles Thing storage and loading."""
import msgpack
import storage
import thing


def store_thing(thing):
    """Store an instance of Thing in $BACKEND_STORE."""
    serialized_thing = msgpack.dumps(thing.dump())
    storage.store(serialized_thing)


def load_things_from_store:
    """Load and return all instances of Thing in $BACKEND_STORE."""
    serialized_things = storage.load_all()
    things = []

    for serialized_thing in serialized_thing:
        thing = Thing.load(msgpack.loads(serialized_thing))
        things.append(thing)

    return things

You retire to your chambers for the night, after you quickly write a completely over the top blog post on the completely trivial and boring thing you just did.


Improving Jenkins Pipeline Speeds By Lowering Quiet Periods

Posted on Thu 12 June 2014 in Uncategorized • Tagged with Performance, Continuous delivery/integration/deployment, Jenkins CI, HorribleLeave a comment

Disclaimer: If you've ever looked at the "quiet period" feature before, this blog post will be extremely boring to you.

So today was the first time I took a real look at the quiet period setting available to Jenkins jobs.

I had created a pipeline using the build flow plugin consisting of two jobs (three if you include the actual pipeline). According to the build graph view, all of the individual jobs took only 2.4 seconds to run, combined. Yet, the pipeline took a whopping 10 whole seconds to run. Here's a screenshot:

pipeline view - before

After a bit of snooping around I stumbled upon the single most useless feature to ever get included in any application (imvho): "quiet period". You can find that on your job configuration page, under "Advanced Project Options".

The quiet period feature basically prevents a job from running for x seconds after it's been scheduled. This would be useful in a variety of situations, according to some credible sources. The default value is set to 5. 5 whole seconds. Per job. That you're never getting back.

Here's my pipeline after I'd set the quiet period to 0 seconds:

pipeline view - after

You're welcome. Feel free to tell me why I did a bad thing.


HA Symfony2: Manipulating Database Sessions

Posted on Tue 27 May 2014 in Web development • Tagged with Symfony2, PHPLeave a comment

user plus database

During the ongoing quest for high performance and high availability for your Symfony2 project, at some point you're going to want to stick your sessions into your database. Why, you ask? Well, consider the following scenario:

  • You make a webapp named "thing" and deploy it on machines A and B
  • User Bob logs in and starts using thing on machine A
  • Machine A goes down and the service IP switches to machine B
  • User Bob now has to log in again

Storing sessions in the databases also scales better, and so on and so forth.

Luckily, Symfony2 has got you covered. Though sessions live somewhere in the app/cache/<env>/ directory by default, there's a short and comprehensive cookbook article that explains how and why to stop abusing I/O.

Once you follow the steps in that article, your sessions will all live happily ever after. In your database.

Manipulating session data

You don't want a gazillion session records in your database. Though the next generation will surely enjoy the fact that humans/bots were already accessing and using your application in the year 2014, there really is no real need for you to keep that many sessions. The solution here is to delete all of the things... or most of 'em anyway.

Making a command that deletes old sessions from the database is pretty straightforward. Here's an example:

<?php

// src/Your/Bundle/Command/SessionsPurgeCommand.php
namespace Your\Bundle\Command;

use Symfony\Bundle\FrameworkBundle\Command\ContainerAwareCommand;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;

class SessionsPurgeCommand extends ContainerAwareCommand
{
    protected function configure()
    {
        $this
          ->setName('sessions:purge')
          ->setDescription('Deletes old sessions from the database');
    }

    protected function execute(InputInterface $input, OutputInterface $output)
    {
        $threshold = 86400; // Maximum seconds of inactivity (86400s = 1 day)
        $limit = time() - $threshold; // Time limit, we'll purge older sessions

        $em = $this->getContainer()->get('doctrine.orm.entity_manager');

        $dql = 'select s from YourBundle:Session s
                where s.sessionTime < ?1';
        $query = $em->createQuery($dql);
        $query->setParameter(1, $limit);
        $sessions = $query->getResult();

        foreach ($sessions as $session) {
            $em->remove($session);
        }

        $em->flush();
    }
}

There. Just throw that into a cron job somewhere, and you're good to go.

You can also decode, access and modify user session data easily, since it's now stored in the database. This means you could get stats from logged in users, queue notifications for users, check certain types of history.. stuff like that.

Here's an example command which prints out a list and count of users who have been active in the last 10 minutes:

<?php

// src/Your/Bundle/Command/SessionsCheckCommand.php
namespace Your\Bundle\Command;

use Symfony\Bundle\FrameworkBundle\Command\ContainerAwareCommand;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;

class SessionsCheckCommand extends ContainerAwareCommand
{
    protected function configure()
    {
        $this
          ->setName('sessions:check')
          ->setDescription('Checks user activity for the past couple of minutes and prints out some stats');
    }

    protected function execute(InputInterface $input, OutputInterface $output)
    {
        $threshold = 600; // Maximum seconds for last activity
        $limit = time() - $threshold;

        $em = $this->getContainer()->get('doctrine.orm.entity_manager');

        $dql = 'select s from YourBundle:Session s
            where s.sessionTime >= ?1
            order by s.sessionTime desc';
        $query = $em->createQuery($dql);
        $query->setParameter(1, $limit);
        $sessions = $query->getResult();

        $active_users = array();                // Names of active users
        $total_active_count = count($sessions); // Total active users
        $total_active_auth_count = 0;           // Total active logged in users

        foreach ($sessions as $session) {
            $data = base64_decode($session->getSessionValue());
            $data = str_replace('_sf2_attributes|', '', $data);
            $data = unserialize($data);

            // If this is a session belonging to an anonymous user, do nothing
            if (!array_key_exists('_security_main', $data)) continue;

            // User is logged in, increment counter
            $total_active_auth_count++;

            // Grab security data
            $data = $data['_security_main'];
            $data = unserialize($data);

            // Add username to activity list
            $last_active_users[] = $data->getUser()->getUsername();
        }

        $output->writeln('The following users were active in the past few minutes:');
        $output->writeln(join(', ', $active_users));

        $output->writeln(sprintf(
            '%s user(s) were active, and %s of them was/were logged in.',
            $total_active_count,
            $total_active_auth_count
        ));
    }
}

I'm not entirely sure what I wanted to achieve by writing this piece of blog padding, but at the very least I'll never lose my session purging code again.

If you've read this far, you might want to buy Bastion on steam since it's a great game with a godly OST and you'll save 85% at the time of writing. Seriously, hurry.