Blog entries published in 2010
Feeds: RSS | Atom

Hadoop lesson learnt: Restart datanodes after modifying dfs.balance.bandwidthPerSec

Published: 2010-09-10 13:17 UTC. Tags: hadoop

I was rebalancing one of the Hadoop clusters I run at work. It was not running very fast, so I modified the appropriate setting:

  <!-- 100Mbit/s -->

I restarted the namenode and thought that would make the trick. But no, you also need to restart all your datanodes for the setting to take effect. Now I can see some action on my network graphs :-).


Whenever You Need a Random Password

Published: 2010-04-14 20:28 UTC. Tags: open source

apt-get install pwgen


Command Line Copy and Paste in Gnome Terminal

Published: 2010-04-10 11:08 UTC. Tags: linux

In the category Stuff I really should have learned several years ago, I now know that the keyboard combinations for copying and pasting in gnome-terminal is Shift-Control-C and Shift-Control-V

Now, if I could find out how to do select text without using the mouse...


Forsberg's Law on Cron Jobs

Published: 2010-02-19 09:45 UTC. Tags: software

They never work as intended the first four times you run them.


Backup of MySQL via phpMyAdmin

Published: 2010-02-06 19:51 UTC. Tags: misctools python

My girlfriend runs a blog on a cheap hosting firm that doesn't provide any way of doing proper SQL dumps of the MySQL database used by the blogging software.

There are plugins for Wordpress that can do full backups, but I prefer doing raw SQL dumps + a filesystem backup. That way, you know what you get, you don't have to trust the backup plugin author to do it right.

The hosting firm does provide access to a phpMyAdmin installation which you can use to download SQL dumps. The trick is of course to do this automatically, as good backups need to be unattended.

I wrote a python program that can do this, using what turned out to be an excellent library for programmatic web browsing: mechanize.

The backup script is available in my misctools project on GitHub.


Easy Update of Slicehost DNS Entries

Published: 2010-02-06 18:43 UTC. Tags: misctools python

This website runs on a virtual machine I buy from Slicehost. I've also choosen to use their DNS servers for my domain - the service is stable and included in the price.

The Slicehost DNS can be modified using the Slicehost API. I wrote two small scripts for easy modification of Slicehost DNS entries from the commandline or from scripts.

  • update_entry, for adding or updating existing entries.
  • dhclient_update_hook, which very easily can be used to update an entry from a dhclient script, to keep records that point to dynamic adressess updated automatically.

Both are available from by cloning my misctools project at GitHub.


PostgreSQL/Python/psycopg2: Confusing error, port setting required for socket connections

Published: 2010-02-06 13:43 UTC. Tags: django python

When trying to get my local development copy of this website running after upgrading my Ubuntu, I got the following confusing error message from the psycopg2 python module:

psycopg2.OperationalError: could not connect to server: No such file or directory
        Is the server running locally and accepting
        connections on Unix domain socket  "/var/run/postgresql/.s.PGSQL.5432"?

My django settings file was correct:

DATABASE_ENGINE = 'postgresql_psycopg2'  # 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' or 'oracle'.
DATABASE_NAME = 'dbname'                 # Or path to database file if using sqlite3.
DATABASE_USER = 'dbuser'                 # Not used with sqlite3.
DATABASE_PASSWORD = 'dbpassword'         # Not used with sqlite3.
DATABASE_HOST = ''                       # Set to empty string for localhost. Not used with sqlite3.
DATABASE_PORT = ''                       # Set to empty string for default. Not used with sqlite3.

Confusing, since my Postgres server was running and I could connect using psql:

psql -U dbuser -W dbname

This turned out to be one of these problems when Google is of no help - others had the same problem, but I could only find posts where people asked the question, no posts where the actual solution was found.

The cause of the problem was that my PostgreSQL installation was configured to listen on port 5433 instead of the default 5432, and as seen in the error message, the port number is part of the path to the unix socket. The different port was probably setup when I upgraded my Ubuntu, since that installed PostgreSQL 8.4 without completely removing PostgreSQL 8.3. The latter is configured to listen on the default port.

The solution is to either configure the running PostgreSQL to listen on port 5432 by modifying /etc/postgresql/8.4/main/postgresql.conf, or by modifying the Django configuration by setting the port: