Andrew Stacey


About
Andrew Stacey
Information about my research, teaching, and other interests.

By: Andrew Stacey
Contact details


Andrew Stacey


blosxom icon


Fri, 14th Aug 2009 (HowDidIDoThat :: Mathematical)

Setting up the n-Lab

The nlab is in the process of migrating to a new host. This is a record of the steps needed to get it working.


New Host

The new host we chose is Rails Playground. They get good reviews and are specialists in Ruby on Rails. We chose their baseline VPS package (256) which had enough space (meaning all of storage, bandwidth, memory) for the current usage on the n-lab (plus a little extra). Moreover, expanding later seems a fairly painless procedure.

Signing up was straightforward. I chose the 'Ubuntu 8.10 with Apache, Passenger, MySQL' image. I already had a spare domain name, mathforge.org, so chose to use that (since the plan is to shift the ncatlab.org name over once it is all ready). I quickly had access to the VPS control panel where I could add this hostname. The domain name was registered through a different company so I had to change the nameservers with that company and then add the domain name to the DNS Manager on Rails Playground. I also added the subdomain nlab.mathforge.org to experiment with virtual hosts.

Installing Instiki

I have full ssh access to the VPS. The first thing to do was to add a new user to run instiki as.

useradd -m instiki

(Perhaps nlab would have been more sensible.)

I wanted to use bzr to keep the instiki installation up to date but unfortunately bzr wasn't installed. Fortunately, it's a VPS and I have root access so a simple

aptitude install bzr

was sufficient. Then I "became" instiki and pulled the itex2mml and instiki code over to the VPS, as per the instructions on the instiki with bazaar page.

bzr branch http://golem.ph.utexas.edu/~distler/code/itexToMML/ itexToMML
bzr branch http://golem.ph.utexas.edu/~distler/code/instiki/svn/ instiki

Compiling itexToMML required a couple of additional packages:

aptitude install swig flex bison

(As this is intended as a complete record of everything I've done on this host, I ought in all honesty to confess to installing zsh as well and changing the login shells of root and instiki to zsh. Because it's worth it.)

That was sufficient for compiling itexToMML. I did install it as root, but for some reason (presumably to do with paths), instiki didn't detect it. Rather than mucking about with environments, I figured it was easier to follow the instructions on this page and install itexToMML locally in the instiki installation.

That was sufficient to get instiki up and running. Launching it by itself now worked:

instiki -d

produced a fresh instiki process on port 2500

Migrating the n-lab

Copying over the database from the n-lab was not difficult. What was mildly complicated was converting it to MySQL. There appears to be a standard procedure for doing this, the instructions here seem fairly standard. However, rails can't cope with converting large databases. The n-lab database is over 100Mb and that's just too big (to check, I tried the conversion with a smaller database and that worked just fine). So I had to do it manually.

The first step is to create the database and user. I followed the instructions from this article on the RailsPlayground knowledge base, taking note of the comment at the bottom. Actually, in the end I discovered that I also had to add INDEX ability to the user. Thus I did (or should have done):

mysql
create database instiki;
create user 'instiki'@'localhost' IDENTIFIED BY 'if you think I'm
putting the password here ...';
GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, INDEX, LOCK TABLES, ALTER, RELOAD ON instiki.*
TO 'instiki'@'localhost';
quit

(I don't think I'm revealing state secrets here if I say that the password can be found from the instiki configuration files.)

Update September 4th: Added LOCK TABLES, ALTER, RELOAD to the above.

The basic procedure is simple: export the database from sqlite3 and reimport it to mysql. Unfortunately, sqlite3 and mysql speak different dialects of sql (think a geordie talking to a cockney). So a little tweaking was necessary. I found scripts online which puported to do this conversion (one perl and one python) but none got it quite right. I suspect that I don't have it quite right yet either, but the following code seems to work.

#!/usr/bin/perl

while (<>) {

    /^BEGIN TRANSACTION/ and next;
    /^COMMIT/ and next;
    if (/^CREATE TABLE/) {
        s/(?:\(|, )\K"(\S*)"/`\1`/g;
        s/PRIMARY KEY NOT NULL/PRIMARY KEY NOT NULL AUTO_INCREMENT/;
        s/^CREATE TABLE (\S*)/DROP TABLE IF EXISTS \1;\nCREATE TABLE IF NOT EXISTS `\1`/;
    }
    s/^INSERT INTO "(\S*)"/INSERT INTO \1/;
    /^CREATE INDEX/ and s/"/`/g;
    s/\\/\\\\/g;

    print;
}

(Update: 1st September: the \K syntax for perl regular expressions seems to be reasonably new. It's in perl v5.10.0 (on mathforge) but not in v5.8.8. For older version of perl, replace the line s/(?:\(|, )... by the routine

while (/(\(|, )"/) {
    s/(\(|, )"(\S*)"/\1`\2`/g;
}

for the equivalent behaviour.)

So, assuming that that is in sqlite3toMySQL.pl one does

sqlite3 production.db.sqlite3 '.dump' | perl sqlite3toMySQL.pl >
nlab.mysql

to export the database, and then

mysql -u instiki -p instiki < nlab.mysql

to import it. It prompts for the password.

I then shifted instiki over to mysql by editing config/database.yml. I removed the lines

production:
  adapter: sqlite3
  database: db/production.db.sqlite3

and replaced them by

production:
  adapter: mysql
  database: instiki
  username: instiki
  password: yeah, right
  host: 127.0.0.1
  port: 3306

I've also tried using the mysql socket. Instead of the host and port the syntax is:

socket: /var/run/mysqld/mysqld.sock

Presumably one is faster than the other, but I haven't investigated yet.

Update: 19th August: sqlite3 and MySQL have different attitudes to data types. Basically, sqlite3 has an "anything goes" approach whilst MySQL says "if it says duck on the tin, it had better go quack". (Refs: sqlite3 and mysql.) The stylesheet tweaks ('additional_style') on the nlab are defined as varchar(255) but have grown rather larger than that. I've changed them to text, but maybe increasing the size of the varchar would be better. A quick scan through of the other varchar columns in the various databases doesn't reveal any others that might be in danger of outgrowing their restrictions so this seems to be the only one to keep an eye on. It needs manually redoing whenever the database is converted.

Update: 1st September: Spoke too soon. We have page titles of up to 95 characters whereas the official limit is 60.

Update: 1st September: there's a malformed URL in the original database that the old software was fine with but the new has issues with (it seems to be a problem with the different versions of ruby). The URL in question contains unescaped percent signs which need to be converted to '%26's. The relevant code is:

cat nlab.mysql | perl -lpe '/codecogs/ && s/%/%26/g' >  nlab.codecogs.mysql

Update: 1st September: according to this post on the mysql mailing list, the unix socket is the faster protocol (it also suggests a test to run, that's on the TODO list!).

Update: 1st September: A further difference between the two database types is to do with comparisons. In MySQL, the default is for comparisions to be case insensitive. Relevant websites are here and here.

Update 4th September: Here's the procedure I'm using for fixing the tables.

  1. Backup the data. A simply mysqldump instiki > nlab.$(date +%F).mysql should suffice for this.

  2. Modify the tables. For this we use the ALTER TABLE command.

    ALTER TABLE wiki_references MODIFY referenced_name varchar(100) DEFAULT '' NOT NULL; ALTER TABLE pages MODIFY name varchar(100) DEFAULT NULL; ALTER TABLE revisions MODIFY content longtext DEFAULT '' NOT NULL;

    Doing a new backup and diffing the result with the original revealed that no data had been changed.

  3. Make string comparisons case sensitive. This can be done at several levels: server, database, table, and column. I chose "database" as it was the highest that the instiki user had access to.

    ALTER DATABASE instiki COLLATE latin1_bin;

    I don't know how to figure out what collations are available, so pure experiment led me to that choice.

    This didn't work. However, changing the collation at the column level did work. I changed it for name in pages and referenced_name in wiki_references. I shall leave the question as to whether it should be changed for other columns to another day.

  4. Enable binary logs (for backup and recovery), and slow query logs (for seeing what to optimise). To do this, I uncommented the lines log_bin and log_slow_queries in /etc/mysql/my.cnf and restarted the mysqld server (/etc/init.d/mysql restart). I also set long_query_time to 5 (seconds) rather than the default 10.

  5. Reload the long pages back into the database. First, I had to find the long pages. Due to the truncation, these were guaranteed to have length 65535.

    % mysql mysql> tee LongPages; mysql> SELECT id, LENGTH(content) FROM revisions WHERE LENGTH(content)=65535; mysql> quit % cat LargePages nlab.mysql | perl -ne '/^\| (\d+)/ and {$1} = 1; if (/^INSERT INTO revisions VALUES((\d+)/) { if ({$1}) { = 1 } else { = 0; print ; = "" } }; if () { .= $_}; ' > nlab.longentries.mysql

    Then just replace INSERT INTO with REPLACE INTO in the file so that the revisions are overwritten. After that, it's a simple

    mysql < nlab.longentries.mysql

    to replace the revisions by their longer versions.

Incidentally, the instiki user now has a .my.cnf file in its home directory. The contents are:

[client]
user=instiki
password=like I'm going to put that here!

[mysql]
database=instiki

This means that the mysql command automatically connects to the instiki database and doesn't need a password.

Passenger

Following the instructions on the knowledge base article, I then set up the virtual host. The file is /etc/apache2/sites-available/instiki. Use whatever editor you like. It contains the lines:

<VirtualHost *:80>
ServerName nlab.mathforge.org
DocumentRoot /home/instiki/instiki/public
</VirtualHost>

which says that the plugging http://nlab.mathforge.org into a browser gets one directed to the directory /home/instiki/instiki/public. To enable this, type

a2ensite instiki
/etc/init.d/apache2 reload

Future modifications to this file merely require the second of these to reread the configuration.

Within /home/instiki/instiki/public is the .htaccess file that redirects all requests to the instiki process. By default it uses the cgi dispatcher, as prompted in that file I redirected to the fastcgi one by changing dispatch.cgi to dispatch.fcgi.

An important point to record is that whenever changes are made to either the apache configurations or to instiki then stuff has to be reloaded. Reloading the apache configuration has to be done as root. To reload it, do

/etc/init.d/apache2 reload

Restarting the instiki process has to be done as instiki. To restart it, do

touch ~/instiki/tmp/restart.txt

Update: 19th August: I've now added an exception to the sudoers file that allows the user instiki to issue the apache restart command in case of a server crash. The relevant command is:

sudo /etc/init.d/apache2 restart

Note that restart is the only argument allowed. If it is not already running then it is equivalent to start.

I have not enabled reload for instiki since a reload is only relevant if the configuration has changed and only root can do that, so if someone has access to change the configuration then they have access to reload it as well.

So far, so good

At this point, most of instiki seems to work. What does not work is anything requiring a POST directive. Testing by running instiki on its own shows that it is not instiki that is the problem. Rather POST directives are not getting as far as instiki. Unfortunately, it is proving tricky to track down the errors.

According to a comment on the RailsPlayground forum, the most recent version of rails requires a newer version of Passenger than comes with the default Ubuntu image. Of course, the RailsPlayground people aren't idiots and have a compatible version of Rails on their system. Unfortunately, instiki comes with its own rails bundled in and that is the newest version (ish). So we try installing the latest passenger. As root:

gem install passenger
passenger-install-apache2-module

This concludes by telling us to add some lines to the apache configuration. The file to add them to is /etc/apache2/conf.d/modrails and in fact they comprise the entirity of that file. The lines are:

LoadModule passenger_module /usr/lib/ruby/gems/1.8/gems/passenger-2.2.4/ext/apache2/mod_passenger.so
PassengerRoot /usr/lib/ruby/gems/1.8/gems/passenger-2.2.4
PassengerRuby /usr/bin/ruby1.8

Lo, and behold! Success. I can now scribble all over the n-lab.

As part of this investigation, I tried seeing if there was a conflict between using the fastcgi and passenger. I did this by moving the .htaccess file out of the way in the instiki installation. It didn't seem to do anything so I'm still a little foggy as to what the process is for loading pages and what does what. I ought to try to understand this a little better to make the best choice as to what configuration to use - there's no point in having things turned on that never get used.

Other Files

A comparison of file lists showed that all of the extra files in the original installation are in the webs directory. Simply tarring that and transferring it over seems sufficient to enable them on the new hosts. Oh, and there's the favicon.ico in the public directory.

Crontab

Using crontab -e, I set things up so that instiki checks for updates every night and then indicates to passenger that it would like to be restarted. The relevant line is:

0 0 * * * cd instiki; bzr pull; touch tmp/restart.txt

It's possible that I should use absolute paths for those. I'll see what error messages that produces first.

Still To Do

Turns out that copying over the database isn't quite everything that one needs to do to clone an instiki installation. There appear to be some other files as well that may have been modified. It may take a little while to track them down, but for the moment, I'm happy to have a working system.

Memory Usage

Shortly after announcing the test site, the whole thing crashed as it ran out of memory. Seems as though everyone flooded over to the site and tried loading it all in one go. On the one hand, this is possibly unusual activity but on the other, it's a good test of how the system handles itself under pressure. In the end, I had to reboot it.

In response to my question on the excellent Rails Playground Forum, it was suggested that I install ruby enterprise edition as that has some optimisations for memory usage over ordinary ruby. Also they're the people responsible for phusion passenger. After installing that, I reinstalled the apache module as instructed and reloaded the apache configuration (/etc/init.d/apache2 reload). The other suggestion was to remove the innodb module from mysql. Turns out that Instiki uses MyISAM by default anyway so this was no hardship. Doing this involved commenting out a line in /etc/mysql/my.cnf about skipping innodb (instructions from the reply to my question on the forum above).

This does seem to have brought the memory usage right down. Before doing this, top reported about 300Mb usage. Afterwards it was down to 120Mb. We'll see how it copes with usage, though. For comparison, the nlab reports usage of about 250Mb.

Permissions

I keep running into difficulties with permissions. It's my own fault: I copied my zsh configuration files from my user directory to the root directory. As a user, having umask 077 is good. As root, it's daft because root often installs stuff for others to use and so having umask 077 means that they can't use 'em. I've changed it back to the more sensible umask 022, but there may be places I've not noticed something's been installed but is unreadable.

Encoded Slashes

Following experiments with 3/2-colimits and finding a discussion of encoded slashes here I've added the AllowEncodedSlashes on directive to the nlab.mathforge.org virtual site (in /etc/apache2/sites-enabled/instiki).

mod_xsendfile

Turns out that the default Apache installation on the Ubuntu image doesn't come with mod_xsendfile so I had to download and install it.

wget http://tn123.ath.cx/mod_xsendfile/mod_xsendfile-0.9.tar.gz
tar xzf mod_xsendfile-0.9.tar.gz
cd mod_xsendfile-0.9
apxs2 -cia mod_xsendfile.c

At this point it complained about httpd.conf being empty. This is because Debian has a more modular approach to configuration files (and Ubuntu is built on Debian). The two necessary files are xsendfile.load and xsendfile.conf in /etc/apache2/mods-available.

xsendfile.load should contain:

LoadModule xsendfile_module /usr/lib/apache2/modules/mod_xsendfile.so

xsendfile.conf should contain:

XSendFile On
XSendFileAllowAbove On

Then it's a simple

a2enmod xsendfile

to enable it.

Now cached files just zoom past.

Update: 1st September Exactly why the cached files zoom past is a little bit of a mystery since it's not completely clear that mod_xsendfile is being used, or if it is, exactly when.

Memory Usage and Monitoring

Under it's first load test, the server crashed. Following advice on the Rails Playground Forum, I've limited the PassengerMaxPoolSize to 4 (down from 6). (Note: the advice was merely to change this, I chose 4 as an experiment. The advice on the Passenger User Guide was to set it to 2 for a VPS with 256Mb. We have 256Mb guaranteed and can access up to 512Mb, also we're using Ruby Enterprise Edition which is meant to reduce memory usage. A value of 2 is still twice as many as would be available when running Instiki through a proxy!)

Passenger has its own set of tools for memory monitoring. Basically, there's a difference between the memory that a process reserves and that it actually uses. The tools to see how much is actually used are passenger-status and passenger-memory-stats.

(Although I appreciate the difference, when it comes to making use of memory then I don't see much difference between "reserved" and "used". Surely both are unavailable for other processes?)

Things to do

  1. Implement a database backup strategy. Using mysqldump (or possibly mysqlhotdump as we're using MyISAM), do a full backup every Sunday at 6am (UTC) followed by incremental backups every day at 6am (UTC). Seems that the relevant commands are:

    mysqldump --single-transaction --flush-logs --master-data=2 \ --all-databases --delete-master-logs > backupsunday1_PM.sql mysqladmin flush-logs

    The second command doesn't actually produce a backup file. What it does is start a new binary log, so that the old binary log is the relevant backup file. Thus the old binary log needs to be copied somewhere.

    Also, to do this, binary logging needs to be turned on. Apparantly, that's only a 1% hit in speed so worth it.

    Relevant pages in the MySQL documentation: example strategy, mysqldump, database backups.

  2. Change the types of table columns (and reinstate any truncated data). The page names and wiki redirect names need to be bigger, current usage indicates 100 characters is just enough. The length of the pages also needs to be bigger. MEDIUMTEXT should be sufficient, but perhaps LONGTEXT would be best. The difference in actual storage usage is minimal so we may as well go for the max.

    As this involves poking around in the database, I'd like to ensure that I have a full backup first.

  3. Change the setting so that string comparisions are case sensitive.

  4. Install monit to keep an eye on memory usage. We are still hitting our ceiling and while this indicates that really we need more memory, it seems that sometimes a process goes 'rogue' and needs killing off. It is possible that this occurs when we first hit the ceiling, cause and effect are a little hard to distinguish here. This needs testing and monitoring, hence monit.

  5. Figure out the Instiki logs. Having multiple instances writing to the same log file seems not to work too well. When a new instance gets started up, it seems to rotate the log file. When two instances are working at the same time, their log messages get interweaved. It'd be better to clean this up a little. A little searching throws up this link and that link which may help.

Logging

I was wondering why the logs weren't getting rotated. Tracing things through, it seems that cron ought to have been doing it, but cron was being polite and letting anacron deal with daily, weekly, and monthly tasks. However, anacron isn't running in the default runlevel on the server, and since the server is meant to be always-on then the advantage of anacron over cron isn't apparent. But cron merely checks for the existence of the anacron binary - it doesn't check if it's actually running. Daft. So I've modified the system crontab to remove its dependency on anacron. See if that works!

Instiki Logging

I've installed the SyslogLogger gem and am trying it out for logging the instiki processes. Since we're using REE rather than the installed ruby, that's the one that has to install the gem.

~# /opt/ruby-enterprise-1.8.6-20090610/bin/gem install SyslogLogger

By default, SyslogLogger uses the user facility. I decided that it would be best to dedicate one of the local facilities to rails apps to separate them from all other logs. To do this, I installed this patch which meant that I could alter the facility. I also had to modify the syslog.conf file, which also revealed one or two mistakes in that file! A couple of the log files were miswritten, so that /var/log/auth.log became /var/logauth.log. I'd been wondering why those files were in strange places. I removed the local0 facility from everywhere that it seemed to be logging to (i.e. the defaults) and sent all log messages on that handle to /var/log/ruby/instiki.log. This seems to work, though it does produce a lot of log messages.

The relevant change in the instiki code is to config/environments/production.rb:

require 'syslog_logger'
config.logger = RAILS_DEFAULT_LOGGER = SyslogLogger.new('instiki_dev',Syslog::LOG_LOCAL0)

#config.action_controller.logger = Logger.new("#{RAILS_ROOT}/log/#{RAILS_ENV}.log", 25, 1024000)

I also programmed logrotate to do ... err ... log rotation. The configuration is in /etc/logrotate.d/nlab and read:

/var/log/ruby/*.log {
   rotate 14
   daily
   dateext
   delaycompress
   compress
   missingok
   create 644 root adm
   sharedscripts
   postrotate
     /usr/bin/touch /home/instiki/instiki/tmp/restart.txt
   endscript
}

[Full link]
Last modified on:
Tue, 8th Sep 2009