Friday, June 21, 2013

@reboot jobs will be run at computer's startup.

"@reboot jobs will be run at computer's startup."
What on earth does that mean?

These days RedHat use cronie as the system cron daemon. Described as 'based on the original cron' I think it is a fork of vixi-cron which was used until EL5.

For some time, both of these cron daemons have had an @reboot syntax which allows you to run scripts at (more or less) boot time (when the cron daemon is started). This allows users to start long running processes without the sysadmin having to write an initscript.

It also happens that from time to time, the cron daemon crashes. This is not ideal because cron is the very tool which can be used to periodically confirm that a daemon has not crashed. For now I have added a check to puppet to ensure that the cron daemon is running.

When the cron daemon is started, it logs some messages, one of which is the cryptic:
@reboot jobs will be run at computer's startup.
message. I understand that it is trying to tell me something but it fals short of conveying the message. The internet did not have much to say on the topic either so I had to resort to the source code.

The source of the message is from within the run_reboot_jobs function. The function first checks the existence of a (so called) lock file. The file is
/var/run/cron.reboot
If this file is present, the message is printed out and none of the @reboot jobs are run. If the file is absent, it is created and then the @reboot jobs are queued up to be run.

Perhaps the message should read:
Lock file /var/run/cron.reboot present. @reboot jobs have already been run. skipping.

So that is the mystery almost solved. The remaining details are that during boot, the rc.sysinit script removed a number of stale lock files, including the contents of /var/run. This ensures that at boot time, the cron daemon runs the @reboot jobs.

If you wanted to re-run the @reboot jobs without rebooting your server, you can easily trick it with:
rm /var/run/cron.reboot
service crond restart

Perhaps that could also be added to the init script so you run
service crond restart-boot




Wednesday, May 29, 2013

chkconfig priorities

I recently had a problem on a CentOS-6 machine where my dhcp server did not start at boot because it was serving a virtual network interface which did not yet exist.

The best solution to this problem would be for the dhcp server to start up and wait for the network interface to come up. Many other network tools do this successfully so I don't know why dhcpd should be different. That problem is however too big for me to fix on my server so I need a more simple approach.

To solve this, I would like to adjust my dhcp server to start after the virtual network interface service. CentOS-6 uses the (old) RedHat style init scripts (with some LSB configuration too). RedHat like to proclaim that they use upstarts now but all the upstarts do is make a call to the /etc/rc.d/rc script, just like init used to do.

So, the problem is still based around the System V style scripts which have magic comments which define when to start & stop the scripts.

This is the relevant parts of the header for the dhcp server (/etc/rc.d/init.d/dhcpd):
### BEGIN INIT INFO
# Provides: dhcpd
# Default-Start:
# Default-Stop:
# Should-Start: portreserve
# Required-Start: $network
# Required-Stop:
# Short-Description: Start and stop the DHCP server
# Description: dhcpd provides the Dynamic Host Configuration Protocol (DHCP)
#              server.
### END INIT INFO
#
# The fields below are left around for legacy tools (will remove later).
#
# chkconfig: - 65 35


Despite the comments about being legacy, when installed using chkconfig, the priorities used are indeed Start 65, Kill 35. We can confirm this with a simple ls
ls /etc/rc.d/rc?.d/*dhcpd

The virtual network is also controlled by an init script with a more modest header (/etc/rc.d/init.d/vand):
# chkconfig: 2345 95 05
# description: Virtual Area Network Deamon


Not surprisingly, this will Start at 95 and Kill at 05.

The first idea I had was to modify the dhcpd init scrip directly. Unfortunately this script is not marked as config file (in the rpm) so when a new version comes out, my changes will be lost. I need something better than that.

It is not immediately obvious but chkconfig does have the ability to alter the priorities. This is called override (but not to be confused with the command line parameter --override).


I was unable to find and documentation on the override but I did work out that his minimal config file was all that was needed:


/etc/chkconfig.d/dhcp:
### BEGIN INIT INFO
# Required-Start: vand
### END INIT INFO

When chkconfig reads the headers from the dhpcd init script, it will also read this file (because it has the same basename) and override the values from the init script.

All that is needed is to apply the settings with:
chkconfig dhcpd on

Some information about LSB init scripts can be found here http://refspecs.linuxbase.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/initscrcomconv.html

The final job is to make puppet aware of my changes. Perhaps in my next post.

Friday, May 24, 2013

What happened to java -client?

On our shared server, users are limited to 25 processes. A login script further reduces the soft limit by another 5 processes. This safety margin allows the user to log in an kill something which is using up all their processes. If they want they can raise their soft limit back up to the hard limit.

This works well for most things but java insists on using lots of processes. Confusingly, it prints this message:
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Cannot create GC thread. Out of system resources.
# An error report file with more information is saved as:
# ./hs_err_pid27297.log
The log file is not of much use. It repeats the claim about 'insufficient memory' which is not true and none of the listed solutions will help.

The problem is that the default garbage collector (GC) scales up with the number of CPU cores. Our old server had 8 but the new server has 24. This is easily pushing the users over the 20 thread soft limit.

I don't if the log file lists the number of java threads. If it does I can't work out how to read it. If it could be found it could be compared with the number of available processes:
expr $(ulimit -u) - $(ps x | wc --lines)

In the past I used to solve this problem by adding this to the login script:
alias java="java -client"

but that does not work any more. I don't know when it was removed but it leaves me with the headache of finding a new workaround.

After some research I found this page https://wiki.csiro.au/pages/viewpage.action?pageId=545034311 which suggested using
-XX:ParallelGCThreads=1 -XX:+UseParallelGC
or even just
-XX:+UseSerialGC

This seems to solve the immediate problem but still is not all that useful. To specify the flag to run java you need this syntax:

java -XX:+UseSerialGC

but for javac you have to specify like this:
javac -J-XX:+UseSerialGC


and I don't expect anyone should have to remember that. One simple solution is to alias both those commands but there are many other tools which are part of java which would also need aliases. These other tools may (and do) use their own unique command line parameters which I would rather not have to learn.


Luckily there is also an environment variable you can set like this:
export _JAVA_OPTIONS="-XX:ParallelGCThreads=1 -XX:+UseParallelGC"

however, that makes all the java programs print out this annoying message:
Picked up _JAVA_OPTIONS: -XX:ParallelGCThreads=1 -XX:+UseParallelGC
And the word on the internet is that there is no way to suppress that message.

My solution to use an alias for the most common tools java and javac while keeping the environment variable for all other programs. At first I tried to use this in my alias:
_JAVA_OPTIONS= /usr/bin/java -XX:+UseSerialGC
but an empty variable still shows the message (and best I can tell, bash won't unset a variable like that). I needed to unset it so I had to combine the env command like this:
env -u _JAVA_OPTIONS -XX:+UseSerialGC

My final profile script looks like this:
export _JAVA_OPTIONS="-XX:ParallelGCThreads=1 -XX:+UseParallelGC"
#export _JAVA_OPTIONS="-XX:+UseSerialGC"
alias java="env -u _JAVA_OPTIONS $(which --skip-alias java) $_JAVA_OPTIONS"
alias javac="env -u _JAVA_OPTIONS $(which --skip-alias javac)$(for i in $_JAVA_OPTIONS ; do echo -n " -J$i" ; done)"
You can select which GC you want. I have demonstrated with the more complex example which has two options to be prefixed for javac.

The java users can now happily compile away without java needing the entire machine resources for every invocation.

Sunday, May 12, 2013

Can a KVM guest find out who it's host is?

Our puppet configuration performs a number of checks on our asset database to make sure things are recorded correctly.

One of the properties we record for virtual machines is their location (which host they are running on).

By default, KVM does not expose this information even though there are many ways it could technically be done.

Using bits and pieces from around the internet I have come up with a process where I can pass the serial number of the host into the guest.

qemu permits you to specify many DMI values on the command line like this:
qemu-kvm ... -smbios type=1,serial="MY-SERIAL"
The virtual machine will see this value and puppet will automatically create a fact with this value.

Unfortunately, libvirt does not use this mechanism and the serial number is blank in the virtual machine. The libvirt specification permits setting a value in the .xml config file but it is still not used.

I would like to insert a value which is the same as the host serial but with a prefix to indicate that this is indeed a virtual machine and then a suffix to make sure the serial is unique (such as the vm name).

Initially I used dmidecode to get the host serial number
dmidecode -s system-serial-number
but to run that you must be root and using sudo from the qemu user turned out to be a PITA. In the end I settled for
/usr/bin/hal-get-property --udi /org/freedesktop/Hal/devices/computer --key system.hardware.serial
which uses dbus but means I can run it without being root.

The next step was a shim around qemu-kvm which could add the command line parameters. On RHEL/CentOS 6, the binary lives in /usr/libexec so I put my wrapper in /usr/local/libexec (I think that is the first time I have ever used that directory). When a vm is being started, the first parameter is -name and then the machine name is specified (at least in the current EL6 version. This was not the case in a previous release so it could change). I check to see if that has been specified because qemu-kvm is also invoked by libvirt to check it's configuration/capabilities which does not require the dmi serial (although it does not hurt).

/usr/local/libexec/qemu-kvm
#!/bin/bash
# This is a wrapper around qemu which will supply
# DMI information
if [ "$1" = "-name" ] ; then
    SERIAL=$(/usr/bin/hal-get-property --udi /org/freedesktop/Hal/devices/computer --key system.hardware.serial)
    exec /usr/libexec/qemu-kvm "$@" -smbios type=1,serial="KVM-$SERIAL-$2"
else
    exec /usr/libexec/qemu-kvm "$@"
fi


The final step is to tell libvirt to use my new shim rather than the qemu-kvm binary directly. This also does not seem optimal but can be done by editing every guest and setting the <emulator> path to /usr/local/libexec/qemu-kvm
(either using virsh edit or your favourite XML editor).

Wednesday, May 1, 2013

ProLiant Virtual Serial Port

While working through the configuration of iLO3 for our HP DL380 G7 servers, I found a few pages talking about the Virtual Serial Port (VSP).

This sounded interesting so I though I would try it out. I was using this page as a guide http://www.fatmin.com/2011/06/redirect-linux-console-to-hp-ilo-via-ssh.html but it was a bit out of date.

The first thing to mention was that this does indeed work with iLO3. When configuring your iLO3, remember that in order to ssh to iLO3 you must have a PTY (-t -t), use protocol version 2 (-2) and use DSA keys (ssh-keygen -t dsa). My old configuration for older versions of iLO used exactly the opposite for all of these settings which took some time to figure out (and HP only recently identified/fixed the bugs with ssh keys).

Once you can ssh to your iLO interface, you can issue the command
VSP
and it will start a terminal emulator (actually, it just passes everything through to your terminal program so you can use xterm if you want but safer to assume a vt102). To exit, press ESCape and then ( and you will return to the iLO prompt.

For the OS configuration, I found that I had to make a few alterations. First, RHEL6/CentOS6 uses upstarts and not inittab. If you are not booting with a serial console you must create your own init script:
/etc/init/ttyS1.conf
# ttyS1 - agetty
#
# This service maintains a agetty on ttyS1 for iLO3 VSP.

stop on runlevel [S016]
start on runlevel [235]

respawn
exec agetty -8 -L -w /dev/ttyS1 115200 vt102


You can then enable the console with the command
start ttyS1
Next time you reboot it should start automatically.

In order to log in as root you must add the terminal it to the /etc/securetty file:
echo '/dev/ttyS1' >>  /etc/securetty

So that was all good. Now I want to roll this out to all my servers. To do this I needed some puppet magic. Not surprising, puppet let me down on most of the steps for this.
# Install the init script. This one is easy, just dump in the file 

file { "/etc/init/ttyS1.conf":
...
}

# Add the entry to securetty
augeas { "securetty_ttyS1":
# Ha ha. securetty has a special lens which is different to most other config files
context => "/files/etc/securetty",
changes => [ "ins 0 before /files/etc/securetty/1",
             "set /files/etc/securetty/0 ttyS1",
],
onlyif  => "match *[.='ttyS1'] size == 0",
}


# Enable & start the service. My version of puppet does not support upstarts on CentOS so I can't do this:
service { "ttyS1":
provider => upstart,
ensure => running,
enable => true,
}
# Instead I have created my own type
define upstart($ensure = "running", enable = "true") {
        service { "upstart-$name":
                provider => 'base',
                ensure => $ensure,
                enable => $enable,
                hasstatus => true,
                start  => "/sbin/initctl start $name",
                stop   => "/sbin/initctl stop $name",
                status => "/sbin/initctl status $name | /bin/grep -q '/running'",
        }
}
upstart{"ttyS1": }


And finally, I need to make sure the server BIOS is configured with the VSP.

package { "hp-health": ensure => present } ->
service { "hp-health":
        ensure => running,
        hasstatus => true,
        hasrestart => true,
        enable => true,
} ->

exec { "vsp":
        logoutput => "true",
        path => ["/bin", "/sbin", "/usr/bin", "/usr/sbin"],
        command => "/sbin/hpasmcli -s 'SET SERIAL VIRTUAL COM2'",
        unless => "/sbin/hpasmcli -s 'SHOW SERIAL VIRTUAL' | grep 'The virtual serial port is currently COM2'",
        require => Class['hp_drivers::service'],
} ->

# while we are messing with the serial ports, make COM1 work as the physical device
exec { "com1":
        logoutput => "true",
        path => ["/bin", "/sbin", "/usr/bin", "/usr/sbin"],
        command => "/sbin/hpasmcli -s 'SET SERIAL EMBEDDED PORTA COM1'",
        unless => "/sbin/hpasmcli -s 'SHOW SERIAL EMBEDDED' | grep 'Embedded serial port A: COM1'",
        require => Class['hp_drivers::service'],
}

Friday, April 19, 2013

Visual Studio 2010

I am not a fan of Visual Studio. Unfortunately I must use it for some projects. Recently I was forced to upgrade to Windows 7 and Visual Studio 2010. Not wanting to duplicate all my files, I decided to leave them on a network drive and just access them via the network.

Seems like a good idea, after all, why have a network if you store everything locally? Well, it seems that Visual Studio does not like that.

For some settings it will decide to find a suitable local directory for you. For some other settings it leaves you high and dry.

For example, when I try and build my program I get the error:

Error    1    error C1033: cannot open program database '\\server\share\working\project\debug\vc100.pdb'    \\server\share\working\project\stdafx.cpp    1    1    project

The internet was of little use which is why I thought I would put it in my blog.

This goes a long way to explain my criticism of Visual Studio. After installing several gig of software, is that the best error message it can come up with? Well I will try the help, Oh, that is online only. After jumping through some hoops, the help tells me that:

This error can be caused by disk error.

Well, no disks errors here. Perhaps it means that it does not like saving the .pdb on a network share. What is a .pdb anyway???

In the end, my solution (can I call it that or as Visual Studio hijacked that word?) was to save intermediate files locally:
  • Open Project -> Properties...
  • Select Configuration Properties\General
  • Select Intermediate Directory
  • Select <Edit...>
  • Expand Macros>>
  • Edit the value (by double clicking on the macros or just typing in):
$(TEMP)\$(ProjectName)\Debug\
  • Select OK
  • Select OK

And that seems to sort it out.

Thursday, April 4, 2013

Graphing the inputs of PCF8591

As discussed in previous posts, the PCF8591 provides up to four analog inputs. Displaying these inputs as numbers makes it hard to visualise what is really going on so here I will present some code which can graph the values in real time.

I am still using my demo board from I2C Analog to Digital Converter. (Thanks to Martin X for finding the YL-40 the schematic on the The BrainFyre Blog)
DX pcf8591-8-bit-a-d-d-a-converter-module-150190 YL-40 schematic

It is not clear from the schematic (or looking at the board) but the four inputs are:
  • AIN0 - Jumper P5 - Light Dependent Resistor (LDR)
  • AIN1 - Jumper P4 - Thermistor
  • AIN2 - Not connected
  • AIN3 - Jumper P6 - Potentiometer
It also seems that the board has I2C pull-up resistors which are not required because the Raspberry Pi already has them. This does not appear to cause any problems.

So my plan is to graph the four inputs so I can visualise them responding to changes.

Once again, I don't want to set out to teach C programming but I will be introducing a library called curses (actually ncurses) which makes it easy to display a text interface in a terminal window (virtual terminal).

It will also be able to adjust the analog output value using the + and - keys.

I will only add notes where I am doing something new from the example shown in Programming I2C.

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <linux/i2c-dev.h>

We need a new header file
#include <ncurses.h>

int main( int argc, char **argv )
{
        int i;
        int r;
        int fd;
        unsigned char command[2];
        unsigned char value[4];
        useconds_t delay = 2000;

        char *dev = "/dev/i2c-1";
        int addr = 0x48;

        int j;
        int key;

Here we do some ncurses setup. This will allow us to check for a keyboard key press without having to wait for one to be pressed.
        initscr();
        noecho();
        cbreak();
        nodelay(stdscr, true);
        curs_set(0);


This will print out a message on the screen (at the current cursor location which is the top left)
        printw("PCF8591");


This will print out some labels for our graph bars. The text is printed at the specified location (row, column)
        mvaddstr(10, 0, "Brightness");
        mvaddstr(12, 0, "Temperature");
        mvaddstr(14, 0, "?");
        mvaddstr(16, 0, "Resistor");


We must now call refresh which will cause ncurses to update the screen with our changes
        refresh();
        fd = open(dev, O_RDWR );
        if(fd < 0)
        {
                perror("Opening i2c device node\n");
                return 1;
        }

        r = ioctl(fd, I2C_SLAVE, addr);
        if(r < 0)
        {
                perror("Selecting i2c device\n");
        }

        command[1] = 0;
        while(1)
        {
                for(i = 0; i < 4; i++)
                {
                        command[0] = 0x40 | ((i + 1) & 0x03); // output enable | read input i
                        r = write(fd, &command, 2);
                        usleep(delay);
                        // the read is always one step behind the selected input
                        r = read(fd, &value[i], 1);
                        if(r != 1)
                        {
                                perror("reading i2c device\n");
                        }
                        usleep(delay);

 The full range of the analog value 0 - 255 would not fit on most screens so we scale down by a factor of 4. This should fit on a 80x25 terminal nicely.
                        value[i] = value[i] / 4;


Position the cursor at the start of the bar
                        move(10 + i + i, 12);
For each position in the graph, either draw a * to show the value or a space to remove any * that might be there from a previous value
                        for(j = 0; j < 64; j++)
                        {
                                if(j < value[i])
                                {
                                        addch('*');
                                }
                                else
                                {
                                        addch(' ');
                                }
                        }
                }

                refresh();

 Check the keyboard and process the keypress
                key = getch();
                if(key == 43)
                {
                        command[1]++;
                }
                else if(key == 45)
                {
                        command[1]--;
                }
                else if(key > -1)
                {
                        break;
                }
        }


Shutdown ncurses
        endwin();
        close(fd);
        printf("%d\n", key);
        return(0);
}



To compile this program you need to use a new flag -l which says to link with the spcecified library (ncurses)

gcc -Wall -o pcf8591d-graph pcf8591d-graph.c -lncurses

When you run the program you should see something like this:
PCF8591


Brightness  *****************************************************

Temperature *******************************************************

?           *********************

Resistor    *****************************************


While it is running the graphs should move as you change the inputs. Try for example shining a torch on the LDR or adjusting the Pot.

You can adjust the green LED with + and - (hint, use - to go from 0 to 255 for maximum effect). Any other key will cause the program to quit.

The graph shows nicely how the inputs can change but it also shows how the value can fluctuate without any input changes. I can't explain exactly why but I would expect much of the fluctuation is because of the lack of a stable external clock/oscillator and/or instability in the reference voltage. Needless to say this is a low cost demo board and may not be exploiting the full potential of the PCF8591.

Here is the complete source code (pcf8591d-graph.c):
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <linux/i2c-dev.h>

#include <ncurses.h>

int main( int argc, char **argv )
{
        int i;
        int r;
        int fd;
        unsigned char command[2];
        unsigned char value[4];
        useconds_t delay = 2000;

        char *dev = "/dev/i2c-1";
        int addr = 0x48;

        int j;
        int key;


        initscr();
        noecho();
        cbreak();
        nodelay(stdscr, true);
        curs_set(0);

        printw("PCF8591");

        mvaddstr(10, 0, "Brightness");
        mvaddstr(12, 0, "Temperature");
        mvaddstr(14, 0, "?");
        mvaddstr(16, 0, "Resistor");

        refresh();
        fd = open(dev, O_RDWR );
        if(fd < 0)
        {
                perror("Opening i2c device node\n");
                return 1;
        }

        r = ioctl(fd, I2C_SLAVE, addr);
        if(r < 0)
        {
                perror("Selecting i2c device\n");
        }

        command[1] = 0;
        while(1)
        {
                for(i = 0; i < 4; i++)
                {
                        command[0] = 0x40 | ((i + 1) & 0x03); // output enable | read input i
                        r = write(fd, &command, 2);
                        usleep(delay);
                        // the read is always one step behind the selected input
                        r = read(fd, &value[i], 1);
                        if(r != 1)
                        {
                                perror("reading i2c device\n");
                        }
                        usleep(delay);


                        value[i] = value[i] / 4;
                        move(10 + i + i, 12);

                        for(j = 0; j < 64; j++)
                        {
                                if(j < value[i])
                                {
                                        addch('*');
                                }
                                else
                                {
                                        addch(' ');
                                }
                        }
                }

                refresh();

                key = getch();
                if(key == 43)
                {
                        command[1]++;
                }
                else if(key == 45)
                {
                        command[1]--;
                }
                else if(key > -1)
                {
                        break;
                }
        }


        endwin();
        close(fd);
        printf("%d\n", key);
        return(0);
}