Friday, April 24, 2009

Histograms in gnuplot

I'm pretty new to gnuplot, but have found it to be a handy tool for simple plots.  

Today I needed to generate a simple histogram.  Rather than creating a permanent script I've just been piping a string into gnuplot from echo.

echo "set terminal png;\
set output 'timings.png';\
set style fill solid 0.5 border -1;\
set xlabel 'chunks';\
set ylabel 'time to complete (m)';\
set key autotitle columnhead;\
set title 'timings on 20 nodes';\
set auto x;\
plot 'timings.tsv' u 2:xticlabels(1) with boxes lt 2;"\
|  gnuplot

and out pops a nice histogram (sorry, can't post this example as it's work related).

The input file had column headers and the first column was used as the xlabels (see the xticlabels call above?)

Thursday, April 23, 2009

Stacktrace from a running process

There have been a few occasions where I've been very confused as to why a process is taking a long time to run (usualy Java processes).  It's often handy to request a stacktrace from the running process; you can do this without terminating it by issuing 'kill -3 <pid>'.   Good, eh?

Unix Join with tabs

This one had me scratching my head for a while yesterday.  When you use the unix 'join' command with the default separators (whitespace) then you can join tab delimited files, but the output is space delimited.

I tried specifying the join character with -t "\t" or -t '\t' and even the despearate -t\t, but none of that works.  Turns out you have to use an *actual* quoted tab character (which I found through a quick google search, the solution was here, thank you JJinuxLand).  

You can insert a tab on the command line with the following key combo "ctrl-v <tab>".

Update: the easier way of doing this is to use the special quoting construct $'string' e.g. join -t $'\t'
See the "QUOTING" section of the bash man page for details.

print "Hello World!"


I'm intending for this blog to just be a brain export of any useful/interesting or confusing things I come across in the course of my work.  It'll probably contain items on MySQL, relational databases in general, Unix/Linux, Perl, Java, bash, svn, eclipse and the like.  It may also end up having some interesting or surprising biology/biochemistry/bioinformatics items as well.