Tuesday, July 21, 2009

Command line R

Sometimes I just want to print out a histogram from a file, or create a simple summary of some numeric data. It's good to be able to just bash off an R script from the command line to do this for you.

There are a couple of easy ways to invoke R to run in 'batch' mode (i.e. non-interactive):
1. R CMD BATCH <scriptname>
2. R --vanilla --slave <scriptname>

For 1. the output is saved in a file named <scriptname>.Rout, in 2. the output comes to STDOUT - which, for me, is much more useful.

Also, you can use the standard shell 'tricks' to create quick scripts without having to save a script file:


1. Create a numeric summary of the input data (in this case for a file with the format - "name,value"):

R --vanilla --slave <<< "d=read.table('data.scores', sep=',');summary(d);q()"
Min. :1.333
1st Qu.:4.037
Median :4.651
Mean :4.634
3rd Qu.:5.282
Max. :8.000

2. Create a histogram for the data and save to a PNG (don't forget to escape any special shell characters).

R --vanilla --slave <<< "d=read.table('data.scores', sep=',');png('data.png');hist(d\$V1);q()"

