GeekBrainDump: Reading stdin from a pipe for command-line R shenanigans

Sunday, February 13, 2011

Reading stdin from a pipe for command-line R shenanigans

I think I rely on command-line tomfoolery a little bit too much. Today I wanted to pipe some data into R and have the commands to run defined on the command line as well. This is the kind of thing I do with Perl or bash to get some quick answers and I'd love to add R to the repertoire.

So, I tried a number of things all of which failed. This is how I got it to work for me:

> perl -le 'printf "%.4f\t%.4f\n", rand(), rand() for 1 .. 20' \
 | R --vanilla --slave -e\
 "data=read.delim(pipe('cat /dev/stdin'), header=F);\
  cor.test(data\$V1, data\$V2)"

You have to remember to escape any special characters in the R script ($ in this case).

3 comments:

AnonymousMarch 26, 2012 at 4:17 AM
Thanks, it is nice, but how to read LINE BY LINE and have a check on each line, and IF it passes that check, adding it to data? Indeed, I don't want to read all table once into R, since my data is big.
ReplyDelete
Replies
PaulBoMarch 26, 2012 at 7:17 AM
Depends on the complexity of your check. If it's a simple text match then piping the data through 'grep' before piping into R would work. For a more complex check I'd pre-process the data with perl/awk or similar.

But you can do it in R. Take a look at the following StackOverflow thread for a couple of examples:

http://stackoverflow.com/questions/4106764/what-is-a-good-way-to-read-line-by-line-in-r
ReplyDelete
Replies
AnonymousMarch 26, 2012 at 11:19 AM
I had read that page, and indeed it has good examples. But my problem is:
I want to use the R code in a PIG script. The R code that I have written work when I do CAT Myexampl.txt | R --vanilla --slave -f MyCode.R
However, when I can it in PIG (using DEFINE and them STREAMING THROUGH), it return all NA values. I am confused why.
ReplyDelete
Replies

Add comment

GeekBrainDump

Sunday, February 13, 2011

Reading stdin from a pipe for command-line R shenanigans

3 comments:

Blog Archive

Labels

About Me

GeekBrainDump

Sunday, February 13, 2011

Reading stdin from a pipe for command-line R shenanigans

3 comments:

Subscribe To

Blog Archive

Labels

About Me