This one had me scratching my head for a while yesterday. When you use the unix 'join' command with the default separators (whitespace) then you can join tab delimited files, but the output is space delimited.
I tried specifying the join character with -t "\t" or -t '\t' and even the despearate -t\t, but none of that works. Turns out you have to use an *actual* quoted tab character (which I found through a quick google search, the solution was here, thank you JJinuxLand).
You can insert a tab on the command line with the following key combo "ctrl-v <tab>
Update: the easier way of doing this is to use the special quoting construct $'string'
e.g. join -t $'\t'
See the "QUOTING" section of the bash man page for details.
Great!
ReplyDeleteYou saved me. :D
Thanks! I really didn't expect to find the answer to this problem in a 2009 post
ReplyDeleteThank you, I struggle with this every time it comes up. Which is often. It should've been simpler.
ReplyDeleteack, doesn't work on a mac! ctrl-v, before or in conjunction with hitting the tab key, doesn't produce the desired effect. Command-v (Command is sometimes used in place of Ctrl on macs) is assigned to the Paste function.
ReplyDeleteHelp again! Please!
Tim, I've not got a mac (and have never used one), so I can't really help with the control key confusion. However, in this kind of situation I end up resorting to Perl (is Perl installed/available?).
ReplyDeleteTo split a tab delimited file and re-join the first and second fields I'd do this:
perl -F'\t' -lane 'print join("\t", @F[0,1])'
There's quite a lot going on in the background with this command. Perl is auto-splitting each line of the the input file into the @F array (split delimiter is specified by -F flag). I'm then re-joining each line with the perl join command (first parameter is the join character all the remaining paramters are joined together with this character. The -l flag removes newlines from each input line but also adds them to each print statement so you don't have to worry about them.
very nice, exactly what I needed, thanks!
ReplyDeleteTotally weird on a Mac with a PC keyboard - if you use ctrl-V and quickly hit the tab key, it works. However, if you hold down ctrl-V too long you get "^V" repeated until you release the keys.
ReplyDeleteCheers! Thanks for saving me that frustrating afternoon fixing the problem! Now to re-run my code on 7GB of files--argh!!
ReplyDeleteAwesome tip, saved me hours of frustration! (AIX 6.1)
ReplyDeleteVery nice post! it puts me on the right way.
ReplyDeleteI had the further problem of using join *in a script* with tab separators.
Suppose your editor is set to not insert tabs...
In bash you can do
tab=`echo -e '\t'`
join -t "$tab"
but then I found an even smarter solution
http://stackoverflow.com/questions/1722353/unix-join-separator-char
Thanks.. This saved my time.
ReplyDeleteThis blog post: http://www.52nlp.com/error-join-multi-character-tab-t-for-using-join-tab/
ReplyDeletesays you can use: $'\t'
I've been using $'\t' for a while as well:
Deletehttp://geekbraindump.blogspot.co.uk/2013/06/better-tab-use-with-bash.html
Maybe I should update this post.
Thank you. That works!
ReplyDelete