"Execute failed: Duplicate column name "<colname>" at positions 1 and 2" - (the column names and reported positions depend on the format of the input data).
I tried to fix up the columns thinking this was a whitespace issue or some such but had no joy. For me the solution was to read the molecules without extracting the data and then use the "SDF Extractor" node to grab the data. This worked with no issues.Monday, December 16, 2013
KNIME: duplicate column issue
Tuesday, November 19, 2013
ShrewSoft VPN Manager window does not show under 64Bit Windows 7
I'm using version 2.2.2 of the ShrewSoft VPN client on my 64bit Win7 laptop. For some reason, the 'Manage' window doesn't show in this version - which bit me when my VPN details changed. The .pcf configuration files aren't re-read if you edit them and so you have to have access to the 'manage' dialog in order to update your connection - you can't even add a connection without accessing this dialog.
So, the solution for me was actually quite simple -
- try to bring up the 'Manager' dialog (right click the ShreSoft icon in the bottom right of the taskbar and click on 'Manage')
- bring up the task manager (
Ctrl-Shift-Esc
is a handy short-cut for that) - the VPN manager should be present in the "Applications" window - right click this and select 'maximize'
Thursday, October 24, 2013
Compare a local and a remote file using 'diff' and process substitution.
rsync -ilvrn <localdir> <remotedir>
) and then use the above diff command to compare them.
Friday, June 21, 2013
Better tab use with bash
One of the most viewed posts on this blog is the unix-join-with-tabs one where I describe the "Ctrl-v
cut -f':' --output-delimiter=<tab>
where the output delimiter needs to be a tab.
However, there's a much nicer/easier way of specifying a tab character in bash:
$'\t'
. Take a look a the QUOTING section of the bash manpage for all the details.
Here's a simple example of it in action:
Tuesday, June 4, 2013
Using 'find' to list files with multiple suffixes
find
' command can make this all a lot less painful.
Here's the command for matching a single suffix:
But editing this command-line to match the next suffix of interest becomes tedious very quickly. Thankfully, you can chain together file tests like so (note the grouping):
Then acting on these files is easy - just update the -exec action to what you want (e.g. "
-exec bzip2 {} \;
" - you probably want to use xargs
or "-exec bzip2 {} +
" for this to reduce the number of command invocations)
An interesting note here is that the following command isn't executed as you might expect. The '
ls
' command is only executed on the *.tsv files due to the way the expression is evaluated: from left to right with the implicit '-and
' between the second '-name
' and '-exec
' exec having higher precedence than the '-or
' between the two '-name
' tests..
Wednesday, April 10, 2013
Finding (and fixing) files with undesirable permissions
chmod -R g+rw *
), but this isn't always what you want.
Friday, March 8, 2013
Merge images
The montage program (part of the ImageMagick suite of tools) is really simple and effective for this:
Hey presto a set of merged density plots laid out into two columns (regardless of the number of images) with some spacing between them and a bit of drop shadow to visually separate the plots.Monday, February 18, 2013
OpenBabel: Convert SDF to SMILES and keep the data!
> babel test1.sdf --append "cLogD7.4 cLogP model_score1 model_score2 some_other_property" test1.smiIn order to end up with a tab delimited file (my favourite) then you have to prefix the argument to 'append' with the desired character. I used "Ctrl-v <tab>" to get a tab in my string. Seems odd that tabs wouldn't be the default delimiter since there's still a tab used to separate the SMILES string from the molecule name in the standard conversion.
Friday, February 15, 2013
simple parallel processing with make
I'll let the code do the talking; here's the basic bash script (stored as an executable):
#!/bin/bash if [[ $# -ne 1 ]]; then echo "Usage: cat commands.txt | $(basename $0) <num processes>" exit 1 fi (while read line; do echo -e "$((++i)):\n\t$line"; done; echo "all:" $(seq 1 $i)) | make -B -j $1 -f <(cat -) allThis uses a couple of clever tricks. I especially like the use of process substitution in the make command (substituting the 'cat -' for the input makefile).
This approach allows the commands in commands.txt to redirect their own output as they need to (using '>', '2>', '&>', etc.)
Monday, February 11, 2013
Bash: while [[ -e /proc/$process_id ]]
> some_interesting_job.py & > process_id=$(ps -o "%p %c" | grep "some_interesting_job" | cut -f 1 -d ' ');\ while [[ -e /proc/$process_id ]];\ do ps -o "%z";\ sleep 5;\ doneThis will report on the virtual memory size (in KiB - see 'ps' manpage for more details) that the process is taking up. The while loop will terminate when the process completes (or is killed).
Friday, January 25, 2013
commandlinefu.com
-
Commandlinefu - main page
Commandlinefu_byvote - sorted by vote count
There are some real gems in there! For example:
> python -m SimpleHTTPServer Serve current directory tree at http://$HOSTNAME:8000/Useful for tiding up your workspace whilst keeping jobs running:
> disown -a && exit Close shell keeping all subprocess runningI do love a bit of process substitution:
> diff <(sort file1) <(sort file2) diff two unsorted files without creating temporary filesHandy:
> rm !(*.foo|*.bar|*.baz) Delete all files in a folder that don't match a certain file extensionAnd a "I should have thought of this; it's so obvious now!" trick:
> some_very_long_and_complex_command # label Easy and fast access to often executed commands that are very long and complex. When using reverse-i-search you have to type some part of the command that you want to retrieve. However, if the command is very complex it might be difficult to recall the parts that will uniquely identify this command. Using the above trick it's possible to label your commands and access them easily by pressing ^R and typing the label (should be short and descriptive).
Thursday, January 24, 2013
R: formatting numbers for output
The function 'format' is pretty good for this. Note that the 'format' call returns a character vector (which is fine if you're only going to write the number to file or the console).
Here's a simple example just using a randomly generated number:
> num <- rnorm(1, mean=10) > num [1] 10.24339We call format with digits=4 (show 4 significant digits) and nsmall=4 (display at least 4 digits after the decimal - for real/complex numbers in non-scientific format):
> format(num, digits=4, nsmall=4) [1] "10.2434"You can see that the format command rounds the numbers. This uses the IEC 60559 standard - 'go to the even digit'. So 0.5 is rounded to 0 and 1.5 is rounded to 2...
Of course, if you're used to sprintf style commands then you can also use the sprintf function for this:
> sprintf("%.4f", num) [1] "10.2434"
Wednesday, January 2, 2013
Simple parallel processing with xargs
It's as simple as:
> ls *.bz2 | xargs -n 1 -P 6 bunzip2This will set off bunzip2 on all bz2 files in the current directory.
The '-n 1' flag tells xargs to only provide one argument (file) per command line; the '-P 6' tells xargs how many concurrent processes to run.