GeekBrainDump: 2012

Wednesday, November 21, 2012

hg over ssh

I'm working with a repo that I can't access using http, so it's just as well that Mercurial works just fine and dandy over ssh as well - you just have to do a little bit of tweaking to the basic configuration to get things working smoothly.

Add the following to the client .hgrc:

[ui]
remotecmd=<path_to_hg>

Where <path_to_hg> is the path to the hg executable on the main repo machine.

This was from StackOverflow: cloning-a-mercurial-repository-over-ssh

Now, I'm working very remotely from the main repo (6000 miles) so it is worth using compression. Just add the following to the [ui] section of .hgrc:

ssh = ssh -C

You could also configure ssh to use compression by default (see details in link below).

This page has a good overview of using ssh with mercurial: collaborating-with-other-people

Thursday, October 18, 2012

bash while loops and pipes

In bash, if you pipe into a while loop, the while loop is run in a subshell. This means that you're going to be very disappointed if you were hoping to capture data/variables within the loop.

The work around is to not have the pipe there - which is possible through process substitution.

e.g.

echo -e "one\ntwo\nthree" | \
 while read name;
  do val=$name;
  echo $val;
 done;
echo $val

This outputs:

one
two 
three

Notice that the final value "three" is not printed twice as 'val' no longer contains a value.

And the work around:

while read name; 
 do val=$name;
 echo $val;
done < <(echo -e "one\ntwo\nthree")
echo $val

which now outputs:

one
two
three
three

Success! (or, if you're a "Bill and Ted's Bogus Journey" fan - Station!).

Monday, October 8, 2012

Mercurial file patterns

I wanted to add all the scripts in a directory tree to the local Mercurial repository. I was going to do something like this:

> find . -name \*.sh | xargs hg add

Which is nice enough (find and xargs go well together), but you can also do it using just Mercurial using patterns. e.g.

> hg add 'glob:**.sh'

The two most useful patterns (for me) are:

'*' match any text in the current directory only
'**' match anything in the entire tree

The patterns are much richer than this though; see hg help patterns for more on what's available (including regexes).

Tuesday, June 5, 2012

Removing VMWare Player - blank grey dialogue box

I've just spent way too long trying to upgrade VMWare Player on my laptop. The issue was that, when I tried to uninstall the incumbent version, I was presented with an unhelpful blank grey dialogue box... Notice the lack of, well, anything.

The box didn't go away , even after a couple of hours waiting - left it because I was wondering if something was being unpacked in the background. This kind of information is frustratingly difficult for me to get at on a windows machine... Eventually, I had to use Task Manager to kill it. I went through a few iterations of this trying out a few different ways of uninstalling or running the newer installer (even setting different default browsers since the contents of the box turned out to be HTML and I thought it might be a compatibility issue - but computer says "no").

I looked around for solutions & came across the following, which worked for me:

http://superuser.com/questions/245424/vmware-workstation-install-problem

The most pertinent advice was:

To uninstall any old version, go to C:\Windows\Installer
Add the "Authors" column and sort by it
One of the .msi files with have a "VMware" author
Double-click it and follow through with the uninstall steps

After uninstalling the older VMWare Player using this method, I was then able to install the latest version and get playing with my brand spanking new ACE image. Success!

Friday, March 9, 2012

Timing your R code

Ever wanted to find out which of a set of methods is faster in R? Well, there's a very easy way to time your code: system.time.

For example: I wanted to compare the speed of using subset's "select" option over post restricting the full returned data.frame.

Here are examples showing the comparison I mean. Assume that "molecule_data" is a data.frame with at least one field (name) and that name_list is a vector of molecule names that I'm interested in.

Here is an example of using subset's "select" restriction mechanism

mol_names <-
   unique( subset(molecule_data, name %in% name_list, select="name") )

here is an example of restricting to a single column post subsetting:

mol_names <-
   unique( subset(molecule_data, name %in% name_list)$name )

I found out that, for my data, using subset's select option was ~50x faster.

system.time(
   mol_names <-
      unique(subset(molecule_data, name %in% name_list, name)))

 user  system elapsed
0.001  0.000  0.001

system.time(
   mol_names <- 
      unique(subset(molecule_data, name %in% name_list)$name)

 user  system elapsed
0.055  0.000  0.056

These timings are unreliable given how small they are (esp the first one), so lets run the operation a hundred times to get a better estimate:

system.time(
   for(i in 1:100){
     mol_names <- unique(subset(molecule_data, name %in% name_list, name))
   }
)

 user  system elapsed
0.131  0.000  0.135

system.time(
   for(i in 1:100){
      mol_names <- unique(subset(molecule_data, name %in% name_list)$name
   }
)

 user  system elapsed
5.607  0.161  5.802

You can see that the time difference holds up over multiple runs. Subset's "select" is the clear winner!

GeekBrainDump

Wednesday, November 21, 2012

hg over ssh

Thursday, October 18, 2012

bash while loops and pipes

Monday, October 8, 2012

Mercurial file patterns

Tuesday, June 5, 2012

Removing VMWare Player - blank grey dialogue box

Friday, March 9, 2012

Timing your R code

Blog Archive

Labels

About Me

GeekBrainDump

Wednesday, November 21, 2012

hg over ssh

Thursday, October 18, 2012

bash while loops and pipes

Monday, October 8, 2012

Mercurial file patterns

Tuesday, June 5, 2012

Removing VMWare Player - blank grey dialogue box

Friday, March 9, 2012

Timing your R code

Subscribe To

Blog Archive

Labels

About Me