Sunday, November 28, 2010

EMC2, G64 and sharp corners

I've cut a number of things on my CNC machine using EMC2. I've been perplexed by the fact that some of the corners in my test shapes were not sharp, but were rounded off.

I've just been designing and cutting some new leadnut holders and found that the hexagonal depression I was cutting was also plagued with these curved corners (not good when you need a hex nut to fit in them). It turns out this is to do with the G64 command in the G-Code which tells the trajectory planner to sacrifice path following accuracy in order to keep the feed rate up (i.e. cut/round corners). A good description is available in the EMC2 documentation section covering trajectory control.

I manually edited my G-code (this one coming from CamBam, but past ones were from Inkscape and dxf2gcode) so that the G64 command now read "G64 P0.001" and the hexagon cut with much sharper corners.

Friday, November 12, 2010

Fun with process substitution

I was introduced to process substitution recently at work, it's a great way to avoid temporary files for some simple cases.

For example, say I wanted to know what the column header changes were between two files (same data, different code used to extract them). The files are tab delimited and I have a script I use (frequently) that prints out the index and name for the headers in an input file -

So, if I want to see what's different between two files, in the past I'd create two output files using my script and then use diff, kompare or comm to compare them.

You can eliminate the temporary files using a technique called process substitution.

> diff <( file1) <( file2)
> 0 blarg
> 1 blorg
< 0 blorg
< 1 blarg
In this case we see two columns have been swapped between file1 and file2.

Take a look at the Advanced Bash Scripting Guide for more examples.

The script basically does this (but allows user specified delimiters):
> head -1 <input file> | perl -F'\t' -lane 'print $n++,"\t$_" for @F'

Friday, September 17, 2010

Bash command line parameter parsing (getopts)

If I only need a single optional parameter I'll often just check to see if the input positional parameter of interest is set (e.g. script that accepts one required field and one optional field):

if [[ ! -n "$2" ]];
echo "I received the second parameter:$2"

But if you want to do something a bit more complex, getopts is your friend.

For example, say you want to have the user input their first name, last name and a "keep my data private" flag you could do something like this:

while getopts f:l:p flag
case $flag in
echo "$0 -f <first name> -l <last name> -p"
echo -e "\t-p [flag] keep my data private"

The getopts command is fairly straightforward (man getopts for more details). If an option requires an argument then a colon is placed after it's letter designation ('f' and 'l' in the above example).

You can check for required parameters by looking at which variables were set:

if [[-z "$FIRSTNAME" || -z "$LASTNAME" ]];
echo "missing required parameter"

Wrap that all up into a neat script with a subroutine that outputs a usage statement and you're home free:

function usage_and_exit()
echo "$0 -f <first name> -l <last name> -p"
echo -e "\t-p [flag] keep my data private"

while getopts f:l:p flag
case $flag in

if [[ -z "$FIRSTNAME" || -z "$LASTNAME" ]];
echo "missing a required parameter (firstname and lastname are required)"

if [[ $PRIVATE -ne 0 ]];
echo "protecting private data"

Wednesday, August 18, 2010

MySQL: the mystery of unsetable global variables

I just tried updating our mysql server to accept very long connections. I have a ton of jobs running, so I wanted to set the wait_timeout variable (via the mysql shell) to something reasonable for these jobs. The default of 8 hours is not sufficient in some rare cases so I tried to set the timeout higher:

mysql> SHOW VARIABLES LIKE 'wait_timeout';
| Variable_name | Value |
| wait_timeout | 28800 |
1 row in set (0.00 sec)

mysql> SET GLOBAL wait_timeout=86400;
mysql> SHOW VARIABLES LIKE 'wait_timeout';
| Variable_name | Value |
| wait_timeout | 28800 |
1 row in set (0.00 sec)

What? Why wasn't it set?

Well, the reason is that the SHOW VARIABLES command defaults to the session variables. So, the local session wait_timeout is still 2880, but the global wait_timeout was actually updated correctly:

mysql> SHOW GLOBAL VARIABLES LIKE 'wait_timeout';
| Variable_name | Value |
| wait_timeout | 86400 |
1 row in set (0.00 sec)

Wednesday, July 14, 2010

Basic MySQL 'top' command

Ever wanted to keep an eye on the running processes in a MySQL database? How about something that works a little like the top command, but without any of the bells and whistles?

Well, here you go:

> while [[ 1 ]];
mysql -u <username> -p<password> -h <host> -e "show full processlist";
sleep 1;

Remember to replace the <username> <password> and <host> variables with the values for your database. Also, if you don't like the bounding box on the mysql output, you can have a cleaner output by using redirection instead of the -e flag:

mysql -u <username> -p<password> -h <host> <<< "show full processlist";

Wednesday, June 16, 2010

Case insensitive regex in bash

By default regexes in [[ ]] are case sensitive. If you want to match in a case insensitive way you have to set the shell option nocasematch.

if [[ $mytext =~ existenz ]];
echo "yep"
echo "nope"

If you run the above script you should get "nope" as the output. For case insensitive matching just insert this into the script prior to the regex:

shopt -s nocasematch;

You can unset the nocasematch shell option using the following: shopt -u nocasematch

Here's a more complete example:

function testText
if [[ $mytext =~ existenz ]];
echo "yep"
echo "nope"
shopt -s nocasematch

If you run that script then the output is:


Sunday, June 13, 2010

R related links

These are just a few R links I think are interesting and worth a further read:

  1. Colour palettes in R.

  2. Using R for introductory statistics

  3. Web-friendly visualisations in R

  4. A different way to view probability densities

  5. Thoughts on Making Data Work

Also of note is Google's announcement (on June 2nd) that they were working with the USPTO to make all granted patents, trademarks and published applications freely available for download - see this post.

Tuesday, June 1, 2010

I love subshells

Sometimes it's the little things in life that give us the greatest pleasures. I love subshells. There, I admitted it.

Say you want to do something with the output of a program but would like to prepend some output/text that also needs to be operated on. The standard idiom would be to create a file with the prepended output, append your output to this and then operate on the file, but you can remove the intermediate files using a subshell. Here's a contrived example:

> (echo -e "some\toutput\tto\tappend"; perl -lane 'print join("\t", log($F[0])/log(10), @F[1 .. #$F])' ;) | a2ps

Something else I use subshells for a lot is to launch programs in a different directory without having to cd there and then back again:

> (cd ~/workspace/; eclipse &)

like I said; it's the little things.

Tuesday, May 18, 2010

delete all empty files in a directory

Automatically generating files? Annoyed at all the empty ones? Here's how to purge them:

> for file in $( ls . ); do if [ ! -s $file ]; then rm -f $file; fi; done

Of course you could just get over your fear of find and take a look at that man page. Perfect for a recursive search and delete all in one:

> find . -empty -delete

simple, no?

Friday, May 14, 2010

escaping quotes in bash variables

I often need to escape quotation characters or other special characters that are being piped into a bash process:

Say you have a mysql table which contains protein names. Some of these have the ' character in them (e.g. "Inosine 5' monophosphate dehydrogenase") and you want to do some bulk processing on these names, you could do something like this:

> mysql -u paulbo -N <<< "SELECT name from proteins" | while read protein_name; do mysql -u paulbo -N "SELECT count(*) FROM data INNER JOIN proteins ON data.protein_id = proteins.protein_id where = '${protein_name//\'/\\\'}'"; done

Ah, three backslashes. Why didn't I think of that?

And sometimes you just want to output the text with the special characters all converted into something a bit more amenable (like the old and trusted '_').

> mysql -u paulbo -N <<< "SELECT name from proteins" | while read protein_name; do echo ${protein_name//[-\/\:\'\"\(\) ]/_};done

Thursday, May 6, 2010

Excluding certain files from a directory listing

Ok, well, this will be obvious to anyone who's really read the ls man page, but I only came across it a couple of days ago.

Say you have a directory with tons of files in it mostly with a single extension and you want to see what else is in there. Sure, you can use grep, but you can also use ls's inbuilt flat --hide.

> ls -l
... # and a ton more *.txt files
> ls -l --hide=*.txt

Isn't that useful?

Friday, April 30, 2010

batch renaming files with spaces

Why do people keep giving me tons of files with spaces in the filename?

Anyway, here's a good way to get rid of those pesky spaces:

ls * | while read file; do mv "$file" ${file// /_}; done

First I tried using "for file in `ls *`" but of course the whitespaces came back to bite me... This was also true for the mv command. You have to quote "$file" in order for the whitespace ridden filename to be recognised as a unit rather than multiple file descriptors.

Tuesday, April 27, 2010

bash array size

Strangely this caught me off guard. If you use ${#array[@]} to get the 'size' of an array, it actually only returns the number of assigned elements in the array.


> array[23]=123
> echo ${#array[@]}

Hmm... only 1? Not 24?

As far as I can tell, there's no way around this. Just don't expect this behaviour and fill your arrays wisely.

While we're on bash arrays, remember that you can change the 'join' character for naive printing of arrays by manipulating the IFS (Internal Field Separator) variable. Below I also show that the quotation context is important for this:

> array[0]=1;array[1]=2;array[2]=3;
> echo ${array[*]}
1 2 3
> echo "${array[*]}"
1 2 3
> IFS=","
> echo ${array[*]}
1 2 3
> echo "${array[*]}"

Note: It's best practice to store and then restore the original IFS variable.

... do stuff

Thursday, April 15, 2010

Java: Heap Dump on OutOfMemoryError

You can request the jvm create a heap dump when an OutOfMemoryError is thrown. This is handy if you have a process that consumes a ton of RAM and you don't know why. Set the max heap size to something around 500M (or less. It needs to be fairly small if you're going to inspect the heap with 'jhat'). Use the -XX:+HeapDumpOnOutOfMemoryError flag to request the heap dump. This will output to java_pid.hprof by default. You can set the output filename manually using -XX:HeapDumpPath=<filename>.


> java -Xmx100m -XX:+HeapDumpOnOutOfMemoryError -XX:HeadDumpPath=/tmp/dump.hprof com.geekbraindump.MyMemoryHoggingClass

Tuesday, March 23, 2010

Linux: command line cut and paste

It's always annoyed me that I have to open up a file in order to cut and paste the contents into a web browser (I use this a lot for capturing information on an internal wiki). As of 2010-03-23 there are no default command line access utilities for this (on CentOs anyway).

However, download and install xclip and the clipboard is yours to command.

xclip allows access to both the PRIMARY (middle mouse button) and SECONDARY (standard copy/paste) selections.

By default piping into xclip puts the text in the PRIMARY clipboard (middle mousebutton).

> echo $RANDOM | xclip
> xclip -o

You can define which selection to input to. Say you want to store text in the SECONDARY selection (accessed using standard cut and paste commands):

> echo $RANDOM | xclip -sel 'clipboard'
> xclip -o -sel 'clipboard'

You can now use edit->paste to output the text. Note that the random number was the same as before. I guess this is some shell caching mechanism. For a new random number each time you have to use a new shell:

> (echo $RANDOM) | cat

Sunday, February 28, 2010

AVR: storing a 2d array in PROGMEM

I've just been fangling with a UV Painter project (row of LEDs to 'paint' on a glow-in-the-dark-wall). I quickly ran out of RAM when adding patterns to the system and learned how to add variables to the Flash program memory instead. Pretty simple and very useful:

#include <avr/pgmspace.h>

const uint8_t mCylonScan[10][N_LED] PROGMEM = {

Then to access the data you just do the following (where 'i' and 'j' are loop variables):

data = pgm_read_byte(&(mCylonScan[i][j]));

See the AVR libc docs for more details.

Friday, February 26, 2010

Manipulating Bash Strings

I'm finding myself looking these up quite a lot, so here's a little cheat sheet of the basics.

Using the string "ABCDEFG12345" as an example:

> string="ABCDEFG12345"
> echo $string

> #Replacement:
> echo ${string/ABC/___}

> #Replacement with character class
> echo ${string/[ABC4]/_}

> #Replace all occurrences:
> echo ${string//[ABC4]/_}

> #Extract from a defined position in the string
> echo ${string:7}
> echo ${string:7:3}

> #substring removal (from the front of the string)
> echo ${string#ABC}
> echo ${string##ABC} #strips the longest substring match
> string2="abcABCabc123ABCabc
> echo ${string2#a*C}
> echo ${string2##a*C}
> # use the % sign to match from the end
> # % for shortest and %% for longest substring match
> echo ${string%45}

Thursday, February 11, 2010

Use csplit to split SDF files (or contextually split any file)

Say you want to split an SDF into individual entities, you could write a Perl script/one-liner (which is what I've been doing for a long time) or you could just use csplit. Thanks to Pat and Jessen for pointing this one out.

e.g. say you had an SDF, test_mols.sdf, with 8 molecules in it and you wanted individual mol files:

> csplit -kzsf "test_mols" -b %0d.mol test_mols.sdf /\$\$\$\$/+1 {*}

This would result in 8 files called test_mols00.mol through test_mols07.mol. Unfortunately these would still contain the SDF delimiter at the end of the file (so, technically these are still SDFs). That's pretty easy to clean up with something like:

> perl -ni -e 'print unless /\$\$\$\$/' *.mol

See the csplit manpage for more details.

Tuesday, February 2, 2010

autolinkification in bugzilla

I often refer to other bugs in a Bugzilla comment. These are 'autolinkified' by bugzilla. I've only just learned that you can refer to a comment in a bug as well and have this 'autolinkified'.

See the Bugzilla hintandtips page.

Short answer:

Bug autolink:" bug 1234"
Comment autolink:"bug 1234, comment 12"
attachment: autolink" bug 1234, attachment 4"

Friday, January 29, 2010

Regular expressions in bash

You can perform regular expression matching on a variable within an extended test command (see the Conditional Constructs part of the bash manual).


prompt> name=foobar.blarg;if [[ $name =~ foo ]]; echo yep; else echo nope; fi
prompt> name=foobar.blarg;if [[ $name =~ foo[a-c] ]]; echo yep; else echo nope; fi
prompt> name=foobar.blarg;if [[ $name =~ foo[d-z] ]]; echo yep; else echo nope; fi

Tuesday, January 26, 2010

Convert tab data into HTML tables with a Perl one-liner

Quick one-liner for generating a HTML table from tab delimited input. Either pipe in your data or include the file as a command line argument.

perl -F'\t' -lane 'BEGIN{print "<table border=1 cellpadding=3 cellspacing=0>"}print "<tr>", (map {"<td>$_</td>"} @F), "</tr>";END{print "</table>}'

The map is in parentheses so that the closing '<tr>' tag is not slurped in as part of it's input array.

Saturday, January 16, 2010

Setting AVR Clock Speed

I'm using gcc/WinAVR to program my AVRs and have just been discovering how to program the clock speed. I'm playing with an ATtiny13 and an ATmega8. Both of these ship with their clocks set to 1MHz by default but both can be clocked to a higher speed.

There are a few things to note:

1. F_CPU is used by the compiler for calculating timings (most obvious example is in the delay.h routines). Setting it has no effect on the actual clock speed. This needs to be set to the correct value.

2. Easiest way (for me) to set the clock speed is to program the relevant fuse bits. This was different for the two chips I've been using - ATtiny13 and ATmega8). Note: for fuse bits 1 = unprogrammed and 0 = programmed (this is due to the nature of EEPROM).

This step is made easy when using the Eclipse avr plugin. There's a GUI/wizard for setting them in Project->Properties->AVR->AVRDude->Fuses. Accessible by selecting the "direct hex values" radio option and then clicking the "start editor" button.

3. For some MCUs you can dynamically adjust the clock speed in software (I know this is true for the ATtiny13 at least). However, this has to be done within 4 clock cycles of setting the CLKPCE bit (again, this is for the Atiny13). See this forum post on and pg.28 of the datasheet.

Here's a good overview of setting the clock for the ATmega8: Electrons - AVR Fuses HOWTO Guide.

Tuesday, January 12, 2010

Bash: reading lines from a file

I guess I don't do this enough to remember it:

When reading a file in a for loop in bash, the following idiom will read each work (whitespace delimited):
for word in `cat file.txt`; do echo $word; done

If you want to grab the whole line then you can do this:
while read line; do echo $line; done < file.txt

cat file.txt | while read line; do echo $line; done

Friday, January 8, 2010

Essential Eclipse Keyboard Shortcuts (navigation)

There are a few shortcuts I use all the time for navigating in Eclipse and I've just learned a few more useful ones (I was looking for quick ways to jump between editor windows).

Here's the essential list (IMO):

Ctrl+E (go to other open editors - opens selection box)
Ctrl+Q (jump to last edit location)
Crtl+O (jump to any member/method/inner-class in the current editor)
Ctrl+shift+T (open any type)
Ctrl+shift+R (open any file)
Ctrl+L (jump to a particular line number)
Ctrl+T (go to a supertype/subtype - multiple presses toggle between super/sub)
Alt+left/right arrow (jump through visited files)
Ctrl+. Ctrl+, (navigate up and down through error/warning locations)

Sunday, January 3, 2010

C rand() and random() functions

There are two primary random functions to be aware of in stdlib.h: random() and rand() the main difference is in the range of values returned by the two functions.

random() returns a pseudo random number in the range 0 -> 0x7FFFFFFF = 0 -> 1,879,098,192 (RANDOM_MAX).
rand() returns values in the range 0 -> 0x7FFF = 0 -> 28,672 (RAND_MAX).

I had mistakenly been using one in place of the other in some micro-controller code and had spent some time wondering why it wasn't behaving as expected...