Thursday, December 24, 2009

updating windows vista file permissions with Icacls

I'm not much of a windows guy, but I have a vista machine at home as the main PC. I recently bought a NAS to backup important files and found that my backup jobs were failing due to file permission issues (there are multiple accounts on the PC). I wanted to do the equivalent of chmod +r (or even 777) on a few files but didn't want the hassle of using windows explorer to adjust the file perms one at a time (I'd already tried changing permissions for the root folder and applying these to the contained files and sub-dirs, but it didn't work. I guess it was something to do with the fact that the permissions weren't uniform for the files in the dirs).

Anyway, it seems you use Icacls for updating file permissions and you can do it recursively with the /T flag. So for a single file you do this:

> Icacls <filename> /grant <user>:<perm>

e.g. This grants full control to user 'Paul' on file 'test.txt'

c:\> Icacls test.txt /grant Paul:F

and for a dir you do this:

c:\Icacls <dirname> /T /grant <user>:<perm>


c:\> Icacls c:\testdir /grant Paul:F

Monday, December 21, 2009

Mysql multiple counts on one line (reports)

I like to poll the contents of multiple tables over time in order to track the progress of certain processing tasks I do. I used to perform individual counts for each table and kept doing that out of habit. There's a much nicer way to do this and have the report in a single result. I tend to keep track of the elapsed time using the unix_timestamp() function (this returns the number of elapsed seconds since Jan 1st 1970 as an unsigned integer).

First set up the initial variables (those to compare against - this is time 0)

PaulBo@test_db:SELECT @time:=UNIX_TIMESTAMP(), @apples:=(SELECT COUNT(*) FROM apples), @oranges:=(SELECT COUNT(*) FROM oranges);

Then, let some time pass and poll the tables for changes:

PaulBo@test_db:SELECT UNIX_TIMESTAMP() - @time as elapsed_time, (SELECT COUNT(*) FROM apples) - @apples as d_apples, (SELECT COUNT(*) FROM oranges) - @oranges as d_oranges;
| elapsed_time | d_apples | d_oranges |
| 435 | 230 | 12887 |
1 row in 1 set (0.00 sec)

Friday, December 11, 2009

Redirecting the output from a "here document"

This had me confused for a few minutes, so I thought I'd post.

It's common to use "here documents" to simplify input to a program in a script.


> cat <<EOF
> This is a random number:
This is a random number:

But what if you want to capture the output of this? The naive attempt would be to redirect after the second EOF, but this is incorrect as the termination string has to be on a line all by itself.

This is the answer:

> cat <<EOF > /tmp/data.txt
> This is a random number:
> cat /tmp/data.txt
This is a random number:

samples from large datasets in R

I have a dataset I want to plot (say 5,000,000 data-points). This can be very slow to plot in R, so you want to take a sample of this data instead.

Say I have a tab delimited file with two columns, say 'time' and 'count'. The columns have these as headers. There are 5M rows and I'd like a simple overview of the count over time.

> data = read.delim('filename', header=T) #read in the tsv (tab separated value) data file
> s = length(data$time) # calculate the number of data-points
> n = 1000 # this is my sample size
> N = sort(sample(1:s, n)) #create a set of indices sampled from the vector 1:s
> plot(data$time[N], data$count[N]) # use the indices to sample from the set

The magic is in the sort(sample(1:s, n)). This takes n samples from the space 1 -> s. Unless a probability vector is provided, each element in the input vector (1:s) has an equal probability of being selected. We sort the output of sample so that the indices are in the correct order to plot. Actually I just tried this without the sort and it seems the plot() function sorts the input vectors anyway.

Thursday, December 3, 2009

Infinite loop in bash

I needed an infinite loop for polling a database table for a while (killed with Ctrl-C).

while [ 1 ];
do mysql -u paul -h test_database -ppass test_paul data_table <<< "show processlist";
sleep 10;