10,000 Monkeys Typing…with a Unix/sh challenge…

July 27, 2010 | Cryptography, Programming Languages | By: Mark VandeWettering

I was testing some code that I wrote for analyzing cryptograms, and decided that the easiest way to do so would be to get some random text, drawn from the letters A-Z. A moments thought yielded this method, without even programming anything:

tr -d -c A-Z < /dev/urandom | dd ibs=10000 count=1

The tr generates the required data, and the dd truncates it to the desired number of characters. But for tidiness, I'd like to have the output broken up so that each line consists of 50 characters, with spaces inserted between every 5 characters (I won't begrudge you if you leave a dangling space at the end of each line). I couldn't figure out a simple way to get that to happen all in one command line and using standard utilities. I can of course write a little Python utility, or even perl, but can anyone think of a clever short way to do this?

Addendum: Tom pointed out something interesting about the command that I listed above: it doesn’t work the way I think it does. Apparently the ibs is the input block size, which dd dutifully allocates, and the count is the number of read system calls that the system issues. For reasons which escape me, it does not try to make sure that it actually received a full input block: it will nicely return short blocks if it finds them, and doesn’t bother retrying to get more. Hence, it works rather erratically when using a pipe as input, particularly when the writes from the upstream process may flush in odd intervals.

Share Button
Be Sociable, Share!

Comments

Comment from Shannon Nelson
Time 7/27/2010 at 8:21 pm

Sed is a magical tool, and cut comes in handy at the last:

tr -d -c A-Z < /dev/urandom | dd ibs=10000 count=1 | sed -e 's/\(.\{5\}\)/\1 /g' -e 's/\(.\{60\}\)/\1\n/g' | cut -c 1-59

Comment from Shannon Nelson
Time 7/27/2010 at 8:32 pm

Of course, this points out one of the problems with specs. Note that I didn’t follow your specs to the letter, I interpreted them and came up with an answer that made sense to me: you asked for lines of 50 characters, and I gave you lines of 59 characters. Was I wrong in not following the given spec?

Comment from VK5FNET
Time 7/27/2010 at 9:16 pm

#!/usr/bin/perl
#use strict;
#use warnings;

my $limit = 250; # ten lines of five groups of five
my @cypher; # where we put the plain text pad
while () { # read line by line from STDIN
my @chars = split(undef, $_); # split the line into a list of characters
#print “|”;
for my $char (@chars) {
if ($char =~ m/[A-Z]/ ) { # if the character is A-Z
push @cypher, $char;
#print “.”;
if ($limit <= scalar(@cypher)) {
#print "\n";
#print scalar(@cypher);
my $offest = 0;
while (@cypher) {
for (1..5) { # GROUP
print " ";
for (1..5) { #LETTERS
print pop @cypher;
}
}
print "\n";
}
exit;
}
}
}
#print scalar(@cypher)."\n";
}

Comment from VK5FNET
Time 7/27/2010 at 9:24 pm

Ok, how about i tidy that up a bit. So 10 groups of 5 characters, totalling 10,000. All comments and debug removed. No doubt there are smaller and faster ways to do this in perl, but you need to be able to read it first to understand it…

This code reads the standard input with the while () then converts each line to an array and interates over the array to pull out the characters we care about then when our array is full, print it out.

perl -e ‘my $limit = 10_000; my @cypher;while (){my @chars=split(undef, $_);for my $char (@chars){if ($char =~ m/[A-Z]/ ){push @cypher, $char;if($limit<=scalar(@cypher)){my $offest = 0;while(@cypher){for(1..10){print " ";for(1..5){print pop @cypher;}}print "\n";}exit;}}}}' < /dev/urandom

Comment from Tom Duff
Time 7/27/2010 at 9:44 pm

This works on Linux but not on Mac OS X (no -w flag to od):

tr -d -c A-Z < /dev/urandom |
dd ibs=1 count=10000 |
od -cvw50 |
sed 's/^[0-7]* *//;s/ //g;s/…../& /g'

The trick is to find a command that doesn't look for newlines in its input; od is the obvious choice. Unfortunately od's -w flag is non-posix.

dd ibs=10000 doesn't actually work how you'd expect when its input is a pipe, for complicated reasons.

Comment from Alan Yates
Time 7/27/2010 at 10:08 pm

tr -d -c A-Z < /dev/urandom | fold -w 50 | sed 's/\(.\{5\}\)/\1 /g'

Truncate to the desired length… I am assuming you mean 50 non-space chars?

If you dislike sed you can try this abomination:

tr -d -c A-Z < /dev/urandom | fold -w 5 | tr '12' ' ' | fold -w 60

Comment from VK5FNET
Time 7/27/2010 at 10:38 pm

hmmm, comment code is striping the < and > characters.
so make sure the while statement looks like this;

while (<>) {

anyhow, shared the challenge at lunch with other programmers. here is another solution.

perl -e ‘$limit=shift; $group=5; $i=0; while($i<$limit) { read STDIN, $b, 1; $o=ord $b; next if $o>234; print chr(65+($o % 26)), (++$i % $group ? “” : ” “) } print “\n”‘ 150 < /dev/urandom | fold -w 60

Comment from VK5FNET
Time 7/27/2010 at 10:42 pm

Alan, the fold command is nice, and I’m assuming that there should be a backslash in front of of the 12 there.

So that its the form-feed char, not the carriage-return 13, or line-feed 10 char?

Or is that in 12 in octal?

Comment from kiwimonster
Time 7/28/2010 at 2:19 am

sed is for wimps. hmm.. how to make it clearer, while still being obtuse…

tr -d -c A-Z < /dev/urandom | fold -w 5 | xargs -n 10 echo | head -200

Comment from Tom Duff
Time 7/28/2010 at 9:38 am

As usual for this sort of thing, is comes out shorter & easier to read in C than in Perl:

#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]){
int c, i, n=argc<2?10000:atoi(argv[1]);
if(n<0) fprintf(stderr, "%s: %s?\n", argv[0], argv[1]), exit(1);
FILE *f=fopen("/dev/urandom", "r");
if(f==0) perror("/dev/urandom"), exit(1);
for(i=0;i!=n;i++){
do c=getc(f); while(c<’A’ || ‘Z’<c);
putchar(c);
if(i%5==4) putchar(i%50==49?’\n’:’ ‘);
}
if(i%50!=0) putchar(‘\n’);
return 0;
}

(Hope this formats right.)

Comment from Tom Duff
Time 7/28/2010 at 9:39 am

Hmm, maybe this formats correctly:

#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]){
    int c, i, n=argc<2?10000:atoi(argv[1]);
    if(n<0) fprintf(stderr, "%s: %s?\n", argv[0], argv[1]), exit(1);
    FILE *f=fopen("/dev/urandom", "r");
    if(f==0) perror("/dev/urandom"), exit(1);
    for(i=0;i!=n;i++){
        do c=getc(f); while(c<’A’ || ‘Z’<c);
        putchar(c);
        if(i%5==4) putchar(i%50==49?’\n’:’ ‘);
    }
    if(i%50!=0) putchar(‘\n’);
    return 0;
}

Comment from Elwood Downey
Time 7/28/2010 at 12:57 pm

Well done, kiwimonster.

Comment from Pádraig Brady
Time 7/29/2010 at 3:42 am

dd doesn’t read full blocks by default for backwards compat reasons.
We added the iflag=fullblock option to coreutils recently, so you can do:

tr -d -c A-Z < /dev/urandom |
dd ibs=10000 iflag=fullblock count=1 cbs=50 conv=unblock |
sed 's/\(.\{5\}\)/\1 /g; s/ $//;'

Trading a little efficiency for clarity,
one could replace the `dd` above with:

fold -w50 | head -n200

Comment from Alan Yates
Time 7/29/2010 at 6:31 pm

@VK5FNET yes that was slosh-zero-one-two (aka newline) before it got mangled.

Comment from rking
Time 10/27/2010 at 3:17 pm

ruby -e ’204.times do 8.times do 5.times do print (“A”..”Z”).to_a[rand(26)] end; print ” “; end; puts end’

or a little more idiomatically (and it buys us 4 extra lines by leaving off the ending space):

ruby -e ’208.times { puts (1..8).map { (1..5).map { (“A”..”Z”).to_a[rand(26)]}.join }.join ” ” }’

If I’m going to use shell, it’s:

pwgen -C0Bs 5 2000 | tr ‘[:lower:]‘ ‘[:upper:]‘

Pingback from junk code | VK5FJ
Time 6/19/2013 at 12:20 am

[...] whats this for? There was a challenge a while back on the Brainwagon blog, using /dev/urandom to generate a list of ‘random’ characters to fit a pattern, ten [...]

Write a comment






5 + = fourteen