------------------------------------------------------------------------
revive failed jobs
------------------------------------------------------------------------

Michael Coughlin is the Project Scientist for the Kitt Peak EMCCD
Demonstrator project. Every night the robotic telescope observes
targets, weather permitting.  This results in a master job list,
"joblist.txt".  The job numbers of the failed jobs are summarized
in file "failed_jobs.txt". In this file, the job number is one minus
the line number.  Michael wishes to identify the jobs that have
failed so that he can restart them.


We will idealize this problem by generating the two files as follows.


$ cat failed_jobs.txt
0,1,5,9

$ cat jobslist.txt
hello
kitty
mitty
gitty
zitty
nitty
fatty
nutty
butty
rusty

We first create a version of jobslist.txt that has line numbers.

$ cat jobslist.txt | nl -n ln -s " " 
1      hello
2      kitty
3      mitty
4      gitty
5      zitty
6      nitty
7      fatty
8      nutty
9      butty
10     rusty

The failed jobs [0,1,5,9] correspond to jobs on lines [1,2,6,10]
(there is an offset of 1 which arises because lines are counted by
line number which start at 1 whereas the job index starts at 0).
Our goal is to re-run failed jobs.  This means joining the two files
on keyword of "jobindex+1".

The unix utility "join" joins two files which are both sorted on a
common key.  That common key is the job number.

So we first sort the failed_jobs.txt

$ gsed 's/,/\n/g' failed_jobs.txt | awk '{print $1+1}' | sort
1
10
2
6

Next, we sort the job list file.
$ cat jobslist.txt | nl -n ln -s " " | sort
1      hello
10     rusty
2      kitty
3      mitty
4      gitty
5      zitty
6      nitty
7      fatty
8      nutty
9      butty

We are now ready to join the two files:

$ join <(gsed 's/,/\n/g' failed_jobs.txt | awk '{print $1+1}' | sort) \
	<(cat jobslist.txt | nl -n ln -s " " | sort) 

1 hello
10 rusty
2 kitty
6 nitty


Notes: "join" is extremely picky. You need to sort the key column
in exactly the same way in for both files. This means that you
should not have right-justified numbers (hence, nl -ln) and without
any tabs or additional spaces (hence, nl  -s " ").