--------------------------------------------------------------------------------
join
--------------------------------------------------------------------------------

joins two files (file "1" and file "2") based on a common key which
by default is that specified in column one.
	#IFS (input field separator) by default is \s or \t
	#by default the first column of the output is the field which
	#which is used  for match (join)

	#by default, only records which are paired are printed
-t "IFS" change IFS. OFS follows IFS
-i ignore case 
-a 1 print unmatched lines in file 1 
-o over-rides the output column order

I. Simplest Example

$ cat a         
1 Protein
2 Carbohydrate
3 Fat

$ cat b		#note that the first column is sorted and has more entries
1 Cheese 
2 Potato
3 Butter
4 Air 
5 Water 

$ join a b	#only common lines are joined  
1 Protein Cheese
2 Carbohydrate Potato
3 Fat Butter

$ join -v 1 a b	#print lines which do not have counterpart in file 1

$ join -v 2 a b 
0 Air
5 Water

II. Joining on specific columns
	#remember: the specific columns must be sorted
$ cat c
Air 0
Cheese 1
Potato 2
Butter 3
Water 5

$ join -1 2 -2 1 c a    #first file use column 2, second file use column 1
1 Cheese Protein
2 Potato Carbohydrate
3 Butter Fat

III. Sorting before joining 

$ cat wine.txt
White Reisling Germany
Red Riocha Spain
Red Beaunes France

$ cat reviews.txt
Riocha Meh
Beaunes Great!
Reisling Terrible!

Fortunately bash allows sub-shells be treated as files. For reasons
I do not understand you have to use GNU join

$ gjoin -1 2 -2 1 <(sort -k 2 wine.txt) <(sort reviews.txt)
Beaunes Red France Great!
Reisling White Germany Terrible!
Riocha Red Spain Meh

IV. Sepcifying a different delimiter or IFS
	#done using -t option

$ join  -1 2 -2 1 -t, file1 file2


V. Grand Combo: specify a specific output sequence

$ cat names.csv
1,John Smith,London
2,Arthur Dent, Newcastle
3,Sophie Smith,London

$ cat transactions.csv
£1234,Deposit,John Smith
£4534,Withdrawal,Arthur Dent
£4675,Deposit,Sophie Smith

$ join -1 2 -2 3 -t , -o 1.1,1.2,1.3,2.2,2.1 names.csv transactions.csv
1,John Smith,London,Deposit,£1234
2,Arthur Dent, Newcastle,Withdrawal,£4534
3,Sophie Smith,London,Deposit,£4675

Of course you could have used awk to re-sequence the output. 

Ref: https://shapeshed.com/unix-join/

------------------------------------------------------------------------
join whilst preserving the order of one of the input files
------------------------------------------------------------------------

A limitation of join is that the input files have to be sorted. Sometimes
you wish to have the output to carry the same order as one of the input files.

$ cat rabbits1
Parvi   Brown
Sarsi   Black
Lakshmi White
Rusty   Tan
Malli   Gray 
Kuro    Blue

$ cat rabbits2
Kuro    Parsley
Malli   Coriander
Rusty   Celery
Lakshmi Lettuce
Sarsi   Carrots
Parvi   Hay

The key word here is name of the rabbit (column 1) and the other
attributes are color and food choice.

Say we wish to preserve ther order of the name in "rabbits1" (because
these correspond to temporal order of my rabbits). The solution is
to add a numerical index to rabbits1 file.

$ cat -n rabbits1 
     1	Parvi   Brown
     2	Sarsi   Black
     3	Lakshmi White
     4	Rusty   Tan
     5	Malli   Gray
     6	Kuro    Blue

We sort on the rabbit names and rewrite the output so that the first
column is the number column

$ join -1 2 -2 1 -o1.1 -o1.2 -o 1.3 -o 2.2 <(cat -n rabbits1 | sort -k2) <(sort rabbits2) 
6 Kuro Blue Parsley
3 Lakshmi White Lettuce
5 Malli Gray Coriander
1 Parvi Brown Hay
4 Rusty Tan Celery
2 Sarsi Black Carrots

Next we re-sort on the first column (the numerical index)

$ join -1 2 -2 1 -o1.1 -o1.2 -o 1.3 -o 2.2 <(cat -n rabbits1 | sort -k2) <(sort rabbits2) | sort -k1 -n
1 Parvi Brown Hay
2 Sarsi Black Carrots
3 Lakshmi White Lettuce
4 Rusty Tan Celery
5 Malli Gray Coriander
6 Kuro Blue Parsley

Finally, lop off the number column

$ join -1 2 -2 1 -o1.1 -o1.2 -o 1.3 -o 2.2 <(cat -n rabbits1 | sort -k2) <(sort rabbits2) | sort -k1n | awk '{$1="";print}'
 Parvi Brown Hay
 Sarsi Black Carrots
 Lakshmi White Lettuce
 Rusty Tan Celery
 Malli Gray Coriander
 Kuro Blue Parsley

We now have sorted list but the ordering is determined by rabbits1

To make the output pretty use "column"

$ join -1 2 -2 1 -o1.1 -o1.2 -o 1.3 -o 2.2 <(cat -n rabbits1 | sort -k2) <(sort rabbits2) | sort -k1n | awk '{$1="";print}' | column -t
Parvi    Brown  Hay
Sarsi    Black  Carrots
Lakshmi  White  Lettuce
Rusty    Tan    Celery
Malli    Gray   Coriander
Kuro     Blue   Parsley