------------------------------------------------------------------------ Unique PTF names (unique names but preserving order) ------------------------------------------------------------------------ Anna Ho has a file of PTF sources $ more PTFSources.txt PTF_D 17 PTF_A 17.2 PTF_C 16 PTF_A 18 PTF_B 19 PTF_C 18.5 PTF_B 19.3 She wants to extract all unique PTF names but preserving the order in which they are listed. Thus Anna would like to see the following output (she does not care about the second column which is the variable mag) PTF_D PTF_A PTF_C PTF_B Naively you may think "uniq -u" or equivalently "sort -u" will be fine. However, sort of any sort will sort the data and thus scramble the input order. Here is the solution! $ awk '{print NR,$1}' PTFSources.txt | sort -k2 | awk 'a!=$2;{a=$2}' | sort -n | awk '{print $2}' # <1 2 3 4 5 > PTF_D PTF_A PTF_C PTF_B 1: prints seqeunce number (number of records, NR) and PTF name 2: sort on the new second column ("k2). Recall that this is now the PTF name. 3: now the modestly clever bit. See if the second variable, $2 (the PTF name), is the same as "a". If not, print the line. Next, regardless set a=$2. In awk un-initialized strings are empty. So the first time the file is read "a" is not equal to $2 and so the line is printed. Howevever "a" is set to $2. On the next read if a==$2 then nothing happens; if not then $2 is printed. 4: sort on the sequence number (now you are going back to the original sequence) 5: strip off the sequence number. Print PTF name. ps. "cut" is very picky about delimiters. Use it with caution. In contrast awk allows delimiters to be regular expressions.