------------------------------------------------------------------------ Multi-line record ------------------------------------------------------------------------ In some databases, a logical record is spread over several lines. It may well be that the record has variable number of fields. It helps to consolidate the fields of a given record on a single line (so it can be fed to line oriented tools such as sed or awk). Consider the file "Kitty.txt" $ cat Kitty.txt 1 hello kitty hello mitty 2 hello ditty 3 hello hello hello goodbye night night We define a record as starting with a number and incorporating all the entries until a line with a number is encountered. The sed code is easy once we get going: store lines which do not start "/^[0-9]+/". When a line with "/^[0-9]+/" is encountered then exchange the pattern space with the hold buffer and delete \n and write out the pattern space. The trick, as always, is how you handle the first and last line. The ordering of commands matters. $ gsed -E -n '1{h;d};${H;x;s/\n/ /gp};/^[0-9]+/!{H;b};/^[0-9]+/{x;s/\n/ /gp}' Kitty.txt 1 hello kitty hello mitty 2 hello ditty 3 hello hello hello good bye night night Now let us dissect this command. $ gsed -E -n '1{h;b};${H;x;s/\n/ /gp};/^[0-9]+/!{H;b};/^[0-9]+/{x;s/\n/ /gp}' Kitty.txt #1 #2 #3 #4 #1 .. hold the first line (assumed to be a numbered line) notice the use of "b" (which moves control to the top) makes for efficient code (no need to go through the rest of commands) #2 .. deal with the last line. it is important to deal with last line before dealing with any other lines -- because once you encounter the last line SED quits! #3 .. non-numbered line, Hold. The delete takes control to the top of the command stack (efficient!). #4 .. numbered line, exchange patter with Hold, get rid of \n and print In any case, after spending a lot of time (but in the process I came to learn how to solve the "last line" problem and also learnt the value of "b" for improved efficiency). I think this may be so useful that I converted this one-liner into a stand-aloneutility. $ cat multi2one #!/bin/bash # multi2one ... build a single logical record from several lines of fields # multi2one rectok fs file1 # The start of a logical record is given by "rectok" (regular expression without the slashes) # field_separator substitutes for \n when lines are combined # rectok, if present in a line, must be at the beginning of the line # example: # multi2one "^[0-9]+" "\t" Kitty.txt # rectok="$1"; shift fs="$1"; shift head -1 $1 | awk -v rectok="$rectok" \ '$0!~rectok{ print "exiting - fist line does not start with record token"; exit -1}' [ $? = 0 ] && gsed -E -n \ '1{h;b};${H;x;s/\n/'"$fs"'/gp};/'"$rectok"'/!{H;b};/'"$rectok"'/{x;s/\n/'"$fs"'/gp}' $1