LFCS File Manipulation

Jarret B

Well-Known Member
Staff member
Joined
May 22, 2017
Messages
340
Reaction score
367
Credits
11,754
The Linux Foundation Certified Systems Administrator has listed in the Domains and Competencies to be able to ‘Analyze text files’. This article will cover the use of six commands to analyze the contents of a text file. The six commands are: wc, cat, head, tail, more and less.

Make sure you understand these commands for the LFCS exam as well as practicing the commands to be familiar with their use.

wc

The ‘wc’ command is used to count words. The command is for text based files. The ‘words’ may be changed to criteria other than words. The basic syntax is:

wc [options] [file]

There are three main options to be familiar with for the ‘wc’ command.

The first option is ‘-m’ which is used to specify the character count of the file. Each character is counted in the full count.

The second option is ‘-l’. The ‘-l’ option counts the number of lines in the specified file. The count is based off of the number of ‘newlines’ character in the file.

The third option is ‘-w’ which specifies to count the number of words.

If the command line includes multiple options then you need to be aware that the results will not be ordered the same way as the options were listed. The order is usually numeric and may usually be in the order of ‘-l -w -m’. The ‘-l’ usually has the lowest number, ‘-w’ is next and ‘-m’ has the largest amount, but this depends on the file itself. Try not to use multiple options at once because of this issue.

An example would be to count the words in a file named ‘test’. The command would be:

wc -w test

For more information on the ‘wc’ command and its options type the command ‘wc --help’ in a terminal.

cat

The ‘cat’ command is used to concatenate two files, but the general use is to copy the file to the standard output, which is the screen.

The standard syntax is:

cat [options] [file1] [file2]…

To concatenate two files, or join them would be a command like:

cat [file1] [file2] > [new-file]

Here, ‘file1’ will be copied into the ‘new-file’. When ‘file1’ is completely copied then ‘file2’ is copied into ‘new-file’. The two files are now made into one.

If the command were not ‘redirected’ (>) into a new file then the files would simply be displayed on the screen one after the other. Remember, if the data is not redirected to another file then the data is displayed on the screen.

The ‘cat’ command has two main options to keep in mind.

The first option is ‘-n’. The option ‘-n’numbers the lines as they are printed on the screen or piped to a file. For example, to display and number the lines in the file named ‘test’ the command would be ‘cat -n test’. Of course, this assumes that the file test is in the current folder.

The second option is the ‘-E’ which places a dollar sign ($) at the end of each line. If the line is wrapped to the next line because of the terminal window width you will be able to see the actual end-of-the-line.

You can also use ‘cat’ to create your own text file. In a folder look at the current files and use a file name not already in use, such as ‘test1’. Type the command ‘cat > test1’. Enter in all of the text you want including the ENTER key and when you are finished press CTRL+C to end the file. The file should now be created and contain the text you entered. Use the command ‘cat test1’ to see the contents of the file.

For more information on the ‘cat’ command type ‘cat --help’ in a terminal.

head


The command ‘head’ is used to display only the beginning of a specified file. The default it to show the first ten lines.

The standard syntax is:

head [option] [file1] [file2]…

When using the ‘head’ command multiple files can be listed. Each file will be displayed according to options used or the defaults. Each file will be separated by a few spaces and a header to list the file being shown.

There are three options you should know.

The first option is ‘-n’. The ‘-n’ option will let you specify the number of lines to be displayed. For example, to show the first 20 lines of the file ‘test’ the command would be ‘head -n 20 test’.

To specify the number of lines you can use a number as shown in the example. If the numbers are large, such as 1,000, then you can use other means to designate the number. The number designation is as follows:
  • b 512
  • kB 1000
  • K 1024
  • MB 1000x1000
  • M 1024x1024
  • GB 1000x1000x1000
  • G 1024x1024x1024
  • The convention continues with T, B, E, Z and Y

So, to specify the first 2048 lines of the file ‘test’ the command would be ‘head -n 2K test’.

The second option is not to display the file name as a header. The option is ‘-q’. By default headers are shown when you list multiple files so you know the file name of the file being displayed. To show the default of ten lines, but no headers, for the file ‘test’ the command is ‘head -q test’.

The third option is ‘-v’ which is ‘verbose’. The option will display the file name in the header no matter what other option is used. The ‘-v’ option will even override the option ‘-q’.

For more information on the ‘head’ command type ‘head --help’ in a terminal.

tail

The ‘tail’ command is the opposite of the ‘head’ command. Instead of displaying the beginning lines of a file the last lines of a file are shown. The default is to show the last ten lines of the specified file.

The basic syntax for the ‘tail’ command is:

tail [options] [file1][file2]…

If multiple files are listed then the ends of all of the files will be listed with a header specifying which file is being displayed. Of course, this assumes no options are being used to override the default output.

There are five output parameters you may want to be familiar with for the ‘tail’ command.

The first parameter is ‘-n’. Like the command ‘head’ the ‘-n’ parameter allows you to set the number of lines to be displayed. The same convention can be used for printing a large number of lines (b, kB, K, M, MB, etc).

If you want to start printing at a certain line to the end of the file then use the parameter ‘-n +#’ where ‘#’ is the line number you wish to start with. For example, to start showing the file ‘test’ at line 17 to the end the command would be ‘tail -n +17’.

The next parameter is used for files which are opened. If the file ‘test’ were opened and currently being edited we could see the ‘tail’ of the file when it was last saved. To see the last 5 lines for the file ‘test’ we use the command ‘tail -n 5 test’. If lines are being added to the file then the current last 5 lines will not be the final last 5 lines when the file is done being edited. For example, to see the last 5 lines of the file when it was last saved by the user or the program we need to find the PID of the program which has the file opened. You can use the program ‘top’, or if it is installed, ‘htop’. Once you have the PID number, if it is 5829, of the program using the file ‘test’ you could issue the command ‘tail -f -pid=5829 test’. The current last 5 lines will be displayed. If the file is saved again then the whole file will be listed. The command will remain active until you close the Terminal or the program editing the file is closed.

If you want to just watch the added lines as they are saved then do not use the ‘--pid=#’ parameter. For example, to watch the file ‘test’ the command is ‘tail -f test’. As the file is saved the new lines will be listed. If the program which has the file opened is closed the command ‘tail’ will remain opened until you close it with a CTRL+C. By specifying the ‘--pid=#’ parameter it closes the ‘tail’ command when the program with the file opened is closed.

The fourth parameter is ‘-q’ which will prevent headers from being displayed. The header is the file name of the file being shown. It is best to not prevent headers especially when multiple files are being shown.

The last parameter to know is ‘-v’. The ‘-v’ parameter forces the headers to be shown and will override the ‘-q’ parameter.

For more information on the ‘tail’ command type ‘tail --help’ in a terminal.

more

When viewing a file which has more content than can fit on the screen it is difficult to show the lines as they pass on the screen. The ‘more’ command is used to show a specified file one screen at a time.

The basic syntax for the ‘more’ command is:

more [options] [file1] [file2]…

By default ‘[file1]’ will be displayed with a header. You can use the ‘spacebar’ to scroll ahead a single page. Once the end of ‘[file1]’ is reached and a scroll forward key is used then ‘[file2]’ is shown. If ‘[file2]’ was not specified on the command-line then the command will end. The ‘ENTER’ key can be used to move forward on line only.

There are three basic parameters to know for the ‘more’ command.

The first parameter is the ‘-#’ option. Here, the number specifies the amount of lines which is considered a full screen. Depending how large the Terminal screen is made the more lines there will be for viewing. You can specify that the number of lines is more than a screen or less than a screen. If no parameter is given then the number of lines to fit a screen is a full screen.

The second parameter is the ‘+#’ option. The number given in the option is the line number to start displaying the file. To skip the first ten lines of the file ‘test’ the command would be ‘more +11 test’.

The last parameter is for dealing with a search string. The option is ‘+/string’. Here, ‘string’ is the text being searched for in a file. The first line displayed will be the line with the first match to ‘string’. For example, if I needed to display the total lines in the file ‘test’ you could search for ‘Total’ by using the command ‘more +/Total test’. Remember that the value for ‘string’ is case-sensitive.

For more information on the ‘more’ command type ‘more --help’ in a terminal.

less

The ‘less’ command works in a similar manner as the ‘more’ command. The big difference is that the ‘less’ command has capabilities that ‘more’ does not.

To start, let’s look at the general syntax for the ‘less’ command:

less [options] [file]…

Once a file is loaded on the screen you should have a display of the file’s contents on the screen. There are key commands to move about the document. The key presses are as follows:
  • e – forward one line
  • y – backward one line
  • f – forward one screen
  • b – backward one screen
  • d – forward half a screen
  • u – backward half a screen
With the ‘less’ command a search can be performed within the loaded file. The search commands within the ‘less’ display are:

  • /string – search forward for the ‘string’
  • ?string – search backward for the ‘string’
  • &string – display only the lines with ‘string’
You can also maneuver a little faster within the document. The keys to use for these quicker jumps are:
  • g – first line
  • G – last line
These simple parameters can help work around a loaded document. When needing to look through large files for certain information the ‘less’ command can come in very handy.

Once you are finished with the ‘less’ command type ‘q’ to quit the program.

Let’s look at an example. Open a Terminal and type the following commands:
  • cd /tmp
  • less --help > less.txt
  • less less.txt
Now you should be in the ‘/tmp’ folder where you created a file named ‘less.txt’ which has the contents of the help information of the ‘less’ command.

First, let’s do a search for the word ‘LeftArrow’. Type ‘/LeftArrow’ without the single quotes and press ENTER. You should move down in the document to the first occurrence of the word ‘LeftArrow’. You can search backwards using the ‘?’ key. So, type in ‘?marked’ and you should move backwards in the document. You can try to perform a few more searches and moving about the document using the keystrokes. When you are done press ‘q’ to quit the ‘less’ command.

As in the example, you can see the information for the ‘less’ command by typing ‘less –-help’ in a terminal. The screen which is then displayed is inside the ‘less’ command itself so you can use the same shortcut keys discussed in this section.

Remember, practice these commands and be familiar with them. Whether you are going to take the LFCS exam or not, the commands are useful.
 
Last edited:

Members online


Latest posts

Top