Linux AWK command with example
awk command in Linux is used text processing and manipulation. Here are some of the most commonly used options:
- Basic Syntax:
awk '{pattern + action}' file
- Options:
-F
- Use a specified field separator (default is whitespace).-v
- Define a variable for use within the awk program.-f
- Read the awk program from a file instead of specifying it on the command line.-i
- Inplace edit of the file.-W
- Assign a variable to an environment variable.
here are a few simple examples of using awk:
- Print every line of a file
Suppose we have a file named "data.txt" that contains some text. We want to use awk to print out every line of this file. The awk command to accomplish this would be:
awk '{print}' data.txt
Here, we are using the print
function with no arguments, which tells awk to print the entire line. This is equivalent to just running cat data.txt
, but can be useful if we want to pipe the output to another command.
- Print the number of lines in a file
Suppose we have a file named "data.txt" that contains some text. We want to use awk to print out the number of lines in this file. The awk command to accomplish this would be:
awk 'END {print NR}' data.txt
Here, we are using the NR
variable, which represents the current line number, to count the number of lines in the file. The END
keyword tells awk to perform this action after it has processed all of the lines in the file.
- Print the sum of a column of numbers
Suppose we have a file named "data.txt" that contains a column of numbers. We want to use awk to print out the sum of these numbers. The awk command to accomplish this would be:
awk '{sum += $1} END {print sum}' data.txt
Here, we are using a variable named sum
to accumulate the values of the first field on each line. The END
keyword tells awk to print the final value of sum
after it has processed all of the lines in the file.
- Print lines that match a pattern
Suppose we have a file named "data.txt" that contains some text. We want to use awk to print out only the lines that contain the word "hello". The awk command to accomplish this would be:
awk '/hello/ {print}' data.txt
Here, we are using the /hello/
pattern to match lines that contain the word "hello". The {print}
action tells awk to print these matching lines.
- Print fields that match a pattern
Suppose we have a file named "data.txt" that contains some data. We want to use awk to print out only the fields that contain the word "hello". The awk command to accomplish this would be:
awk '{for(i=1; i<=NF; i++) if($i ~ /hello/) print $i}' data.txt
Here, we are using a for
loop to iterate over each field in the line (NF
is a built-in variable that represents the number of fields on the line). The if
statement checks if the current field matches the /hello/
pattern, and if it does, we use print
to output the field.
- Calculate the average of a column of numbers
Suppose we have a file named "data.txt" that contains two columns of numbers. We want to use awk to print out the average of the second column. The awk command to accomplish this would be:
awk '{sum += $2} END {print sum/NR}' data.txt
- Extract a specific column
Suppose we have a file named "data.txt" that contains several columns of data, separated by whitespace. We want to use awk to extract only the second column of data. The awk command to accomplish this would be:
awk '{print $2}' data.txt
Here, we are using the print
function to output the second field of each line ($2
). This is a common operation in text processing, and awk makes it easy to extract specific columns of data.
- Use a different field separator
Suppose we have a file named "data.txt" that contains several columns of data, separated by commas instead of whitespace. We want to use awk to extract only the third column of data. The awk command to accomplish this would be:
awk -F, '{print $3}' data.txt
Here, we are using the -F
option to specify that the field separator is a comma instead of whitespace. This allows awk to correctly identify and extract the fields in the input file.
- Replace text in a file
Suppose we have a file named "data.txt" that contains some text, and we want to use awk to replace all occurrences of the word "apple" with the word "pear".
The awk command to accomplish this would be:
awk '{gsub(/apple/, "pear"); print}' data.txt
Here, we are using the gsub()
function to perform a global search and replace operation on the input file. The first argument specifies the pattern to search for (/apple/
), and the second argument specifies the replacement text ("pear"
). The print
function is used to output the modified lines.
These are just a few simple examples. Below we have mentioned 3 complex example with AWK.
- Suppose we have a file named "data.txt" that contains the following data:
apple orange banana
carrot tomato broccoli
We want to use awk to print out only the first column of this file. The awk command to accomplish this would be:
awk '{print $1}' data.txt
Here's a breakdown of how this command works:
'{print $1}'
: Specifies the action to perform on each line of the input file. In this case, we use theprint
function to output the first field of each line (which is separated by whitespace).$1
represents the first field.data.txt
: Specifies the input file to process.
When we run this command, we get the following output:
apple
carrot
- we have a CSV file named "data.csv" that contains the following data:
Name, Age, Gender
John, 32, M
Jane, 25, F
Bob, 47, M
Alice, 19, F
We want to use awk to extract the names and ages of all the people in the file, and print them out in the format "Name (Age)".
The awk command to accomplish this would be:
awk -F"," '{print $1,$2}' data.csv
John (32)
Jane (25)
Bob (47)
Alice (19)
-F,
: Specifies that the field separator in the input file is a comma.'{print $1, $2}'
: here we use theprint
function to format the output string as "Name (Age)", where$1
represents the first field (Name) and$2
represents the second field (Age). The\n
at the end of the string adds a newline character to the output.data.csv
: Specifies the input file to process.
- Count the number of lines in a file
Suppose we have a file named "data.txt" that contains some text. We want to use awk to count the number of lines in the file. The awk command to accomplish this would be:
awk 'END {print NR}' data.txt
Here, we are using the NR
built-in variable to keep track of the number of lines processed by awk. The END
pattern matches after all lines have been processed, and the print
function is used to output the final count.
- Convert all text to uppercase
Suppose we have a file named "data.txt" that contains some text, and we want to use awk to convert all the text to uppercase. The awk command to accomplish this would be:
awk '{print toupper($0)}' data.txt
Here, we are using the toupper()
function to convert the entire input line ($0
) to uppercase. The print
function is used to output the modified lines.
- Combine fields into a single string
Suppose we have a file named "data.txt" that contains several columns of data, separated by whitespace. We want to use awk to combine the second and third columns into a single string, separated by a hyphen. The awk command to accomplish this would be:
awk '{print $2 "-" $3}' data.txt
Here, we are using the string concatenation operator (-
) to combine the second and third fields into a single string. The print
function is used to output the modified lines.
- we have another log file named "access.log" that contains records of website access. Each record consists of several fields, separated by spaces. Here is an example of a record:
192.168.1.1 - - [01/Mar/2023:10:00:01 -0500] "GET /index.html HTTP/1.1" 200 1234
We want to extract the IP address, timestamp, HTTP status code, and number of bytes transferred for each record, and print them out in a formatted table.
The awk command to accomplish this would be:
awk '{printf "%-15s %-26s %-10s %-10s\n", $1, substr($4, 2), $9, $10}' access.log
Here's a breakdown of how this command works:
'{printf "%-15s %-26s %-10s %-10s\n", $1, substr($4, 2), $9, $10}'
: Specifies the action to perform on each line of the input file. In this case, we use theprintf
function to format the output as a table with columns for the IP address, timestamp (extracted from the fourth field), HTTP status code, and number of bytes transferred. The%
-prefixed codes are placeholders for each field, and the-
flag specifies left alignment for each field. Thesubstr
function extracts the timestamp from the fourth field, starting from the second character (to remove the leading bracket).access.log
: Specifies the input file to process.
When we run this command, we get output like this:
192.168.1.1 01/Mar/2023:10:00:01 -0500 200 1234
192.168.1.2 01/Mar/2023:10:00:02 -0500 404 0
192.168.1.3 01/Mar/2023:10:00:03 -0500 200 5678
192.168.1.4 01/Mar/2023:10:00:04 -0500 200 9012
These are just a few examples of how to use the AWK command in Linux. If you have further questons then you can reach out to us on whatsapp numbers: 7838238895/8909068089/8920228066