AWK Commnd in Linux

Pawan Kumar Yadav
3 min readJun 29, 2024

AWK Introduction:

AWK is a powerful text-processing tool and scripting language used in Unix and Linux systems. It’s named after its creators: Aho, Weinberger, and Kernighan.

AWK is particularly useful for:

  1. Processing text files
  2. Extracting and manipulating data
  3. Generating formatted reports

Basic AWK Structure:

An AWK command typically follows this structure:

awk 'pattern { action }' input_file
  • pattern: Optional. A condition for executing the action.
  • action: The operation to perform when the pattern matches.
  • input_file: The file to process.

Key Concepts:

  1. Records and Fields:
  • AWK processes input line by line, with each line called a “record”.
  • Each record is automatically split into “fields” based on whitespace (by default).
  • Fields are accessed using $1, $2, $3, etc. $0 represents the entire line.

2. Built-in Variables:

  • NR: Current record number
  • NF: Number of fields in the current record
  • FS: Field separator (default is whitespace)
  • OFS: Output field separator

3. Patterns:

  • Can be regular expressions, comparisons, or special patterns like BEGIN and END.

4. Actions:

  • Enclosed in curly braces {}
  • Can include print statements, calculations, control structures, etc.

Example:

Let’s say we have a file named “employees.txt” with this content:

John Doe 50000
Jane Smith 55000
Bob Johnson 48000

Patterns in AWK:

  1. Regular Expressions:
awk '/John/' employees.txt  # Prints lines containing "John"

2. Relational Expressions:

awk '$3 > 50000' employees.txt  # Prints lines where the 3rd field is greater than 50000

3. Special Patterns:

  • BEGIN: Executed before processing any input
  • END: Executed after processing all input
awk 'BEGIN {print "Employee List:"} {print $0} END {print "End of list."}' employees.txt

Actions in AWK:

  1. Print Statement:
awk '{print $1, $3}' employees.txt  # Prints first and third fields

2. Formatted Print:

awk '{printf "Name: %-15s Salary: $%d\\n", $1 " " $2, $3}' employees.txt
root@MMLITPAWANY /tmp# awk '{printf "Name: %-15s Salary: $%d\\n", $1 " " $2, $3}' emp.txt
Name: John Doe Salary: $50000
Name: Jane Smith Salary: $55000
Name: Bob Johnson Salary: $48000
root@MMLITPAWANY /tmp# awk '{printf "Name: %-20s Salary: $%d\\n", $1 " " $2, $3}' emp.txt
Name: John Doe Salary: $50000
Name: Jane Smith Salary: $55000
Name: Bob Johnson Salary: $48000
root@MMLITPAWANY /tmp# awk '{printf "Name: %20s Salary: $%d\\n", $1 " " $2, $3}' emp.txt
Name: John Doe Salary: $50000
Name: Jane Smith Salary: $55000
Name: Bob Johnson Salary: $48000

3. Conditional Statements:

awk '{if ($3 > 50000) print $1 " " $2 " is highly paid"}' employees.txt
root@MMLITPAWANY /tmp# awk '{if ($3 > 50000) print $1 " " $2 " is highly paid"}' emp.txt
Jane Smith is highly paid

Advanced AWK Concepts:

  1. Custom Field Separator:
awk -F: '{print $1}' /etc/passwd  # Uses colon as field separator
root@MMLITPAWANY /tmp# awk -F: '{print $1}' /etc/passwd 
root
bin
daemon
adm
lp
sync
shutdown
halt
mail

2. Array Usage:

awk '{count[$1]++} END {for (name in count) print name, count[name]}' employees.txt
root@MMLITPAWANY /tmp# awk '{count[$1]++} END {for (name in count) print name, count[name]}' emp.txt 
Bob 1
John 1
Jane 1
root@MMLITPAWANY /tmp# awk '{count[$2]++} END {for (name in count) print name, count[name]}' emp.txt
Johnson 1
Smith 1
Doe 1
root@MMLITPAWANY /tmp# awk '{count[$3]++} END {for (name in count) print name, count[name]}' emp.txt
48000 1
50000 1
55000 1

3. Built-in Functions:

  • length(), substr(), tolower(), toupper(), etc.
awk '{print tolower($1), length($0)}' employees.txt
root@MMLITPAWANY /tmp# awk '{print tolower($1), length($0)}' emp.txt 
john 14
jane 16
bob 17
root@MMLITPAWANY /tmp# awk '{print toupper($0), length($0)}' emp.txt
JOHN DOE 50000 14
JANE SMITH 55000 16
BOB JOHNSON 48000 17

4. User-Defined Functions:

awk '
function capitalize(string) {
return toupper(substr(string, 1, 1)) substr(string, 2)
}
{print capitalize($1), capitalize($2)}
' employees.txt
root@MMLITPAWANY /tmp# awk '
function capitalize(string) {
return toupper(substr(string, 1, 1)) substr(string, 2)
}
{print capitalize($1), capitalize($2)}
' emp.txt
John Doe
Jane Smith
Bob Johnsonbas

5. Multi-line AWK Scripts:

You can write more complex AWK scripts in a separate file and execute them:

Save this as process_employees.awk and run:

#!/usr/bin/awk -f
BEGIN { print "Processing file:" }
{
print "Line " NR ": " $0
total += $3
}
END {
print "Total salary: " total
print "Average salary: " total/NR
}
awk -f process_employees.awk employees.txt

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

No responses yet

Write a response