Awk Command in Linux with Examples - MS TV Life.COM

Awk Command in Linux with Examples

Awk Command in Linux with Examples

We hope this post helped you to find out  Awk Command in Linux with Examples

Awk is a general-purpose scripting language designed for superior textual content processing. It’s largely used as a reporting and evaluation instrument.

Not like most different programming languages which might be procedural, awk is data-driven, which implies that you outline a set of actions to be carried out in opposition to the enter textual content. It takes the enter information, transforms it, and sends the outcome to plain output.

How to Install Gradle on CentOS 8

This text covers the necessities of the awk programming language. Figuring out the fundamentals of awk will significantly enhance your means to control textual content recordsdata on the command line.

How awk Works #

There are a number of completely different implementations of awk. We’ll use the GNU implementation of awk, which is named gawk. On most Linux methods the awk interpreter is only a symlink to gawk.

Data and fields #

Awk course of textual information, both from recordsdata or streams. The enter information is split into data and fields. Awk operates on one file at a time till the top of the enter is reached. Data are separated by a personality known as the file separator. The default file separator is the newline character, which implies that every line within the textual content information is a file. A brand new file separator could be set utilizing the RS variable.

Data include fields that are separated by the sphere separator. By default, fields are separated by a whitespace, together with a number of tab, area and newline characters.

The fields in every file are referenced by the greenback signal ($) adopted by area quantity, starting with 1. The primary area is represented with $1, the second with $2, and so forth. The final area will also be referenced with the particular variable $NF. The complete file could be referenced with $0.

Here’s a visible illustration displaying find out how to reference data and fields:

tmpfs      788M  1.8M  786M   1% /run/lock 
/dev/sda1  234G  191G   31G  87% /
|-------|  |--|  |--|   |--| |-| |--------| 
   $1       $2    $3     $4   $5  $6 ($NF) --> fields
|-----------------------------------------| 
                    $0                     --> record

Awk program #

To course of a textual content with awk you write a program that tells the command what to do. The applications consists of collection of guidelines and person outlined features. Every rule accommodates one sample and motion pair. Guidelines are separated by newline or semi-colons (;). Sometimes, an awk program appears to be like like this:

pattern { action }
pattern { action }
...

When awk course of a knowledge, if the sample matches the file, it performs the desired motion on that file. When the rule has no sample, all data (strains) are matched.

An awk motion is enclosed in braces ({}) and consists of statements. Every assertion specifies the operation to be carried out. An motion can have multiple assertion separated by newline or semi-colons (;). If the rule has no motion, it defaults to printing the entire file.

Awk helps several types of statements, together with expressions, conditionals, enter, output statements, and extra. The commonest awk statements are:

  • exit – Stops the execution of the entire program and exits.
  • subsequent – Stops processing the present file and strikes to the subsequent file within the enter information.
  • print – Prints data, fields, variables, and customized textual content.
  • printf – Offers you extra management over the output format, just like C and bash printf.

When writing awk applications, the whole lot after the hash mark (#) and till the top of the road is taken into account to be a remark. Lengthy strains could be damaged into a number of strains utilizing the continuation character, backslash ().

Executing awk applications #

An awk program could be run in a number of methods. If this system is brief and easy ,it may be handed on to the awk interpreter on the command-line:

awk 'program' input-file...

When running the program on the command-line, it should be enclosed in single quotes (''), so the shell doesn’t interpret the program.

If the program is large and complex, it is best to put it in a file and use the -f option to pass the file to the awk command:

awk -f program-file input-file...

In the examples below we will use a file named “teams.txt” that looks like the one below:

Bucks Milwaukee    60 22 0.732 
Raptors Toronto    58 24 0.707 
76ers Philadelphia 51 31 0.622
Celtics Boston     49 33 0.598
Pacers Indiana     48 34 0.585

Awk Patterns #

Patterns in awk management whether or not the related motion needs to be executed or not.

Awk helps several types of patterns together with, common expression, relation expression, vary, and particular expression patterns.

When the rule has no sample, every enter file is matched. Right here is an instance of a rule containing solely an motion:

awk '{ print $3 }' teams.txt

The program will print the third field of each record:

60
58
51
49
48

Regular expression patterns #

A regular expression or regex is a pattern that matches a set of strings. Awk regular expression patterns are enclosed in slashes (//):

/regex pattern/ { action }

The most basic example is a literal character or string matching. For example, to display the first field of each record that contains “0.5” you would run the following command:

awk '/0.5/ { print $1 }' teams.txt
Celtics
Pacers

The pattern can be any type of extended regular expression. Here is an example that prints the first field if the record starts with two or more digits:

awk '/^[0-9][0-9]/ { print $1 }' teams.txt
76ers

Relational expressions patterns #

The relational expressions patterns are typically used to match the content material of a selected area or variable.

By default, common expressions patterns are matched in opposition to the data. To match a regex in opposition to a area, specify the sphere and use the “include” comparability operator (~) in opposition to the sample.

For instance, to print the primary area of every file whose second area accommodates “ia” you’d kind:

awk '$2 ~ /ia/ { print $1 }' teams.txt
76ers
Pacers

To match fields that do not contain a given pattern use the !~ operator:

awk '$2 !~ /ia/ { print $1 }' teams.txt
Bucks
Raptors
Celtics

You can compare strings or numbers for relationships such as, greater than, less than, equal, and so on. The following command prints the first field of all records whose third field is greater than 50:

awk '$3 > 50 { print $1 }' teams.txt
Bucks
Raptors
76ers

Range patterns #

A spread patterns include two patterns separated by a comma:

All data beginning with a file that matches the primary sample till a file that matches the second sample are matched.

Right here is an instance that can print the primary area of all data ranging from the file together with “Raptors” till the file together with “Celtics”:

awk '/Raptors/,/Celtics/ { print $1 }' teams.txt
Raptors
76ers
Celtics

The patterns can also be relation expressions. The command below will print all records starting from the one whose fourth field is equal to 32 until the one whose fourth field is equal to 33:

awk '$4 == 31, $4 == 33 { print $0 }' teams.txt
76ers Philadelphia 51 31 0.622
Celtics Boston     49 33 0.598

Vary patterns can’t be mixed with different sample expressions.

Particular expression patterns #

Awk contains the next particular pattens:

  • BEGIN – Used to carry out actions earlier than data are processed.
  • END – Used to carry out actions after data are processed.

The BEGIN sample is usually used to set variables and the END sample to course of information from the data corresponding to calculation.

The next instance will print “Begin Processing.”, then print the third area of every file and at last “Finish Processing.”:

awk 'BEGIN { print "Start Processing." }; { print $3 }; END { print "End Processing." }' teams.txt
Start Processing
60
58
51
49
48
End Processing.

If a program has solely a BEGIN sample, actions are executed and the enter shouldn’t be processed. If a program has solely an END sample, the enter is processed earlier than performing the rule actions.

The Gnu model of awk additionally contains two extra particular patterns BEGINFILE and ENDFILE that means that you can carry out actions when processing recordsdata.

Combining patterns #

Awk means that you can mix two or extra patterns utilizing the logical AND operator (&&) and logical OR operator (||).

Find out how to Setup FTP Server with VSFTPD on CentOS 8

Right here is an instance that makes use of the && operator to print the primary area of these file whose third area is larger than 50 and the fourth area is lower than 30:

awk '$3 > 50 && $4 < 30 { print $1 }' teams.txt
76ers Philadelphia 51 31 0.622
Celtics Boston     49 33 0.598

Built-in Variables #

Awk has a number of built-in variables that contain useful information and allows you to control how the program is processed. Below are some of the most common built-in Variables:

  • NF – The number of fields in the record.
  • NR – The number of the current record.
  • FILENAME – The name of the input file that is currently processed.
  • FS – Field separator.
  • RS – Record separator.
  • OFS – Output field separator.
  • ORS – Output record separator.

Here is an example showing how to print the file name and the number of lines (records):

awk 'END { print "File", FILENAME, "contains", NR, "lines." }' teams.txt
File teams.txt contains 5 lines.

Variables in AWK can be set at any line in the program. To define a variable for the entire program, it should be set in a BEGIN pattern.

Changing the Field and Record Separator #

The default value of the field separator is any number of space or tab characters. It can be changed by setting in the FS variable.

For example, to set the field separator to . you would use:

awk 'BEGIN { FS = "." } { print $1 }' teams.txt
Bucks Milwaukee    60 22 0
Raptors Toronto    58 24 0
76ers Philadelphia 51 31 0
Celtics Boston     49 33 0
Pacers Indiana     48 34 0

The field separator can also be set to more than one characters:

awk 'BEGIN { FS = ".." } { print $1 }' teams.txt

When running awk one-liners on the command-line, you can also use the -F option to change the field separator:

awk -F "." '{ print $1 }' teams.txt

By default, the record separator is a newline character and can be changed using the RS variable.

Here is an example showing how to change the record separator to .:

awk 'BEGIN { RS = "." } { print $1 }' teams.txt
Bucks Milwaukee    60 22 0
732 
Raptors Toronto    58 24 0
707 
76ers Philadelphia 51 31 0
622
Celtics Boston     49 33 0
598
Pacers Indiana     48 34 0
585

Awk Actions #

Awk actions are enclosed in braces ({}) and executed when the sample matches. An motion can have zero or extra statements. A number of statements are executed within the order they seem and should be separated with by newline or semi-colons (;).

There are a number of varieties of motion statements which might be supported in awk:

  • Expressions, corresponding to variable project, arithmetic operators, increment, and decrement operators.
  • Management statements, used to manage the circulate of this system (ifforwhereaschange and extra)
  • Output statements, corresponding to print and printf.
  • Compound statements, to group different statements.
  • Enter statements, to manage the processing of the enter.
  • Deletion statements, to take away array parts.

The print assertion might be probably the most used awk assertion. It prints a formatted output of textual content, data, fields, and variables.

When printing a number of objects, you have to separate them with commas. Right here is an instance:

awk '{ print $1, $3, $5 }' teams.txt

The printed items are separated by single spaces:

Bucks 60 0.732
Raptors 58 0.707
76ers 51 0.622
Celtics 49 0.598
Pacers 48 0.585

If you don’t use commas, there will be no space between the items:

awk '{ print $1 $3 $5 }' teams.txt

The printed items are concatenated:

Bucks600.732
Raptors580.707
76ers510.622
Celtics490.598
Pacers480.585

When print is used without an argument, it defaults to print $0. The current record is printed.

To print a custom text, you must quote the text with double-quote characters:

awk '{ print "The first field:", $1}' teams.txt
The first field: Bucks
The first field: Raptors
The first field: 76ers
The first field: Celtics
The first field: Pacers

You can also print special characters such as newline:

awk 'BEGIN { print "First linenSecond linenThird line" }'
First line
Second line
Third line

The printf statement gives you more control over the output format. Here is an example that inserts line numbers:

awk '{ printf "%3d. %sn", NR, $0 }' teams.txt

printf doesn’t create a newline after each record, so we are using n:

  1. Bucks Milwaukee    60 22 0.732 
  2. Raptors Toronto    58 24 0.707 
  3. 76ers Philadelphia 51 31 0.622
  4. Celtics Boston     49 33 0.598
  5. Pacers Indiana     48 34 0.585

The following command calculates the sum of the values stored in the third field in each line:

awk '{ sum += $3 } END { printf "%dn", sum }' teams.txt
266

Here is another example showing how to use expressions and control statements to print the squares of numbers from 1 to 5:

awk 'BEGIN { i = 1; while (i < 6) { print "Square of", i, "is", i*i; ++i } }'
Square of 1 is 1
Square of 2 is 4
Square of 3 is 9
Square of 4 is 16
Square of 5 is 25

One-line commands such as the one above are harder to understand and maintain. When writing longer programs, you should create a separate program file:

prg.awk
BEGIN { 
  i = 1
  while (i < 6) { 
    print "Square of", i, "is", i*i; 
    ++i 
  } 
}

Run the program by passing the file name to the awk interpreter:

awk -f prg.awk

You can also run an awk program as an executable by using the shebang directive and setting the awk interpreter:

prg.awk
#!/usr/bin/awk -f
BEGIN { 
  i = 1
  while (i < 6) { 
    print "Square of", i, "is", i*i; 
    ++i 
  } 
}

Save the file and make it executable:

chmod +x prg.awk

You can now run the program by entering:

./prg.awk

Utilizing Shell Variables in Awk Packages #

If you’re utilizing the awk command in shell scripts, chances are high that you just’ll must go a shell variable to the awk program. One possibility is to surround this system with double as a substitute of single quotes and substitute the variable in this system. Nonetheless, this feature will make your awk program extra complicated as you’ll want to flee the awk variables.

The beneficial means to make use of shell variables in awk applications is to assign the shell variable to an awk variable. Right here is an instance:

num=51awk -v n="$num" 'BEGIN {print n}'
51

Conclusion #

Awk is without doubt one of the strongest instruments for textual content manipulation.

This text barely scratches the floor of the awk programming language. To study extra about awk, try the official Gawk documentation.

We hope the Awk Command in Linux with Examples help you. If you have any query regarding Awk Command in Linux with Examples drop a comment below and we will get back to you at the earliest.

We hope this post helped you to find out  Awk Command in Linux with Examples  . You may also want to see – How to Install Gradle on CentOS 8

Source link

Copy link
Powered by Social Snap