search lines based on column values
awk -F\: '$2 ~/User/ && $1 ~/B2BProd/ {print $3}' /opt/www/htdocs/server.lists
- print column (seperated by : ) 3 where column 2 contains “User” AND column 1 contains “B2BProd” from file
search ldif for entries lacking a description
gawk 'BEGIN { IGNORECASE=1; RS="\n\n"; } !/description/ {print ; print "";}' groups_dnOwnerDescription_only.ldif
- ignore case
- set Record Seperator to “\n\n” (for parsing an ldif as one record)
- find records (LDIF entry) that DON’T contain “description”
- print the found record (LDIF ENTRY) & an extra line (to keep double return seperated.
Split file into multiple files based on record count.
awk 'NR%1000 == 1{ file = "outputfile" i++ } { print > file }' ORS= RS=------ test.list
- To output the first 1000 records to outputfile0, the next to outputfile1, etc.
Count occurences of a line in a file:
awk '{count[$1]++}END{for(j in count) print j,"\t"count[j]}' FS=\t
Format comma separated to fix width fields
awk -F"," '{printf("%-1s,%-5s,%-6s\n", $1,$2,$3)}' filename.txt
Split a file into two based on data
awk -F, '{if($2<=500)print > "500L.txt";else print > "500G.txt"}' file1
split into files named based on column #4
awk -F, '{print > "somefile_"$4}' <input_file>
count the number of occurrences of an attribute in an LDIF file & print a status every 100,000 lines
awk ' \
# set field separator to :
BEGIN{FS=":"} \
#if number of records (NR) mod 100000 = 1 (NR is evenly divisible by 100000) then print the number of records processed divided by the number of lines in the file (precomputed) & multiply by 100 to get a percentage complete.
NR%100000==1{print NR/36654988*100"%"} \
# add 1 to the count of the occurrences of the first column
{count[$1]++} \
#when finished print out the count
END {for(j in count) print j,"\t" count[j]} \
' entries.ldif
find the missing ranges from a list of numbers
awk 'NR-1{if($1!=(_+1))print _}{_=$1}' conns_sorted.list
find etimes greater than 1000 (1 sec) in an OpenDJ access log
awk 'BEGIN {FS="etime=";} /etime/ && $2>1000 {print;}' access.20160322175201Z
use awk to find/replace data from one file into another.
FNR – record number in current file NR – total record number (from all files)
awk -F',|, ' 'NR==FNR{a[$1]=$2} NR>FNR{$3=a[$3];print}' OFS=',' "country-list.csv" "Forensic_report_for_04-02-17.csv"
- set field seperator (-F) to
,|,
- if NR==FNR than assign $2 as the value to hash ‘a’ with key $1
- This will occur for the first file passed in only as NR==FNR can only be true for the first file
- if NR>FNR then set the value of $3 as the vlaue from hash ‘a’ with key of $3 and print
- THis will occur for files AFTER the first file as NR will not be greater than FNR until after we switched to the second file & FNR gets reset.
- Set the Output Field Seperator (OFS) to “,”
- first file passed is the data to be used in the find/replace hash
- other files will receive the replacement.
use awk to read a list of CNs & then print out those entries from an LDIF file
awk 'NR==FNR{a[$1]=1;} {RS="\n\n"} NR>FNR{match($0, /([0-9]+)/, cn); if (a[cn[1]] != "") print $0."\n"}' find.list testBak.ldif