AWK学习_4


如何用awk实现打印指定列的功能?

Print all but the first three columns
Too cumbersome:

awk '{print " "$4" "$5" "$6" "$7" "$8" "$9" "$10" "$11" "$12" "$13}' things

A solution that does not add extra leading or trailing whitespace:

awk '{for(i=4;i<NF;i++)printf "%s",$i OFS; if (NF) printf "%s",$NF; printf ORS}'

Demo:

$ echo '1 2 3 4 5 6 7' |
  awk '{for(i=4;i<NF;i++) printf"%s",$i OFS;if(NF)printf"%s",$NF;printf ORS}' |
  tr ' ' '-'
4-5-6-7

Another approach using the ternary operator is Sudo_O’s solution:

$ echo '1 2 3 4 5 6 7' |
  awk '{for(i=4;i<=NF;i++)printf "%s",$i (i==NF?ORS:OFS)}' | tr ' ' '-'
4-5-6-7

And EdMorton gives a solution that preserves the original whitespace between fields:

$ echo '1 2 3 4 5 6 7' |
  awk '{sub(/([^ ]+ +){3}/,"")}1' | tr ' ' '-'
4---5----6-7

The solution given by larsr in the comments is almost correct:

$ echo '1 2 3 4 5 6 7' |
  awk '{for (i=3;i<=NF;i++) $(i-2)=$i; NF=NF-2; print $0}' | tr ' ' '-'
3-4-5-6-7

This is the fixed and parametrized version of larsr solution:

$ echo '1 2 3 4 5 6 7' |
  awk '{for(i=n;i<=NF;i++)$(i-(n-1))=$i;NF=NF-(n-1);print $0}' n=4 | tr ' ' '-'
4-5-6-7

All other answers are nice but add extra spaces:

Example of answer adding extra leading spaces:

$ echo '1 2 3 4 5 6 7' | awk '{$1=$2=$3=""}1' | tr ' ' '-'
---4-5-6-7

Example of answer adding extra trailing space

$ echo '1 2 3 4 5 6 7' |
  awk '{for(i=4;i<=13;i++)printf "%s ",$i;printf "n"}' file | tr ' ' '-'
4-5-6-7-

====

$ awk '{for(i=1;i<4;i++) $i="";print}' file

use cut

$ cut -f4-13 file

or if you insist on awk and $13 is the last field

$ awk '{$1=$2=$3="";print}' file

else

$ awk '{for(i=4;i<=13;i++)printf "%s ",$i;printf "n"}' file

====
The correct way to do this is with an RE interval because it lets you simply state how many fields to skip, and retains inter-field spacing for the remaining fields.

e.g. to skip the first 3 fields without affecting spacing between remaining fields given the format of input we seem to be discussing in this question is simply:

$ echo '1 2 3 4 5 6' | awk '{sub(/([^ ]+ +){3}/,"")}1'
4 5 6

If you want to accommodate leading spaces and non-blank spaces, but again with the default FS, then it’s:

$ echo ' 1 2 3 4 5 6' | awk '{sub(/[[:space:]]*([^[:space:]]+[[:space:]]+){3}/,"")}1'
4 5 6

If you have an FS that’s an RE you can’t negate in a character set, you can convert it to a single char first (RS is ideal if it’s a single char since an RS CANNOT appear within a field, otherwise consider SUBSEP), then apply the RE interval subsitution, then convert to the OFS. e.g. if chains of “.”s separated the fields:

$ echo '1...2.3.4...5....6' | awk -F'[.]+' '{gsub(FS,RS);sub("([^"RS"]+["RS"]+){3}","");gsub(RS,OFS)}1'
4 5 6

Obviously if OFS is a single char AND it can’t appear in the input fields you can reduce that to:

$ echo '1...2.3.4...5....6' | awk -F'[.]+' '{gsub(FS,OFS); sub("([^"OFS"]+["OFS"]+){3}","")}1'
4 5 6

Then you have the same issue as with all the loop-based solutions that reassign the fields – the FSs are converted to OFSs. If that’s an issue, you need to look into GNU awks’ patsplit() function.

====
Pretty much all the answers currently add either leading spaces, trailing spaces or some other separator issue. To select from the fourth field where the separator is whitespace and the output separator is a single space using awk would be:

$ awk '{for(i=4;i<=NF;i++)printf "%s",$i (i==NF?ORS:OFS)}' file

To parametrize the starting field you could do:

$ awk '{for(i=n;i<=NF;i++)printf "%s",$i (i==NF?ORS:OFS)}' n=4 file

And also the ending field:

$ awk '{for(i=n;i<=m=(m>NF?NF:m);i++)printf "%s",$i (i==m?ORS:OFS)}' n=4 m=10 file
实际测试效果:
# echo '1 2 3 4 5 6 7'
1 2 3 4 5 6 7
# echo '1 2 3 4 5 6 7' | awk '{for(i=1;i<4;i++) $i="";print}'
4 5 6 7
# echo '1 2 3 4 5 6 7' | awk '{$1=$2=$3="";print}'
4 5 6 7
#
# echo '1 2 3 4 5 6 7' | awk '{for(i=4;i<NF;i++)printf "%s",$i OFS; if (NF) printf "%s",$NF; printf ORS}'
4 5 6 7
#
# echo '1 2 3 4 5 6 7' | awk '{sub(/([^ ]+ +){3}/,"")}1'
4 5 6 7
#
# echo '1 2 3 4 5 6 7' | awk '{ for (i=3; i<=NF; i++) print $i }'
3
4
5
6
7
# echo '1 2 3 4 5 6 7' | awk '{$1=$2=$3=""}sub("^"FS"*","")'
4 5 6 7
参考链接:

Print all but the first three columns

==

如何删除输出中的前/后导空格(类似于trim的作用)?

This will remove all spaces …(删除所有空格)

# echo " test test test " | tr -d ' '
testtesttest

This will remove trailing spaces…(删除尾部空格)

# echo " test test test " | sed 's/ *$//'
 test test test

This will remove leading spaces…(删除前导空格)

# echo " test test test " | sed 's/^ *//'
test test test

This will remove both trailing and leading spaces(同时删除前后空格)

# echo " test test test " | sed -e 's/^ *//' -e 's/ *$//'
test test test

搜索关键字:http://search.aol.com/aol/search?q=linux+how+to+trim+whitespace+of+output

==

用awk去重(获取两种类型的去重数据的方法)
array2=$(cat filename.txt | awk '{type["all"]+=$3; type[$2]+=$3} END {for (i in type) print i, type[i]}')

设置一个数组,但是用不同的数组下标进行区分(比如这里的:all 和 $2 ,只是下标不同,但用的同一个数组),最后用个END进行打印即可;当然也可以设置多个数组,但如何关联起来就是个技巧问题了(跟具体需求相关)。

,

《“AWK学习_4”》 有 1 条评论

  1. awk的FS和OFS的使用学习 – Simple awk command issue (FS, OFS related)
    https://stackoverflow.com/questions/16203336/simple-awk-command-issue-fs-ofs-related
    `
    $ head rivalHost.txt | awk -F’\t’ OFS=’\t’ ‘NR>1 && $(11)==”200″ {print $1,$6,$7,$(11)}’
    awk: syntax error at source line 1
    context is
    >>> OFS=\ <<< t awk: bailing out at source line 1 #如果想让OFS变量生效,需要将 OFS='\t' 这个放在最后,否则会有上面的报错(放在BEGIN块中试了也不行) `

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注