通过一个例子学习一下awk脚本的编写


=Start=

缘由:

通过学习一个用awk对 sfltool dumpbtm 命令输出做解析的案例来学习一下awk编程的知识(因为要解析的文本内容非常规格式,是一段一段的,简单想了想,好像用Python来写还真的没那么好写……)。

正文:

参考解答:
$ sudo sfltool dumpbtm

$ sudo sfltool dumpbtm >~/dumpbtm.$(date '+%F')

# 脚本的使用方法
$ chmod u+x ./sh_parse_dumpbtm_v2.sh
$ ./sh_parse_dumpbtm_v2.sh ~/dumpbtm.2024-04-11 | awk -F'\t' '{print $1,$2,$3,$NF}'
# sfltool dumpbtm 命令的输出内容样例如下

========================
 Records for UID -2 : FFFFEEEE-DDDD-CCCC-BBBB-AAAAFFFFFFFE
========================

 ServiceManagement migrated: true
 SharedFileList migrated: false

 Items:
...
 #7:
                 UUID: E9907828-7AB0-4B5B-985E-57C8B4B6F958
                 Name: com.xk72.charles.ProxyHelper
       Developer Name: Charles
      Team Identifier: 9A5PCU4FSD
                 Type: curated legacy daemon (0x90010)
          Disposition: [enabled, disallowed, visible, notified] (9)
           Identifier: com.xk72.charles.ProxyHelper
                  URL: file:///Library/LaunchDaemons/com.xk72.charles.ProxyHelper.plist
      Executable Path: /Library/PrivilegedHelperTools/com.xk72.charles.ProxyHelper
           Generation: 18
    Assoc. Bundle IDs: [com.xk72.Charles ]
    Parent Identifier: Charles
...

========================
 Records for UID 0 : FFFFEEEE-DDDD-CCCC-BBBB-AAAA00000000
========================

 ServiceManagement migrated: false
 SharedFileList migrated: false

 Items:

 #1:
                 UUID: 7CEC80FD-6488-474B-A782-0E59FB8726A0
                 Name: (null)
       Developer Name: (null)
                 Type: developer (0x20)
          Disposition: [disabled, allowed, visible, not notified] (2)
           Identifier: Unknown Developer
                  URL: (null)
           Generation: 45

 #2:
                 UUID: 26267948-139D-4480-9633-25054FE1DFC8
                 Name: Broadcom Inc
       Developer Name: Broadcom Inc
                 Type: developer (0x20)
          Disposition: [disabled, allowed, visible, not notified] (2)
           Identifier: Broadcom Inc
                  URL: (null)
           Generation: 2

 #3:
                 UUID: 7B61DE75-36CA-4B66-AB06-3AA565C54A50
                 Name: Charles
       Developer Name: Charles
                 Type: curated developer (0x80020)
          Disposition: [disabled, allowed, visible, not notified] (2)
           Identifier: Charles
                  URL: (null)
           Generation: 1
...
========================
 Records for UID 501 : FFFFEEEE-00CD-42B1-9BF6-AAAAFFFFFFFE
========================

 ServiceManagement migrated: true
 SharedFileList migrated: true

 Items:
...
 #5:
                 UUID: 0230D4A4-E37D-42A2-992C-ADF8B390908A
                 Name: Charles
       Developer Name: Charles
                 Type: curated developer (0x80020)
          Disposition: [disabled, allowed, visible, notified] (10)
           Identifier: Charles
                  URL: (null)
           Generation: 1
  Embedded Item Identifiers:
    #1: com.xk72.charles.ProxyHelper
...
#!/bin/zsh

# Usage:
# ./sh_parse_dumpbtm_v2.sh ~/dumpbtm.2024-04-11 | awk -F'\t' '{print $1,$2,$3,$NF}'

# https://apple.stackexchange.com/a/465945/100302
# https://github.com/luckman212/login-items-dump

# sudo sfltool dumpbtm |

# 从文件中读取内容,以命令行传入的第一个字符串作为文件名
cat $1 |
/usr/bin/awk 'BEGIN { capture=0; u501=0 }

function f(field, dv) {
    if (field && field != "(null)") {
        return field
    } else {
        return (dv=="" ? "-" : dv)
    }
}

function fe(field) {
    if (field && field != "(null)") {
        return field
    }
}

# 加入 u501 这个变量是为了给只解析当前用户的日志提供一个开关,当读取到那一部分了之后才打开开关
/^ Records for UID 501 :/ { u501=1 }

/^ #/ { capture=1; next }
/^$/ { capture=0 }

# capture {
capture && u501 {
    if (/Disposition:/) { status=$NF; disposition=$0; sub(/^.*Disposition: /, "", disposition) }
    if (/Generation:/) { generation=$2 }
    if (/UUID:/) { uuid=$2 }
    if (/URL:/) { url=$2 }
    if (/Type:/) { type=$0; sub(/^.*Type: /, "", type) }
    if (/^[[:space:]]+Identifier:/) { identifier=$0; sub(/^.*Identifier: /, "", identifier) }
    if (/Bundle Identifier:/) { bid=$3 }
    if (/Parent Identifier:/) { pname=$0; sub(/^.*Parent Identifier: /, "", pname) }
    if (/Team Identifier:/) { tid=$3 }
    if (/^[[:space:]]+Name:/) { name=$0; sub(/^.*Name: /, "", name) }
    if (/Developer Name:/) { devname=$0; sub(/^.*Developer Name: /, "", devname) }
    if (/Executable Path:/) { execpath=$0; sub(/^.*Executable Path: /, "", execpath) }
}

/^$/ {
    if (fe(name) && fe(devname)) {
        cname = sprintf("%s (%s)", name, devname)
    } else {
        cname = sprintf("%s%s", fe(name), fe(devname))
    }
    if (type ~ /daemon/) { cname = "👿" cname }

    # if (identifier!="|") { printf "%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\n", cname, identifier, status, type, pname, f(tid), f(bid), f(url) }

    ue = fe(url) "|" fe(execpath)
    # if (ue!="|" && a[ue]=="") {
    if (ue!="|" && a[ue]=="" && status!="(2)") {
        # printf "%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\n", cname, identifier, status, type, pname, f(tid), f(bid), f(url), f(execpath), disposition
        printf "%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\n", cname, identifier, status, type, pname, f(tid), f(bid), f(url), generation
        a[ue]=1
    }

    uuid=url=type=bid=tid=name=devname=execpath=cname=identifier=ue=status=pname=disposition=""
}

# END {for (i in a) print i, a[i]}
' | /usr/bin/sort -t '\t' -k2
参考链接:

Parse the output of sfltool dumpbtm
https://github.com/luckman212/login-items-dump

How to determine details of backgound process?
https://apple.stackexchange.com/questions/465920/how-to-determine-details-of-backgound-process

awk-用于对文本和数据进行处理的编程语言
https://wangchujiang.com/linux-command/c/awk.html

AWK 工作原理
https://www.runoob.com/w3cnote/awk-work-principle.html

精通awk系列
https://www.junmajinlong.com/shell/awk/index/

=END=


发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注