在Ubuntu12.04 x32上使用parallel命令
晚上在整理之前搜集的笔记/文章的时候发现了一篇名为”如何利用多核CPU来加速你的Linux命令 — awk, sed, bzip2, grep, wc等“的文章,比较感兴趣,于是想尝试一下,看看到底有没有文章中说的那种效果。
所以,先看看其帮助信息 # parallel -h ,报错如下:
The program 'parallel' is currently not installed. You can install it by typing:apt-get install moreutils
于是按照它的指导进行了安装,再运行命令:
# parallel -h parallel [OPTIONS] command -- arguments for each argument, run command with argument, in parallel parallel [OPTIONS] -- commands run specified commands in parallel
o(╯□╰)o,感觉讲的好抽象,还是去看manual手册吧:
# man -t parallel | ps2pdf -> parallel_manual.pdf The program 'ps2pdf' is currently not installed. You can install it by typing: apt-get install ghostscript
先生成一个测试文件:
# perl -e 'for(1..1000000){print "$_n"}' > num1000000
然后:
# cat num1000000 | parallel --pipe wc parallel: invalid option -- '-' parallel [OPTIONS] command -- arguments for each argument, run command with argument, in parallel parallel [OPTIONS] -- commands run specified commands in parallel
去下载源码进行安装:
# wget http://ftp.gnu.org/gnu/parallel/parallel-20140622.tar.bz2 # wget http://ftp.gnu.org/gnu/parallel/parallel-20140622.tar.bz2.sig
验证一下:
# gpg --verify parallel-20140622.tar.bz2.sig parallel-20140622.tar.bz2 gpg: Signature made Mon 23 Jun 2014 09:27:19 AM CST using RSA key ID 88888888 gpg: Can't check signature: public key not found
需要先导入key:
# gpg --recv-keys 88888888 gpg: requesting key 88888888 from hkp server keys.gnupg.net gpg: /root/.gnupg/trustdb.gpg: trustdb created gpg: key 88888888: public key "Ole Tange <[email protected]>" imported gpg: no ultimately trusted keys found gpg: Total number processed: 1 gpg: imported: 1 (RSA: 1)
# gpg --verify --verbose parallel-20140622.tar.bz2.sig parallel-20140622.tar.bz2 gpg: armor header: Version: GnuPG v1.4.12 (GNU/Linux) gpg: Signature made Mon 23 Jun 2014 09:27:19 AM CST using RSA key ID 88888888 gpg: using PGP trust model gpg: Good signature from "Ole Tange <[email protected]>" gpg: aka "Ole Tange <[email protected]>" gpg: aka "[jpeg image of size 6001]" gpg: WARNING: This key is not certified with a trusted signature! gpg: There is no indication that the signature belongs to the owner. Primary key fingerprint: CDA0 1A42 08C4 F745 0610 7E7B D1AB 4516 8888 8888 gpg: binary signature, digest algorithm SHA512
安装:
# tar jxf parallel-20140622.tar.bz2 # cd parallel-20140622/ # ./configure && make && make install
安装完了之后,直接使用parallel命令还是会和刚才一样报错,后来重启了之后,我找到源码安装目录里面的src目录中的parallel可执行脚本,运行命令:
root@hi:~/download/parallel-20140622# cat /root/tmp/num1000000 | ./src/parallel –pipe -j4 wc When using programs that use GNU Parallel to process data for publication please cite: O. Tange (2011): GNU Parallel – The Command-Line Power Tool, ;login: The USENIX Magazine, February 2011:42-47. This helps funding further development; and it won't cost you a cent. To silence this citation notice run 'parallel –bibtex' once or use '–no-notice'. 165668 165668 1048571 149797 149797 1048579 149796 149796 1048572 149797 149797 1048579 85349 85349 597444 149797 149797 1048579 149796 149796 1048572
然后就可以按照那篇文章中介绍的方法去享受parallel带来的加速了(前提条件是:多核,我的VPS因为是单核的,所以就无法享受加速度带来的快感了::>_<:: )
# find / -name parallel /root/download/parallel-20140622/src/parallel /usr/include/c++/4.6/parallel /usr/local/bin/parallel /usr/local/share/doc/parallel /usr/bin/parallel # file /usr/bin/parallel /usr/bin/parallel: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15, BuildID[sha1]=0xc39a37f03a41bc1b1096aa6175a2c427b28bfb23, stripped # file /usr/local/bin/parallel /usr/local/bin/parallel: a perl script, ASCII text executable, with escape sequences # which parallel /usr/local/bin/parallel
你会发现,其实parallel就是一个可执行的Perl脚本,如果可以的话你也可以自己去试着写一个`(*∩_∩*)′
更多参考文章:
- Use multiple CPU Cores with your Linux commands — awk, sed, bzip2, grep, wc, etc. | RankFocus – Systems and Data
- parallel的下载地址
- 快速安装脚本的下载地址
- 使用sig文件生成和验证数字签名
- GNU Parallel Tutorial
- Gnu Parallel – Parallelize Serial Command Line Programs Without Changing Them
一些parallel命令的快速入门教学视频:
- https://www.youtube.com/watch?v=OpaiGYxkSuQ&index=2&list=PL284C9FF2488BC6D1
- https://www.youtube.com/watch?v=P40akGWJ_gY&index=3&list=PL284C9FF2488BC6D1
- https://www.youtube.com/watch?v=1ntxT-47VPA&index=4&list=PL284C9FF2488BC6D1
- https://www.youtube.com/watch?v=fOX1EyHkQwc&index=5&list=PL284C9FF2488BC6D1
====
Linux下使用.sig签名文件验证签名
网上一些下载资源会同时提供下载资源名称加”.sig”为文件名的分离签名文件,用来校验下载资源的完整性。
以grub为例,当前最新版本的grub为2.00版本,可从 ftp://ftp.gnu.org/gnu/grub/ 下载,有两个文件:grub-2.00.tar.gz.sig和grub-2.00.tar.gz。
验证方法:
$ gpg --verify grub-2.00.tar.gz.sig grub-2.00.tar.gz gpg: 于 2012年06月28日 星期四 08时11分54秒 CST 创建的签名,使用 DSA,钥匙号 E82E4209 gpg: 无法检查签名:找不到公钥
这说明找不到对应的公钥,同时会提示当前验证的钥匙号为 E82E4209,根据这个钥匙号导入公钥:
$ gpg --recv-keys E82E4209 gpg: 下载密钥'E82E4209',从 hkp 服务器 keys.gnupg.net gpg: 密钥 E82E4209:公钥"Vladimir 'phcoder' Serbinenko <[email protected]>"已导入 gpg: 没有找到任何绝对信任的密钥 gpg: 合计被处理的数量:1 gpg: 已导入:1
$ gpg --verify --verbose grub-2.00.tar.gz.sig grub-2.00.tar.gz gpg: 于 2012年06月28日 星期四 08时11分54秒 CST 创建的签名,使用 DSA,钥匙号 E82E4209 gpg: 使用 PGP 信任模型 gpg: 完好的签名,来自于"Vladimir 'phcoder' Serbinenko <[email protected]>" gpg: 警告:这把密钥未经受信任的签名认证! gpg: 没有证据表明这个签名属于它所声称的持有者。 主钥指纹: E53D 497F 3FA4 2AD8 C9B4 D1E8 35A9 3B74 E82E 4209 gpg: 二进制 签名,散列算法 SHA512
====
补充:parallel的多种安装方法
GNU Parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially. This makes it possible to use output from GNU Parallel as input for other programs.
= 10 seconds installation =
The 10 seconds installation will try do to a full installation; if that fails, a personal installation; if that fails, a minimal installation.
(wget -O – pi.dk/3 || curl pi.dk/3/ || fetch -o – http://pi.dk/3) | bash
This will literally install faster than reading the rest of this document.
= Full installation =
Full installation of GNU Parallel is as simple as:
wget http://ftpmirror.gnu.org/parallel/parallel-20140622.tar.bz2 bzip2 -dc parallel-20140622.tar.bz2 | tar xvf – cd parallel-20140622 ./configure && make && make install
= Personal installation =
If you are not root you can add ~/bin to your path and install in ~/bin and ~/share:
wget http://ftpmirror.gnu.org/parallel/parallel-20140622.tar.bz2 bzip2 -dc parallel-20140622.tar.bz2 | tar xvf – cd parallel-20140622 ./configure –prefix=$HOME && make && make install
Or if your system lacks ‘make’ you can simply copy src/parallel src/sem src/niceload src/sql to a dir in your path.(如果不想安装或是系统中缺少make工具,你可以将从源码解压出来的src/parallel src/sem src/niceload src/sql 复制到在你的$PATH路径中的目录里面去即可)
= Minimal installation =
If you just need parallel and do not have ‘make’ installed (maybe the system is old or Microsoft Windows):
wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel chmod 755 parallel cp parallel sem mv parallel sem dir-in-your-$PATH/bin/
= Test the installation =
After this you should be able to do:
parallel -j0 ping -nc 3 ::: foss.org.my gnu.org freenetproject.org
This will send 3 ping packets to 3 different hosts in parallel and print the output when they complete.
Watch the intro video for a quick introduction:
https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial (man parallel_tutorial). You command line will love you for it.
When using programs that use GNU Parallel to process data for publication please cite:
O. Tange (2011): GNU Parallel – The Command-Line Power Tool,
;login: The USENIX Magazine, February 2011:42-47.
《“在Ubuntu12.04 x32上使用parallel命令”》 有 1 条评论
Shell 黑科技之匿名函数实现任务并行化
https://my.oschina.net/leejun2005/blog/917455
Bash脚本实现批量作业并行化
http://jerkwin.github.io/2013/12/14/Bash%E8%84%9A%E6%9C%AC%E5%AE%9E%E7%8E%B0%E6%89%B9%E9%87%8F%E4%BD%9C%E4%B8%9A%E5%B9%B6%E8%A1%8C%E5%8C%96/