[译]另类的密码破解速查表

原文链接：unix-ninja :: A cheat-sheet for password crackers

在文章正文开始之前先说几句：其实挺激动的，这么好的文章，现在才看到；老外的分享意识太给力了，我觉得如果是国内的话，这么好的内容在公开的场合可能就看不到了（自己偷偷摸摸的用就行了，肯定不会放出来的，或者只是在小范围内流传），像我肯定是看不到的::>_<:: 最初是在360安全播报里面看到的文章，但是里面的翻译其实挺不用心的（但也得感谢360安全播报，要不然现在/一段时间内我还真看不到这篇文章，在此感谢他们的辛勤收集），而且对于这类文章中最重要的示例代码竟然也没有注意可能存在的转义问题，从而导致了不少错误，我只有从英文原文中去比对、重新翻译（真的是第一次进行这样的翻译，英语丢了好久了，虽然花了一定的时间进行校对，但还是希望出错的地方不要太多……），然后根据自己仅有的一些经验进行了校对，肯定还有不对的地方，还望多多理解~~

本文中我将分享一些我在密码破解过程中觉得有用的Bash命令和正则表达式。大多数时候，我们通过类似于Pastebin这样的网站来破解hash。考虑到手工分离hash是一个非常耗时间的过程，所以这里我们将使用正则表达式来让我们的生活生活更轻松！

提取 MD5 哈希(egrep方式)

# egrep -oE '(^|[^a-fA-F0-9])[a-fA-F0-9]{32}([^a-fA-F0-9]|$)' *.txt | egrep -o '[a-fA-F0-9]{32}' > md5-hashes.txt

另一种sed的方式

# sed -rn 's/.*[^a-fA-F0-9]([a-fA-F0-9]{32})[^a-fA-F0-9].*/\1/p' *.txt > md5-hashes

说明：上面的正则表达式可以用于提取以十六进制表示的SHA1/SHA256和其它未加盐的hash密码，你唯一需要做的就是根据具体的hash类型修改上面的32为相应的hash字符串长度。

提取有效MySQL-Old哈希

# grep -e "[0-7][0-9a-f]\{7\}[0-7][0-9a-f]\{7\}" *.txt > mysql-old-hashes.txt

提取 blowfish 哈希

# grep -e "\$2a\\$\08\\$\(.\)\{75\}" *.txt > blowfish-hashes.txt

提取 Joomla 哈希

# egrep -o "([0-9a-zA-Z]{32}):(\w{16,32})" *.txt > joomla.txt

提取 VBulletin 哈希

# egrep -o "([0-9a-zA-Z]{32}):(\S{3,32})" *.txt > vbulletin.txt

提取 phpBB3-MD5

# egrep -o '\$H\$\S{31}' *.txt > phpBB3-md5.txt

提取 WordPress-MD5

# egrep -o '\$P\$\S{31}' *.txt > wordpress-md5.txt

提取 Drupal 7

# egrep -o '\$S\$\S{52}' *.txt > drupal-7.txt

提取 old Unix-md5

# egrep -o '\$1\$\w{8}\S{22}' *.txt > md5-unix-old.txt

提取 md5-apr1

# egrep -o '\$apr1\$\w{8}\S{22}' *.txt > md5-apr1.txt

提取 sha512crypt, SHA512(Unix)

# egrep -o '\$6\$\w{8}\S{86}' *.txt > sha512crypt.txt

从文本文件中提取 Email地址

# grep -E -o "\b[a-zA-Z0-9.#?$*_-]+@[a-zA-Z0-9.#?$*_-]+\.[a-zA-Z0-9.-]+\b" *.txt > e-mails.txt

从文本文件中提取 URL

# grep http | grep -shoP 'http.*?[" >]' *.txt > http-urls.txt

提取HTTPS、FTP和其他的URL格式

# grep -E '(((https|ftp|gopher)|mailto)[.:][^ >"\t]*|www\.[-a-z0-9.]+)[^ .,;\t>">\):]' *.txt > urls.txt

注意:如果grep返回 “Binary file (standard input) matches” 使用以下方法

# tr '[\000-\011\013-\037\177-\377]' '.' < *.log | grep -E "Your_Regex"
or
# cat -v *.log | egrep -o "Your_Regex"

提取浮点数

# grep -E -o "^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$" *.txt > floats.txt

提取 Visa 信用卡数据

# grep -E -o "4[0-9]{3}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}" *.txt > visa.txt

提取 MasterCard 的数据

# grep -E -o "5[0-9]{3}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}" *.txt > mastercard.txt

提取 American Express 的数据

# grep -E -o "\b3[47][0-9]{13}\b" *.txt > american-express.txt

提取 Diners Club 的数据

# grep -E -o "\b3(?:0[0-5]|[68][0-9])[0-9]{11}\b" *.txt > diners.txt

提取 Discover 的数据

# grep -E -o "6011[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}" *.txt > discover.txt

提取 JCB 的数据

# grep -E -o "\b(?:2131|1800|35\d{3})\d{11}\b" *.txt > jcb.txt

提取 AMEX 的数据

# grep -E -o "3[47][0-9]{2}[ -]?[0-9]{6}[ -]?[0-9]{5}" *.txt > amex.txt

提取社会安全号码 (Social Security Number – SSN)

# grep -E -o "[0-9]{3}[ -]?[0-9]{2}[ -]?[0-9]{4}" *.txt > ssn.txt

提取 Indiana驾照号码 (Indiana Driver License Number)

# grep -E -o "[0-9]{4}[ -]?[0-9]{2}[ -]?[0-9]{4}" *.txt > indiana-dln.txt

提取美国护照卡号 (US Passport Cards)

# grep -E -o "C0[0-9]{7}" *.txt > us-pass-card.txt

提取美国护照号码 (US Passport Number)

# grep -E -o "[23][0-9]{8}" *.txt > us-pass-num.txt

提取美国的手机号码

# grep -Po '\d{3}[\s\-_]?\d{3}[\s\-_]?\d{4}' *.txt > us-phones.txt

提取 ISBN 号码

# egrep -a -o "\bISBN(?:-1[03])?:? (?=[0-9X]{10}$|(?=(?:[0-9]+[- ]){3})[- 0-9X]{13}$|97[89][0-9]{10}$|(?=(?:[0-9]+[- ]){4})[- 0-9]{17}$)(?:97[89][- ]?)?[0-9]{1,5}[- ]?[0-9]+[- ]?[0-9]+[- ]?[0-9X]\b" *.txt > isbn.txt

========

用 sed/awk/grep 处理的一些小技巧

用sed移除空字符(Remove the space character with sed)

# sed -i 's/ //g' file.txt
OR
# egrep -v "^[[:space:]]*$" file.txt

用sed移除最后的空白字符(Remove the last space character with sed)

# sed -i s/.$// file.txt

按单词的长度排序

# awk '{print length, $0}' rockyou.txt | sort -n | cut -d " " -f2- > rockyou_length-list.txt

将大小写互相转换(Convert uppercase to lowercase and the opposite)

# tr [A-Z] [a-z] < file.txt > lower-case.txt
# tr [a-z] [A-Z] < file.txt > upper-case.txt

用sed删除空行

# sed -i '/^$/d' List.txt

用sed删除自定义字符

# sed -i "s/'//" file.txt

用sed删除一个字符串

# echo 'This is a foo test' | sed -e 's/<foo>//g'

用tr替换字符

# tr '@' '#' < emails.txt
OR
# sed 's/@/#' file.txt

用awk/cut命令打印指定列(Print specific columns with awk)

# awk -F "," '{print $3}' infile.csv > outfile.csv
or
# cut -d "," -f 3 infile.csv > outfile.csv

Note: if you want to isolate all columns after column 3 use (如果你想排除第3列之后的内容请使用下面的命令)

# cut -d "," -f 3- infile.csv > outfile.csv

使用urandom生成随机密码(Generate Random Passwords with urandom)

# tr -dc 'a-zA-Z0-9._!@#$%^&*()' < /dev/urandom | fold -w 8 | head -n 500000 > wordlist.txt
# tr -dc 'a-zA-Z0-9-_!@#$%^&*()_+{}|:<>?=' < /dev/urandom | fold -w 12 | head -n 4
# base64 /dev/urandom | tr -d '[^:alnum:]' | cut -c1-10 | head -2
# tr -dc 'a-zA-Z0-9' < /dev/urandom | fold -w 10 | head -n 4
# tr -dc 'a-zA-Z0-9-_!@#$%^&*()_+{}|:<>?=' < /dev/urandom | fold -w 12 | head -n 4 | grep -i '[!@#$%^&*()_+{}|:<>?=]'
# tr -dc '[:print:]' < /dev/urandom | fold -w 10| head -n 10
# tr -cd '[:alnum:]' < /dev/urandom | fold -w30 | head -n2

删除括号(Remove Parenthesis with tr)

# tr -d '()' < in_file > out_file

用你的文件名生成单词列表(Generate wordlists from your file-names)

# ls -A | sed 's/regexp/&\n/g'

当遇到cat命令无法处理的奇特字符时(Process text files when cat is unable to handle strange characters)

# sed 's/\([[:alnum:]]*\)[[:space:]]*(.)\(\..*\)/\1\2/' *.txt

使用awk过滤出指定长度的内容(Generate length based wordlists with awk)

# awk 'length == 10' file.txt > 10-length.txt

合并两个不同的txt文件

# paste -d' ' file1.txt file2.txt > new-file.txt

利用parallel命令进行快速排序(Faster sorting)

# export alias sort='sort --parallel=<number_of_cpu_cores> -S <amount_of_memory>G ' && export LC_ALL='C' && cat file.txt | sort -u > new-file.txt

Mac到UNIX格式的转换(Mac to unix)

# tr '\015' '\012' < in_file > out_file

Dos到UNIX格式的转换(Dos to Unix)

# dos2unix file.txt

Unix到Dos格式的转换(Unix to Dos)

# unix2dos file.txt

利用grep移除本文件中已在其他文件中出现的内容(Remove from one file what is in another file)

# grep -F -v -f file1.txt -w file2.txt > file3.txt

用sed命令提取指定行的内容(Isolate specific line numbers with sed)

# sed -n '1,100p' test.file > file.out

提取pdf文件中的文本内容(Create Wordlists from PDF files)

# pdftotext file.pdf file.txt

查找一个文件中的指定行/内容(Find the line number of a string inside a file)

# awk '{ print NR, $0 }' file.txt | grep "string-to-grep"
or
# grep -n "string-to-grep"

====
更快的过滤请参考(Faster filtering with the silver searcher)
https://github.com/ggreer/the_silver_searcher

(如果想进行更快的匹配，请将上面命令中的grep替换为ag，而保持正则表达式不变)For faster grepping use all the above grep regular expressions with the command ag. The following is a proof of concept of its speed:

# time ack-grep -o "\b[a-zA-Z0-9.#?$*_-]+@[a-zA-Z0-9.#?$*_-]+\.[a-zA-Z0-9.-]+\b" *.txt > /dev/null
real 1m2.447s
user 1m2.297s
sys 0m0.645s

# time egrep -o "\b[a-zA-Z0-9.#?$*_-]+@[a-zA-Z0-9.#?$*_-]+\.[a-zA-Z0-9.-]+\b" *.txt > /dev/null
real 0m30.484s
user 0m30.292s
sys 0m0.310s

# time ag -o "\b[a-zA-Z0-9.#?$*_-]+@[a-zA-Z0-9.#?$*_-]+\.[a-zA-Z0-9.-]+\b" *.txt > /dev/null
real 0m4.908s
user 0m4.820s
sys 0m0.277s

====

cat命令的有效使用(Useful Use of Cat)

Contrary to what many veteran unix users may believe, this happens to be one of the rare opportunities where using cat can actually make your searches faster. The SilverSearcher utility is (at the time of this writing) not quite as efficient as cat when it comes to reading from file handles. Therefore, you can pipe output from cat into ag to see nearly a 2x real time performance gain{这种情况出现在：使用cat命令打开大量文件而不是自己使用文件句柄一个一个的进行读取，从而得到了很大的性能提升}:

$ time ag -o '(^|[^a-fA-F0-9])[a-fA-F0-9]{32}([^a-fA-F0-9]|\$)' *.txt | ag -o '[a-fA-F0-9]{32}' > /dev/null
real 0m10.851s
user 0m13.069s
sys 0m0.092s

$ time cat *.txt | ag -o '(^|[^a-fA-F0-9])[a-fA-F0-9]{32}([^a-fA-F0-9]|\$)' | ag -o '[a-fA-F0-9]{32}' > /dev/null
real 0m6.689s
user 0m7.881s
sys 0m0.424s

参考链接：

拓展链接：