问题:crontab如何自动周期性地访问某一个网页
例如:http://ixyzero.com/zhihu/ 是一个当有访问的时候才会执行抓取任务并更新数据库的页面,可否让crontab每天的几点自动访问一下这个网页?
解决方法:
lynx -source http://ixyzero.com/zhihu/ >/dev/null 2>&1
或者
wget -q --spider http://ixyzero.com/zhihu/
应用到Crontab
例如每天10点自动访问下面的网页!
#crontab -e
0 10 * * * lynx -source http://ixyzero.com/zhihu/index.php >/dev/null 2>&1
或者
0 10 * * * wget -q --spider http://ixyzero.com/zhihu/index.php
——-
命令及其选项的解释:
lynx -source http://ixyzero.com/zhihu/ >/dev/null 2>&1
# -source
works the same as dump but outputs HTML source instead of formatted text.
# -dump
dumps the formatted output of the default document or those specified on the command line to standard output. Unlike interactive mode, all documents are processed. This can be used in the following way:
lynx -dump http://www.subir.com/lynx.html
wget -q –spider http://ixyzero.com/zhihu/
# -q[–quiet]
Turn off Wget’s output.
# –spider
Wget will behave as a Web spider, wich means that it will not download the pages, just check that they are there.(This feature needs much more work for Wget to get close to the functionality of real web spiders.)
《“crontab如何自动周期性地访问某一个网页”》 有 1 条评论
轻松组建分布式 pyspider 集群
https://imlonghao.com/10.html