=Start=
缘由:
在服务器上执行某个find操作时,导致内存占用升高,业务进程的内存占用也持续升高且恢复缓慢,初步排查看上去像是因为proc_inode_cache的占用升高导致。
正文:
参考解答:
通过学习 [Linux Used内存到底哪里去了? | http://blog.yufeng.info/archives/2456] 了解到「内存的去向主要有3个:1.进程消耗;2.slab消耗;3.pagetable消耗。」
struct page是系统boot的时候就会根据内存大小算出来分配出去的,18内核是1.56%左右,32内核由于cgroup的原因会在2.3%
查看slab使用情况的方法
$ slabtop [OR] $ cat /proc/slabinfo |awk '{print $1,$3*$4/1024,"KB"}' | sort -k2 -n | tail |
&
比较危险的做法
echo 1 > /proc/sys/vm/drop_caches # free pagecache [OR] echo 2 > /proc/sys/vm/drop_caches # free dentries and inodes [OR] echo 3 > /proc/sys/vm/drop_caches # free pagecache, dentries and inodes sync # forces the dump to be destructive |
&
温和一点的做法
echo 300 > /proc/sys/vm/vfs_cache_pressure # Increasing vfs_cache_pressure beyond 100 causes the kernel to prefer to reclaim dentries and inodes |
参考链接:
- Reducing inode and dentry caches to keep OOM killer at bay
https://major.io/2008/12/03/reducing-inode-and-dentry-caches-to-keep-oom-killer-at-bay/ - slabinfo
http://man7.org/linux/man-pages/man5/slabinfo.5.html - drop_caches
http://linuxinsight.com/proc_sys_vm_drop_caches.html - OOM Killer
https://linux-mm.org/OOM_Killer
=END=
《 “如何减少inode和dentry的缓存以尽量避免OOM” 》 有 8 条评论
窥探的艺术:透过LINUX源码洞悉OOM的本质!
http://rdc.hundsun.com/portal/article/748.html
https://linux-mm.org/OOM
`
如何防止重要的进程触发(OOM)机制而被杀死,这里有两种方法:
▪ 文明优雅:设置参数/proc/PID/oom_adj。
▪ 简单粗暴:关闭linux内核的OOM机制。
`
Linux的审计程序auditd中的一段避免进程因为OOOM机制被杀死的代码(static void avoid_oom_killer(void){}):
https://github.com/linux-audit/audit-userspace/blob/master/src/auditd.c#L324
`
static void avoid_oom_killer(void)
{
int oomfd, len, rc;
char *score = NULL;
/* New kernels use different technique */
if ((oomfd = open(“/proc/self/oom_score_adj”, O_NOFOLLOW | O_WRONLY)) >= 0) {
score = “-1000”;
} else if ((oomfd = open(“/proc/self/oom_adj”, O_NOFOLLOW | O_WRONLY)) >= 0) {
score = “-17”;
} else {
audit_msg(LOG_NOTICE, “Cannot open out of memory adjuster”);
return;
}
len = strlen(score);
rc = write(oomfd, score, len);
if (rc != len)
audit_msg(LOG_NOTICE, “Unable to adjust out of memory score”);
close(oomfd);
}
`
理解和配置 Linux 下的 OOM Killer
http://www.vpsee.com/2013/10/how-to-configure-the-linux-oom-killer/
`
dmesg
grep -i “Kill process” /var/log/messages
# vi oomscore.sh
#!/bin/bash
for proc in $(find /proc -maxdepth 1 -regex ‘/proc/[0-9]+’); do
printf “%2d %5d %s\n” \
“$(cat $proc/oom_score)” \
“$(basename $proc)” \
“$(cat $proc/cmdline | tr ‘\0’ ‘ ‘ | head -c 50)”
done 2>/dev/null | sort -nr | head -n 10
# chmod +x oomscore.sh
# ./oomscore.sh
`
Linux 的 OOM 终结者
https://linux.cn/article-5824-1.html
CentOS下如何检查OOM消息
https://stackoverflow.com/questions/624857/finding-which-process-was-killed-by-linux-oom-killer
https://unix.stackexchange.com/questions/128642/debug-out-of-memory-with-var-log-messages
`
grep -i ‘killed process’ /var/log/messages
dmesg | egrep -i ‘killed process’
grep oom /var/log/*
grep total_vm /var/log/*
dstat –top-oom #显示可能会因为OOM被kill掉的第一个进程
`
CentOS 7上的OOM日志放在哪?
https://www.reddit.com/r/linuxadmin/comments/3zun74/whered_the_oom_logs_go_in_centos7_systemd/
https://unix.stackexchange.com/questions/128642/debug-out-of-memory-with-var-log-messages/
Linux cgroups 简介
http://blog.jobbole.com/114311/
https://www.cnblogs.com/sparkdev/p/8296063.html
Linux Control Group 简介
http://wsfdl.com/linux/2015/05/21/%E7%90%86%E8%A7%A3control_group.html
浅谈Cgroups
https://mp.weixin.qq.com/s/3a5k3YA6ALri3BrQWQbOpw
浅谈Cgroups V2
https://mp.weixin.qq.com/s/8Rr-hxKQyHpT7L-Zx7PkcA
Linux的cgroup功能(二):资源限制cgroup v1和cgroup v2的详细介绍
https://www.lijiaocn.com/%E6%8A%80%E5%B7%A7/2019/01/28/linux-tool-cgroup-detail.html
深入理解 Linux Cgroup 系列(一):基本概念
https://mp.weixin.qq.com/s/JAh3hN-vOCPLE_iHs92J3Q
深入理解 Linux Cgroup 系列(二):玩转 CPU
https://mp.weixin.qq.com/s/dKWut8zEdJDs9OxVZAhE4A
深入理解 Linux Cgroup 系列(三):内存
https://mp.weixin.qq.com/s/TJ864CBkS2v-COhOkg-JpQ