最近在进行抓包重组方面的编程学习,在代码写完了之后需要对抓包性能进行分析,其中比较重要的一点就是丢包率,这点异常关键,如果丢包率高了,那方案可能就需要重新设计了。但如何了解在一个高速网络I/O情况下程序的丢包率呢?一种比较容易想到的方式就是和tcpdump的抓包效果进行对比,但因为tcpdump底层依赖的是libpcap库,当网络I/O较大时,它也会出现丢包的情况,所以准确性有待商榷;还有一种方式就是读取Linux系统中的某些特定文件获取相关信息(这个应该是一种比较简单也较为准确的方法,简单说的是方法简单——读文件然后进行相关换算就行,为什么是较为准确,而不是准确呢?因为该文件中存放的是总体的信息,无法像tcpdump那样根据某些规则进行过滤,计算的是一个总数,所以与实际相比值要偏大一些),下面就记录一下该方法:
搜索关键字:
- http://search.aol.com/aol/search?q=linux+%2Fproc%2Fnet%2Fdev
- http://search.aol.com/aol/search?q=linux+%2Fsys%2Fclass%2Fnet%2Feth0%2Fstatistics%2Frx_packets
参考关键字:
/proc/net/dev /proc/net/raw -- RAW socket information /proc/net/tcp -- TCP socket information /proc/net/udp -- UDP socket information /sys/class/net/eth0/statistics/rx_bytes /sys/class/net/eth0/statistics/rx_packets /sys/class/net/eth0/statistics/tx_bytes /sys/class/net/eth0/statistics/tx_packets
==
There are many traffic monitoring tools available on Linux, which can monitor/classify network traffic, and report real-time traffic statistics in fancy user interfaces. Most of these tools (e.g., ntopng, iftop) are powered by libpcap, which is a packet capture library used to monitor network traffic in user space. Despite their versatility, however, libpcap-based network monitoring tools cannot scale to handle traffic on multi Gigabit rate network interfaces, due to the overhead associated with user-space packet capture.(虽然基于libpcap的网络监控工具功能齐全,但是因为用户态抓包能力的先天性限制,它们无法扩展去处理Gigabit速率的网络接口。)
In this tutorial, I will present simple shell scripts that can monitor network traffic on per-interface basis, without relying on slow libpcap library. These scripts are fast enough to support multi Gigabit rates, but only suitable if you are interested in “aggregate” network statistics on per interface basis.(下面的脚本只适用于对总体的网卡流量进行统计的情形)
The secret for the scripts lies in sysfs virtual filesystem which is used by the kernel to export device- or driver-related information to user space. Network interface related statistics are exported via /sys/class/net/<ethX>/statistics.(因为内核将设备、驱动相关的信息导出至用户态的 sysfs 虚拟文件系统的相应文件中)
For example, the statistics on eth0 interface are found in these files:
- /sys/class/net/eth0/statistics/rx_packets: number of packets received
- /sys/class/net/eth0/statistics/tx_packets: number of packets transmitted
- /sys/class/net/eth0/statistics/rx_bytes: number of bytes received
- /sys/class/net/eth0/statistics/tx_bytes: number of bytes transmitted
- /sys/class/net/eth0/statistics/rx_dropped: number of packets dropped while received
- /sys/class/net/eth0/statistics/tx_dropped: number of packets dropped while transmitted
The numbers stored in the files are automatically refreshed in real-time by the kernel. Therefore, you can write scripts that calculate traffic statistics based on these files.
The following are two such scripts (thanks to joemiller). The first script counts the number of packets per second, received (RX) or sent (TX) on an interface, while the latter scripts measures the network bandwidth of incoming (RX) and outgoing (TX) traffic on an interface.
Reference:
http://xmodulo.com/measure-packets-per-second-throughput-high-speed-network-interface.html
#!/bin/bash interval="1" # update interval in seconds interface_name=$(awk '$1 !~ /lo/' /proc/net/dev | awk 'BEGIN{max=0} {if($2+0 > max+0) max=$2 fi} END{print $1}' | awk -F':' '{print $1}') while true do R1=`cat /sys/class/net/$interface_name/statistics/rx_packets` T1=`cat /sys/class/net/$interface_name/statistics/tx_packets` rb1=`cat /sys/class/net/$interface_name/statistics/rx_bytes` tb1=`cat /sys/class/net/$interface_name/statistics/tx_bytes` sleep $interval R2=`cat /sys/class/net/$interface_name/statistics/rx_packets` T2=`cat /sys/class/net/$interface_name/statistics/tx_packets` rb2=`cat /sys/class/net/$interface_name/statistics/rx_bytes` tb2=`cat /sys/class/net/$interface_name/statistics/tx_bytes` TXPPS=`expr $T2 - $T1` RXPPS=`expr $R2 - $R1` TBPS=`expr $tb2 - $tb1` RBPS=`expr $rb2 - $rb1` TKBPS=`expr $TBPS / 1024` RKBPS=`expr $RBPS / 1024` echo -e "Send $interface_name: $TXPPS pkts/s\tRecv $interface_name: $RXPPS pkts/s" echo -e "Send $interface_name: $TKBPS KB/s\tRecv $interface_name: $RKBPS KB/s" done
==
参考链接:
- How to measure packets per second or throughput on high speed network interface – Xmodulo
- http://serverfault.com/questions/533513/how-to-get-tx-rx-bytes-without-ifconfig
- http://stackoverflow.com/questions/349576/linux-retrieve-per-interface-sent-received-packet-counters-ethernet-ipv4-ipv
- http://stackoverflow.com/questions/3521678/linux-what-are-means-of-fields-in-proc-net-dev
《 “如何计算Linux系统下高速网络I/O情况下的收/发包速率?” 》 有 10 条评论
腾讯安全零距离之大眼——大型网络流量分析系统软件篇
https://security.tencent.com/index.php/blog/msg/40
`
内核挂钩(netfilter hook)
libpcap
libpfring
专用硬件(Tilera/Cavium/…)
DPDK
`
DPDK分析
http://www.jianshu.com/p/0ff8cb4deaef
`
分析了目前的传统服务器结构以及可能存在的问题引出需求
提出DPDK开发套件如何突破操作系统限制
之后分析了dpdk的整体结构
最后对相关联的技术和场景做扩展
`
Linux tcp 网络常见报错及分析
http://blog.51cto.com/welcomeweb/1975357
Linux 网络常见监控项以及报错
https://jin-yang.github.io/post/linux-monitor-network.html
`
简介
本来是想将报错和监控拆开的,但是发现两者几乎是耦合的,通过监控项才能发现错误,定为错误的原因时也要依赖监控项,索性就将两者合并到了一起。
查看丢包
操作系统处理不过来,丢弃数据
应用程序处理不过来,操作系统丢弃
Out of memory
内存不足
orphan sockets
总结
Procfs 文件系统
`
The netfilter.org “libnetfilter_conntrack” project
https://netfilter.org/projects/libnetfilter_conntrack/
`
介绍:
libnetfilter_conntrack 是一个用户空间库,它为内核中的连接跟踪状态表提供了一个编程接口(API)。libnetfilter_conntrack 库以前称为 libnfnetlink_conntrack 和 libctnetlink 。conntrack-tools目前在许多其他应用程序中使用这个库。
依赖:
libnetfilter_conntrack requires libnfnetlink and a kernel that includes the nfnetlink_conntrack subsystem (initial support >= 2.6.14, recommended >= 2.6.18).
`
https://stackoverflow.com/questions/tagged/netfilter
https://github.com/vishvananda/netlink/blob/master/conntrack_linux.go
Linux进程网络流量统计方法及实现
https://blog.didiyun.com/index.php/2018/11/07/linux/
[译] 深入理解 iptables 和 netfilter 架构
https://arthurchiao.github.io/blog/deep-dive-into-iptables-and-netfilter-arch-zh/
https://www.digitalocean.com/community/tutorials/a-deep-dive-into-iptables-and-netfilter-architecture
https://upload.wikimedia.org/wikipedia/commons/3/37/Netfilter-packet-flow.svg
开源流量分析系统 Apache Spot 概述
https://mp.weixin.qq.com/s/DQdcByiuMNlUMhK7uHAdCA
`
1. Apache Spot 是什么?
2. Apache Spot 能做什么?
3. Apache Spot 是怎么做的?
4. Apache Spot 关键特性
5. Apache Spot 系统介绍
1、系统架构
2、组件视图
3、数据流视图
4、服务视图
6. Apache Spot 组件介绍
1、配置组件 Spot-Setup
2、数据采集组件 Spot-Ingest
3、机器学习组件 Spot-ML
4、交互组件 Spot-OA
7. Apache Spot 环境&部署
8. Apache Spot 数据采集
1、Proxy 数据采集
2、Flow 数据采集
3、DNS 数据采集
`
面向初学者的DPDK技术解析
https://mp.weixin.qq.com/s/RW0GO8hNxoE7upAeAExWHg
“tcp丢包分析”实验解析(一)–proc文件系统
https://mp.weixin.qq.com/s/3AiZLXjMmgC07SYlRZW_qQ
“tcp丢包分析”实验解析(二)–kprobe和tracepoint
https://mp.weixin.qq.com/s/bQsComXRa7N6Ojlp2CaesA
“tcp丢包分析”实验解析(三)–驱动接收包过程
https://mp.weixin.qq.com/s/lO15IRYYq5YLstukh93Eng
Terminal bandwidth utilization tool
https://github.com/imsnif/bandwhich
`
一个命令行程序,查看当前什么程序在占用带宽,以及不同 IP 地址产生的流量。
This is a CLI utility for displaying current network utilization by process, connection and remote IP/hostname
`