=Start=
缘由:
Nginx日志的分析,尤其是加白,在不是特别清楚功能和作用的情况下,还是应该细粒度的操作,比如先按照Content-Type加白,就比按照domain维度的加白粒度会更细一点,比按照uri来加白要更方便和准确一点。简单记录一下,方便后面有需要的时候参考。
正文:
参考解答:
命名格式以及类型
A media type consists of a type and a subtype, which is further structured into a tree. A media type can optionally define a suffix and parameters:
mime-type = type "/" [tree "."] subtype ["+" suffix]* [";" parameter];
类型-type
The "type" part defines the broad use of the media type. As of November 1996, the registered types were: application, audio, image, message, multipart, text and video. By December 2020, the registered types included the foregoing, plus font, example, and model.
An unofficial top-level type in common use is chemical.
* application
* audio
* image
* message
* multipart
* text
* video
* font
* example
* model
日常分析中需要关注和排除的常见Content-Type
生产环境中在用的种类太多,前期先用白名单机制(只关注特定的Content-Type),后面可以采用黑明单机制(只关注新的Content-Type)。
另外就是,用nginx日志中的Content-Type字段来判断,一般会比用URI中的后缀来判断更具通用性和更准。
,lower(substring_index(http_content_type,';',1)) as http_content_type
and lower(http_content_type) not rlike 'image|video|audio|font|css|javascript'
# json字符串
application/json;charset=utf-8
application/json
# 任意二进制数据
application/octet-stream
# 由制表符分隔的 TSV 文件格式与 CSV 文件格式非常相似,但 CSV 文件中存储的数据字段是用逗号而不是制表符空格分隔的。两者都是一种分隔符分隔值格式。
text/tab-separated-values;charset=utf-8
# 文本
text/plain
text/json #非IANA官方推荐的json字符串类型,但浏览器也能识别和支持
text/html
text/xml
……
# 需要排除的一些MIME类型
# js/css
text/javascript;charset=utf-8
text/css;charset=utf-8
# 图片
image/gif
image/jpeg
image/png
image/webp
# 视频
video/mp4
video/mp2t
# 音频
audio/m4a
# 字体
font/woff2
font/woff
font/ttf
application/font-woff
application/font-woff2
vnd是什么含义
vnd indicates vendor-specific MIME types, which means they are MIME types that were introduced by corporate bodies rather than e.g. an Internet consortium.
vnd表示特定供应商的MIME类型,这意味着它们是由法人团体引入的MIME类型,而不是例如Internet联盟。
.docx application/vnd.openxmlformats-officedocument.wordprocessingml.document
.ico image/vnd.microsoft.icon
.mpkg application/vnd.apple.installer+xml #Apple Installer Package
二进制数据
在 RFC 2046 中,application/octet-stream 被定义为 "任意二进制数据",它与那些唯一目的是保存到磁盘上的实体有一定的重叠,因此不属于任何 "网络"内容。或者从另一个角度看,应用/八进制流的唯一安全做法就是将其保存到文件中,并希望别人知道它的用途。
json和jsonscript的MIME类型
For JSON text:
application/json
For JSONP (runnable JavaScript) with callback:
application/javascript
==
IANA has registered the official MIME Type for JSON as application/json.
When asked about why not text/json, Crockford seems to have said JSON is not really JavaScript nor text and also IANA was more likely to hand out application/* than text/*.
A lot of stuff got put into the text/* section in the early days that would probably be put into the application/* section these days.
The most widely supported non-standard media types are text/json or text/javascript. But some big names even use text/plain.
==
IANA 已将 JSON 的官方 MIME 类型注册为 application/json。
当被问及为何不注册 text/json 时,Crockford 似乎说 JSON 并非真正的 JavaScript,也不是文本,而且 IANA 更倾向于注册 application/* 而非 text/*。
早期有很多东西被放到 text/* 部分,而现在这些东西可能会被放到 application/* 部分。
最广泛支持的非标准媒体类型是 text/json 或 text/javascript。但有些大公司甚至使用 text/plain。
参考链接:
Common MIME types
https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types
MIME types (IANA media types)
https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types
Media Types
https://www.iana.org/assignments/media-types/media-types.xhtml
What is the meaning of “vnd” in MIME types?
https://stackoverflow.com/questions/5351093/what-is-the-meaning-of-vnd-in-mime-types
Do I need Content-Type: application/octet-stream for file download?
https://stackoverflow.com/questions/20508788/do-i-need-content-type-application-octet-stream-for-file-download
Which JSON content type do I use?
https://stackoverflow.com/questions/477816/which-json-content-type-do-i-use
Media type
https://en.wikipedia.org/wiki/Media_type
=END=
《“Nginx日志分析-MIME types”》 有 1 条评论
`
and uri not rlike ‘css|js|ico|svg|woff2|png|woff$’
`