Nginx日志分析-MIME types


=Start=

缘由:

Nginx日志的分析,尤其是加白,在不是特别清楚功能和作用的情况下,还是应该细粒度的操作,比如先按照Content-Type加白,就比按照domain维度的加白粒度会更细一点,比按照uri来加白要更方便和准确一点。简单记录一下,方便后面有需要的时候参考。

正文:

参考解答:

命名格式以及类型

A media type consists of a type and a subtype, which is further structured into a tree. A media type can optionally define a suffix and parameters:

mime-type = type "/" [tree "."] subtype ["+" suffix]* [";" parameter];

类型-type
The "type" part defines the broad use of the media type. As of November 1996, the registered types were: application, audio, image, message, multipart, text and video. By December 2020, the registered types included the foregoing, plus font, example, and model.
An unofficial top-level type in common use is chemical.

* application
* audio
* image
* message
* multipart
* text
* video
* font
* example
* model

日常分析中需要关注和排除的常见Content-Type

生产环境中在用的种类太多,前期先用白名单机制(只关注特定的Content-Type),后面可以采用黑明单机制(只关注新的Content-Type)。

另外就是,用nginx日志中的Content-Type字段来判断,一般会比用URI中的后缀来判断更具通用性和更准。

,lower(substring_index(http_content_type,';',1)) as http_content_type

and lower(http_content_type) not rlike 'image|video|audio|font|css|javascript'


# json字符串
application/json;charset=utf-8
application/json

# 任意二进制数据
application/octet-stream

# 由制表符分隔的 TSV 文件格式与 CSV 文件格式非常相似,但 CSV 文件中存储的数据字段是用逗号而不是制表符空格分隔的。两者都是一种分隔符分隔值格式。
text/tab-separated-values;charset=utf-8

# 文本
text/plain
text/json #非IANA官方推荐的json字符串类型,但浏览器也能识别和支持
text/html
text/xml
……

# 需要排除的一些MIME类型

# js/css
text/javascript;charset=utf-8
text/css;charset=utf-8

# 图片
image/gif
image/jpeg
image/png
image/webp

# 视频
video/mp4
video/mp2t

# 音频
audio/m4a

# 字体
font/woff2
font/woff
font/ttf
application/font-woff
application/font-woff2

vnd是什么含义

vnd indicates vendor-specific MIME types, which means they are MIME types that were introduced by corporate bodies rather than e.g. an Internet consortium.
vnd表示特定供应商的MIME类型,这意味着它们是由法人团体引入的MIME类型,而不是例如Internet联盟。

.docx    application/vnd.openxmlformats-officedocument.wordprocessingml.document
.ico     image/vnd.microsoft.icon
.mpkg    application/vnd.apple.installer+xml #Apple Installer Package

二进制数据

在 RFC 2046 中,application/octet-stream 被定义为 "任意二进制数据",它与那些唯一目的是保存到磁盘上的实体有一定的重叠,因此不属于任何 "网络"内容。或者从另一个角度看,应用/八进制流的唯一安全做法就是将其保存到文件中,并希望别人知道它的用途。

json和jsonscript的MIME类型

For JSON text:
application/json

For JSONP (runnable JavaScript) with callback:
application/javascript
==
IANA has registered the official MIME Type for JSON as application/json.

When asked about why not text/json, Crockford seems to have said JSON is not really JavaScript nor text and also IANA was more likely to hand out application/* than text/*.

A lot of stuff got put into the text/* section in the early days that would probably be put into the application/* section these days.

The most widely supported non-standard media types are text/json or text/javascript. But some big names even use text/plain.
==

IANA 已将 JSON 的官方 MIME 类型注册为 application/json。

当被问及为何不注册 text/json 时,Crockford 似乎说 JSON 并非真正的 JavaScript,也不是文本,而且 IANA 更倾向于注册 application/* 而非 text/*。

早期有很多东西被放到 text/* 部分,而现在这些东西可能会被放到 application/* 部分。

最广泛支持的非标准媒体类型是 text/json 或 text/javascript。但有些大公司甚至使用 text/plain。
参考链接:

Common MIME types
https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types

MIME types (IANA media types)
https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types

Media Types
https://www.iana.org/assignments/media-types/media-types.xhtml

What is the meaning of “vnd” in MIME types?
https://stackoverflow.com/questions/5351093/what-is-the-meaning-of-vnd-in-mime-types

Do I need Content-Type: application/octet-stream for file download?
https://stackoverflow.com/questions/20508788/do-i-need-content-type-application-octet-stream-for-file-download

Which JSON content type do I use?
https://stackoverflow.com/questions/477816/which-json-content-type-do-i-use

Media type
https://en.wikipedia.org/wiki/Media_type

=END=


《“Nginx日志分析-MIME types”》 有 1 条评论

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注