Nginx日志分析-MIME types


=Start=

缘由:

Nginx日志的分析,尤其是加白,在不是特别清楚功能和作用的情况下,还是应该细粒度的操作,比如先按照Content-Type加白,就比按照domain维度的加白粒度会更细一点,比按照uri来加白要更方便和准确一点。简单记录一下,方便后面有需要的时候参考。

正文:

参考解答:

命名格式以及类型

A media type consists of a type and a subtype, which is further structured into a tree. A media type can optionally define a suffix and parameters:

mime-type = type "/" [tree "."] subtype ["+" suffix]* [";" parameter];

类型-type
The "type" part defines the broad use of the media type. As of November 1996, the registered types were: application, audio, image, message, multipart, text and video. By December 2020, the registered types included the foregoing, plus font, example, and model.
An unofficial top-level type in common use is chemical.

* application
* audio
* image
* message
* multipart
* text
* video
* font
* example
* model

日常分析中需要关注和排除的常见Content-Type

生产环境中在用的种类太多,前期先用白名单机制(只关注特定的Content-Type),后面可以采用黑明单机制(只关注新的Content-Type)。

另外就是,用nginx日志中的Content-Type字段来判断,一般会比用URI中的后缀来判断更具通用性和更准。

,lower(substring_index(http_content_type,';',1)) as http_content_type

and lower(http_content_type) not rlike 'image|video|audio|font|css|javascript'


# json字符串
application/json;charset=utf-8
application/json

# 任意二进制数据
application/octet-stream

# 由制表符分隔的 TSV 文件格式与 CSV 文件格式非常相似,但 CSV 文件中存储的数据字段是用逗号而不是制表符空格分隔的。两者都是一种分隔符分隔值格式。
text/tab-separated-values;charset=utf-8

# 文本
text/plain
text/json #非IANA官方推荐的json字符串类型,但浏览器也能识别和支持
text/html
text/xml
……

# 需要排除的一些MIME类型

# js/css
text/javascript;charset=utf-8
text/css;charset=utf-8

# 图片
image/gif
image/jpeg
image/png
image/webp

# 视频
video/mp4
video/mp2t

# 音频
audio/m4a

# 字体
font/woff2
font/woff
font/ttf
application/font-woff
application/font-woff2

vnd是什么含义

vnd indicates vendor-specific MIME types, which means they are MIME types that were introduced by corporate bodies rather than e.g. an Internet consortium.
vnd表示特定供应商的MIME类型,这意味着它们是由法人团体引入的MIME类型,而不是例如Internet联盟。

.docx    application/vnd.openxmlformats-officedocument.wordprocessingml.document
.ico     image/vnd.microsoft.icon
.mpkg    application/vnd.apple.installer+xml #Apple Installer Package

二进制数据

在 RFC 2046 中,application/octet-stream 被定义为 "任意二进制数据",它与那些唯一目的是保存到磁盘上的实体有一定的重叠,因此不属于任何 "网络"内容。或者从另一个角度看,应用/八进制流的唯一安全做法就是将其保存到文件中,并希望别人知道它的用途。

json和jsonscript的MIME类型

For JSON text:
application/json

For JSONP (runnable JavaScript) with callback:
application/javascript
==
IANA has registered the official MIME Type for JSON as application/json.

When asked about why not text/json, Crockford seems to have said JSON is not really JavaScript nor text and also IANA was more likely to hand out application/* than text/*.

A lot of stuff got put into the text/* section in the early days that would probably be put into the application/* section these days.

The most widely supported non-standard media types are text/json or text/javascript. But some big names even use text/plain.
==

IANA 已将 JSON 的官方 MIME 类型注册为 application/json。

当被问及为何不注册 text/json 时,Crockford 似乎说 JSON 并非真正的 JavaScript,也不是文本,而且 IANA 更倾向于注册 application/* 而非 text/*。

早期有很多东西被放到 text/* 部分,而现在这些东西可能会被放到 application/* 部分。

最广泛支持的非标准媒体类型是 text/json 或 text/javascript。但有些大公司甚至使用 text/plain。
参考链接:

Common MIME types
https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types

MIME types (IANA media types)
https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types

Media Types
https://www.iana.org/assignments/media-types/media-types.xhtml

What is the meaning of “vnd” in MIME types?
https://stackoverflow.com/questions/5351093/what-is-the-meaning-of-vnd-in-mime-types

Do I need Content-Type: application/octet-stream for file download?
https://stackoverflow.com/questions/20508788/do-i-need-content-type-application-octet-stream-for-file-download

Which JSON content type do I use?
https://stackoverflow.com/questions/477816/which-json-content-type-do-i-use

Media type
https://en.wikipedia.org/wiki/Media_type

=END=


《 “Nginx日志分析-MIME types” 》 有 4 条评论

  1. default_type application/octet-stream;

    nginx 踩坑 之 mine.types
    https://juejin.cn/post/7259582611181240377
    `
    application/octet-stream 介绍
    这是应用程序文件的默认值。意思是 未知的应用程序文件,浏览器一般不会自动执行或询问执行。浏览器会像对待 设置了HTTP 头 Content-Disposition值 为 attachment 的文件一样来对待这类文件,即浏览器会触发下载行为。

    application/octet-stream 是一种通用的二进制数据类型的 MIME type。在 MIME type 中, application 表示一种应用程序或二进制数据类型,而 octet-stream 表示二进制数据流,即未知的二进制数据。

    当服务器或系统无法识别文件的具体内容类型时,通常会将文件的 MIME type 设置为 application/octet-stream。这种设置告诉接收文件的客户端应用程序==不要尝试解释==文件的内容,而应该直接将其保存到本地文件系统中或交给用户选择打开方式。因为没有指定明确的内容类型,所以客户端将不能自动地以正确的方式打开或显示该文件,而是需要用户自行选择相应的应用程序来处理。

    application/octet-stream 通常用于传输各种二进制文件,如未知格式的文件、压缩文件(如ZIP、RAR)、可执行文件、字体文件、以及其他无法明确定义为特定内容类型的文件。这样的文件可能包含任何类型的数据,因此服务器或发送者没有确切的方式来标识文件的内容类型。

    需要注意的是,当服务器知道文件的具体内容类型时,最好还是使用相应的明确 MIME type,而不是将其设置为application/octet-stream。这样可以确保客户端能够正确地处理和展示文件内容,提供更好的用户体验。但在某些情况下,例如传输未知类型的二进制数据或允许用户下载任意文件时,这时使用 application/octet-stream 才是一种合理的选择。
    `

  2. Common MIME types
    https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types
    `
    常见的MIME类型(以下仅列出常见的办公用文档数据下载的MIME类型,像zip/rar等压缩文件的就不在内)

    .doc Microsoft Word application/msword
    .docx Microsoft Word (OpenXML) application/vnd.openxmlformats-officedocument.wordprocessingml.document

    .xls Microsoft Excel application/vnd.ms-excel
    .xlsx Microsoft Excel (OpenXML) application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

    .ppt Microsoft PowerPoint application/vnd.ms-powerpoint
    .pptx Microsoft PowerPoint (OpenXML) application/vnd.openxmlformats-officedocument.presentationml.presentation

    .csv Comma-separated values (CSV) text/csv

    .pdf Adobe Portable Document Format (PDF) application/pdf

    application/octet-stream
    `

  3. What is a correct MIME type for .docx, .pptx, etc.?
    https://stackoverflow.com/questions/4212861/what-is-a-correct-mime-type-for-docx-pptx-etc
    `
    Here are the correct Microsoft Office MIME types for HTTP content streaming:

    Extension MIME Type
    .doc application/msword
    .dot application/msword

    .docx application/vnd.openxmlformats-officedocument.wordprocessingml.document
    .dotx application/vnd.openxmlformats-officedocument.wordprocessingml.template
    .docm application/vnd.ms-word.document.macroEnabled.12
    .dotm application/vnd.ms-word.template.macroEnabled.12

    .xls application/vnd.ms-excel
    .xlt application/vnd.ms-excel
    .xla application/vnd.ms-excel

    .xlsx application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
    .xltx application/vnd.openxmlformats-officedocument.spreadsheetml.template
    .xlsm application/vnd.ms-excel.sheet.macroEnabled.12
    .xltm application/vnd.ms-excel.template.macroEnabled.12
    .xlam application/vnd.ms-excel.addin.macroEnabled.12
    .xlsb application/vnd.ms-excel.sheet.binary.macroEnabled.12

    .ppt application/vnd.ms-powerpoint
    .pot application/vnd.ms-powerpoint
    .pps application/vnd.ms-powerpoint
    .ppa application/vnd.ms-powerpoint

    .pptx application/vnd.openxmlformats-officedocument.presentationml.presentation
    .potx application/vnd.openxmlformats-officedocument.presentationml.template
    .ppsx application/vnd.openxmlformats-officedocument.presentationml.slideshow
    .ppam application/vnd.ms-powerpoint.addin.macroEnabled.12
    .pptm application/vnd.ms-powerpoint.presentation.macroEnabled.12
    .potm application/vnd.ms-powerpoint.template.macroEnabled.12
    .ppsm application/vnd.ms-powerpoint.slideshow.macroEnabled.12

    .mdb application/vnd.ms-access
    `

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注