Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: wrong detected MIME type with UTF-8 BOM #20912

Closed
unixpickle opened this issue Jul 5, 2017 · 1 comment
Closed

net/http: wrong detected MIME type with UTF-8 BOM #20912

unixpickle opened this issue Jul 5, 2017 · 1 comment

Comments

@unixpickle
Copy link

unixpickle commented Jul 5, 2017

What version of Go are you using (go version)?

go version go1.8.3 darwin/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/alex/Documents/Code/Go"
GORACE=""
GOROOT="/usr/local/Cellar/go/1.8.3/libexec"
GOTOOLDIR="/usr/local/Cellar/go/1.8.3/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/23/qy5hclf52mdfnx7xgn1ddk6r0000gn/T/go-build891556908=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"

What did you do?

Run this code, which is on the playground:

package main

import (
	"fmt"
	"net/http"
)

func main() {
	document := "<!DOCTYPE html><html></html>"
	bom := "\xef\xbb\xbf"
	fmt.Println(http.DetectContentType([]byte(document)))
	fmt.Println(http.DetectContentType([]byte(bom+document)))
}

What did you expect to see?

text/html; charset=utf-8
text/html; charset=utf-8

What did you see instead?

text/html; charset=utf-8
text/plain; charset=utf-8

It seems from sniff.go that a BOM automatically triggers a text/plain MIME type. Ideally, htmlSig would detect UTF-8 BOMs and skip past them.

@bradfitz
Copy link
Contributor

bradfitz commented Jul 5, 2017

  1. UTF-8 BOMs are unnecessary and often cause pain for little to no benefit. You should avoid them.

  2. http.DetectContentType implements https://mimesniff.spec.whatwg.org/ which does not seem to suggest that any textual content type can have a UTF-8 BOM in front of it.

So it looks like this is working as intended.

Let me know if I misread the mimesniff spec, though.

@bradfitz bradfitz closed this as completed Jul 5, 2017
@golang golang locked and limited conversation to collaborators Jul 5, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants