Doctype & Charset Checker: Verify HTML5 and UTF-8 on Any Page

RankNibbler verifies the doctype declaration and character encoding on any URL. Catch missing doctypes that force quirks-mode rendering, legacy XHTML or HTML4 declarations, and encoding mismatches that corrupt special characters. Free, instant, no signup.

Check a page now →

What Is a Doctype?

The doctype declaration is the first line of any HTML document: <!DOCTYPE html>. It tells the browser which version of HTML the page uses and which rendering mode to apply. HTML5 uses the simple <!DOCTYPE html>; older standards (HTML 4.01, XHTML 1.0) used long, verbose declarations with DTD references.

Without a doctype, browsers fall back to "quirks mode" — a legacy rendering mode designed for 1990s websites. In quirks mode, the CSS box model works differently, layouts break unpredictably, and some modern features are disabled. The result is a page that looks fine in some browsers and broken in others. Every modern webpage should start with <!DOCTYPE html>.

Doctype Declarations Compared

DoctypeStatusDeclaration
HTML5Current — use this<!DOCTYPE html>
HTML 4.01 StrictLegacy<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"...>
HTML 4.01 TransitionalLegacy<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"...>
XHTML 1.0 StrictLegacy<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"...>
XHTML 1.0 TransitionalLegacy<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"...>
No doctypeQuirks mode — fix immediately(missing)

If your site uses anything other than <!DOCTYPE html>, the page will still render but may do so unpredictably across browsers. Upgrading is a one-line change with no downside.

What Is Character Encoding?

Character encoding tells the browser which character set the page uses — that is, how bytes map to letters and symbols. UTF-8 is the universal modern standard because it supports every character in every language, from ASCII to Chinese to emoji, in a variable-width format that is backwards compatible with ASCII.

The declaration lives in the <head>:

<meta charset="UTF-8">

It should be the first thing after the opening <head> tag, before any other meta tag or content. Why first? Because the browser has to know the encoding before it can parse any other text — if the declaration comes later, the first few hundred bytes could be misinterpreted.

Common Character Encoding Problems

Missing Charset Declaration

Without <meta charset>, browsers guess based on HTTP headers or content heuristics. Guesses are usually correct for English content but unreliable for international characters. Always declare explicitly.

Using a Legacy Encoding

ISO-8859-1, Windows-1252, Shift-JIS, and other legacy encodings still exist in old content. They work for their specific language but break any time a non-native character appears. Migrate to UTF-8 by re-saving files with the correct encoding.

Encoding Mismatch

The HTTP header says one encoding, the meta tag says another, the file was actually saved as a third. Browsers pick one and characters break. The fix is to ensure your server, file, and declaration all agree — standardise on UTF-8 across the stack.

BOM (Byte-Order Mark) Issues

Some editors save UTF-8 files with a byte-order mark at the start. For HTML files, this sometimes causes parsing issues (the BOM gets treated as invisible whitespace). Save UTF-8 without BOM whenever possible.

Mojibake / Garbled Characters

"£" showing up where "£" should appear, or "’" instead of an apostrophe, are classic encoding problems. The underlying cause is always a mismatch somewhere between file, server, and declaration.

Why This Matters for SEO

Doctype and charset are not direct ranking factors, but they affect rankings through several indirect paths:

How to Fix Doctype and Encoding Issues

  1. Add HTML5 doctype. Ensure <!DOCTYPE html> is the first line.
  2. Add UTF-8 charset. Place <meta charset="UTF-8"> as the first element inside <head>.
  3. Save files as UTF-8. Configure your editor (VS Code, Sublime, etc.) to save files as UTF-8 without BOM by default.
  4. Configure server Content-Type. Ensure your web server sends Content-Type: text/html; charset=utf-8 headers.
  5. Test. Use this checker to verify, then view-source to manually confirm.
  6. Audit sitewide. Use Site Audit to flag any page missing the declarations.

Related Technical SEO Tools