-
Notifications
You must be signed in to change notification settings - Fork 89
[Draft] Day 8: “What's wrong with this HTML, and is it valid?” / Patrick Brosset #200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
captainbrosset
wants to merge
12
commits into
matuzo:advent25
Choose a base branch
from
captainbrosset:whats-wrong
base: advent25
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
e37060a
first draft
captainbrosset 2d2d77d
Improvements
captainbrosset ec6886b
Mostly finalized
captainbrosset 4b81ae9
Apply suggestions from code review
captainbrosset a5b0d12
Addressed some review comments
captainbrosset 1dbda72
Added quirks mode image and examples
captainbrosset de5f9e4
dom tree drawing
captainbrosset 24cc084
Update hell/adventcalendar/2025/8/index.md
captainbrosset 441a1df
Update hell/adventcalendar/2025/8/index.md
captainbrosset 25de646
Update hell/adventcalendar/2025/8/index.md
captainbrosset 2dc780e
code block
captainbrosset b124094
Final touch ups.
captainbrosset File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,35 +1,175 @@ | ||
| --- | ||
| title: "What's wrong with this HTML, and is it valid?" | ||
| author: "Your Name" | ||
| author_bio: "Your short bio" | ||
| author: "Patrick Brosset" | ||
| author_bio: "I'm in love with the web platform and developer tools. I have been working on browsers for more than 10 years. Currently a product manager on the Edge team at Microsoft, I used to work at Mozilla, and was a web developer before that. I do developer relations, technical documentation, and work on a wide range of web platform technologies and tools. Always happy to chat on social media!" | ||
| date: 2025-12-08 | ||
| author_links: | ||
| - label: "Site" | ||
| url: "https://linktoyourblog123.com" | ||
| link_label: "linktoyourblog123.com" | ||
| url: "https://patrickbrosset.com" | ||
| link_label: "patrickbrosset.com" | ||
| - label: "Mastodon" | ||
| url: "https://mas.to/@patrickbrosset" | ||
| link_label: "@[email protected]" | ||
| - label: "Bluesky" | ||
| url: "https://bsky.app/profile/patrickbrosset.com" | ||
| link_label: "@patrickbrosset.com" | ||
| intro: "<p>Short introductory text</p>" | ||
| image: "advent25_8" | ||
| --- | ||
| Some text. | ||
| Some text. | ||
| Behold this magnificient HTML document: | ||
|
|
||
| Some text. Some text. | ||
| ```html | ||
| <html> | ||
| <body marginheight=150 marginwidth=300 bgcolor=black text=white> | ||
| <marquee> | ||
| <b>Hello <i>HTML</b> World!</i> | ||
| </marquee> | ||
| ``` | ||
|
|
||
| ## Heading | ||
| To try it in your browser, copy the following line and paste it into the address bar: | ||
|
|
||
| Some text. | ||
| `data:text/html,<html><body marginheight=150 marginwidth=300 bgcolor=black text=white><marquee><b>Hello <i>HTML</b> World!</i></marquee>` | ||
captainbrosset marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| ## Heading | ||
| ## What's wrong with it? | ||
|
|
||
| Some text. | ||
| Everything? I mean, this HTML looks like it was written in 1998! | ||
|
|
||
| Here's what's wrong with it: | ||
captainbrosset marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| <p class="highlight"><strong>Note:</strong> Some text.</p> | ||
| 1. The document is in _quirks mode_ because it lacks a proper [`DOCTYPE` preamble](https://developer.mozilla.org/docs/Glossary/Doctype). | ||
|
|
||
| ```html | ||
| <h1> | ||
| <a href="/"> | ||
| Hello World | ||
| </a> | ||
| </h1> | ||
| ``` | ||
| If you've never heard of quirks mode, then you're probably lucky enough to have started your web development areer after it was an important thing to know about. Suffice to say it's weird. Here are some of the ways quirks mode impacts HTML documents: | ||
captainbrosset marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| * The box model behaves differently, which affects layout and spacing. | ||
captainbrosset marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| * Some CSS properties don't really work as you'd expect. | ||
| * Certain inline elements don't vertically align the way you think they should. | ||
| * Font sizes don't inherit on table elements. | ||
|
|
||
| 1. The `<head>` tag is missing, which means the document has no `<title>` either, which is bad for accessibility. | ||
captainbrosset marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| A common thing that assistive technology users do is read the title of a page first to know if they want to spend more time reading the page's content. Without a descriptive title, folks are forced to start reading more of the content to know if that's what they were looking for in the first place, which is time-consuming and potentially confusing. | ||
|
|
||
| 1. The `<body>` tag uses deprecated attributes: `marginheight`, `marginwidth`, `bgcolor`, and `text`. | ||
|
|
||
| These attributes are [obsolete and discouraged by the spec itself](https://html.spec.whatwg.org/multipage/obsolete.html#obsolete). | ||
|
|
||
| 1. The `<marquee>` tag is [obsolete](https://html.spec.whatwg.org/multipage/obsolete.html#the-marquee-element) and should be avoided in favor of CSS animations. | ||
|
|
||
| Plus, if you really must animate scrolling text, then please use the [`prefers-reduced-motion` media query](https://developer.mozilla.org/docs/Web/CSS/@media/prefers-reduced-motion) to respect user preferences. | ||
|
|
||
| 1. The `<b>` and `<i>` tags look like they're used for styling. That's wrong, right? | ||
|
|
||
| More on that later. | ||
|
|
||
| 1. The `<b>` and `<i>` tags are improperly nested. The nesting is `<b><i></b></i>` which is out of order. | ||
|
|
||
| 1. The closing `</body>` and `</html>` tags are missing. | ||
|
|
||
| ## Is this valid HTML? | ||
|
|
||
| Well, yes and no: | ||
|
|
||
| * No: if you send this to the [W3C HTML validator](https://validator.w3.org/nu/), it'll be pretty angry at you and will list the errors I mentioned earlier. | ||
| * But also, yes: the resulting page just loads and works fine in browsers. | ||
|
|
||
| Before discussing each point in details, don't you think this is just beautiful? HTML is so self-correcting that making a browser fail only by using HTML is really hard to achieve, and HTML that looks like it was written two decades ago still works! I mean, take a look at [www.spacejam.com](https://www.spacejam.com/1996/), [this old bar website](https://www.thecrystalcornerbar.com/), or even [the very first web page that was ever created](https://info.cern.ch/hypertext/WWW/TheProject.html). | ||
captainbrosset marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| Now let's go over the list of issues I mentioned earlier one more time, but this time, let's talk about why they're not actually causing any issues: | ||
|
|
||
| 1. Sure, quirks mode can lead to weird rendering issues if you don't know that you're using it, but it's still implemented in browsers and perfectly ok to use. | ||
|
|
||
| Even if quirks mode was added for [historical reasons](https://quirks.spec.whatwg.org/#history), to support web pages that were made before the CSS specification was fully fleshed out, the code in browser engines which detects the document mode and renders it accordingly is here to stay. There really is no reason for browsers to ever remove it, unless, one day, all quirks mode documents disappear from the web. | ||
|
|
||
| Judging by Chrome's [QuirksModeDocument use counter](https://chromestatus.com/metrics/feature/timeline/popularity/2034), showing that about 30% of page loads in Chrome use quirks mode, I don't think that's going to happen anytime soon. I think a lot of these page loads are due to ads creating iframes without a proper `DOCTYPE`, but still, that's a lot of pages. | ||
captainbrosset marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| If you're encountering weird rendering issues that you can't explain, double check that you have a `DOCTYPE` in your HTML document. You can also run the following line of code in the browser console: `document.compatMode`. If it returns `BackCompat`, then you're in quirks mode. | ||
|
|
||
| 1. The `<head>` tag can definitely be omitted. Neither the [HTML specification](https://html.spec.whatwg.org/multipage/semantics.html#the-head-element), nor browser implementations require the tag to be present. | ||
|
|
||
| It's bad for accessibility reasons if you omit it, again because you probably also won't have a `<title>` tag, but it still works. | ||
|
|
||
| In fact, you can also omit `<html>` and `<body>` tags too. Personally, I commonly use this to quickly test things out in the browser. Instead of creating a new HTML file on my computer, which takes a bit more time, I just type some HTML in the address bar directly. For example: `data:text/html,<div>something`. No `<html>`, no `<head>`, no `<body>` elements. | ||
captainbrosset marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| 1. `marginheight`, `marginwidth`, `bgcolor`, or `text` are deprecated _presentational attributes_. But, even if they're deprecated and discouraged, they're still implemented in browsers, for backward compatibility reasons. | ||
|
|
||
| In fact, here are other similar attributes: `bgColor`, `fgColor`, `linkColor`, `alinkColor`, and `vlinkColor`. | ||
|
|
||
| If you're as old as I am, you might have used these attributes a long time ago, perhaps when creating sites in FrontPage or Dreamweaver. | ||
|
|
||
| Anyway, these presentational attributes act as 0-specificity CSS properties, which means that any CSS property you set in a stylesheet will override them. | ||
|
|
||
| 1. The `<marquee>` element still animates text in browsers. In fact, if you want to go crazy with it, try nesting two `<marquee>` elements, like this: | ||
|
|
||
| ```html | ||
| <marquee | ||
| direction="down" | ||
| width="200" | ||
| height="200" | ||
| behavior="alternate"> | ||
| <marquee behavior="alternate">This text will bounce</marquee> | ||
| </marquee> | ||
| ``` | ||
|
|
||
| Take a look at [the example on codepen](https://codepen.io/captainbrosset/pen/dPGvrMQ?editors=1100). | ||
captainbrosset marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| 1. Using `<b>` and `<i>` is perfectly valid. They used to be meant for making the text bold and italic, hence their names. But they were deprecated in HTML4, and the meaning of the tags was changed to mean something else. The `<b>` tag now means _bring attention_ and the `<i>` tag now means _idiomatic text_. | ||
|
|
||
| `<b>` is now used to markup keywords, product names, or other spans of text whose typical presentation would be boldfaced, but not including any special importance. | ||
captainbrosset marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| `<i>` is now used to markup text that is set off from the normal prose for readability reasons. | ||
captainbrosset marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| More semantic tag names have since been invented too: `<strong>`, `<em>`, or `<mark>`, which convey slightly different semantics. | ||
|
|
||
| If there's no semantic aspect to the piece of text you want to make bold or italic, don't use `<b>` or `<i>`, use CSS `font-weight` and `font-style` instead. | ||
|
|
||
| 1. Misnested tags can sometimes happen in HTML, and when it does, the page doesn't break! | ||
|
|
||
| That's the beauty of HTML once again. If you're coming from an XML background, you might be surprised by the forgiveness of HTML. But, in the vast majority of cases, HTML parers just figure things out on their own and get you what you want. | ||
|
|
||
| In our example, the markup is `<b><i></b></i>`, which feels obviously wrong because the closing `</b>` tag should appear after the closing `</i>` tag, to respect nesting. This particular markup creates the following DOM tree: | ||
|
|
||
| * b | ||
| * i | ||
| * i | ||
|
|
||
| This behavior is actually specified in the HTML spec, and called the _adoption agency algorithm_. I think we owe it to [Chris Wilson](https://cwilso.com/) for thinking about this in the first place. Chris, if you ever find traces of old discussions about this, or care to write the backstory, I would be very interested! | ||
|
|
||
| Of course, I'm not saying you should do this. It's still important to create correctly nested HTML markup. But there are historical reasons for things like this to work. You have to remember that, back in the early days, browser engines didn't always agree on how to parse and render HTML. So, in order to ensure that as much of the web as possible was supported across all browsers, it was sometimes easier to just support how other browsers did things. And that's how things like misnested tags ended up being supported. | ||
captainbrosset marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| 1. Missing end tags are fine. The HTML parser is able to close most of them on its own. | ||
|
|
||
| For example, a list item doesn't need to be closed if what follows is another list item or the end of the list. So, this works fine: | ||
|
|
||
| ```html | ||
| <ul> | ||
| <li>Item 1 | ||
| <li>Item 2 | ||
| <li>Item 3 | ||
| </ul> | ||
| ``` | ||
|
|
||
| The same is true for paragraphs. You can omit the closing `</p>` tag if what follows is another paragraph, a heading, a list, and a whole lot of other elements: | ||
|
|
||
| ```html | ||
| <section> | ||
| <p>This is a paragraph | ||
| <p>This is another paragraph | ||
| <h2>This is a heading</h2> | ||
| <p>This is yet another paragraph | ||
| </section> | ||
| ``` | ||
|
|
||
| You can find out more about these examples, and others, in the [Optional tags section of the HTML spec](https://html.spec.whatwg.org/multipage/syntax.html#optional-tags). | ||
|
|
||
| Also, think about it, you're probably already using this without realizing. Have you ever closed a `<img>`, `<input>`, or `<link>` tag? Probably not, and that's fine. The HTML spec defines a whole lot of elements which don't require closing tags: `<base>`, `<link>`, `<meta>`, `<hr>`, `<br>`, `<source>`, `<img>`, `<input>`, and others. | ||
captainbrosset marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| ## So, what's the moral of the story? | ||
|
|
||
| HTML can be very forgiving, and browsers implement things that may seem obscure or weird, but they do so for a very good reason: backward compatibility! | ||
|
|
||
| The web is the only platform where sites that were written years ago can still work fine today. This isn't to say that things never get removed though, they do, and probably more often than you realize. Remember AppCache, WebSQL, module import assertions, or special rules that apply to the font-size of `<h1>` elements when nested inside certain elements? | ||
|
|
||
|
|
||
|
|
||
| This is both a blessing and a curse. The fact that so much of the languages we use are so forgiving and time-enduring made the web what it is today. A welcoming platform that doesn't take so much effort to get used to, and kind of just works. But, this also means that old features and bad practices can linger on for a long time and, if they're used by many sites and users, can't really ever be removed. | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whaaat!? I didn't know that was possible, cool!