Are you a coder? Please take our new survey (it's short and fun) about how you use AI at work.
HTML is deceptive. It looks easy. And easy HTML is easy. With a few tags you can write your name on a webpage, make it bigger or smaller, add “is awesome” in bold or italics, and even—for those of us who came of age in better times—make it blink or scroll across the screen.
Because HTML looks easy and lacks features like formal conditional logic and Turing-completeness, it’s often dismissed as not a programming language. “That’s not real code; it’s just markup” is a common refrain. Now, I’m no stranger to the austere beauty of the command line, from automating scripts to training machine-learning models. But underestimating HTML is a mistake.
HTML is the most significant computing language, programming or otherwise, ever developed. Every other programming language has to grapple with how HTML has redefined computing over the past 30-plus years. So many “pure” programming languages automate the production of more and more HTML.
When haters deny HTML’s status as a programming language, they’re showing they don’t understand what a language really is. Language is not instructing an interlocutor what to do in a way that leaves no room for other interpretations; it is better and richer than that. Like human language, HTML is conversational. It is remarkably adept at adapting to context. It can take a different shape on any machine, from a desktop browser or an e-reader screen to a mobile app or a screen reader for the blind (so long as that device is built to present hypertext).
HTML is somehow simultaneously paper and the printing press for the electronic age. It’s both how we write and what we read. It’s the most democratic computer language and the most global. It’s the medium we use to connect with each other and publish to the world. It makes perfect sense that it was developed to serve as a library—an archive, a directory, a set of connections—for all digital knowledge.
Like many readers my age, I encountered HTML in 1994, during those heady days when Netscape’s browser brought the web to much more of the world. In my case, the internet was available on just a few machines at my high school—I didn’t have a computer at home, let alone dial-up—but I quickly made myself an amateur expert in navigating the strange new space that Tim Berners-Lee and his colleagues had brought into being.
An older friend showed me how he built his own homepage, complete with animated “under construction” GIFs, an interactive Magic 8-Ball, and a handful of bandwidth-devouring photos. Since I didn’t have an email address, he set up a text entry field on his site where I could send him messages, with stern instructions that the textbox was “only for TIM.” He’d post his own responses directly on the website, so I (and theoretically anyone) could read them when I next got access to a machine. This is what passed for information security for two low-risk targets in those days.
The idea that you could build something fun for your friends, but potentially anyone in the world could discover it, was intoxicating. So was the language itself. Most raw HTML then was easy for humans, even newcomers, to understand. You didn’t need a primer in formal logic to grasp that most tags needed to be opened and closed (like parentheses in a math statement or ordinary sentence) to function as intended. And you didn’t need to be a mischief-minded teenager to see the possibilities in what happened when HTML “broke.”
To learn any programming language is to learn how to debug it. But a malformed command in Python usually returns an error message that keeps the code from running, not something that fails brilliantly yet monstrously, outpacing its creators’ intentions. With HTML, we are all Doctor Frankenstein.
One of my favorite websites of all time is the Embroidery Troubleshooting Guide. These days it’s available only via the Internet Archive, unless (like me) you have a local copy. At the top, it looks like a typical, if somewhat old-fashioned, small-business website. But when you glance down, you immediately notice something strange about it. The text, all center-aligned in alternating red and blue Arial, gradually gets bigger and bigger, with phrases forced to wrap lines or reach the edge mid-word, filling up the screen like Alice trying to squeeze through smaller and smaller doors in Wonderland.
When you view the source code (have any other programs made it so easy to view source like a website?), you’ll quickly discover what’s gone wrong. Each line of centered text begins with <h2> or <h3> header tags that never close. Each header tag—which only establishes a relative size, not an absolute one, part of the semantic richness of the web’s flexible grammar—builds on the last, creating progressively larger nesting dolls. The tag designed for defining textual hierarchy runs amok, creating chaos. The fact that the words themselves are about how and why threads can break makes it poetry.
On its own, the Embroidery Troubleshooting Guide would be a clever enough piece of found conceptual art. But by viewing the source, downloading the file, and replacing the instructions for troubleshooting common sewing problems with any text you like, you can make that artwork your own. I like to put in my favorite poetry, decontextualizing it and forcing myself to read it with new eyes.
“Broken” sites like these upend the great achievement of semantic HTML. As it developed, semantic HTML increasingly separated structure from presentation: Instead of <i> tags, which strictly specify that a text be presented in italics, we use <em> tags to identify emphasis (or <cite> tags for titles of books or movies, etc.). These elements can then be presented as italics on a computer screen but be read with a different intonation by a screen reader. The Embroidery Troubleshooting Guide hijacks a semantic tag and makes it present something unexpected. The same building blocks that allow a single website to be displayed responsively on a tiny phone or enormous television screen can make a website fundamentally undisplayable. This is delightful.
I appreciate the utility of content management systems and complex sites that generate HTML dynamically, but there’s a joy in building sites out of simple HTML files you can edit by hand. I still edit my own website this way, tidying it up so I can see every tag, section, and paragraph break. I even love editing my own ebooks, turning PDFs into nicely formatted HTML-based EPUB files that never get published to anyone: my own private library of self-contained websites. During the height of the pandemic, editing these files and their style sheets by hand was a balm.
Ultimately, even as HTML has become the province of professionals, it cannot be gatekept. This is what makes so many programmers so anxious about the web, and sometimes pathetically desperate to maintain the all-too-real walls they’ve erected between software engineers and web developers. But people who write HTML know that hierarchies were made to be blown up. All it takes is a tag that doesn’t close where you’d expect it to.
What other programmers might say dismissively is something HTML lovers embrace: Anyone can do it. Whether we’re using complex frameworks or very simple tools, HTML’s promise is that we can build, make, code, and do anything we want.