Responsive Images | Software Designing

On Sep 21, 2018

Aug 10, 2018

Planning for Everything | Software Designing

Aug 10, 2018

I come here not to bury img, but to praise it.

Well, mostly.

Historically, I like img just fine. It’s refreshingly uncomplicated, on the surface: it fires off a request for the file in its src attribute, renders the contents of that file, and provides assistive technologies with an alternative narration. It does so quickly, efficiently, and seamlessly. For most of the web’s life, that’s all img has ever had to do—and thanks to years and years of browsers competing on rendering performance, it keeps getting better at it.

But there’s a fine line between “reliable” and “stubborn,” and I’ve known img to come down on both sides of it.

Though I admit to inadvertently hedging my bets a little by contributing to the jQuery Mobile Project—a framework originally dedicated to helping produce “mobile sites”—I’ve always come down squarely in the responsive web design (RWD) camp. For me, the appeal of RWD wasn’t in building a layout that adapted to any viewport—though I do still think that’s pretty cool. The real appeal was in finding a technique that could adapt to the unknown-unknowns. RWD felt—and still feels—like a logical and ongoing extension of the web’s strengths: resilience, flexibility, and unpredictability.

That said, I would like to call attention to one thing that m-dot sites (dedicated mobile versions of sites, usually found at a URL beginning with the letter m followed by a dot) did have over responsively designed websites, back in the day: specially tailored assets.

Tailoring Assets

In a responsive layout, just setting a max-width: 100% in your CSS ensures that your images will always look right—but it also means using image sources that are at least as large as the largest size at which they’ll be displayed. If an image is meant to be displayed anywhere from 300 pixels wide to 2000 pixels wide, that same 2000-pixel-wide image is getting served up to users in all contexts. A user on a small, low-resolution display gets saddled with all of the bandwidth costs of massive, high-resolution images, but ends up with none of the benefits. A high-resolution image on a low-resolution display looks like any other low-resolution image; it just costs more to transfer and takes longer to appear.

Even beyond optimization, it wasn’t uncommon to show or hide entire blocks of content, depending on the current viewport size, during those early days of RWD. Though the practice became less common as we collectively got the hang of working responsively, img came with unique concerns when serving disparate content across breakpoints: our markup was likely to be parsed long before our CSS, so an img would have no way of knowing whether it would be displayed at the current viewport size. Even an img (or its container) set to display: none would trigger a request, by design. More bandwidth wasted, with no user-facing benefit.

Our earliest attempts

I am fortunate enough to have played a tiny part in the history of RWD, having worked alongside Filament Group and Ethan Marcotte on the Boston Globe website back in 2011.

It was, by any measure, a project with weight. The Globe website redesign gave us an opportunity to prove that responsive web design was not only a viable approach to development, but that it could scale beyond the “it might be fine for a personal blog” trope—it could work for a massive news organization’s website. It’s hard to imagine that idea has ever needed proving, looking back on it now, but this was a time when standalone m-dot sites were widely considered a best practice.

While working on the Globe, we tried developing a means of delivering larger images to devices with larger screens, beginning with the philosophy that the technique should err on the side of mobile: start with a mobile-sized and -formatted image, then swap that with a larger version depending on the user’s screen size. This way, if anything should break down, we’re still erring on the side of caution. A smaller—but still perfectly representative—image.

The key to this was getting the screen’s width in JavaScript, in the head of the document, and relaying that information to the server in time to defer requests for images farther down the page. At the time, that JavaScript would be executed prior to any requests in body being made; we used that script to set a cookie about the user’s viewport size, which would be carried along with those img requests on the same page load. A bit of server-side scripting would read the cookie and determine which asset to send in response.

It worked well, but it was squarely in the realm of “clever hack”—that parsing behavior wasn’t explicitly defined in any specifications. And in the end, as even the cleverest hacks are wont to do, it broke.

Believe it or not, that was good news.

Prefetching—or “speculative preparsing”—is a huge part of what makes browsers feel fast: before we can even see the page, the browser starts requesting assets so they’re closer to “ready” by the time the page appears. Around the time the Globe’s site launched, several major browsers made changes to the way they handled prefetching. Part of those changes meant that an image source might be requested before we had a chance to apply any of our custom logic.

Now, when browsers compete on performance, users win—those improvements to speculative preparsing were great news for performance, improving load times by as much as 20 percent. But there was a disconnect here—the fastest request is the one that never gets made. Good ol’ reliable img was single-mindedly requesting the contents of its src faster than ever, but often the contents of those requests were inefficient from the outset, no matter how quickly the browser managed to request, parse, and render them—the assets were bigger than they’d ever need to be. The harm was being done over the wire.

So we set out to find a new hack. What followed was a sordid tale of noscript tags and dynamically injected base tags, of document.write and eval—of rendering all of our page’s markup in a head element, to break preparsing altogether.

For some of you, the preceding lines will require no explanation, and for that you have my sincerest condolences. For everyone else: know that it was the stuff of scary developer campfire stories (or, I guess, scary GIF-of-a-campfire stories). Messy, hard-to-maintain hacks all the way down, relying entirely on undocumented, unreliable browser quirks.

Worse than those means, though, were the ends: none of it really worked. We were always left with compromises we’d be foisting on a whole swath of users—wasted requests for some, blurry images for others. It was a problem we simply couldn’t solve with sufficiently clever JavaScript; even if we had been able to, it would’ve meant working around browser-level optimizations rather than taking advantage of them. We were trying to subvert browsers’ improvements, rather than work with them. Nothing felt like the way forward.

We began hashing out ideas for a native solution: if HTML5 offered us a way to solve this, what would that way look like?

A native solution

What began in a shared text file eventually evolved into one of the first and largest of the W3C’s Community Groups—places where developers could build consensus and offer feedback on evolving specifications. Under the banner of the “Responsive Images Community Group,” we—well, at the risk of ruining the dramatic narrative, we argued on mailing lists.

One such email, from Bruce Lawson, proposed a markup pattern for delivering context-appropriate images that fell in line with the existing rich-media elements in HTML5—like the video tag—even borrowing the media attribute. He called it picture; image was already taken as an ancient alias of img, after all.

What made this proposal special was the way it used our reliable old friend img. Rather than a standalone element, picture came to exist as a wrapper—and a decision engine—for an inner img element:

That img inside picture would give us an incredibly powerful fallback pattern—it wouldn’t be the sort of standard where we have to wait for browser support to catch up before we could make use of it. Browsers that didn’t understand picture and its source elements would ignore it and still render the inner img. Browsers that did understand picture could use criteria attached to source elements to tell the inner img which source file to request.

Most important of all, though, it meant we didn’t have to recreate all of the features of img on a brand-new element: because picture didn’t render anything in and of itself, we’d still be leaning on the performance and accessibility features of that img.

This made a lot of sense to us, so we took it to the Web Hypertext Application Technology Working Group (WHATWG), one of the two groups responsible for the ongoing development of HTML.

If you’ve been in the industry for a few years, this part of the story may sound a little familiar. Some of you may have caught whispers of a fight between the WHATWG’s srcset and the picture element put forth by a scrappy band of web-standards rebels and their handsome, charismatic, and endlessly humble Chair. Some of you read the various calls to arms, or donated when we raised funds to hire Yoav Weiss to work full-time on native implementations. Some of you have RICG T-shirts, which—I don’t mind saying—were rad.

A lot of dust needed to settle, and when it finally did, we found ourselves with more than just one new element; edge cases begat use cases, and we discovered that picture alone wouldn’t be enough to suit all of the image needs of our increasingly complex responsive layouts. We got an entire suite of enhancements to the img element as well: native options for dealing with high-resolution displays, with the size of an image in a layout, with alternate image formats—things we had never been able to do natively, prior to that point.