Native Audio with HTML5

Emily P. Lewis | October 12, 2011

 

Once upon a time, audio on the web lived primarily in the world of third-party browser plug-ins like Flash, QuickTime and Silverlight. This was not a bad world, but it had its issues.

For one, most plug-ins require the user to install them, but not all users are willing (or able) to install them. Also, many players built with these plug-ins are inaccessible, making it difficult for folks who use assistive technologies to access the audio or alternative content.

Then there are the front-end design hassles like trying to get a dropdown menu to display on top of a plug-in-based player. And let’s not forget that to build a custom player with these plug-ins requires knowledge and expertise in that SDK.

Enter HTML5

Today, we have another option: HTML5 <audio>. This new element allows you to deliver audio files directly through the browser, without the need for any plug-ins. It works much like the tried-and-true <img> element, embedding the audio file into a web page via the src attribute:

<audio src="audio.mp3"></audio>

Not only does native audio deliver independence from plug-ins, it can be targeted with CSS and JavaScript.  This means that creating a custom player is simply a matter of writing HTML, CSS and JS. It also means more front-end control for responsive, dynamic designs and potentially better accessibility.

As far as browser support goes, <audio> enjoys support by all of today’s latest browsers, including mobile browsers for iOS 4+, Android 2.3+ and Opera Mobile 11+.

Sound good? Then let’s get started adding embedded audio in our web pages!

A Basic Audio Player

To add a simple audio player to your web page, all you need is a single line of markup:

<audio src="audio.mp3" controls></audio>

This includes the src attribute I already discussed, which embeds the specified audio file into the page. It also includes the controls attribute, which tells the browser to use its default control interface for audio.

As you can see in Figure 2, each browser has a different default for player controls but all include the basics: play/pause toggle, timeline progress bar and volume control.

More Attributes

Beyond src and controls, <audio> has several other attributes you can utilize to further modify how your audio file will load and play.

autoplay

<audio src="audio.mp3" controls autoplay></audio>

The Boolean autoplay attribute is one that I don’t recommend using because it specifies that the audio begin playing as soon as the page loads. This is a usability no-no for most scenarios, so exercise restraint in using this attribute.

If you do decide to utilize autoplay, please be sure to include the controls attribute (or roll your own custom controls) so that your users can stop the audio or reduce volume.

crossorigin

<audio src="audio.mp3" controls crossorigin="anonymous"></audio>

crossorigin is used to indicate if an audio file is being served from a different domain.  This is a very new attribute introduced for all media elements (<video> and <img> too) to address playback issues with Cross-Origin Resource Sharing (CORS).

Depending on the scenario, crossorigin can be declared with an empty string or with CORS settings attribute keywords: user-credentials or anonymous.

loop

<audio src="audio.mp3" controls loop></audio>

Another Boolean attribute, loop, tells the browser to loop the audio when playing. Like autoplay, I’m not a particular fan of this attribute because it takes control away from the user. But if you must use it, I recommend including the controls attribute alongside loop.

mediagroup

<audio src="audio.mp3" controls mediagroup="AnyName"></audio>

mediagroup is another relatively new attribute that is used to tie together multiple media files for synchronized playback. Each media element with the same keyword value for mediagroup is, essentially, linked and can be manipulated for playback via the DOM.

This attribute is valid for all media elements, so it is possible to link audio to audio, as well as audio to video and video to video.

muted

<audio src="audio.mp3" controls muted></audio>

The Boolean attribute muted does just what it says: mutes the audio file upon initial play. The user can then override this if volume controls are provided.

preload

<audio src="audio.mp3" controls preload></audio>

The preload attribute suggests how the browser should buffer the audio, according to the specified value:

  • preload="auto" (same as the Boolean preload in the example) leaves it up to the browser to decide whether to begin downloading.
  • preload="metadata" tells the browser to download information like tracks and duration, but to wait to buffer the audio until the user selects play.
  • preload="none" tells the browser that no audio information should be downloaded until the user activates the controls.

Make note that not all browsers support all of these attributes and that the specification itself is still changing. That means you have to experiment, test and stay up-to-date on the spec.

Fallback Content

As I already mentioned, <audio> is well supported by modern browsers. But what about users who aren’t on modern browsers? Depending on your audience, there could be a fair percentage of your users who can’t access your audio content. For those users, <audio> offers fallback content, which is contained within the opening and closing <audio> tags:

<audio src="audio.mp3" controls>

    <p>Your browser does not support native audio, but you can <a href="audio.mp3">download this MP3</a> to listen on your device.</p>

</audio>

For browsers that don’t support <audio>, this fallback content is what displays to the user, while browsers that do support native audio ignore the fallback and display the player.

In this example, I chose to include some explanatory text and a link to download the audio file for my fallback content. But you can pretty much include any content you want to serve to those users, including HTML.

Not a Perfect World

HTML5 <audio> makes it (arguably) easier for an average front-end developer like me to add audio to web pages. And it opens up a world of possibilities for better media accessibility as the specification evolves.

Unfortunately, like the world of plug-ins, native audio has its issues too.

No Single Codec

To keep audio content for the web at reasonable sizes for streaming and download, audio data is compressed/decompressed using codecs. Different codecs transform the audio into different formats that offer good quality with minimum bitrates.

So far, there is no single standard for audio codecs in the HTML5 specification. This means that some browsers support some formats, while other browsers support others:

Multiple Audio Files

Fortunately, <audio> is set up to handle multiple file formats:

<audio controls>

    <source src="audio.ogg">

    <source src="audio.mp3">

</audio>

As you can see in this example, to declare multiple audio files, you first drop the src attribute from <audio>. Next, you nest child <source> elements inside <audio>, each of which specifies a different file format via the src attribute.

A browser will read the first-listed <source> and, if it supports the specified file format, the audio player will render on the page. If the browser doesn’t, it moves on to the next <source> element.

In the event the browser doesn’t find a <source> file format it can support, it will fail and playback won’t be possible:

But this is where you can take advantage of fallback content, which must be nested within <audio> and after all <source> elements:

<audio controls>

    <source src="audio.ogg">

    <source src="audio.mp3">

    <p>Your browser does not support native audio, but you can <a href="audio.mp3">download this MP3</a> to listen on your device.</p>

</audio>

In this example, a browser will first check if it supports <audio>. If it doesn’t, it goes straight to the fallback content.

If it does support <audio>, it next checks for support of file formats, starting with the first <source> and proceeding until it reaches a supported format. In the event no listed formats are supported, the fallback content displays.

Files & Order

In terms of which file formats to include, it isn’t necessary to have all formats listed in Figure 3. Including just MP3 and OGG will cover all your bases for modern browsers supporting HTML5 <audio>.

Regarding source order, it technically doesn’t matter which audio file format is listed first. That said, I usually include my OGG <source> first. It is the higher-quality file, compared to MP3, and I want browsers that support both to get the OGG first. Also, there was a bug in older versions of Firefox where if the first <source> format was MP3 it failed, so listing OGG first can avoid triggering this bug.

MIME Types

In addition to specifying multiple audio formats, it is also good practice to specify MIME types for each audio file:

<audio controls>

    <source src="audio.ogg" type="audio/ogg">

    <source src="audio.mp3" type="audio/mp3">

    <p>Your browser does not support native audio, but you can <a href="audio.mp3">download this MP3</a> to listen on your device.</p>

</audio>

By specifying a MIME type for each audio format, it helps the browser know what type of content it will be dealing with. This can speed up <audio> rendering because the browser won’t have to download the files to determine content type.

Also, some browsers won’t play audio without the correct MIME type. For example, Safari 5.1 (at least as of this writing) will fail to play any audio if the first-listed <source> is an unsupported format like OGG without a specified MIME type.

Server Support

During my experiments with <audio> I encountered one of the more frustrating aspects of delivering native media: server support for MIME types. Though you can specify the MIME type for each audio format directly in your markup as seen in the example above, this doesn’t guarantee that your web server supports those MIME types.

And if your server doesn’t support a given format, you won’t have playback … something I discovered (not quickly enough) when an <audio> implementation that worked on my local system failed on the live web server.

I’m am the furthest thing from a expert on server configurations, but I have found success circumventing these MIME type issues by updating my sites’ .htaccess files to reference the correct file types. And the HTML5 Boilerplate.htaccess file is a fantastic template to start with.

Making the Transition

HTML5 is still new to so many developers. So maybe you aren’t quite ready to take the leap headfirst into <audio>? Or perhaps you have concerns about your users on browsers without <audio> support?

I completely understand wanting the best possible experience for all your users, regardless of their browsers. Fortunately, you can ease into HTML5 <audio> and gracefully degrade the experience for users on older browsers.

Flash Fallback

As I mentioned, <audio> fallback content can include HTML. And that means it can include a Flash <object> for browsers that don’t support <audio>:

<audio controls>

    <source src="audio.ogg" type="audio/ogg">

    <source src="audio.mp3" type="audio/mp3">

    <object data="mediaplayer.swf?audio=audio.mp3">

        <param name="movie" value="mediaplayer.swf?audio=audio.mp3">

    </object>

</audio>

In this example, the browser will first check if it supports <audio>. If it doesn’t, it will fallback to the Flash audio player (provided the plug-in is installed).

If the browser does support <audio>, it will proceed through the <source> elements until it finds a supported format. In the event no supported format is listed, the browser will fallback to the Flash player (again, if the plug-in is installed).

Fallback for the Fallback

Now, what if Flash isn’t supported? That’s when you use the fallback’s fallback:

<audio controls>

    <source src="audio.ogg" type="audio/ogg">

    <source src="audio.mp3" type="audio/mp3">

    <object data="mediaplayer.swf?audio=audio.mp3">

        <param name="movie" value="mediaplayer.swf?audio=audio.mp3">

        <p>Your browser does not support native audio or Flash, but you can <a href="audio.mp3">download this MP3</a> to listen on your device.</p> 

    </object>

</audio>

Simply nest your Flash fallback content within the <object> and after all <param>s. Browsers that don’t support HTML5 audio or Flash will fallback to this content, in this case some explanatory text and a link to download the audio.

Pre-Built Players

Another way you can transition to HTML5 audio is to use a pre-built player. Many players today give you options to choose different skins for the player and even skin on your own via CSS. Additionally, several HTML5 media players are already built with Flash fallback content. Here are a few to check out:

Further Reading

To say this is the tip of the iceberg when it comes to native audio is an understatement. This article focuses on the core markup and syntax for embedding audio into your web pages. But the true power of native audio is the ability to target it using JS and CSS.

You can make your own custom player. You can visualize audio. You can generate audio on the fly. And these are just some of the early experiments.

As you experiment further with <audio>, please check out these resources:

P.S. You’ve got the foundation for <video> now!

Much of what this article discusses for HTML5 <audio> applies equally to <video>. As media elements, they share many of the same attributes and follow a similar syntax. <video> is also subject to many of the same issues as <audio> — specifically multiple file formats and MIME types — and benefits from the same solutions.

 

About the Author

Emily Lewis is a freelance web designer of the standardista variety, which means she gets geeky about things like semantic markup andCSS, usability and accessibility. As part of her ongoing quest to spread the good word about standards, she writes about web design on her blog, A Blog Not Limited, and is the author of Microformats Made Simple and a contributing author for the HTML5 Cookbook. She’s also a guest writer for Web Standards Sherpa.net magazine and MIX Online.

In addition to loving all things web, Emily is passionate about community building and knowledge sharing. She co-founded and co-manages Webuquerque, the New Mexico Adobe User Group for Web Professionals, and is a co-host of the The ExpressionEngine Podcast. Emily also speaks at conferences and events all over the country, including SXSW, MIX, In Control, Voices That Matter, New Mexico Technology Council, InterLab and the University of New Mexico.

Find Emily on: