Around the wwworld: Web MIDI, Web AUDIO and what the web does best

Katie Fenn &
40:24 min
Nov 21, 2024

So, you don’t use the Web Audio and Web MIDI APIs in your day job - does that mean they have nothing important to tell us about the nature of the Web? Think again! Katie Fenn takes us on a tour of the Web Audio and Web MIDI APIs by creatively-coding Daft Punk’s 1997 classic, “Around the World”. The talk will reflect on what the Web is good at, and the enduring value of unlicensed standards.

Katie Fenn

Sr Software Engineer at The Financial Times

Katie Fenn is a software engineer at The Financial Times. She works with all aspects of the web, particularly JavaScript, CSS, Node.JS and ops. When not at her desk, she is usually in the pool or on her bike in the Peak District.

Transcript

Katie Fenn 0:16 So I’m Katie my and my Twitter handle is Katie underscore femme, but you can find me on blue sky now at Katie fen and a content warning, this talk contains loud electronic music,

so I’m Katie and I really like Daft Punk, but you don’t have To like Daft Punk to watch this talk, I promise this talk isn’t really about Daft Punk. It’s more about the technology of electronic music and a whirlwind tour of what the Web Audio API and what the Web MIDI API is capable of, and a reflection on what the web is for towards the end. So just in case you don’t know who Daft Punk are. They’re a Grammy Award winning electronic music duo from Paris, and they’re famous for dressing as robots while performing. And this is the image that perhaps most people are familiar with. But what many people don’t know is that their career spans nearly 30 years. They were pioneers of French house music, and this is the image that they selected for their first album in 1997 early on in their career, they used synthesizers that were already old hat in the 90s, including Roland, tr 909, drum machine, which you can see here. This was released in 1983 and the Roland Juno 106 which was released in 1984 these machines could be found for under 100 pounds secondhand, and this was because they were considered old technology at the time. They were considered passe, but they did have one new technology which would change the world of electronic music forever, and that technology was MIDI. This is what MIDI cables and ports look like. Nearly all synthesizers have them now, and MIDI MIDI cables are very cheap and they’re very easy to use. MIDI lets musicians synchronize machines of different types and brands together. You could prepare performances at home on your sequencer and then mix them into a live performance on your MIDI enabled synthesizer and drum machine in a club. Synthesizers were a common sight in prog rock bands in the 70s, and slowly, they went from being a small part in a bigger band to being the band. Bands like crafft, Pet Shop Boys, erasure orbital, Chemical Brothers, the Prodigy and Daft Punk all use synthesizers to change the way that music was made and performed, and now the Web Audio API and the Web MIDI API integrate analog synthesis and MIDI into the web. These APIs are mere footnotes compared to our everyday work if you do front end development, but they are very powerful. They’re very creative, and they’re a lot of fun to use. So let’s explore, explore these technologies and that let’s make a full song. Let’s make daft punks 1997 single around the world. And for those of you who are watching now, this is what it sounds like. So where do we start? Let’s start with this. This is my arterial keepstep Pro. This is the thing that you can’t see. It’s out of frame. It’s on my desk here. This is a MIDI controller and a sequencer. It looks a little bit like a keyboard, like you’d have in your bedroom to make music on in the 80s and 90s. But it’s slightly different. This makes absolutely no sound of its own. It only sends MIDI data to my computer. MIDI data tells computers, synthesizers, drum machines, which notes to play. MIDI data can even be used to control lighting and visual effects in a live performance, and that really goes to show that MIDI day, MIDI is about message passing and nothing else. So let’s hook this up to our browser using the Web MIDI API, and then we can see what MIDI data looks like. So I’m going to be using just normal Chrome Dev Tools and the sources I’m trying to get to the Sources panel, but it’s right behind my share my screen sharing controls, hopefully. So I’m going to move those out the way. So I don’t want to drop off the stream. Okay, so we are looking at this. Okay, so I’m going to try and get some MIDI data into my slides, and the first thing that I’m going to do is I’m going to get the MIDI I’m going to try and get the MIDI access from the Navigator, and this is like the gateway to all the MIDI API. So I’m going to type, let access equals navigator, dot request MIDI access, and we’re going to await that as well, because it returns promise. Next, we’re going to iterate through all the connected devices. I’ve only got one device connected, but if you have a big home studio, you might have more. So we need to iterate through them. So we’re going to iterate through device of access, dot inputs, dot values. And this is just an array, and it’s going to return to us all the devices. Then if we have device dot name of key step, pro and

then what we’re going to do is we’re going to create a callback which gets a callback function which gets executed every time that I push a key down or I turn a control or change anything on here. So we’re going to set device.on MIDI. Message equals to a new callback, and that gets a variable passed passed in with the MIDI message. So we’re going to call that message next because message, the message, variable which gets passed in is an unsigned integer, eight array. We want to turn that into a normal array so we get the access to all the normal array API methods that we can use to start manipulating that array. So we’re going to create a e variable called data, and we’re going to use array dot from to create a new normal array, and then what I want to do is I’m going to use document dot query selector, and I want to get this an element of this class here, and then I’m going to set inner text equal to message, dot data, dot slice. So just want the first three items of the MIDI message, and hopefully, if I run that, I might need to open the console just to keep an eye on any errors that get turned up. Okay, let’s see if this works fantastic. Okay, so we can see some MIDI data here. Let’s see what what this looks like, and I’ll tell you what each bit is. The first value is the status byte, and it’s often used to identify the control that’s being pressed. So if I press one key down, then we get 146 some controls have different values depending on their context. So if I lift the key back up again, we should get a different number. So we get 130 so 1461 30. So 30, or 153 and 137 if I’m on the right setting here. So these have these. This has context, and depending on the control that you are using, it might return one value or the other. The second is the data byte, which carries information such as the note that’s being pressed. So if I press bottom C, then we get 36 if I go up a semi tone, then we get 30 730-839-4041,

and so on. Additional plan parameters then follow. And here it’s the velocity of the of the key that I’m pressing, and velocity. What that means is that if I press a key very slowly, then I get a very low number, and if I push it very fast, then I get a very high number. And you can use this to write your code to accentuate or quieten some of the sounds that you want to make, like a real life piano would. When you push one the keys down very hard, you get a much louder, brighter sound. So that’s what velocity is for. Or if you, if you move some of the let’s. See if I move some of the other controls on here, but then we might get a number which goes up and down. Okay. So the brilliant thing about MIDI data is that you don’t need API documentation to use these values and weirdly counter intuitively, it makes them easier to understand. They’re designed to be used, to be learned intuitively, and professional audio software works the same way. So if I wanted to hook this MIDI controller up to a professional digital audio workstation like Logic Pro or Ableton, I would set the setting that I’d want some of these controls to change. I’d put it into a learning mode, then the next control that I would move will get assigned to that. And you write your code in the same way counter intuitively. This means that you don’t need any API documentation to learn how each different thing, each different device works. As long as you you perform the same action in the same way twice, you’ll get the same MIDI data. And that makes it really easy to use. You just write your code to anticipate the kind of data that you get in. So now that we know how to tell the computer to make a noise, we need to figure out how to make the computer make a noise that we want to hear. This is what the a Web Audio API is for. The workflow for the Web Audio API works by connecting nodes together, and you’re going to see me write a lot of code, but this is what it boils down to. We take an input, we transform it using different effects, and then we send it out to a destination, which is usually your default system audio out, and we connect these these nodes, together, and this workflow models the way that a real hardware synthesizer does, where you patch an oscillator into an output in order to make a noise. So let’s start with an oscillator and see if we can make an oscillator. This is what oscillators sound like. Oscillators create an electronic signal which make a noise when you collect them to a loud speaker. So I’ve hooked up my synthesizer in this video to an oscilloscope so that you can see this, the shape of this electronic signal and the sound that they make. So RAM wave sounds soft, sawtooth wave sounds sharp wave sounds like a mix of the two,

and a pulse wave sounds like kind of like a sharp square wave, and you could hear me adjusting the pulse width to make it sound sharper by increasing the frequency or decreasing it. You can play different notes. When you put this all together, you have the foundations of all electronic music. So let’s see if we can make an oscillator. So I’m going back to DevTools again. I’m going to increase the size of the window to make it easy to see, and I’m going to go to my oscillator demo. Cool, right? So, like we got the MIDI access for as the gateway to the Web MIDI API, we have something similar, similar concept for the web, web audio API. So we’re going to create something called audio context. This is a new audio context, and this is our gateway to the Web Audio API. So like I said, we join different nodes together. What we’re going to do is we’re going to create a series of nodes that we’re going to join and join together that describe the sound that we want to make. So we’re going to create an oscillator node. Oscillator equals new oscillator node. The first thing that we do when we create a new mode is we’re going to pass in the audio context, and that coordinates all the nodes that we’re going to make. The second argument that we’re passing in is an options object, and one of the things that we can set is the type of wave that we want to create. And we want to create a nice, sharp sawtooth wave. Oscillators, by default, start making a signal, start making a sound. As. Soon as you start them, whether you’ve got a key down on your MIDI controller or not. And of course, as we all know, we want to, we only want to make a sound when we push a key down, and we want it to stop when we lift a key up. So what we need to do is that we need to create a gate. And to do that, we are going to create a new type of node called gain node. I’m going to type this new gain node. We’re going to pass in the audio context again, and we’re going to set the gain to zero. We use gain nodes to amplify the sound, or silence the sound, if we want to. And we pass in a gain of zero, we want it to be silenced by default, because we don’t want it to make a sound just as it is. We only want to make a sound when we push a key down. So we’re going to set the gain to zero, and then we’re going to start joining our nodes together. So we’re going to type oscillator. Dot connect, and we’re going to connect that to the gain node. Then we’re going to set connect the gain to context, dot destination, and the destination is, by default, configured to be our default system audio out, and then we’re going to start the oscillator. So next, what we need to do is that we need to open that gate when we push a key down. So we need to write some code that responds to a key down event. So I have this call back here, and we’re getting MIDI data passed in, which contains a number of the key that I’m pressing. And I want to set the frequency of the oscillator to the right pitch in hertz. So I’m going to set oscillator dot frequency dot value equal to I’m going to use a help function, which I created earlier, called MIDI to frequency. And that takes MIDI data, dot input, which is an integer, and it turns it into a number in hertz of the pitch that I want to play next I’m going to use, I’m going to use gain dot, gain dot line ramp to value at time, and we want to ramp the gate up to one so we’ll be able to hear the noise of the oscillator. And we want to ramp it up to a time which is current time, plus naught point naught, five seconds, so I’m going to pop that there. And conversely, when we left a key up, we want to ramp the value back down to zero. So let’s see if this works. And can we keep one eye on that

so we can hear the sound of the oscillator now, and if I push a key down, we hear a sound. If I lift the key up, then we get then it silences. It again. Fantastic. So that’s oscillators. So let’s say if we can turn this into something that’s recognizable in the song. So if we can create the bass line, next, we need to learn about filters. Filters are a new type of node. They’re a new type of component inside a sequence inside a synthesizer, and they’re used to quiet and remove and accentuate certain frequencies of sound. They’re often used in electronic music bass lines. And this video shows me patching an oscillator output of an oscillator into my filter, and then patching my filter into the output,

you should be able To hear that sound gets softer and then disappear completely.

What it’s doing is it’s filtering out all the frequencies of sound that are above a certain point. And when I turn the knob down, it decreases the cutoff, the filter cutoff, so it decreases the the frequency in hertz above which all the sound is being filtered out. So let’s see if we can create a baseline. So we’re going back to dev tools, and we are. Going to go to base filter, and this code is very similar to the code that we’ve just written. So we have an oscillator, we have a gain node, which creates a gate. And what we’re going to do next is that we’re going to create a filter node, so let filter equals new by quad Filter node. We’re going to pass in audio context, and we’re going to set the frequency to 144 hertz, and that’s going to filter all the frequencies from our oscillator above 144 hertz. So if I run this, we should be able to hear a nice, thick, creamy bass line. So Much, much softer sound that is a classic electronic music bass line. Next, let’s see if we can make the lead melody which is which plays the melody in the song. So we’re going to learn about something called low frequency oscillator. So low frequency oscillators, or LFOs, are not designed to make noise themselves, like other oscillators, but they’re designed to automate the changes of the parameters of other sounds. So this physio shows me patching an LFO into the pitch component of an oscillator for an accentuated tremolo bass drop effect. What it’s going to do is it’s going to modulate the pitch. The pitch is going to go up and down slowly at first, and then it’s going to get faster as I turn the frequency of the LFO up, what it’s doing is it’s modulating. It’s automating changes to the parameter of another thing you

so let’s see if we can make this in Web Audio. So I’m going to go back to Chrome DevTools. I’m going to go back to sources, okay, and I’m going to go into our demo for the lead. Oh, let’s see down here. Okay, so we’re going to create a low frequency oscillator. And one of the brilliant things about the Web Audio API is that a low frequency oscillator is just an oscillator node like the other ones that we have created. And brilliantly, we can just reuse it and just set a low frequency. So it is literally a normal oscillator with a very low frequency. We’re going to set it to five. We don’t want it to oscillate with a frequency of five hertz to begin with. And then what we’re going to do is we’re going to plug it into a gate node. Oscillators, by default, oscillate between values of zero and one. What we want to do is that we want this oscillator to modulate the cutoff frequency of a filter, and if we modulate the cutoff frequency of a filter by just one hertz. You’re not going to be able to hear the difference. So what we want to do is that we want to modulate it by 5000 hertz. So what we’re going to do we’re going to create a new gain node called LFO gain, and then pass in the context, and then we want to pass in the gain of 5000 so instead of oscillating between zero and one, this oscillator, when connected to this gain node, will oscillate between zero and 5000

so we’re going to connect LFO to LFO gain. Then we’re going to connect LFO gain to the filters dot frequency property, and then we’re going to start the LFO oscillator. So let’s see if this works. Oh, okay, connect. I have made my first mistake. There’s a typo, okay, hopefully it’ll work.

You should be able to hear that filter opening and closing about five times a second. So. And if I change my code, if I reduce it to just one hertz, you should hear that changing much, much slower, or I can make it faster. And what that’s essentially doing is it’s it’s just modulating the parameter of something else, and that’s what LFOs are for. So that’s lead, okay, samples self darp punk didn’t actually use samples in around the world. I think a lot of early YouTube videos that analyzed it thought they did, particularly sampling a track called Good Times by chic, but they didn’t. They used, they used synthesizer and they used a drum machine. The drum machine, which they used was itself based on samples of real life instruments. They used the Roland TR 909, was made famous by the acid house, and it was particularly prized for the sound of its symbols, which were sampled from real life hi hat and crash symbols. So let’s see if we can get some samples into the Web Audio API. There we go. So the first thing that we’re going to do is that we’re going to load a file like we do lots of other files. We’re going to await fetch, and we want to fetch this file here, and then we want to take that, then we want to take the response, and we want to turn it into an array buffer, and that will put us into an array like object, which is an array of numbers which describes the sound the way the sound waves of that sample. And that’s brilliant, if you want to mess around with that in code, because that’s just a series of numbers that you can mess around with. You can transform it, if you know what you’re doing, we are just going to leave it as is, and we’re going to take the we’re going to take the buffer, and we’re going to use the audio context to decode audio data from it. Context, dot, T, code, audio data from the buffer. There we go. So next we want this function to play, um, our sample whenever I push a key down on the keyboard. So I’m going to write some code to do that. We need to create an audio buffer source node, so source equals new audio source node, and pass in the context, and we’re going to pass in the audio buffer as well. Then we need to connect the source node to the destination, and then we’re going to start the source node. Source dot start. One of the pitfalls which I fell into when I was first starting to write this talk and find out about the Web Audio API was that I assumed that you could reuse audio buffer source nodes, that you’d be somehow be able to wind them back and play your sample from the beginning, but as far as I know, you can’t do that. So on every key press, we are creating a new audio buffer source node and reusing it and then discarding it as soon as this function is finished. I don’t think there’s any memory leak problems associated with that, but as far as a note, that is how you need to use them. So let’s see if we can run this code and see if we can get the sound of a Roland TR 909, clap. Let’s see. There we go. That is our sample. That’s the sound of a Roland TR 909. Fantastic, right? Let’s see if we could put everything together and see what it sounds like in a complete song. And a quick reminder while I’m using this MIDI controller to cue the sounds in the browser, it is the browser that is making all the noise that you’re going to that you’re going to hear. So let’s go back to dev tools sources, and go down here, and let’s hope this works. I’m going to open the console so you’ll be able to see all the MIDI data coming in. So. Here we go.

It. That’s it, that is, that’s the Web Audio and the Web MIDI API. So how do we sum up the Web Audio API requires a bit of domain knowledge to get started. It helps if you know how a real life hardware synthesizer works. But it’s not too hard to get started. Easy things are easy ish and hard things are possible. The problem for beginners is it’s hard to know which is which often. So what are some of the drawbacks? It’s not a digital audio workstation. It’s not going to offer you the same kind of power that you get out of Ableton Live or Logic Pro, but it is excellent for education and experimentation, and it’s well supported in Chrome edge, Safari and Firefox. The Web MIDI API is well supported in musical hardware, almost every single piece of professional audio equipment, any kind of sequencer, any kind of synthesizer, whether it’s targeted towards beginners or whether it’s targeted towards really high end professionals, all of them will have MIDI ports, either in traditional MIDI or over USB. The only drawback is it’s not supported in Safari. It’s supported in Firefox, in Chrome and edge, but not supported in Safari. But it is excellent for hobbyist development, where it is support, where it is found. So what does the web do best? Should we be thinking of integrating the Web Audio API into our websites? Well, no, I’m not going to tell you to do that. I think that’s probably a terrible idea. And is it replacement for serious production tools? Also, no, it’s not going to offer you the same kind of power and control and the sheer wealth of audio tools that something like Ableton or logic will be able to give you. But that doesn’t mean that the Web Audio API and the web media API are worthless. There’s a huge community of hobbyists and experimental musicians out there, these guys, these people, are called circuit benders. And circuit benders make electronic music by using scrap toys, broken synthesizers, to make music out of them. And if they can make music out of rubbish, out of trash, then you can bet that someone will make something cool with a web browser. What makes MIDI so great is that it’s used by practically everyone. It’s used by amateurs and professionals alike. I knew absolutely nothing about electronic music before I started writing this talk, and that really goes to show how amazing an educational tool, Web Audio and the Web MIDI API really can be. The web should be the software platform of the amateur, of the hobbyist of the alters doing something that nobody has ever conceived of before. The web’s compatibility, its accessibility, its approachability, make it the ideal creative platform. And this is a call for Apple to find a way to support the Web MIDI API on I on iPads and iPhones as well. The web and MIDI belong together. This is Dave Smith. Dave Smith was an engineer and founder of a company called sequential circuits. He’s also the creator of the Prophet five, famous polyphonic synthesizer. Together with the president of Roland, ekitaro kakahashi, they created MIDI to this day, MIDI remains completely unlicensed. No fees have ever been created for the creation of MIDI. This has led to its widespread adoption and near universal compatibility. Whole industries and art forms exist because of MIDI, and it remains easy to use for beginners and experts alike. And if that sounds familiar to you, then it should. If you’re thinking about the web, then you’d be right. Since its creation, Tim Berners Lee has not collected any fees for creating the world wide web. Free and Open Standards create wealth for us all. The web has changed our lives for the better. It’s critical that the web remains free and easy to use and endures as the software platform for the common good. MIDI and the web are naturally allied to each other and deserve to endure together. Thank you very much.

Brian Rinaldi 35:31 Thanks so much. Katie, that was that was fabulous. One of the things that I Well, first of all, I I’ve loved to do MIDI stuff from back in high school, I got into using all kinds of synthesizers. And it’s amazing how, you know, all these decades have passed and like literally, it all just works the same, right? It hasn’t really changed which, which goes to show the power of like that. You know, having a standard like that, that, that works kind of indefinitely. So, and you had one of the coolest demos of any talk, you know, I love that song. So, so really great stuff. I think one of the things you brought up was that I had a question about, you know, because I’ve talked about this before, when I’ve brought up web, audio, API and things like that is, while most of the time the web is kind of outside of videos, right? Like it’s quiet, right? You don’t go to a website. You don’t tend to have audio engines. But I was making the case years ago that maybe is a lot of other types of software do you just go to the web and suddenly you get no audio feedback like you’re used to, like in games and other other media where, like, you clicks, you know, you’ll have audio feedback. And oftentimes that audio feedback is helpful in recognizing, like, Oh, hey, this was that action because I heard, you know, a different kind of click than I hear for, you know, from different actions. So, I mean, do you think this is something you know, people should explore and use the Web Audio API for. Or is it like, I mean, do you what feeling on, on kind of this silence we have across much of of web apps, interactions

Katie Fenn 37:16 in general? No, I don’t think you should start baking sound and noise into your web apps, and that’s for a very good reason. There’s a lot of people out there who rely on screen readers to to read out websites, and if you start putting your own sound into your website, is then you going to disrupt their experience. This is also a very important principle. One of the reasons why you don’t have auto playing videos on websites is because if you have auto playing video, then someone who’s using a screen reader can’t find that video and silence it. So I think perhaps in some very limited circumstances where you’re trying to recreate a specific in specific interface, for people who are used to sounding those interfaces, possibly, but in general, no, I wouldn’t recommend it for accessibility reasons.

Brian Rinaldi 38:15 Okay, all right, that makes sense. I mean, I think some sites use it for mostly for small things like notification sounds and things like that. I know those are, and I’ve actually, did you use any of the libraries at all? Or you just, like, prefer to use straight Web Audio? Because I when I’ve done it before, I’ve noticed, like some of them, for instance, like getting samples and things like that, there are libraries that really make it kind of easy to just drop things in. Do you have any try that you recommend or network? Yeah.

Katie Fenn 38:47 So the prototype for this talk was written using a library called tone js, and I found that to be very helpful for the demo purposes, I wanted to explore the the native web audio API as it is in browsers, but tone js is very similar. It has very similar nodes that you can connect together. It works in a very similar way, but it gives you a wealth of other types of nodes that you can connect together. For instance, I think the the Web Audio API, the native web audio API only gives you, I think, a triangle, a triangle, sawtooth sine wave and a square wave. By default, it doesn’t give you pulse waves, which are really important part of electronic music, but that’s something that tone Jess gives you, and for a beginner, I would recommend actually getting started with tone, yes, before trying to experiment a bit with the native web audio API. I don’t think you’re losing anything. I don’t think you’re losing any less control. It just gives you more toys out of the toy box to play with, and I don’t think that’s a bad thing. So, yeah, definitely. You go and check out Tara and JS,

Brian Rinaldi 40:02 awesome. Yeah, okay, I don’t think I’ve tried that one. I’m curious. I’ll give it a shot. Well, this, this was great. I love the talk, I love the choice of music and and everything else. So this was really awesome. Thank you, Katie, you’re

Katie Fenn 40:18 welcome. Thanks for inviting me.

Web Development

More Awesome Sessions

SESSION

GIFs Are Forever, Let’s Make Them Better!

PixelPalooza will explore all the ways we, as developers, can leverage media like images, video, audio and documents across our sites and applications.

SESSION

Boost Your Next.js and Astro Projects: Effortless OpenGraph Image Automation

PixelPalooza will explore all the ways we, as developers, can leverage media like images, video, audio and documents across our sites and applications.

SESSION

How to Build Your Own Image Optimization Pipeline

PixelPalooza will explore all the ways we, as developers, can leverage media like images, video, audio and documents across our sites and applications.

SESSION

Async Workloads - Simplifying complex workflows for the web

Netlify's Sean Roberts explains what Async Workloads are, what they are useful for and demonstrates how to use them.

SESSION

An Intro to Deno 2 - Full Node and npm compat?! Lets build some apps!

In this session, Jo Franchetti will show how the latest release of Deno 2 with full npm compatibility gives you access to Deno's secure, modern runtime while also tapping into the vast npm ecosystem.

Check out all 368 sessions

Katie Fenn

Transcript

Tags

More Awesome Sessions

GIFs Are Forever, Let’s Make Them Better!

Boost Your Next.js and Astro Projects: Effortless OpenGraph Image Automation

How to Build Your Own Image Optimization Pipeline

Async Workloads - Simplifying complex workflows for the web

An Intro to Deno 2 - Full Node and npm compat?! Lets build some apps!

Don't miss Mission: Astro Migration coming up on Aug 18