Wav files are simple enough when you're manipulating them in software. With modern DAWs, we can do amazing things to wav files without having to understand exactly what's under the hood. But what if you want to write the next great DAW? What if you just want to experiment with audio data? Then you need to understand how to read and write wav files.
In this article, I'm going to explain the WAV file specification, and give you the source code needed to get started using WAV files in your own software. I made two source files. One is in C++, and the other is in VB.net, so almost everyone should be happy.
WAV File Specification
Like most files, WAV files have two basic parts, the header and the data. The data is just one giant chunk of bytes that represents your audio. Your program has to read the header so that it can understand how to interpret the data.
Before we get ahead of ourselves, it's important to note that all files are made up of just ones and zeros. The data is meaningless until your software gives it meaning. In modern programming languages, bits are grouped into sections called bytes, which are just 8 consecutive bits. In other words, when I refer to a byte, I am just referring to a chunk of 8 ones and zeros that your program will read as a group, and then turn into useful information (integers, characters ... etc).
In normal wav files (or at least the basic ones that we will address here), the header is the first 44 bytes of the file. Everything that you need to know about the file is contained within those first 44 bytes. Here's how they break down (there is a better graphic for this here):
| Position (in bytes) | Field Name | Field Size (in bytes) | Description |
| 0 | Chunk ID | 4 | This should just containt "RIFF" |
| 4 | File Size | 4 | The size of the rest of the file after this field. Entire File Size - 8 |
| 8 | File Format | 4 | This should just containt "WAVE" |
| 12 | SubChunk1 ID | 4 | This should just containt "fmt " |
| 16 | SubChunk1 Size | 4 | For PCM files, this will be 16. If this is something else, you may need to look at a more complete specification than this one. |
| 20 | Audio Format | 2 | Should be 1 for uncompressed audio. If this is something other than 1, you may have a compressed file. |
| 22 | Number of Channels | 2 | Either 1 or 2, or ??? |
| 24 | Sample Rate | 4 | CD quality would be 44100. |
| 28 | Byte Rate | 4 | SampleRate * NumChannels * BitsPerSample / 8 |
| 32 | Block Alignment | 2 | Channels * BitsPerSample / 8 |
| 34 | Bits Per Sample | 2 | 8, 16, 24 ... |
| 36 | SubChunk2 ID | 4 | This should just containt "data" |
| 40 | Data Size | 4 | The number of bytes following the header. The size of the data. |
So how do we get that information into your software?
Code
Reading and writing files in C, Java and VB is very similar. C can get a little tricky because you have to use pointers, but for the most part, you will take advantage of similar classes in each language. Basically, you want to open a file stream, and stream all of the data out of the file into a useful container.
You can think of file streams like you think of internet streams. You open a stream, and then you can get data from it. The only real difference is that in code, you can also write data to a stream.
In both C and VB, opening a stream is easy.
In C: ifstream inFile( myPath, ios::in | ios::binary);
In VB: Dim wavFileStream As System.IO.FileStream = New System.IO.FileInfo(myPath).OpenRead()
Then, you need to read each variable from the header, sequentially, and convert the data into a usable format.
In C, simply define the variable properly, and call the read function: inFile.read( (char*) &myFormat, sizeof(short) );
In Vb, you will use the BitConverter class: myFormat = System.BitConverter.ToInt16(header, 20)
ASIDE: Are you starting to see how similar these implementations are? Java is similar as well. It is kind of an amalgamation of C and VB, taking stylistic elements from both. If one of my readers wants to port this code over to Java, I would love to include it here as well.
Writing the file back out is very similar. In both cases, you simply open the stream in output mode, and call the appropriate write function.
At this point, you should start perusing the source code for whichever language you prefer. In both languages, I created a class called WaveFileForIO. This class is an input/output shell for WAV files. To use it, simply instantiate the class with a filename, then you can call getSummary() to get a printout of the header, or you can peruse the data directly.
Here is the code for reading and writing wav files in c, with an example.
Here is the code for reading and writing wav files in VB.Net, with an example project.
Here is just the class for reading and writing wav files in vb.net (sample usage in comments at the top).
If you have any questions, post them in the comments. I am not an expert, but I will try to answer them. Also, check this stanford page that was probably written by someone smarter than me. It has a lot of good info.
UPDATE:
I may have failed to mention this previously, so I wanted to note that the example code will not successfully read EVERY wav file. The example code is a basic working example that will read a lot of wav files. If you want fool-proof WAV IO, you will need to examine the format in depth. I left out a lot of data checks, to make sure that the data is formatted properly. I also left out everything about handling endian-ness (don't worry if you don't know what this is). If you are building a commercial application, you will need to handle these issues.
Thanks to the guys in the KVR Audio Forums for reminding me to make this more explicit.
If you want to explore the intricacies of all of the possible variations on wav files, you could start with this explanation of all of the possible chunks.