wav文件格式

http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html

Audio File Format Specifications

File Description: WAVE or RIFF WAVE sound file
File Extension: Commonly.wav, sometimes.wave
File Byte Order: Little-endian

P. Kabal, TSP Lab, ECE, McGill University: Last update:2006-06-19

WAVE Specifications

The WAVE file specifications came from Microsoft. The WAVE file format use RIFF chunks, each chunk consisting of a chunk identifier, chunk length and chunk data.

  • WAVE specifications, Version 1.0, 1991-08:riffmci.rtf
       Local copy:Multimedia Programming Interface and Data Specifications 1.0(see pages 56-65)
  • WAVE update (Revision: 3.0), 1994-04-15:Multimedia Registration Kit Revision 3.0 (Q120253))
       Local copy:New Multimedia Data Types and Data Techniques(see pages 12-22)
  • Multi-channel / high bit resolution formats, 2001-12-04:Multiple Channel Audio Data and WAVE Files
       Local copy:Multiple Channel Audio Data and WAVE Files

Data Types

The data in WAVE files can be of many different types. Data format codes are listed in the following:

  • Internet RFC, Codec registrations, 1998-06:ftp://ftp.isi.edu/in-notes/rfc2361.txt
       Local copy:rfc2361.txt
  • Microsoft include files (part of the MSVC compiler or theDirectX SDK: fromMicrosoft Download Center)
       Local copy:MMREG.H(Version 1.46)
       Local copy:ksmedia.h

Wave File Format

Wave files have a master RIFF chunk which includes a WAVE identifier followed by sub-chunks. The data is stored in little-endian byte order.

Field Length Contents
ckID 4 Chunk ID: "RIFF"
cksize 4 Chunk size:4+n
  WAVEID 4 WAVE ID: "WAVE"
WAVE chunks n Wave chunks containing format information and sampled data

Format Chunk

The Format chunk specifies the format of the data. There are 3 variants of the Format chunk for sampled data. These differ in the extensions to the basic Formant chunk.

Field Length Contents
ckID 4 Chunk ID: "fmt"
cksize 4 Chunk size:16or18or40
  wFormatTag 2 Format code
nChannels 2 Number of interleaved channels
nSamplesPerSec 4 Sampling rate (blocks per second)
nAvgBytesPerSec 4 Data rate
nBlockAlign 2 Data block size (bytes)
wBitsPerSample 2 Bits per sample
cbSize 2 Size of the extension (0or22)
wValidBitsPerSample 2 Number of valid bits
dwChannelMask 4 Speaker position mask
SubFormat 16 GUID, including the data format code

The standard format codes for waveform. data are given below. The references above give many more format codes for compressed data, a good fraction of which are now obsolete.

Format Code PreProcessor Symbol Data
0x0001 WAVE_FORMAT_PCM PCM
0x0003 WAVE_FORMAT_IEEE_FLOAT IEEE float
0x0006 WAVE_FORMAT_ALAW 8-bit ITU-T G.711 A-law
0x0007 WAVE_FORMAT_MULAW 8-bit ITU-T G.711 µ-law
0xFFFE WAVE_FORMAT_EXTENSIBLE Determined bySubFormat

PCM Format

The first part of the Format chunk is used to describe PCM data.

  • For PCM data, the Format chunk in the header declares the number of bits/sample in each sample (wBitsPerSample). The original documentation (Revision 1) specified that the number of bits per sample is to be rounded up to the next multiple of 8 bits. This rounded-up value is the container size. This information is redundant in that the container size (in bytes) for each sample can also be determined from the block size divided by the number of channels (nBlockAlign/nChannels).
    • This redundancy has been appropriated to define new formats. For instance,Cool Edituses a format which declares a sample size of 24 bits together with a container size of 4 bytes (32 bits) determined from the block size and number of channels. With this combination, the data is actually stored as 32-bit IEEE floats. The normalization (full scale 223) is however different from the standard float format.
  • PCM data is two's-complement except for resolutions of 1-8 bits, which are represented as offset binary.

Non-PCM Formats

An extended Format chunk is used for non-PCM data. ThecbSizefield gives the size of the extension.

  • For all formats other than PCM, the Format chunkmusthave an extended portion. The extension can be of zero length, but the size field (with value0) must be present.
  • For float data, full scale is 1. The bits/sample would normally be 32 or 64.
  • For the log-PCM formats (µ-law and A-law), the Rev. 3 documentation indicates that the bits/sample field (wBitsPerSample) should be set to 8 bits.
  • The non-PCM formats must have a Fact chunk.

Extensible Format

TheWAVE_FORMAT_EXTENSIBLEformat code indicates that there is an extension to the Format chunk. The extension has one field which declares the number of "valid" bits/sample (wValidBitsPerSample). Another field (dwChannelMask) contains a bits which indicate the mapping from channels to loudspeaker positions. The last field (SubFormat) is a 16-byte globally unique identifier (GUID).

  • With theWAVE_FORMAT_EXTENSIBLEformat, the original bits/sample field (wBitsPerSample) must match the container size (8 * nBlockAlign/nChannels). This means thatwBitsPerSamplemust be a multiple of8. Reduced precision within the container size is now specified bywValidBitsPerSample.
  • The number of valid bits (wValidBitsPerSample) is informational only. The data is correctly represented in the precision of the container size. The number of valid bits can be any value from 1 to the container size in bits.
  • The loudspeaker position mask uses 18 bits, each bit corresponding to a speaker position (e.g. Front Left or Top Back Right), to indicate the channel to speaker mapping. More details are in the document cited above. This field is informational. An all-zero field indicates that channels are mapped to outputs in order: first channel to first output, second channel to second output, etc.
  • The first two bytes of the GUID form. the sub-code specifying the data format code, e.g.WAVE_FORMAT_PCM. The remaining 14 bytes contain a fixed string, "/x00/x00/x00/x00/x10/x00/x80/x00/x00/xAA/x00/x38/x9B/x71".

TheWAVE_FORMAT_EXTENSIBLEformat should be used whenever:

  • PCM data has more than 16 bits/sample.
  • The number of channels is more than 2.
  • The actual number of bits/sample is not equal to the container size.
  • The mapping from channels to speakers needs to be specified.

Fact Chunk

All (compressed) non-PCM formatsmusthave a Fact chunk (Rev. 3 documentation). The chunk contains at least one value, the number of samples in the file.

Field Length Contents
ckID 4 Chunk ID: "fact"
cksize 4 Chunk size: minimum4
  dwSampleLength 4 Number of samples (per channel)
  • The Rev. 3 documentation states that the Fact chunk "is required for all new new WAVE formats", but "is not required  for the standardWAVE_FORMAT_PCMfiles". One presumes that files with IEEE float data (introduced after the Rev. 3 documention) need a Fact chunk.
  • The number of samples field is redundant for sampled data, since the Data chunk indicates the length of the data. The number of samples can be determined from the length of the data and the container size as determined from the Format chunk.
  • Their is an ambiguity as to the meaning of "number of samples" for multichannel data. The implication in the Rev. 3 documentation is that it should be interpreted to be "number of samples per channel". The statement in the Rev. 3 documentation is:
    "The <nSamplesPerSec> field from the wave format header is used in conjunction with the <dwSampleLength> field to determine the length of the data in seconds."
    With no mention of the number of channels in this computation, this implies thatdwSampleLengthis the number of samples per channel.
  • There is a question as to whether the Fact chunk should be used for (including those with PCM)WAVE_FORMAT_EXTENSIBLEfiles. One example of aWAVE_FORMAT_EXTENSIBLEwith PCM data from Microsoft, does not have a Fact chunk.

Data Chunk

The Data chunk contains the sampled data.

Field Length Contents
ckID 4 Chunk ID: "data"
cksize 4 Chunk size:n
  sampled data n Samples
pad byte 0or1 Padding byte ifnis odd

 

Examples

Consider sampled data with the following parameters,

  • Ncchannels
  • The total number of blocks isNs. Each block consists ofNcsamples.
  • Sampling rateF(blocks per second)
  • Each sample isMbytes long

PCM Data

Field Length Contents
ckID 4 Chunk ID: "RIFF"
cksize 4 Chunk size:4 + 24 +
(8 +M*Nc* Ns+ (0
or1))
  WAVEID 4 WAVE ID: "WAVE"
ckID 4 Chunk ID: "fmt"
cksize 4 Chunk size:16
  wFormatTag 2 WAVE_FORMAT_PCM
nChannels 2 Nc
nSamplesPerSec 4 F
nAvgBytesPerSec 4 F*M*Nc
nBlockAlign 2 M*Nc
wBitsPerSample 2 rounds up to8 *M
ckID 4 Chunk ID: "data"
cksize 4 Chunk size:M*Nc* Ns
  sampled data M*Nc*Ns Nc*Nschannel-interleavedM-byte samples
pad 0or1 Padding byte ifM*Nc*Nsis odd

Notes

  • WAVE files often have information chunks that precede or follow the sound data (Data chunk). Some programs (naively) assume that for PCM data, the file header is exactly 44 bytes long and that the rest of the file contains sound data. This is not a safe assumption.

Non-PCM Data

Field Length Contents
ckID 4 Chunk ID: "RIFF"
cksize 4 Chunk size:4 + 26 + 12 +
(8 +M*Nc*Ns+ (0
or1))
  WAVEID 4 WAVE ID: "WAVE"
ckID 4 Chunk ID: "fmt"
cksize 4 Chunk size:18
  wFormatTag 2 Format code
nChannels 2 Nc
nSamplesPerSec 4 F
nAvgBytesPerSec 4 F*M*Nc
nBlockAlign 2 M*Nc
wBitsPerSample 2 8 *M(float data) or16(log-PCM data)
cbSize 2 Size of the extension:0
ckID 4 Chunk ID: "fact"
cksize 4 Chunk size:4
  dwSampleLength 4 Nc*Ns
ckID 4 Chunk ID: "data"
cksize 4 Chunk size:M*Nc*Ns
  sampled data M*Nc*Ns Nc*Nschannel-interleavedM-byte samples
pad 0or1 Padding byte ifM*Nc*Nsis odd
  • MicrosoftWindows Media Playerwill not play non-PCM data (e.g. µ-law data) if the Format chunk does not have the extension size field (cbSize) or a Fact chunk is not present.

Extensible Format

Field Length Contents
ckID 4 Chunk ID: "RIFF"
cksize 4 Chunk size:4 + 48 + 12 +
(8 +M*Nc*Ns+ (0
or1))
  WAVEID 4 WAVE ID, "WAVE"
ckID 4 Chunk ID: "fmt"
cksize 4 Chunk size:40
  wFormatTag 2 WAVE_FORMAT_EXTENSIBLE
nChannels 2 Nc
nSamplesPerSec 4 F
nAvgBytesPerSec 4 F*M*Nc
nBlockAlign 2 M*Nc
wBitsPerSample 2 8 *M
cbSize 2 Size of the extension:22
wValidBitsPerSample 2 at most8 *M
dwChannelMask 4 Speaker position mask:0
SubFormat 16 GUID (first two bytes are the data format code)
ckID 4 Chunk ID: "fact"
cksize 4 Chunk size:4
  dwSampleLength 4 Nc*Ns
ckID 4 Chunk ID: "data"
cksize 4 Chunk size:M*Nc*Ns
  sampled data M*Nc*Ns Nc*Nschannel-interleavedM-byte samples
pad 0or1 Padding byte ifM*Nc*Nsis odd
  • The Fact chunk can be omitted if the sampled data is in PCM format.
  • MicrosoftWindows Media Playerenforces the use of theWAVE_FORMAT_EXTENSIBLEformat code. For instance a file with 24-bit data declared as a standardWAVE_FORMAT_PCMformat code will not play, but a file with 24-bit data declared as aWAVE_FORMAT_EXTENSIBLEfile with aWAVE_FORMAT_PCMsubcode can be played.

你可能感兴趣的:(File,each,documentation,float,byte,extension)