Mike Slinn

Extracting Audio from an MP4 as 32-bit WAV

Published 2021-11-04.
Time to read: 1 minutes.

This page is part of the av_studio collection.

My Sony Alpha 7 Mark iii camera creates mp4 files with good quality stereo audio. I wanted to extract the audio to a 32-bit WAV file so I could work on it further in Pro Tools. Here is a bash script I wrote to do that:

mp4ToWav
#!/bin/bash

# $1 Input file path

function help {
  if [ "$1" ]; then printf "$1\n\n"; fi
  echo "$(basename $0) - Extract audio stream from an mp4 file and save as 32-bit wav
  
Usage: $(basename $0) filename
"
  exit 1
}

if [ -z "$1" ]; then help "Error: no media file name specified"; fi

if [ ! -f "$1" ]; then help "Error: '$1' not found"; fi

filename="$( basename -- "$1" )"
path="$( dirname "$1" )"
extension="${filename##*.}"
filename="${filename%.*}"

ffmpeg \
  -i "$1" \
  -vn \
  -acodec pcm_f32le \
  -ar 44100 \
  -ac 2 \
  "$path/$filename.wav"

This is a sample usage:

Shell
$ mp4ToWav "Video Files/Descending C to G djembe"
ffmpeg version 4.3.2-0+deb11u1ubuntu1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 10 (Ubuntu 10.2.1-20ubuntu1)
  configuration: --prefix=/usr --extra-version=0+deb11u1ubuntu1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55fc3d8d1f80] st: 0 edit list: 1 Missing key frame while searching for timestamp: 1001
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55fc3d8d1f80] st: 0 edit list 1 Cannot find an index entry before timestamp: 1001.
Guessed Channel Layout for Input Stream #0.1 : stereo
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Video Files/Descending C to G djembe.mp4':
  Metadata:
    major_brand     : XAVC
    minor_version   : 16785407
    compatible_brands: XAVCmp42iso2
    creation_time   : 2021-10-31T19:00:25.000000Z
  Duration: 00:09:06.55, start: 0.000000, bitrate: 51575 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709/bt709/iec61966-2-4), 1920x1080 [SAR 1:1 DAR 16:9], 49492 kb/s, 59.94 fps, 59.94 tbr, 60k tbn, 119.88 tbc (default)
    Metadata:
      creation_time   : 2021-10-31T19:00:25.000000Z
      handler_name    : Video Media Handler
      encoder         : AVC Coding
    Stream #0:1(und): Audio: pcm_s16be (twos / 0x736F7774), 48000 Hz, stereo, s16, 1536 kb/s (default)
    Metadata:
      creation_time   : 2021-10-31T19:00:25.000000Z
      handler_name    : Sound Media Handler
    Stream #0:2(und): Data: none (rtmd / 0x646D7472), 491 kb/s (default)
    Metadata:
      creation_time   : 2021-10-31T19:00:25.000000Z
      handler_name    : Timed Metadata Media Handler
      timecode        : 07:09:43:54
Stream mapping:
  Stream #0:1 -> #0:0 (pcm_s16be (native) -> pcm_f32le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'Video Files/Descending C to G djembe.wav':
  Metadata:
    major_brand     : XAVC
    minor_version   : 16785407
    compatible_brands: XAVCmp42iso2
    ISFT            : Lavf58.45.100
    Stream #0:0(und): Audio: pcm_f32le ([3][0][0][0] / 0x0003), 44100 Hz, stereo, flt, 2822 kb/s (default)
    Metadata:
      creation_time   : 2021-10-31T19:00:25.000000Z
      handler_name    : Sound Media Handler
      encoder         : Lavc58.91.100 pcm_f32le
size=  188304kB time=00:09:06.55 bitrate=2822.4kbits/s speed= 153x
video:0kB audio:188304kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000059%

The original mp4 was 3.4GB, and the output wav was 188MB.

* indicates a required field.

Please select the following to receive Mike Slinn’s newsletter:

You can unsubscribe at any time by clicking the link in the footer of emails.

Mike Slinn uses Mailchimp as his marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp’s privacy practices.