question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is your feature request related to a problem? Please describe. Implement ffprobe wasm version.

Describe the solution you’d like ffprobe is the necessary companion of ffmpeg, needed to analyze media file before processing .

Describe alternatives you’ve considered In this simple case I’m using the command line ffprobe via execFile to probe the file

probe = function (fpath) {
      var self = this;
      return new Promise((resolve, reject) => {
        var loglevel = self.logger.isDebug() ? 'debug' : 'warning';
        const args = [
          '-v', 'quiet',
          '-loglevel', loglevel,
          '-print_format', 'json',
          '-show_format',
          '-show_streams',
          '-i', fpath
        ];
        const opts = {
          cwd: self._options.tempDir
        };
        const cb = (error, stdout) => {
          if (error)
            return reject(error);
          try {
            const outputObj = JSON.parse(stdout);
            return resolve(outputObj);
          } catch (ex) {
            self.logger.error("MediaHelper.probe failed %s", ex);
            return reject(ex);
          }
        };
        cp.execFile('ffprobe', args, opts, cb)
          .on('error', reject);
      });
    }//probe

or in this case to seek to position in the media file:

seek = function (fpath, seconds) {
      var self = this;
      return new Promise((resolve, reject) => {
        var loglevel = self.logger.isDebug() ? 'debug' : 'panic';
        const args = [
          '-hide_banner',
          '-loglevel', loglevel,
          '-show_frames',//Display information about each frame
          '-show_entries', 'frame=pkt_pos',// Display only information about byte position
          '-of', 'default=noprint_wrappers=1:nokey=1',//Don't want to print the key and the section header and footer
          '-read_intervals', seconds + '%+#1', //Read only 1 packet after seeking to position 01:23
          '-print_format', 'json',
          '-v', 'quiet',
          '-i', fpath
        ];
        const opts = {
          cwd: self._options.tempDir
        };
        const cb = (error, stdout) => {
          if (error)
            return reject(error);
          try {
            const outputObj = JSON.parse(stdout);
            return resolve(outputObj);
          } catch (ex) {
            self.logger.error("MediaHelper.probe failed %s", ex);
            return reject(ex);
          }
        };
        cp.execFile('ffprobe', args, opts, cb)
          .on('error', reject);
      });
    }//seek

Additional context Probe media files before processing; seek to media position;

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:22
  • Comments:13

github_iconTop GitHub Comments

17reactions
crazotercommented, Apr 28, 2022

Recently I had a use case where I needed to perform a duration check on media files on the browser before uploading to the server. I’ll put my approach here while I built my POC as it is somewhat related.

Use case For context, my use case is as follows:

  • User uploads a video/audio file into a file input.
  • Browser takes file and somehow derives its duration.

ffprobe-wasm is not the full ffprobe program

  • My first idea is to use ffprobe-wasm, but I quickly discovered that the program that is being executed is not actually ffprobe, but ffprobe-wasm-wrapper.cpp which is essentially an attempt to rewrite ffprobe to be more emscripten friendly, but only contains a fraction of the utility that ffprobe offers. The application as-is was insufficient for my use case as I needed to verify audio files as well.
  • I decided against enhancing ffprobe-wasm-wrapper.cpp at the time because it would essentially mean manually porting ffprobe to wasm, something I lacked both the time and expertise to do. What I instead explored is to compile the entire ffprobe into wasm, something which I managed to successfully accomplish. My fork of the ffprobe-wasm repo can be found here: https://github.com/crazoter/ffprobe-wasm. The messy & uncleaned steps I took are as follows:
  1. First, I cloned https://github.com/alfg/ffprobe-wasm. If you’re trying to replicate the steps, you should refer to my fork which includes the updated dockerfile.
  2. I noticed that the ffmpeg version was very old. I updated the dockerfile to resolve that and use the latest ffmpeg (now using git instead of the snapshotted tarball). I then built their wasm module via docker-compose run ffprobe-wasm make.
  3. I then jumped into the running docker container with an interactive bash. I navigated to the ffmpeg directory that was downloaded to the tmp file, and manually compiled ffprobe.o and fftools.cmdutils. If I remember correctly, I executed:
emmake make fftools/ffprobe.o fftools/cmdutils.o
emcc --bind \
	-O3 \
	-L/opt/ffmpeg/lib \
	-I/opt/ffmpeg/include/ \
	-s EXTRA_EXPORTED_RUNTIME_METHODS="[FS, cwrap, ccall, getValue, setValue, writeAsciiToMemory]" \
	-s INITIAL_MEMORY=268435456 \
	-lavcodec -lavformat -lavfilter -lavdevice -lswresample -lswscale -lavutil -lm -lx264 \
	-pthread \
	-lworkerfs.js \
	-o ffprobe.js \
        ffprobe.o cmdutils.o
  • You can also perform emmake make build to build everything.
  • I am not too familiar with the flags to be honest, so this may be sub-optimal.
  • The instructions above may not be completely correct as I did not refine the process and rerun it to verify it. As a precaution I added the resulting files into the repo in my-dist.
  1. To test the files, I adapted emscripten’s https://github.com/emscripten-core/emscripten/blob/main/src/shell_minimal.html to use the compiled ffprobe_g and made some modifications to the generated JS code to call the main function directly (ffprobe_g.max.js is beautified from ffprobe_g. As to why there’s ffprobe_g and ffprobe, I did not investigate the reason). To run the file locally, I used Servez.
  • However, as I was not proficient at wasm, there were a few problems with my prototype. I imagine someone with more experience will be able to resolve these issues:
    1. There is no way to “reset” the args passed into ffprobe. Once the args are passed into the application, passing a non-empty args array into subsequent calls to main (without refreshing the page) will “stack” the new args with the old args, causing issues. My workaround was to use the same file name & flags for all main calls, and not pass args in subsequent main calls, which worked for our use case. YMMV.
    1. The interface was not as clean as ffmpeg-wasm as the logs are async and there is no indicator to specify when the application has finished running.
    1. The generated code assumed the existence of SharedArrayBuffer even though it was not necessary to process most files with it. It is thus necessary to guard parts of the code using typeof SharedArrayBuffer !== "undefined" to prevent the code from failing if you intended to use ffprobe without having to change your https headers.
  • Still, for anyone interested in porting ffprobe to wasm, I think this is a step in the right direction and can be worth exploring. I am actually quite curious why the original authors of ffprobe-wasm didn’t just compile the whole file.

What I ended up using

  • Due to the uncertain reliability of my (somewhat successful) ffprobe prototype, I decided to go with using ffmpeg-wasm instead.
  • Handling async issues with ffmpeg-wasm was easier even though the data was coming separately from the logger as you could await for the execution to be completed.
  • The concern however is that I’d have to read the entire file into memory. Using 1GB of memory to read a 1GB audio file on the browser is unacceptable for my use case, even if the memory is released immediately afterward. This is a problem independent of ffmpeg-wasm, but instead caused by how the emscripten file system is used. After all, we’d have to somehow bring the file into MEMFS before ffmpeg can even start processing it, and normally we just bring the whole file into MEMFS. What if we just bring in a slice of that?
  • So I decided to instead use the Blob.slice API to obtain the first 5MB (arbitrary number) of data from the file, and then pass that into the emscripten file system using the fetchFile API provided by ffmpeg. The idea is that the metadata would be at the start of the file, and then we’ll have some excess data for ffmpeg to guess the format of the file if necessary.
// Toy example
const maxSliceLength = Math.min(1024*1024*5, oFiles[nFileId].size);
const slicedData = oFiles[nFileId].slice(0, maxSliceLength);
(async () => {
  ffmpeg.FS('writeFile', 'testfile', await fetchFile(slicedData));
  await ffmpeg.run('-i', 'testfile', '-hide_banner');
  ffmpeg.FS('unlink', 'testfile');
})();
  • This resolved the memory issue, but introduced a new problem; since ffmpeg is only seeing the first 5MB of the file, it has to guess the duration of some files using bitrate. This thus involved a bit more engineering to identify if the estimation is performed, and if so, perform the estimation ourselves using the actual file size:
    • One way is to estimate by bitrate. Personally this is a last resort because the difference in estimated & actual file size can be ridiculous.
    • Second (more reliable) way is to take the estimated duration from ffmpeg and multiply it by maxSlicedLength / file.size.
  • edit: ffmpeg & ffprobe will throw an error for some files if it can’t read the whole file (e.g. mp4, 3gp). More specifically, the dreaded Invalid data found when processing input For these types of files, there are 2 options currently available:
  • I settled for this solution as I didn’t need an exact value for the duration (a malicious actor would be able to bypass a browser-based check anyway).

Hopefully this write-up will benefit someone looking for a similar solution, or someone hoping to port ffprobe to wasm.

7reactions
brunomsantiagocommented, Jan 4, 2022

Any plan on support that? It would be awesome to get format and stream metadata on browser. Something like that: ffprobe -hide_banner -loglevel fatal -show_error -show_format -show_streams -print_format json video.mp4 ffprobe is so much faster than ffmpeg because it don’t try to read the entire file.

Read more comments on GitHub >

github_iconTop Results From Across the Web

alfg/ffprobe-wasm - GitHub
ffprobe -wasm uses emscripten to compile FFmpeg's libav to Web Assembly via Docker. Emscripten is also used to create and compile the Wasm...
Read more >
ffprobe-wasm - npm
Gather information from multimedia streams. Works on the browser and Node.js. Uses the code at alfg/ffprobe-wasm, but in a packaged format, so ...
Read more >
FFMPEG.WASM
ffmpeg.wasm is a pure WebAssembly / JavaScript port of FFmpeg. It enables video & audio record, convert and stream right inside browsers.
Read more >
FFmpeg + WebAssembly - DEV Community ‍ ‍
Tagged with ffmpeg, webassembly, cpp, tutorial. ... I also have a more advanced example of using FFProbe via Wasm:
Read more >
ffprobe-wasm vs ffmpeg.wasm - compare differences and reviews ...
Posts with mentions or reviews of ffprobe-wasm. We have used some of these posts to build our list of alternatives and similar projects....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found