- 04 Aug, 2014 19 commits
-
-
Ben Avison authored
The previous implementation of the parser made four passes over each input buffer (reduced to two if the container format already guaranteed the input buffer corresponded to frames, such as with MKV). But these buffers are often 200K in size, certainly enough to flush the data out of L1 cache, and for many CPUs, all the way out to main memory. The passes were: 1) locate frame boundaries (not needed for MKV etc) 2) copy the data into a contiguous block (not needed for MKV etc) 3) locate the start codes within each frame 4) unescape the data between start codes After this, the unescaped data was parsed to extract certain header fields, but because the unescape operation was so large, this was usually also effectively operating on uncached memory. Most of the unescaped data was simply thrown away and never processed further. Only step 2 - because it used memcpy - was using prefetch, making things even worse. This patch reorganises these steps so that, aside from the copying, the operations are performed in parallel, maximising cache utilisation. No more than the worst-case number of bytes needed for header parsing is unescaped. Most of the data is, in practice, only read in order to search for a start code, for which optimised implementations already existed in the H264 codec (notably the ARM version uses prefetch, so we end up doing both remaining passes at maximum speed). For MKV files, we know when we've found the last start code of interest in a given frame, so we are able to avoid doing even that one remaining pass for most of the buffer. In some use-cases (such as the Raspberry Pi) video decode is handled by the GPU, but the entire elementary stream is still fed through the parser to pick out certain elements of the header which are necessary to manage the decode process. As you might expect, in these cases, the performance of the parser is significant. To measure parser performance, I used the same VC-1 elementary stream in either an MPEG-2 transport stream or a MKV file, and fed it through avconv with -c:v copy -c:a copy -f null. These are the gperftools counts for those streams, both filtered to only include vc1_parse() and its callees, and unfiltered (to include the whole binary). Lower numbers are better: Before After File Filtered Mean StdDev Mean StdDev Confidence Change M2TS No 861.7 8.2 650.5 8.1 100.0% +32.5% MKV No 868.9 7.4 731.7 9.0 100.0% +18.8% M2TS Yes 250.0 11.2 27.2 3.4 100.0% +817.9% MKV Yes 149.0 12.8 1.7 0.8 100.0% +8526.3% Yes, that last case shows vc1_parse() running 86 times faster! The M2TS case does show a larger absolute improvement though, since it was worse to begin with. This patch has been tested with the FATE suite (albeit on x86 for speed). Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
-
Ben Avison authored
Initialise VC1DSPContext for parser as well as for decoder. Note, the VC-1 code doesn't actually use the function pointer yet. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
-
Ben Avison authored
This permits re-use with parsers for codecs which use similar start codes. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Vittorio Giovara authored
Every supported format is converted to RGB.
-
Carl Eugen Hoyos authored
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
-
Vittorio Giovara authored
-
Vittorio Giovara authored
-
Vittorio Giovara authored
The decoder was producing different results when ASM was disabled. Based on a long debug session with Kostya.
-
Vittorio Giovara authored
Based on a long debug session with Kostya.
-
Vittorio Giovara authored
-
Vittorio Giovara authored
The rationale is that you have a packed format in form <greyscale sample> <alpha sample> <greyscale sample> <alpha sample> and shortening greyscale to 'G' might make one thing about Greenscale instead. An alias pixel format and color space name are provided for compatibility.
-
Vittorio Giovara authored
-
Luca Barbato authored
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
-
Kostya Shishkov authored
Bug-Id: 772 CC: libav-stable@libav.org Found-By: Justin Ruggles <justin.ruggles@gmail.com> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
-
Janne Grunau authored
This makes the default of '1' more explicit than defaulting to '1' in fate-run.sh and regression-funcs.sh if THREADS is not set. Fixes the reported thread count in fate-cpu if THREADS is not set.
-
Marvin Scholz authored
Icecast is basically a convenience wrapper around the HTTP protocol. Signed-off-by: Martin Storsjö <martin@martin.st>
-
Diego Biurrun authored
-
- 03 Aug, 2014 19 commits
-
-
Kieran Kunhya authored
Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
Diego Biurrun authored
Bug-Id: CVE-2013-0868 inspired by a patch from Michael Niedermayer <michaelni@gmx.at> Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Diego Biurrun <diego@biurrun.de> CC: libav-stable@libav.org
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Anton Khirnov authored
Signed-off-by: Diego Biurrun <diego@biurrun.de>
-
Janne Grunau authored
-
Janne Grunau authored
llvm's integrated assembler does not accept spaces as macro argument delimiter when targeting darwin. Using a explicit delimiter is a good idea in principle since it makes case like 'macro 4 -2' vs 'macro 4 - 2' clear.
-
Janne Grunau authored
Add CPU count and number threads as informative values for fate.
-
Janne Grunau authored
libavutil/cpu-test prints raw and effective cpu flags to STDERR. Detected cpu flags can be useful for debugging fate errors. No comparison of the result against a expected result since that would require fate config specific references.
-
Luca Barbato authored
Split return value handling from the actual opening. Incidentally fixes the https -> http redirect issue reported by Compn on behalf of rcombs. CC: libav-stable@libav.org
-
Justin Ruggles authored
This treats mono as planar internally within libavresample rather than changing the sample format. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Diego Biurrun authored
Also add missing mem.h header for av_freep().
-
Diego Biurrun authored
-
Diego Biurrun authored
-
Diego Biurrun authored
This is cleaner and avoids a cast plus a related const qualifier warning.
-
- 02 Aug, 2014 2 commits
-
-
Diego Biurrun authored
-
Diego Biurrun authored
-