-
Ganesh Ajjanagadde authored
qsort is called indirectly in filter_frame, suggesting its performance criticality. AV_QSORT is substantially faster due to the inlining of the comparison callback. Thus, the increase in performance should be worth the increase in binary size. This optimization is just a low hanging fruit. The trac ticket 1430 is a request for an improved deshake filter. Sample benchmark (x86-64, Haswell, GNU/Linux): File: original from https://trac.ffmpeg.org/ticket/1430 command: ffmpeg -stream_loop 8 -i file.webm -vf deshake=rx=64:ry=64 -f null - Timer truncated at 1024 runs. new: 28260 decicycles in qsort, 1 runs, 0 skips 35570 decicycles in qsort, 2 runs, 0 skips 39010 decicycles in qsort, 4 runs, 0 skips 46897 decicycles in qsort, 8 runs, 0 skips 40442 decicycles in qsort, 16 runs, 0 skips 41611 decicycles in qsort, 32 runs, 0 skips 40345 decicycles in qsort, 64 runs, 0 skips 38967 decicycles in qsort, 128 runs, 0 skips 38647 decicycles in qsort, 256 runs, 0 skips 40238 decicycles in qsort, 512 runs, 0 skips 39676 decicycles in qsort, 1024 runs, 0 skips old: 1740280 decicycles in qsort, 1 runs, 0 skips 923560 decicycles in qsort, 2 runs, 0 skips 511330 decicycles in qsort, 4 runs, 0 skips 309720 decicycles in qsort, 8 runs, 0 skips 194900 decicycles in qsort, 16 runs, 0 skips 142686 decicycles in qsort, 32 runs, 0 skips 112516 decicycles in qsort, 64 runs, 0 skips 98166 decicycles in qsort, 128 runs, 0 skips 88147 decicycles in qsort, 256 runs, 0 skips 88706 decicycles in qsort, 512 runs, 0 skips 86783 decicycles in qsort, 1024 runs, 0 skips Reviewed-by: Nicolas George <george@nsup.org> Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
7910a2c2