- 07 Sep, 2017 2 commits
-
-
Timo Rothenpieler authored
-
Timo Rothenpieler authored
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
- 02 Sep, 2017 1 commit
-
-
Timo Rothenpieler authored
Interlaced encoding profits from it, or might even need it in some players. No harm in enabling it unconditionally. Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
- 01 Sep, 2017 2 commits
-
-
Timo Rothenpieler authored
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
Timo Rothenpieler authored
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
- 02 Jun, 2017 1 commit
-
-
Ganapathy Kasi authored
hw accelerated transcode (h264_cuvid -> h264_nvenc with -hwaccel cuvid) was broken after the filtergraph initialization was changed to intialize decoder first followed by encoder (commit af1761f7). During initialzing encoder with bframes, local buffers are allocated internally in encoder which fails since no cuda context is available. Now pushing the correct cuda context before encoder initialization fixes the issue. Also adding push/pop cuda ctx during create/destroy/map/unmap resources and destroy encoder session. Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
- 01 Jun, 2017 1 commit
-
-
Timo Rothenpieler authored
-
- 23 May, 2017 1 commit
-
-
Timo Rothenpieler authored
Fixes #6260
-
- 10 May, 2017 2 commits
-
-
Sumit Agarwal authored
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
Ben Chang authored
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
- 09 May, 2017 2 commits
-
-
Timo Rothenpieler authored
-
Timo Rothenpieler authored
-
- 07 May, 2017 2 commits
-
-
Timo Rothenpieler authored
-
Timo Rothenpieler authored
-
- 26 Apr, 2017 1 commit
-
-
Ben Chang authored
This patch aims to reduce the number of input/output surfaces NVENC allocates per session. Previous default sets allocated surfaces to 32 (unless there is user specified param or lookahead involved). Having large number of surfaces consumes extra video memory (esp for higher resolution encoding). The patch changes the surfaces calculation for default, B-frames, lookahead scenario respectively. The other change involves surface selection. Previously, if a session allocates x surfaces, only x-1 surfaces are used (due to combination of output delay and lock toggle logic). To prevent unused surfaces, changing surface rotation to using predefined fifo. Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
- 23 Mar, 2017 2 commits
-
-
Timo Rothenpieler authored
-
Timo Rothenpieler authored
-
- 20 Mar, 2017 1 commit
-
-
Clément Bœsch authored
-
- 17 Mar, 2017 1 commit
-
-
Konda Raju authored
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
- 01 Mar, 2017 2 commits
-
-
Konda Raju authored
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
Ganapathy Raman Kasi authored
qmin and qmax are not necessary for nvenc vbr. Enforcing this constraint, doesn't allow user to use vbr 2 pass mode without explicity setting the qmin and qmax options Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
- 14 Feb, 2017 1 commit
-
-
Timo Rothenpieler authored
Thanks to Miroslav Slugeň for figuring out what was going on here.
-
- 13 Feb, 2017 1 commit
-
-
Timo Rothenpieler authored
-
- 20 Jan, 2017 2 commits
-
-
Timo Rothenpieler authored
-
Timo Rothenpieler authored
-
- 17 Jan, 2017 1 commit
-
-
Luca Barbato authored
Make sure that NVENC does not misbehave if other cuda usages happen in the application.
-
- 01 Jan, 2017 3 commits
-
-
Miroslav Slugen authored
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
Miroslav Slugen authored
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
Miroslav Slugeň authored
Round qpIntra and qpInter calculation instead of old floor behavior. Adopted from vaapi_encode_h264.c Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
- 26 Dec, 2016 1 commit
-
-
Ruta Gadkari authored
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
-
- 21 Dec, 2016 1 commit
-
-
Ruta Gadkari authored
By default it is -1. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
-
- 30 Nov, 2016 3 commits
-
-
Timo Rothenpieler authored
-
Miroslav Slugeň authored
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
Philip Langdale authored
When input surfaces are cuda frames, we will not know what the actual underlying format (nv12, p010, etc) is at surface allocation time. On the other hand, we will know when the input frames are actually registered and associated with a surface. So, let's delay format discovery until registration time, which is actually how we handle other frame properties, such as dimensions. By itself, this change doesn't allow for transcoding of 10bit content from cuvid, but it reduces the problem to the hardcoding of the sw format in ffmpeg_cuvid.c Signed-off-by: Philip Langdale <philipl@overt.org> Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
- 25 Nov, 2016 1 commit
-
-
Philip Langdale authored
This dubious behaviour in nvenc was finally removed by nvidia, and as we refuse to run on anything older than 7.0, we don't need to keep it around for old versions.
-
- 22 Nov, 2016 2 commits
-
-
Miroslav Slugeň authored
User selectable surfaces are not working correctly, if you set number of surfaces on cmdline, it will always use minimum 32 or 48 depends on selected resolution, but in nvenc it is not necessary to use so many surfaces. So from now you can define as low as 1 surface and nvenc will still work, it will ofcourse lower GPU memory usage by 95% and async_delay to zero That was the easy part, now littlebit more... Next part of this patch is to always prefer rc_lookahead to be more important for number of surfaces, than user defined surfaces value. Maximum rc_lookahead from nvidia documentation is 32, but could increase in future generations so there is no limit for this yet. Value async_depth is still accepted and prefered over rc_lookahead. There were also bug when you request more than rc_lookahead > 31, it will always set maximum 31, because surface numbers recalculation was after setting lookahead, which is now fixed. Results: If you set -rc_lookahead 32 and -bf 3 it will now use only 40 surfaces and lower GPU memory usage by 20%, also it will now increase PSNR by 0.012dB Two more comments: 1. from my internal test, i don't understand addition of 4 more surfaces when lookahead is calculated, i didn't used this and everything works as with those 4 more extra surfaces, does anybody know what is going on there? I looks like it was used for B frames which are calculated separately, because B frames maximum is 4. 2. rc_lookahead is defined default to -1, but in test condition if (ctx->rc_lookahead) which sets lookahead it will be always true, i don't know if this is intended behavior, so in default behavior is lookahead always on! This is default condition when rc_lokkahead is -1 (not defined on cmdline), whis is maybe something that is not intended: ctx->encode_config.rcParams.enableLookahead = 1; ctx->encode_config.rcParams.lookaheadDepth = 0; ctx->encode_config.rcParams.disableIadapt = 0; ctx->encode_config.rcParams.disableBadapt = 0; Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
Timo Rothenpieler authored
-
- 05 Nov, 2016 1 commit
-
-
Matt Oliver authored
Signed-off-by: Matt Oliver <protogonoi@gmail.com>
-
- 19 Oct, 2016 1 commit
-
-
Sven C. Dack authored
Adds a check to see if the hardware supports temporal aq. Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
-
- 12 Oct, 2016 1 commit
-
-
Timo Rothenpieler authored
-