Commits · 8446318502bf21347a4867a5a1fcd8d9bfbd6a41 · Linshizhi / ffmpeg.wasm-core

20 Mar, 2019 2 commits
- swscale/ppc: Add av_unused to template vars only used in one includer · 6b5ea90e
  Lauri Kasanen authored 5 years ago
  
  6b5ea90e
- swscale/ppc: Clean up some mixed decl warnings · ac3062f1
  Lauri Kasanen authored 5 years ago
  
  ac3062f1
05 Feb, 2019 1 commit

libswscale/ppc: VSX-optimize 9-16 bit yuv2planeX · 8522d219

Lauri Kasanen authored 6 years ago

./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p16be \
-s 1920x1728 -f null -vframes 100 -v error -nostats -

9-14 bit funcs get about 6x speedup, 16-bit gets about 15x.
Fate passes, each format tested with an image to video conversion.

Only POWER8 includes 32-bit vector multiplies, so POWER7 is locked out
of the 16-bit function. This includes the vec_mulo/mule functions too,
not just vmuluwm.

With TIMER_REPORT skips disabled:
yuv420p9le
  12412 UNITS in planarX,  131072 runs,      0 skips
  73136 UNITS in planarX,  131072 runs,      0 skips
yuv420p9be
  12481 UNITS in planarX,  131072 runs,      0 skips
  73410 UNITS in planarX,  131072 runs,      0 skips
yuv420p10le
  12322 UNITS in planarX,  131072 runs,      0 skips
  72546 UNITS in planarX,  131072 runs,      0 skips
yuv420p10be
  12291 UNITS in planarX,  131072 runs,      0 skips
  72935 UNITS in planarX,  131072 runs,      0 skips
yuv420p12le
  12316 UNITS in planarX,  131072 runs,      0 skips
  72708 UNITS in planarX,  131072 runs,      0 skips
yuv420p12be
  12319 UNITS in planarX,  131072 runs,      0 skips
  72577 UNITS in planarX,  131072 runs,      0 skips
yuv420p14le
  12259 UNITS in planarX,  131072 runs,      0 skips
  72516 UNITS in planarX,  131072 runs,      0 skips
yuv420p14be
  12440 UNITS in planarX,  131072 runs,      0 skips
  72962 UNITS in planarX,  131072 runs,      0 skips
yuv420p16le
  10548 UNITS in planarX,  131072 runs,      0 skips
  73429 UNITS in planarX,  131072 runs,      0 skips
yuv420p16be
  10634 UNITS in planarX,  131072 runs,      0 skips
 150959 UNITS in planarX,  131072 runs,      0 skips
Signed-off-by: Lauri Kasanen <cand@gmx.com>

8522d219

04 Dec, 2018 1 commit

swscale/ppc: Move VSX-using code to its own file · 78c7ff7d

Lauri Kasanen authored 6 years ago

Passes fate on LE (with "lavc/jrevdct: Avoid an aliasing violation" applied).
Signed-off-by: Lauri Kasanen <cand@gmx.com>
Tested-by: Michael Kostylev on BE
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>

78c7ff7d

26 Nov, 2018 1 commit

swscale/output: Altivec-optimize yuv2plane1_8 · 46c5693e

Lauri Kasanen authored 6 years ago

./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p \
-f null -vframes 100 -v error -nostats -

1158 UNITS in planar1,   65528 runs,      8 skips

-cpuflags 0

19082 UNITS in planar1,   65533 runs,      3 skips

16.48 speedup ratio. On x86, SSE2 is ~7. Curiously, the Power C version
takes as many cycles as the x86 SSE2 version, yikes it's fast.

Note that this function uses VSX instructions, but is not marked so.
This is because several existing functions also make that mistake.
I'll submit a patch moving them once this is reviewed.
Signed-off-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>

46c5693e

14 Aug, 2018 1 commit
- libswscale: Adds conversions from/to float gray format. · 582bc5a3
  Sergey Lavrushkin authored 6 years ago
```
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
  582bc5a3
09 Nov, 2016 1 commit
- swscale: Drop is9_OR_10BPS() use, its name is not correct · d736b52a
  Michael Niedermayer authored 8 years ago
```
Found-by: Luca Barbato
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
```
  d736b52a
12 Oct, 2016 1 commit
- swscale: Add input support for 12-bit formats · 328ea6a9
  Michael Niedermayer authored 12 years ago
```
Implemented for AV_PIX_FMT_GBRP12.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
```
  328ea6a9
27 Sep, 2016 1 commit
- swscale: Rename is9_OR_10 to match what it does · 2b5b1e1e
  Luca Barbato authored 8 years ago
```
It is used to select functions that work with 9-15bits.
```
  2b5b1e1e
31 Mar, 2016 1 commit

swscale: cleanup unused code · 6de58b49

Pedro Arthur authored 8 years ago

Removed previous swscale code under '#ifndef NEW_FILTER'
and removed unused fields of SwsContext

6de58b49

31 May, 2015 1 commit
- ppc: Restrict some Altivec implementations to Big Endian · da60b99a
  Luca Barbato authored 9 years ago
```
In Little Endian the vec_ld/vec_st operations work as
expected only for byte-vectors.
```
  da60b99a
27 Apr, 2015 1 commit

swscale/ppc/swscale_altivec.c: POWER LE support in yuv2planeX_8() delete macro... · 603c8393

Rong Yan authored 9 years ago

swscale/ppc/swscale_altivec.c: POWER LE support in yuv2planeX_8() delete macro GET_VF() it was wrong

GCC tool had a bug of PPC intrinsic interpret, which has been fixed in GCC 4.9.1. This bug lead to
errors in two of our previous patches. We found this when we update our GCC tools to 4.9.1 and by
reading the related info on GCC website. We fix our previous error in two separate commits
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

603c8393

14 Mar, 2015 1 commit
- ppc: libswscale: use LOCAL_ALIGNED instead of DECLARE_ALIGNED · 5d38c628
  Christophe Gisquet authored 9 years ago
```
The later may yield incorrect code for on-stack variables.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
```
  5d38c628
12 Nov, 2014 1 commit

libswscale/ppc/swscale_altivec.c : fix hScale_altivec_real()... · e74e1460

Rong Yan authored 10 years ago

libswscale/ppc/swscale_altivec.c : fix hScale_altivec_real() yuv2planeX_16_altivec() yuv2planeX_8() for little endian

add marcos GET_LS() GET_VF() LOAD_FILTER() LOAD_L1() GET_VF4() FIRST_LOAD() UPDATE_PTR() LOAD_SRCV() LOAD_SRCV8() GET_VFD() for POWER LE
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

e74e1460

29 Aug, 2013 1 commit
- swscale: ppc: Hide arch-specific initialization details · c2503d9c
  Diego Biurrun authored 12 years ago
```
Also give consistent names to init functions.
```
  c2503d9c
08 Oct, 2012 1 commit
- Replace PIX_FMT_* -> AV_PIX_FMT_*, PixelFormat -> AVPixelFormat · 716d413c
  Anton Khirnov authored 12 years ago
  
  716d413c
05 Oct, 2012 1 commit

ppc: swscale: rework yuv2planeX_altivec() · 07eb7e20

Mans Rullgard authored 12 years ago

This gets rid of the variable-length scratch buffer by filtering 16
pixels at a time and writing directly to the destination.  The extra
loads this requires to load the source values are compensated by not
doing a round-trip to memory before shifting.
Signed-off-by: Mans Rullgard <mans@mansr.com>

07eb7e20

22 Jul, 2012 1 commit
- swscale: Mark all init functions as av_cold · 5a6e3c03
  Diego Biurrun authored 12 years ago
  
  5a6e3c03
04 Jul, 2012 1 commit

sws: support 12&14 bit planar colorspaces · fa36f334

Michael Niedermayer authored 12 years ago

Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

fa36f334

06 Mar, 2012 1 commit

swscale: make filterPos 32bit. · 2254b559

Ronald S. Bultje authored 12 years ago

Fixes overflows for large image sizes.

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org

2254b559

21 Feb, 2012 1 commit
- swscale: K&R formatting cosmetics for PowerPC code (part I/II) · 04217de4
  Diego Biurrun authored 12 years ago
  
  04217de4
25 Jan, 2012 1 commit
- cosmetics: Remove some unnecessary block braces. · 33ad8c3c
  Diego Biurrun authored 13 years ago
  
  33ad8c3c
22 Oct, 2011 2 commits
- swscale: update altivec yuv2planeX asm to new per-plane API. · f48b12e0
  Ronald S. Bultje authored 13 years ago
  
  f48b12e0
- Split up yuv2yuvX functions · ff7913ae
  Kieran Kunhya authored 13 years ago
```
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
```
  ff7913ae
25 Sep, 2011 1 commit

ppc: fix some pointer to integer casts · d853e571

Mans Rullgard authored 13 years ago

Use uintptr_t instead of plain int.  Without this change, the
comparisons will come out wrong for pointers in certain ranges.
Fixes random failures on ppc64.  Also fixes some compiler warnings.
Signed-off-by: Mans Rullgard <mans@mansr.com>

d853e571

18 Aug, 2011 1 commit

swscale: split hScale() function pointer into h[cy]Scale(). · 3f04ab4f

Ronald S. Bultje authored 13 years ago

This allows using more specific implementations for chroma/luma, e.g.
we can make assumptions on filterSize being constant, thus avoiding
that test at runtime.

3f04ab4f

12 Aug, 2011 2 commits
- swscale: add dithering to yuv2yuvX_altivec_real · 3304a1e6
  Luca Barbato authored 13 years ago
```
It just does that part in scalar form, I doubt using a vector store
over 2 array would speed it up particularly.

The function should be written to not use a scratch buffer.
```
  3304a1e6
- swscale: use 15-bit intermediates for 9/10-bit scaling. · 28c1115a
  Ronald S. Bultje authored 13 years ago
  
  28c1115a
11 Jul, 2011 1 commit

swscale: for >8bit scaling, read in native bit-depth. · 948ccdad

Ronald S. Bultje authored 13 years ago

For 9/10bit, it means we don't have to upscale to 16bit before
actual scaling or pixel format conversion, and thus a performance
gain.

948ccdad

01 Jul, 2011 1 commit

swscale: for >8bit scaling, read in native bit-depth. · 8a8d0ce2

Ronald S. Bultje authored 13 years ago

For 9/10bit, it means we don't have to upscale to 16bit before
actual scaling or pixel format conversion, and thus a performance
gain.

8a8d0ce2

30 Jun, 2011 1 commit

swscale: implement >8bit scaling support. · 45f6ffe5

Ronald S. Bultje authored 13 years ago

This means that precision is retained when scaling between sample
formats with >8 bits per component (48bit RGB, 16bit grayscale,
9/10/16bit YUV).

45f6ffe5

29 Jun, 2011 1 commit

swscale: implement >8bit scaling support. · ef1ee362

Ronald S. Bultje authored 13 years ago

This means that precision is retained when scaling between sample
formats with >8 bits per component (48bit RGB, 16bit grayscale,
9/10/16bit YUV).

ef1ee362

28 Jun, 2011 3 commits

PPC: swscale: disable altivec functions for unsupported formats · 635930d4
Mans Rullgard authored 13 years ago
```
Signed-off-by: Mans Rullgard <mans@mansr.com>
```
635930d4

swscale: change prototypes of scaled YUV output functions. · 13a09979

Ronald S. Bultje authored 13 years ago

Remove unused variables "flags" and "dstFormat" in yuv2packed1,
merge source rows per plane for yuv2packed[12], and make every
source argument int16_t (some where invalidly set to uint16_t).
This prevents stack pollution and is part of the Great Evil Plan
to simplify swscale.

13a09979

swscale: split yuv2packedX_altivec in smaller functions. · dc179ec8

Ronald S. Bultje authored 13 years ago

This will likely lead to a considerable performance boost,
since it removes a branch from the inner loop. Part of the
Great Evil Plan to simplify swscale.

dc179ec8

26 Jun, 2011 1 commit
- swscale: remove unused xInc/srcW arguments from hScale(). · 97535ffb
  Ronald S. Bultje authored 13 years ago
  
  97535ffb
07 Jun, 2011 2 commits
- swscale: extract SWS_FULL_CHR_H_INT conditional into init code. · ca364a5b
  Ronald S. Bultje authored 13 years ago
  
  ca364a5b
- swscale: un-special-case yuv2yuvX16_c(). · bda9b20f
  Ronald S. Bultje authored 13 years ago
```
Make yuv2yuvX16_c a function pointer for yuv2yuvX(), so that the
function pointer becomes bitdepth-independent.
```
  bda9b20f
03 Jun, 2011 2 commits
- swscale: enable hScale_altivec_real. · 075d0ae7
  Ronald S. Bultje authored 13 years ago
  
  075d0ae7
- swscale: split out ppc _template.c files from main swscale.c. · 67d80a54
  Ronald S. Bultje authored 13 years ago
  
  67d80a54