Commits · 6e42021128982c9b4bc1f698a326a7f8361d67a0 · Linshizhi / ffmpeg.wasm-core

27 Feb, 2019 1 commit
- h264/arm64: implement missing 4:2:2 chroma loop filter neon functions · 186bd30a
  Janne Grunau authored 5 years ago
  
  186bd30a
20 Feb, 2019 2 commits
- lavc/aarch64/h264dsp_init: Only use neon horizontal intra loopfilter for 4:2:0. · 7e4d3dbe
  Carl Eugen Hoyos authored 6 years ago
  
  7e4d3dbe
- aarch64/h264dsp: change loop filter stride argument to ptrdiff_t · aa844dc4
  James Almer authored 6 years ago
```
This was missed in d5d699abSigned-off-by: James Almer <jamrial@gmail.com>
```
  aa844dc4
19 Feb, 2019 19 commits

aarch64: vp8: Move the vp8dsp makefile entries to the right places · c8bc9d13

Martin Storsjö authored 6 years ago

Even if NEON would be disabled, the init functions should be built
as they are called as long as ARCH_AARCH64 is set.

These functions are part of a generic DSP subsytem, not tied directly
to one decoder. (They should be built if the vp7 decoder is enabled,
even if the vp8 decoder is disabled.)
Signed-off-by: Martin Storsjö <martin@martin.st>
(cherry picked from commit b4b27dce)

c8bc9d13

aarch64: vp8: Remove superfluous includes · fecf75a5

Martin Storsjö authored 6 years ago

This fixes building with MSVC, which lacks unistd.h.
Signed-off-by: Martin Storsjö <martin@martin.st>
(cherry picked from commit ad32f7b1)

fecf75a5

aarch64: vp8: Fix assembling with armasm64 · 7ddfa5e9
Martin Storsjö authored 6 years ago
```
Signed-off-by: Martin Storsjö <martin@martin.st>
(cherry picked from commit 2eeac799)
```
7ddfa5e9

aarch64: vp8: Fix assembling with clang · c950beb6

Martin Storsjö authored 6 years ago

This also partially fixes assembling with MS armasm64 (via
gas-preprocessor).

The movrel macro invocations need to pass the offset via a separate
parameter. Mach-o and COFF relocations don't allow a negative
offset to a symbol, which is handled properly if the offset is passed
via the parameter. If no offset parameter is given, the macro
evaluates to something like "adrp x17, subpel_filters-16+(0)", which
older clang versions also fail to parse (the older clang versions
only support one single offset term, although it can be a parenthesis.
Signed-off-by: Martin Storsjö <martin@martin.st>
(cherry picked from commit 26d7af4c)

c950beb6

aarch64: vp8: Optimize vp8_idct_add_neon for aarch64 · 7e42d5f0

Martin Storsjö authored 6 years ago

The previous version was a pretty exact translation of the arm
version. This version does do some unnecessary arithemetic (it does
more operations on vectors that are only half filled; it does 4
uaddw and 4 sqxtun instead of 2 of each), but it reduces the overhead
of packing data together (which could be done for free in the arm
version).

This gives a decent speedup on Cortex A53, a minor speedup on
A72 and a very minor slowdown on Cortex A73.

Before:        Cortex A53    A72    A73
vp8_idct_add_neon:   79.7   67.5   65.0
After:
vp8_idct_add_neon:   67.7   64.8   66.7
Signed-off-by: Martin Storsjö <martin@martin.st>

7e42d5f0

aarch64: vp8: Skip saturating in shrn in ff_vp8_idct_add_neon · 49f9c427

Martin Storsjö authored 6 years ago

The original arm version didn't do saturation here. This probably
doesn't make any difference for performance, but reduces the
differences.
Signed-off-by: Martin Storsjö <martin@martin.st>

49f9c427

aarch64: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2 · 37394ef0

Martin Storsjö authored 6 years ago

This makes it similar to put_epel16_v6, and gives a large speedup
on Cortex A53, a minor speedup on A72 and a very minor slowdown on
A73.

Before:                 Cortex A53     A72     A73
vp8_put_epel16_h6v6_neon:   2211.4  1586.5  1431.7
After:
vp8_put_epel16_h6v6_neon:   1736.9  1522.0  1448.1
Signed-off-by: Martin Storsjö <martin@martin.st>

37394ef0

aarch64: vp8: Port bilin functions from arm version · e39a9212

Martin Storsjö authored 6 years ago

                      Cortex A53     A72     A73
vp8_put_bilin4_h_c:        303.8   102.2   161.8
vp8_put_bilin4_h_neon:     100.0    40.9    41.2
vp8_put_bilin4_hv_c:       322.8   201.0   305.9
vp8_put_bilin4_hv_neon:    156.8    72.6    77.0
vp8_put_bilin4_v_c:        304.7   101.7   166.5
vp8_put_bilin4_v_neon:      82.7    41.2    33.0
vp8_put_bilin8_h_c:       1192.7   352.5   623.8
vp8_put_bilin8_h_neon:     213.5    70.2    87.8
vp8_put_bilin8_hv_c:      1098.6   769.2  1041.9
vp8_put_bilin8_hv_neon:    324.0   123.5   146.0
vp8_put_bilin8_v_c:       1193.9   350.4   617.7
vp8_put_bilin8_v_neon:     183.9    60.7    64.7
vp8_put_bilin16_h_c:      2353.1   671.2  1223.3
vp8_put_bilin16_h_neon:    261.9   140.7   145.0
vp8_put_bilin16_hv_c:     2453.2  1470.9  2355.2
vp8_put_bilin16_hv_neon:   383.9   196.0   217.0
vp8_put_bilin16_v_c:      2349.3   669.8  1251.2
vp8_put_bilin16_v_neon:    202.9   110.7    96.2
Signed-off-by: Martin Storsjö <martin@martin.st>

e39a9212

aarch64: vp8: Port epel4 functions from arm version · 58d15492

Martin Storsjö authored 6 years ago

                      Cortex A53    A72    A73
vp8_put_epel4_h4_c:        631.4  291.7  367.8
vp8_put_epel4_h4_neon:     241.0  131.0  155.7
vp8_put_epel4_h4v4_c:      967.5  529.3  667.7
vp8_put_epel4_h4v4_neon:   429.3  241.8  279.7
vp8_put_epel4_h4v6_c:     1374.7  657.5  864.5
vp8_put_epel4_h4v6_neon:   515.5  295.5  334.7
vp8_put_epel4_h6_c:        851.0  421.0  486.0
vp8_put_epel4_h6_neon:     321.5  195.0  217.7
vp8_put_epel4_h6v4_c:     1111.3  621.1  781.2
vp8_put_epel4_h6v4_neon:   539.2  328.0  365.3
vp8_put_epel4_h6v6_c:     1561.3  763.3  999.7
vp8_put_epel4_h6v6_neon:   645.5  401.0  434.7
vp8_put_epel4_v4_c:        663.8  298.3  357.0
vp8_put_epel4_v4_neon:     116.0   81.5   72.5
vp8_put_epel4_v6_c:        870.5  437.0  507.4
vp8_put_epel4_v6_neon:     147.7  108.8   92.0
Signed-off-by: Martin Storsjö <martin@martin.st>

58d15492

aarch64: vp8: Port missing epel8 functions from arm version · cc7ba00c

Martin Storsjö authored 6 years ago

                      Cortex A53     A72     A73
vp8_put_epel8_h4_c:       2594.8  1159.6  1374.8
vp8_put_epel8_h4_neon:     506.4   244.2   314.0
vp8_put_epel8_h6_c:       3445.8  1677.1  1811.3
vp8_put_epel8_h6_neon:     634.4   371.7   433.0
vp8_put_epel8_v4_c:       2614.0  1174.8  1378.0
vp8_put_epel8_v4_neon:     321.0   221.7   235.8
vp8_put_epel8_v6_c:       3635.5  1703.0  2079.2
vp8_put_epel8_v6_neon:     416.9   317.0   295.5
Signed-off-by: Martin Storsjö <martin@martin.st>

cc7ba00c

aarch64: vp8: Port vp8_luma_dc_wht and vp8_idct_dc_add4uv from arm version · 52c9b0a6

Martin Storsjö authored 6 years ago

                     Cortex A53    A72    A73
vp8_luma_dc_wht_c:        115.7   75.7   90.7
vp8_luma_dc_wht_neon:      60.7   41.2   45.7
vp8_idct_dc_add4uv_c:     376.1  262.9  282.5
vp8_idct_dc_add4uv_neon:   52.0   29.0   37.0
Signed-off-by: Martin Storsjö <martin@martin.st>

52c9b0a6

aarch64: vp8: Fix a typo in a comment · c513fcd7
Martin Storsjö authored 6 years ago
```
Signed-off-by: Martin Storsjö <martin@martin.st>
```
c513fcd7
aarch64: vp8: Reorder the function pointer inits to match the arm original · f1011ea2
Martin Storsjö authored 6 years ago
```
Signed-off-by: Martin Storsjö <martin@martin.st>
```
f1011ea2

aarch64: vp8: Move the vp8dsp makefile entries to the right places · b4b27dce

Martin Storsjö authored 6 years ago

Even if NEON would be disabled, the init functions should be built
as they are called as long as ARCH_AARCH64 is set.

These functions are part of a generic DSP subsytem, not tied directly
to one decoder. (They should be built if the vp7 decoder is enabled,
even if the vp8 decoder is disabled.)
Signed-off-by: Martin Storsjö <martin@martin.st>

b4b27dce

aarch64: vp8: Remove superfluous includes · ad32f7b1

Martin Storsjö authored 6 years ago

This fixes building with MSVC, which lacks unistd.h.
Signed-off-by: Martin Storsjö <martin@martin.st>

ad32f7b1

aarch64: vp8: Use the proper aarch64 form for conditional branches · 85bfaa49

Martin Storsjö authored 6 years ago

The previous form also does seem to assemble on current tools,
but I think it might fail on some older aarch64 tools.
Signed-off-by: Martin Storsjö <martin@martin.st>

85bfaa49

aarch64: vp8: Fix assembling with armasm64 · 2eeac799
Martin Storsjö authored 6 years ago
```
Signed-off-by: Martin Storsjö <martin@martin.st>
```
2eeac799

aarch64: vp8: Fix assembling with clang · 26d7af4c

Martin Storsjö authored 6 years ago

This also partially fixes assembling with MS armasm64 (via
gas-preprocessor).

The movrel macro invocations need to pass the offset via a separate
parameter. Mach-o and COFF relocations don't allow a negative
offset to a symbol, which is handled properly if the offset is passed
via the parameter. If no offset parameter is given, the macro
evaluates to something like "adrp x17, subpel_filters-16+(0)", which
older clang versions also fail to parse (the older clang versions
only support one single offset term, although it can be a parenthesis.
Signed-off-by: Martin Storsjö <martin@martin.st>

26d7af4c

libavcodec: vp8 neon optimizations for aarch64 · 0801853e

Magnus Röös authored 6 years ago

Partial port of the ARM Neon for aarch64.

Benchmarks from fate:

benchmarking with Linux Perf Monitoring API
nop: 58.6
checkasm: using random seed 1760970128
NEON:
 - vp8dsp.idct       [OK]
 - vp8dsp.mc         [OK]
 - vp8dsp.loopfilter [OK]
checkasm: all 21 tests passed
vp8_idct_add_c: 201.6
vp8_idct_add_neon: 83.1
vp8_idct_dc_add_c: 107.6
vp8_idct_dc_add_neon: 33.8
vp8_idct_dc_add4y_c: 426.4
vp8_idct_dc_add4y_neon: 59.4
vp8_loop_filter8uv_h_c: 688.1
vp8_loop_filter8uv_h_neon: 216.3
vp8_loop_filter8uv_inner_h_c: 649.3
vp8_loop_filter8uv_inner_h_neon: 195.3
vp8_loop_filter8uv_inner_v_c: 544.8
vp8_loop_filter8uv_inner_v_neon: 131.3
vp8_loop_filter8uv_v_c: 706.1
vp8_loop_filter8uv_v_neon: 141.1
vp8_loop_filter16y_h_c: 668.8
vp8_loop_filter16y_h_neon: 242.8
vp8_loop_filter16y_inner_h_c: 647.3
vp8_loop_filter16y_inner_h_neon: 224.6
vp8_loop_filter16y_inner_v_c: 647.8
vp8_loop_filter16y_inner_v_neon: 128.8
vp8_loop_filter16y_v_c: 721.8
vp8_loop_filter16y_v_neon: 154.3
vp8_loop_filter_simple_h_c: 387.8
vp8_loop_filter_simple_h_neon: 187.6
vp8_loop_filter_simple_v_c: 384.1
vp8_loop_filter_simple_v_neon: 78.6
vp8_put_epel8_h4v4_c: 3971.1
vp8_put_epel8_h4v4_neon: 855.1
vp8_put_epel8_h4v6_c: 5060.1
vp8_put_epel8_h4v6_neon: 989.6
vp8_put_epel8_h6v4_c: 4320.8
vp8_put_epel8_h6v4_neon: 1007.3
vp8_put_epel8_h6v6_c: 5449.3
vp8_put_epel8_h6v6_neon: 1158.1
vp8_put_epel16_h6_c: 6683.8
vp8_put_epel16_h6_neon: 831.8
vp8_put_epel16_h6v6_c: 11110.8
vp8_put_epel16_h6v6_neon: 2214.8
vp8_put_epel16_v6_c: 7024.8
vp8_put_epel16_v6_neon: 799.6
vp8_put_pixels8_c: 112.8
vp8_put_pixels8_neon: 78.1
vp8_put_pixels16_c: 131.3
vp8_put_pixels16_neon: 129.8

This contains a fix to include guards by Carl Eugen Hoyos.
Signed-off-by: Martin Storsjö <martin@martin.st>

0801853e

31 Jan, 2019 2 commits

lavc/aarch64/vp8dsp: Fix the include guard. · ed20fbcd
Carl Eugen Hoyos authored 6 years ago
```
Fixes fate-source.
```
ed20fbcd

libavcodec: vp8 neon optimizations for aarch64 · 833fed52

Magnus Röös authored 6 years ago

Partial port of the ARM Neon for aarch64.

Benchmarks from fate:

benchmarking with Linux Perf Monitoring API
nop: 58.6
checkasm: using random seed 1760970128
NEON:
 - vp8dsp.idct       [OK]
 - vp8dsp.mc         [OK]
 - vp8dsp.loopfilter [OK]
checkasm: all 21 tests passed
vp8_idct_add_c: 201.6
vp8_idct_add_neon: 83.1
vp8_idct_dc_add_c: 107.6
vp8_idct_dc_add_neon: 33.8
vp8_idct_dc_add4y_c: 426.4
vp8_idct_dc_add4y_neon: 59.4
vp8_loop_filter8uv_h_c: 688.1
vp8_loop_filter8uv_h_neon: 216.3
vp8_loop_filter8uv_inner_h_c: 649.3
vp8_loop_filter8uv_inner_h_neon: 195.3
vp8_loop_filter8uv_inner_v_c: 544.8
vp8_loop_filter8uv_inner_v_neon: 131.3
vp8_loop_filter8uv_v_c: 706.1
vp8_loop_filter8uv_v_neon: 141.1
vp8_loop_filter16y_h_c: 668.8
vp8_loop_filter16y_h_neon: 242.8
vp8_loop_filter16y_inner_h_c: 647.3
vp8_loop_filter16y_inner_h_neon: 224.6
vp8_loop_filter16y_inner_v_c: 647.8
vp8_loop_filter16y_inner_v_neon: 128.8
vp8_loop_filter16y_v_c: 721.8
vp8_loop_filter16y_v_neon: 154.3
vp8_loop_filter_simple_h_c: 387.8
vp8_loop_filter_simple_h_neon: 187.6
vp8_loop_filter_simple_v_c: 384.1
vp8_loop_filter_simple_v_neon: 78.6
vp8_put_epel8_h4v4_c: 3971.1
vp8_put_epel8_h4v4_neon: 855.1
vp8_put_epel8_h4v6_c: 5060.1
vp8_put_epel8_h4v6_neon: 989.6
vp8_put_epel8_h6v4_c: 4320.8
vp8_put_epel8_h6v4_neon: 1007.3
vp8_put_epel8_h6v6_c: 5449.3
vp8_put_epel8_h6v6_neon: 1158.1
vp8_put_epel16_h6_c: 6683.8
vp8_put_epel16_h6_neon: 831.8
vp8_put_epel16_h6v6_c: 11110.8
vp8_put_epel16_h6v6_neon: 2214.8
vp8_put_epel16_v6_c: 7024.8
vp8_put_epel16_v6_neon: 799.6
vp8_put_pixels8_c: 112.8
vp8_put_pixels8_neon: 78.1
vp8_put_pixels16_c: 131.3
vp8_put_pixels16_neon: 129.8
Signed-off-by: Magnus Röös <mla2.roos@gmail.com>

833fed52

26 Jan, 2019 3 commits

h264/aarch64: add intra loop filter neon asm · 28a8b541

Janne Grunau authored 6 years ago

Add my neon asm from x264 relicensed under the LGPL 2.1 or later. Ported
(x264 uses nv12 chroma) and optimized.

Cycle count for checkasm --bench on a Snapdragon 820e:
h264_h_loop_filter_luma_intra_8bpp_c: 60.0
h264_h_loop_filter_luma_intra_8bpp_neon: 54.2
h264_v_loop_filter_luma_intra_8bpp_c: 148.3
h264_v_loop_filter_luma_intra_8bpp_neon: 73.8
h264_h_loop_filter_chroma_intra_8bpp_c: 27.8
h264_h_loop_filter_chroma_intra_8bpp_neon: 21.4
h264_h_loop_filter_chroma_mbaff_intra_8bpp_c: 15.8
h264_h_loop_filter_chroma_mbaff_intra_8bpp_neon: 15.7
h264_v_loop_filter_chroma_intra_8bpp_c: 45.8
h264_v_loop_filter_chroma_intra_8bpp_neon: 17.3

28a8b541

h264/aarch64: optimize neon loop filter · 846c3d6a

Janne Grunau authored 6 years ago

Exit as soon as possible if no filtering will be done.

Improves the checkasm --bench cycle count on a Snapdragon 820e:
h264_h_loop_filter_luma_8bpp_c:      72.4 ->  72.5
h264_h_loop_filter_luma_8bpp_neon:   97.1 ->  56.3
h264_v_loop_filter_luma_8bpp_c:     174.0 -> 173.5
h264_v_loop_filter_luma_8bpp_neon:   62.9 ->  60.9
h264_h_loop_filter_chroma_8bpp_c:    30.2 ->  30.3
h264_h_loop_filter_chroma_8bpp_neon: 51.6 ->  25.7
h264_v_loop_filter_chroma_8bpp_c:    57.3 ->  57.3
h264_v_loop_filter_chroma_8bpp_neon: 28.0 ->  24.0

846c3d6a

h264/aarch64: sign extend int stride in loop filter asm · bb515e3a
Janne Grunau authored 6 years ago

bb515e3a

03 Jan, 2019 1 commit

libavcodec: Remove dynamic relocs from aarch64/h264idct_neon.S · 6fcf8131

Manoj Gupta authored 6 years ago

Some of the assembly functions e.g. ff_h264_idct_dc_add_neon
has code like:
        movrel          x14, X(ff_h264_idct_add_neon)

Linker cannot resolve them fully at link time and emits dynamic
relocations.
Use explicit labels instead so that no dynamic relocations are
needed at all.

This avoids lld complains about text relocations.

For background, see https://crbug.com/917919Signed-off-by: Manoj Gupta <manojgupta@chromium.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>

6fcf8131

13 Jul, 2018 1 commit

lavc/aarch64/h264dsp_init_aarch64: Fix weight function prototypes. · 0576ef46

Carl Eugen Hoyos authored 7 years ago

Fixes the following warnings:
libavcodec/aarch64/h264dsp_init_aarch64.c: In function ‘ff_h264dsp_init_aarch64’:
libavcodec/aarch64/h264dsp_init_aarch64.c:84:38: warning: assignment from incompatible pointer type [enabled by default]
         c->weight_h264_pixels_tab[0] = ff_weight_h264_pixels_16_neon;
                                      ^
libavcodec/aarch64/h264dsp_init_aarch64.c:85:38: warning: assignment from incompatible pointer type [enabled by default]
         c->weight_h264_pixels_tab[1] = ff_weight_h264_pixels_8_neon;
                                      ^
libavcodec/aarch64/h264dsp_init_aarch64.c:86:38: warning: assignment from incompatible pointer type [enabled by default]
         c->weight_h264_pixels_tab[2] = ff_weight_h264_pixels_4_neon;
                                      ^
libavcodec/aarch64/h264dsp_init_aarch64.c:88:40: warning: assignment from incompatible pointer type [enabled by default]
         c->biweight_h264_pixels_tab[0] = ff_biweight_h264_pixels_16_neon;
                                        ^
libavcodec/aarch64/h264dsp_init_aarch64.c:89:40: warning: assignment from incompatible pointer type [enabled by default]
         c->biweight_h264_pixels_tab[1] = ff_biweight_h264_pixels_8_neon;
                                        ^
libavcodec/aarch64/h264dsp_init_aarch64.c:90:40: warning: assignment from incompatible pointer type [enabled by default]
         c->biweight_h264_pixels_tab[2] = ff_biweight_h264_pixels_4_neon;
                                        ^

0576ef46

26 Jan, 2018 1 commit
- lavc/aarch64/sbrdsp_neon: fix build on old binutils · 77237504
  Rodger Combs authored 7 years ago
  
  77237504
18 Oct, 2017 1 commit

aarch64: Remove a dot from a label · 73251063

Martin Storsjö authored 7 years ago

This fixes building with armasm64 (when run through gas-preprocessor).
Signed-off-by: Martin Storsjö <martin@martin.st>

73251063

03 Jul, 2017 1 commit

lavc/aarch64: add sbrdsp neon implementation · 0a24d7ca

Matthieu Bouron authored 7 years ago

autocorrelate_c: 644.0
autocorrelate_neon: 420.0
hf_apply_noise_0_c: 1688.5
hf_apply_noise_0_neon: 1498.6
hf_apply_noise_1_c: 1691.2
hf_apply_noise_1_neon: 1500.6
hf_apply_noise_2_c: 1688.1
hf_apply_noise_2_neon: 1500.3
hf_apply_noise_3_c: 1696.6
hf_apply_noise_3_neon: 1502.2
hf_g_filt_c: 2117.8
hf_g_filt_neon: 1218.7
hf_gen_c: 4573.4
hf_gen_neon: 2461.0
neg_odd_64_c: 72.0
neg_odd_64_neon: 64.7
qmf_deint_bfly_c: 1107.6
qmf_deint_bfly_neon: 291.6
qmf_deint_neg_c: 210.4
qmf_deint_neg_neon: 107.4
qmf_post_shuffle_c: 163.0
qmf_post_shuffle_neon: 107.7
qmf_pre_shuffle_c: 120.5
qmf_pre_shuffle_neon: 110.7
sum64x5_c: 1361.6
sum64x5_neon: 435.4
sum_square_c: 1686.4
sum_square_neon: 787.2

0a24d7ca

28 Jun, 2017 2 commits

lavc/aacpsdsp: use ptrdiff_t for stride in hybrid_analysis · b12a3617
Clément Bœsch authored 7 years ago

b12a3617

lavc/aarch64: add a few SIMD functions for AAC PS · ff0ecef6

Clément Bœsch authored 7 years ago

☭ tests/checkasm/checkasm --bench --test=aacpsdsp
checkasm: using random seed 3318985180
MMX implied by specified flags
MMX implied by specified flags
NEON:
 - aacpsdsp.add_squares        [OK]
 - aacpsdsp.mul_pair_single    [OK]
 - aacpsdsp.hybrid_analysis    [OK]
 - aacpsdsp.stereo_interpolate [OK]
checkasm: all 5 tests passed
nop: 10.0
ps_add_squares_c: 63221.2
ps_add_squares_neon: 22311.7
ps_hybrid_analysis_c: 2466.6
ps_hybrid_analysis_neon: 1521.9
ps_mul_pair_single_c: 68592.0
ps_mul_pair_single_neon: 17426.6
ps_stereo_interpolate_c: 72344.3
ps_stereo_interpolate_neon: 72308.8
ps_stereo_interpolate_ipdopd_c: 117415.2
ps_stereo_interpolate_ipdopd_neon: 113386.3

ff0ecef6

21 Jun, 2017 2 commits

aarch64: vp9 16bpp: Fix assembling with Xcode 6.2 and older · 9e85c5d6

Memphiz authored 7 years ago

Properly use the b.eq form instead of the nonstandard form (which
both gas and newer clang accept though), and expand the register
lists that used a range (which the Xcode 6.2 clang, based on clang
3.5 svn, didn't support).
Signed-off-by: Martin Storsjö <martin@martin.st>

9e85c5d6

aarch64: vp9: Fix assembling with Xcode 6.2 and older · 998609dd

Memphiz authored 7 years ago

Properly use the b.eq/b.ge forms instead of the nonstandard forms
(which both gas and newer clang accept though), and expand the
register list that used a range (which the Xcode 6.2 clang, based
on clang 3.5 svn, didn't support).

This is cherrypicked from libav commit
a970f9de.
Signed-off-by: Martin Storsjö <martin@martin.st>

998609dd

20 Jun, 2017 1 commit

aarch64: vp9: Fix assembling with Xcode 6.2 and older · a970f9de

Memphiz authored 7 years ago

Properly use the b.eq/b.ge forms instead of the nonstandard forms
(which both gas and newer clang accept though), and expand the
register list that used a range (which the Xcode 6.2 clang, based
on clang 3.5 svn, didn't support).
Signed-off-by: Martin Storsjö <martin@martin.st>

a970f9de

14 Jun, 2017 1 commit
- lavc/aarch64/simple_idct: fix build with Xcode 7.2 · 20400835
  Matthieu Bouron authored 7 years ago
  
  20400835
13 Jun, 2017 1 commit
- lavc/aarch64/simple_idct: fix idct_col4_top coefficient · 8aa60606
  Matthieu Bouron authored 7 years ago
```
Fixes regression introduced by 5d0b8b1a.
```
  8aa60606
11 May, 2017 1 commit

lavc/aarch64/simple_idct: fix iOS build without gas-preprocessor · 5d0b8b1a

Matthieu Bouron authored 7 years ago

Separates macro arguments with commas and passes .4H/.8H as macro
arguments instead of 4H/8H (the later form being interpreted as an
hexadecimal value).

Fixes ticket #6324.
Suggested-by: Martin Storsjö <martin@martin.st>

5d0b8b1a