1. 08 Jan, 2014 5 commits
    • Ronald S. Bultje's avatar
      vp9: make mv bounds 32bit. · 024fac5c
      Ronald S. Bultje authored
      Fixes an assert in file from trac ticket 3188.
      024fac5c
    • Ronald S. Bultje's avatar
      vp9: reset contextual caches on frame size change with mt enabled. · 5b0fc078
      Ronald S. Bultje authored
      Fixes crash/valgrind errors in trac ticket 3188 and hang in 3274.
      5b0fc078
    • Ronald S. Bultje's avatar
      vp9/x86: idct_32x32_add_ssse3 sub-8x8-idct. · 04a187fb
      Ronald S. Bultje authored
      Runtime of the full 32x32 idct goes from 2446 to 2441 cycles (intra) or
      from 1425 to 1306 cycles (inter). Overall runtime is not significantly
      affected.
      04a187fb
    • Ronald S. Bultje's avatar
      vp9/x86: idct_32x32_add_ssse3 sub-16x16-idct. · 37b001d1
      Ronald S. Bultje authored
      Runtime of all IDCTs together goes from 3327 to 2473 cycles (intra, i.e.
      ~35% faster) or from 2312 to 1448 cycles (inter, i.e. ~60% faster). Total
      decode time of ped1080p.webm goes from 8.086sec to 7.974sec (1.4% faster).
      37b001d1
    • Ronald S. Bultje's avatar
      vp9/x86: idct_32x32_add_ssse3. · e84d14df
      Ronald S. Bultje authored
      Sub-IDCTs will follow later. ped1080.webm goes from 9.295s to 8.191s
      (13.5% faster). The IDCT itself goes from 4372 (intra) or 4337 (inter)
      to 403 (intra) or 329 (inter) cycles for the DC-only form, 23755 (intra)
      or 23723 (inter) to 3497 (intra) or 3607 (inter) cycles for the no-DC
      form, which averages from 23393 (intra) or 16612 (inter) to 3449 (intra)
      or 2392 (inter) for all 32x32s together, i.e. about ~7x faster (all
      tests done on ped1080p.webm).
      e84d14df
  2. 07 Jan, 2014 2 commits
  3. 06 Jan, 2014 33 commits