1. 16 Mar, 2014 1 commit
  2. 15 Mar, 2014 2 commits
  3. 14 Mar, 2014 1 commit
  4. 25 Jan, 2014 1 commit
  5. 23 Jan, 2014 1 commit
  6. 03 Nov, 2013 1 commit
  7. 24 Aug, 2013 1 commit
  8. 20 Aug, 2013 1 commit
  9. 21 Mar, 2013 7 commits
  10. 08 Mar, 2013 1 commit
  11. 25 Feb, 2013 1 commit
  12. 19 Feb, 2013 1 commit
  13. 18 Feb, 2013 1 commit
  14. 15 Feb, 2013 1 commit
    • Anton Khirnov's avatar
      h264: deMpegEncContextize · 2c541554
      Anton Khirnov authored
      Most of the changes are just trivial are just trivial replacements of
      fields from MpegEncContext with equivalent fields in H264Context.
      Everything in h264* other than h264.c are those trivial changes.
      
      The nontrivial parts are:
      1) extracting a simplified version of the frame management code from
         mpegvideo.c. We don't need last/next_picture anymore, since h264 uses
         its own more complex system already and those were set only to appease
         the mpegvideo parts.
      2) some tables that need to be allocated/freed in appropriate places.
      3) hwaccels -- mostly trivial replacements.
         for dxva, the draw_horiz_band() call is moved from
         ff_dxva2_common_end_frame() to per-codec end_frame() callbacks,
         because it's now different for h264 and MpegEncContext-based
         decoders.
      4) svq3 -- it does not use h264 complex reference system, so I just
         added some very simplistic frame management instead and dropped the
         use of ff_h264_frame_start(). Because of this I also had to move some
         initialization code to svq3.
      
      Additional fixes for chroma format and bit depth changes by
      Janne Grunau <janne-libav@jannau.net>
      Signed-off-by: 's avatarAnton Khirnov <anton@khirnov.net>
      2c541554
  15. 31 Jan, 2013 1 commit
  16. 23 Jan, 2013 1 commit
  17. 18 Nov, 2012 1 commit
  18. 05 Oct, 2012 2 commits
  19. 01 Oct, 2012 2 commits
  20. 26 Jul, 2012 1 commit
  21. 23 Jun, 2012 1 commit
  22. 28 Apr, 2012 4 commits
    • Roland Scheidegger's avatar
      h264: new assembly version of get_cabac for x86_64 with PIC · 82c71913
      Roland Scheidegger authored
      This adds a hand-optimized assembly version for get_cabac much like the
      existing one, but it works if the table offsets are RIP-relative.
      Compared to the non-RIP-relative version this adds 2 lea instructions
      and it needs one extra register.
      There is a surprisingly large performance improvement over the c version (more
      so than the generated assembly seems to suggest) just in get_cabac, I measured
      roughly 40% faster for get_cabac on a K8. However, overall the difference is
      not that big, I measured roughly 5% on a test clip on a K8 and a Core2.
      Hopefully it still compiles on x86 32bit...
      Now that only one table is used, there's some chance even darwin as compiles
      this (apparently the label arithmetic used previously doesn't work if it
      involves symbols defined in a different file, thanks to Ronald S. Bultje for
      helping me with this).
      Signed-off-by: 's avatarMichael Niedermayer <michaelni@gmx.at>
      82c71913
    • Roland Scheidegger's avatar
      h264: use one table instead of several for cabac functions · 7f668cd2
      Roland Scheidegger authored
      The reason is this is easier for PIC code (in particular on darwin...).
      Keep the old names as pointers (static in cabac_functions.h so gcc
      knows these are just immediate offsets) so the c code can nicely stay the same
      (alternatively could use offsets directly in the functions needing the
      tables). This should produce the same code as before with non-pic and better
      code (confirmed) with pic.
      
      The assembly uses the new table but still won't work for PIC case.
      Signed-off-by: 's avatarMichael Niedermayer <michaelni@gmx.at>
      7f668cd2
    • Roland Scheidegger's avatar
      h264: new assembly version of get_cabac for x86_64 with PIC · 9b9df1cd
      Roland Scheidegger authored
      This adds a hand-optimized assembly version for get_cabac much like the
      existing one, but it works if the table offsets are RIP-relative.
      Compared to the non-RIP-relative version this adds 2 lea instructions
      and it needs one extra register. get_cabac() gets about 40% faster, for
      an overall speedup of about 5%.
      Signed-off-by: 's avatarRonald S. Bultje <rsbultje@gmail.com>
      9b9df1cd
    • Roland Scheidegger's avatar
      h264: use one table instead of several for cabac functions · 14e9ffc1
      Roland Scheidegger authored
      The reason is this is easier for PIC code (in particular on darwin...).
      Keep the old names as pointers (static in cabac_functions.h so gcc
      knows these are just immediate offsets) so the c code can nicely stay the same
      (alternatively could use offsets directly in the functions needing the
      tables). This should produce the same code as before with non-pic and better
      code (confirmed) with pic.
      
      The assembly uses the new table but still won't work for PIC case.
      Signed-off-by: 's avatarRonald S. Bultje <rsbultje@gmail.com>
      14e9ffc1
  23. 21 Apr, 2012 1 commit
  24. 20 Apr, 2012 1 commit
    • Roland Scheidegger's avatar
      h264: assembly version of get_cabac for x86_64 with PIC (v4) · a812b599
      Roland Scheidegger authored
      This adds a hand-optimized assembly version for get_cabac much like the
      existing one, but it works if the table offsets are RIP-relative.
      Compared to the non-RIP-relative version this adds 2 lea instructions
      and it needs one extra register.
      There is a surprisingly large performance improvement over the c version (more
      so than the generated assembly seems to suggest) just in get_cabac, I measured
      roughly 40% faster for get_cabac on a K8. However, overall the difference is
      not that big, I measured roughly 5% on a test clip on a K8 and a Core2.
      Hopefully it still compiles on x86 32bit...
      v2: incorporated feedback from Loren Merritt to avoid rip-relative movs
      for every table, and got rid of unnecessary @GOTPCREL.
      v3: apply similar fixes to the the decode_significance functions, and use
      same macro arguments for non-pic case.
      v4: prettify inline asm arguments, add a non-fast-cmov version (as I expect
      the c code to be faster otherwise since both cmov and sbb suck hard on a
      Prescott, even can't construct the mask with a 64bit shift as that's just as
      terrible - it's quite difficult to find usable instructions on that chip...).
      This is tested to work but not on a P4, in theory it _should_ be fast there.
      Signed-off-by: 's avatarMichael Niedermayer <michaelni@gmx.at>
      a812b599
  25. 05 Apr, 2012 1 commit
  26. 28 Mar, 2012 1 commit
  27. 29 Feb, 2012 1 commit
  28. 10 Feb, 2012 1 commit
    • Ronald S. Bultje's avatar
      h264: disallow constrained intra prediction modes for luma. · 45b7bd7c
      Ronald S. Bultje authored
      Conversion of the luma intra prediction mode to one of the constrained
      ("alzheimer") ones can happen by crafting special bitstreams, causing
      a crash because we'll call a NULL function pointer for 16x16 block intra
      prediction, since constrained intra prediction functions are only
      implemented for chroma (8x8 blocks).
      
      Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
      CC: libav-stable@libav.org
      45b7bd7c