1. 08 Nov, 2016 1 commit
  2. 18 Oct, 2016 1 commit
  3. 12 Oct, 2016 1 commit
  4. 08 Mar, 2016 1 commit
  5. 20 Jan, 2016 1 commit
  6. 18 Dec, 2015 1 commit
  7. 14 Dec, 2015 1 commit
  8. 17 Oct, 2015 1 commit
    • Rostislav Pehlivanov's avatar
      aacenc: add support for encoding files using Long Term Prediction · 27d23ae0
      Rostislav Pehlivanov authored
      Long Term Prediction allows for prediction of spectral coefficients
      via the previously decoded time-dependent samples. This feature
      works well with harmonic content 2 or more frames long, like speech,
      human or non-human, piano music or any constant tones at very low
      bitrates.
      
      It should be noted that the current coder is highly efficient and
      the rate control system is unable to encode files at extremely
      low bitrates (less than 14kbps seems to be impossible) so this
      extension isn't capable of optimum operation. Dramatic difference
      is observable with some types of audio and speech but for the most
      part the audiable differences are subtle. The spectrum looks better
      however so the encoder is able to harvest the additional bits that
      this feature provies, should the user choose to enable it. So
      it's best to enable this feature only if encoding at the absolutely
      lowest bitrate that the encoder is capable of.
      27d23ae0
  9. 12 Oct, 2015 4 commits
    • Rostislav Pehlivanov's avatar
      aacenc: shorten name of ff_aac_adjust_common_prediction · 93e6b23c
      Rostislav Pehlivanov authored
      To keep it similar to the other functions which are all named *_pred.
      93e6b23c
    • Rostislav Pehlivanov's avatar
      aacenc: increase size of s->planar_samples[] from 6 to 8 · 65f5b96d
      Rostislav Pehlivanov authored
      Left out of last commit which added support for eight channel audio.
      65f5b96d
    • Rostislav Pehlivanov's avatar
      aacenc: add support for changing options based on a profile · 0f4334df
      Rostislav Pehlivanov authored
      This commit adds the ability for a profile to set the default
      options, as well as for the user to override such options
      by simply stating them in the command line while still keeping
      the same profile, as long as those options are still permitted by
      the profile.
      
      Example: setting the profile to aac_low (the default) will turn
      PNS and IS on. They can be disabled by -aac_pns 0 and -aac_is 0,
      respectively. Turning on -aac_pred 1 will cause the profile to be
      elevated to aac_main, as long as no options forbidding aac_main
      have been entered (like AAC-LTP, which will be pushed soon).
      
      A useful feature is that by setting the profile to mpeg2_aac_low,
      all MPEG4 features will be disabled and if the user tries to enable
      them then the program will exit with an error. This profile is
      signalled with the same bitstream as aac_low (MPEG4) but some devices
      and decoders will fail if any MPEG4 features have been enabled.
      0f4334df
    • Claudio Freire's avatar
      AAC encoder: memoize quantize_band_cost · b629c67d
      Claudio Freire authored
      The bulk of calls to quantize_band_cost are replaced
      by a call to a version that memoizes, greatly improving
      performance, since during coefficient search there is
      a great deal of repeat work.
      
      Memoization cannot always be applied, so do this in a
      different function, and leave the original as-is.
      b629c67d
  10. 11 Oct, 2015 1 commit
    • Claudio Freire's avatar
      AAC encoder: Extensive improvements · 01ecb717
      Claudio Freire authored
      This finalizes merging of the work in the patches in ticket #2686.
      
      Improvements to twoloop and RC logic are extensive.
      
      The non-exhaustive list of twoloop improvments includes:
       - Tweaks to distortion limits on the RD optimization phase of twoloop
       - Deeper search in twoloop
       - PNS information marking to let twoloop decide when to use it
         (turned out having the decision made separately wasn't working)
       - Tonal band detection and priorization
       - Better band energy conservation rules
       - Strict hole avoidance
      
      For rate control:
       - Use psymodel's bit allocation to allow proper use of the bit
         reservoir. Don't work against the bit reservoir by moving lambda
         in the opposite direction when psymodel decides to allocate more/less
         bits to a frame.
       - Retry the encode if the effective rate lies outside a reasonable
         margin of psymodel's allocation or the selected ABR.
       - Log average lambda at the end. Useful info for everyone, but especially
         for tuning of the various encoder constants that relate to lambda
         feedback.
      
      Psy:
       - Do not apply lowpass with a FIR filter, instead just let the coder
         zero bands above the cutoff. The FIR filter induces group delay,
         and while zeroing bands causes ripple, it's lost in the quantization
         noise.
       - Experimental VBR bit allocation code
       - Tweak automatic lowpass filter threshold to maximize audio bandwidth
         at all bitrates while still providing acceptable, stable quality.
      
      I/S:
       - Phase decision fixes. Unrelated to #2686, but the bugs only surfaced
         when the merge was finalized. Measure I/S band energy accounting for
         phase, and prevent I/S and M/S from being applied both.
      
      PNS:
       - Avoid marking short bands with PNS when they're part of a window
         group in which there's a large variation of energy from one window
         to the next. PNS can't preserve those and the effect is extremely
         noticeable.
      
      M/S:
       - Implement BMLD protection similar to the specified in
         ISO-IEC/13818:7-2003, Appendix C Section 6.1. Since M/S decision
         doesn't conform to section 6.1, a different method had to be
         implemented, but should provide equivalent protection.
       - Move the decision logic closer to the method specified in
         ISO-IEC/13818:7-2003, Appendix C Section 6.1. Specifically,
         make sure M/S needs less bits than dual stereo.
       - Don't apply M/S in bands that are using I/S
      
      Now, this of course needed adjustments in the compare targets and
      fuzz factors of the AAC encoder's fate tests, but if wondering why
      the targets go up (more distortion), consider the previous coder
      was using too many bits on LF content (far more than required by
      psy), and thus those signals will now be more distorted, not less.
      
      The extra distortion isn't audible though, I carried extensive
      ABX testing to make sure.
      
      A very similar patch was also extensively tested by Kamendo2 in
      the context of #2686.
      01ecb717
  11. 23 Sep, 2015 1 commit
    • Claudio Freire's avatar
      AAC encoder: tweak rate-distortion logic · 7ec74ae4
      Claudio Freire authored
      This patch modifies the encode frame function to
      retry encoding the frame when the resulting bit count
      is too far off target, but only adjusting lambda
      in small, incremental step. It also makes the logic
      more conservative - otherwise it will contend with
      bit reservoir-related variations in bit allocation,
      and result in artifacts when frame have to be truncated
      (usually at high bit rates transitioning from low
      complexity to high complexity).
      7ec74ae4
  12. 06 Sep, 2015 1 commit
  13. 01 Sep, 2015 1 commit
    • Rostislav Pehlivanov's avatar
      aacenc_tns: rework coefficient quantization and filter application · f3f6c6b9
      Rostislav Pehlivanov authored
      This commit reworks the TNS implementation to a hybrid between what
      the specifications say, what the decoder does and what's the best
      thing to do.
      
      The filter application function was copied from the decoder and
      modified such that it applies the inverse AR filter to the
      coefficients. The LPC coefficients themselves are fed into the
      same quantization expression that the specifications say should
      be used however further processing is not done, instead they're
      converted to the form that the decoder expects them to be in
      and are sent off to the compute_lpc_coeffs function exactly the
      way the decoder does. This function does all conversions and will
      return the exact coefficients that the decoder will generate, which
      are then applied to the coefficients.
      Having the exact same coefficients on both the encoder and decoder
      is a must since otherwise the entire sfb's over which the filter
      is applied will be attenuated.
      
      Despite this major rework, TNS might not work fine on some audio
      types at very low bitrates (e.g. sub 90kbps) as it can attenuate
      some coefficients too much. Users are advised to experiment with
      TNS at higher bitrates if they wish to use this tool or simply
      wait for the implementation to be improved.
      Signed-off-by: 's avatarRostislav Pehlivanov <atomnuker@gmail.com>
      f3f6c6b9
  14. 29 Aug, 2015 2 commits
    • Rostislav Pehlivanov's avatar
      aacenc_tns: rework the way coefficients are calculated · f20b6717
      Rostislav Pehlivanov authored
      This commit abandons the way the specifications state to
      quantize the coefficients, makes use of the new LPC float
      functions and is much better.
      
      The original way of converting non-normalized float samples
      to int32_t which out LPC system expects was wrong and it was
      wrong to assume the coefficients that are generated are also
      valid. It was essentially a full garbage-in, garbage-out
      system and it definitely shows when looking at spectrals
      and listening. The high frequencies were very overattenuated.
      The new LPC function performs the analysis directly.
      
      The specifications state to quantize the coefficients into
      four bit index values using an asin() function which of course
      had to have ugly ternary operators because the function turns
      negative if the coefficients are negative which when encoding
      causes invalid bitstream to get generated.
      
      This deviates from this by using the direct TNS tables, which
      are fairly small since you only have 4 bits at most for index
      values. The LPC values are directly quantized against the tables
      and are then used to perform filtering after the requantization,
      which simply fetches the array values.
      
      The end result is that TNS works much better now and doesn't
      attenuate anything but the actual signal, e.g. TNS removes
      quantization errors and does it's job correctly now.
      
      It might be enabled by default soon since it doesn't hurt and
      helps reduce nastyness at low bitrates.
      Signed-off-by: 's avatarRostislav Pehlivanov <atomnuker@gmail.com>
      f20b6717
    • Rostislav Pehlivanov's avatar
      aacenc_pred: rework the way prediction is done · 44ddee94
      Rostislav Pehlivanov authored
      This commit completely alters the algorithm of prediction.
      The original commit which introduced prediction was completely
      incorrect to even remotely care about what the actual coefficients
      contain or whether any options were enabled. Not my actual fault.
      
      This commit treats prediction the way the decoder does and expects
      to do: like lossy encryption. Everything related to prediction now
      happens at the very end but just before quantization and encoding
      of coefficients. On the decoder side, prediction happens before
      anything has had a chance to even access the coefficients.
      
      Also the original implementation had problems because it actually
      touched the band_type of special bands which already had their
      scalefactor indices marked and it's a wonder the asserion wasn't
      triggered when transmitting those.
      
      Overall, this now drastically increases audio quality and you should
      think about enabling it if you don't plan on playing anything encoded
      on really old low power ultra-embedded devices since they might not
      support decoding of prediction or AAC-Main. Though the specifications
      were written ages ago and as times change so do the FLOPS.
      Signed-off-by: 's avatarRostislav Pehlivanov <atomnuker@gmail.com>
      44ddee94
  15. 21 Aug, 2015 5 commits
    • Rostislav Pehlivanov's avatar
      aacenc: implement the complete AAC-Main profile · 76b81b10
      Rostislav Pehlivanov authored
      This commit finalizes AAC-Main profile encoding support
      by implementing all mandatory and optional tools available
      in the specifications and current decoders.
      
      The AAC-Main profile reqires that prediction support be
      present (although decoders don't require it to be enabled)
      for an encoder to be deemed capable of AAC-Main encoding,
      as well as TNS, PNS and IS, all of which were implemented
      with previous commits or earlier of this year.
      
      Users are encouraged to test the new functionality using either
      -profile:a aac_main or -aac_pred 1, the former of which will enable
      the prediction option by default and the latter will change the
      profile to AAC-Main. No other options shall be changed by enabling
      either, it's currently up to the users to decide what's best.
      
      The current implementation works best using M/S and/or IS,
      so users are also welcome to enable both options and any
      other options (TNS, PNS) for maximum quality.
      Signed-off-by: 's avatarRostislav Pehlivanov <atomnuker@gmail.com>
      76b81b10
    • Rostislav Pehlivanov's avatar
      aacenc_tns: implement temporal noise shaping · a1c487e9
      Rostislav Pehlivanov authored
      This commit implements temporal noise shaping support in the
      encoder, along with an -aac_tns option to toggle it on or off
      (off by default for now). TNS will increase audio quality
      and reduce quantization noise by applying a multitap FIR filter
      across allowed coefficients and transmit side information to the
      decoder so it could create an inverse filter.
      
      Users are encouraged to test the new functionality by enabling
      -aac_tns 1 during encoding.
      
      No major bugs are observable at this time so after a while if no
      new problems appear and if the current implementation is deemed
      of high enough quality and stability it will be enabled by default,
      possibly at the same time the encoder has its experimental flag
      removed and becomes the standard aac encoder in ffmpeg.
      Signed-off-by: 's avatarRostislav Pehlivanov <atomnuker@gmail.com>
      a1c487e9
    • Rostislav Pehlivanov's avatar
      aacenc: do not reject AAC-Main profile · eab12d07
      Rostislav Pehlivanov authored
      This commit permits for the use of the Main profile
      in encoding. The functionality of that profile will
      be added in the commits following. By itself, this
      commit does not alter anything.
      Signed-off-by: 's avatarRostislav Pehlivanov <atomnuker@gmail.com>
      eab12d07
    • Rostislav Pehlivanov's avatar
      aaccoder: move the quantization functions to a separate file · 43b378a0
      Rostislav Pehlivanov authored
      This commit moves the quantizer to a separate header file.
      This allows the quantizer to be used from a separate files outside
      of aaccoder without having to put another function pointer and will
      result in a slight speedup as the compiler can do more optimizations.
      
      This is required for commits following.
      Signed-off-by: 's avatarRostislav Pehlivanov <atomnuker@gmail.com>
      43b378a0
    • Rostislav Pehlivanov's avatar
      aacenc: create and initialize an LTP context · b47a1e5c
      Rostislav Pehlivanov authored
      This commit only creates and initializes an LTP
      context which is needed for upcoming commits (TNS).
      Signed-off-by: 's avatarRostislav Pehlivanov <atomnuker@gmail.com>
      b47a1e5c
  16. 01 Aug, 2015 1 commit
  17. 27 Jul, 2015 1 commit
    • Claudio Freire's avatar
      AAC Encoder: clipping avoidance · 59216e05
      Claudio Freire authored
      Avoid clipping due to quantization noise to produce audible
      artifacts, by detecting near-clipping signals and both attenuating
      them a little and encoding escape-encoded bands (usually the
      loudest) rounding towards zero instead of nearest, which tends to
      decrease overall energy and thus clipping.
      
      Currently fate tests measure numerical error so this change makes
      tests using asynth (which are near clipping) report higher error
      not less, because of window attenuation. Yet, they sound better,
      not worse (albeit subtle, other samples aren't subtle at all).
      Only measuring psychoacoustically weighted error would make for
      a representative test, so that will be left for a future patch.
      Signed-off-by: 's avatarMichael Niedermayer <michael@niedermayer.cc>
      59216e05
  18. 21 Jul, 2015 1 commit
  19. 05 Jul, 2015 3 commits
    • Rostislav Pehlivanov's avatar
      aacenc: implement Intensity Stereo encoding support · e8576dc8
      Rostislav Pehlivanov authored
      This commit implements intensity stereo coding support
      to the native aac encoder. This is a way to increase the efficiency
      of the encoder by zeroing the right channel's spectral coefficients
      (in a channel pair) and rederiving them in the decoder using information
      from the scalefactor indices of special band types. This commit
      confomrs to the official ISO 13818-7 specifications, although due to
      their ambiguity certain deviations have been taken to ensure maximum
      sound quality. This commit has been extensively tested and has shown
      to not result in audiable audio artifacts unless in extreme cases.
      This commit also adds an option, aac_is, which has the value of
      0 by default. Intensity Stereo is part of the scalable aac profile
      and is thus non-default.
      
      The way IS coding works is that it rederives the right channel's
      spectral coefficients from the left channel via the scalefactor
      index values left in the right channel. Since an entire band's
      spectral coefficients do not need to be coded, the encoder's
      efficiency jumps up and it unzeroes some high frequency values
      which it previously did not have enough bits to encode. That way
      less information is lost than the information lost by rederiving
      the spectral coefficients with some error. This is why the
      filesize of files encoded with IS do not decrease significantly.
      Users wishing that IS coding should reduce filesize are expected
      to reduce their encoding bitrates appropriately.
      
      This is V2 of the commit. The old version did not mark ms_mask as
      0 since M/S and IS coding are incompactible, which resulted in
      distortions with M/S coding enabled. This version also improves
      phase detection by measuring it for every spectral coefficient in
      the band and using a simple majority rule to determine whether the
      coefficients are in or out of phase. Also, the energy values per
      spectral coefficient were changed as to reflect the
      official specifications.
      Reviewed-by: 's avatarClaudio Freire <klaussfreire@gmail.com>
      Signed-off-by: 's avatarMichael Niedermayer <michaelni@gmx.at>
      e8576dc8
    • Rostislav Pehlivanov's avatar
      aaccoder: add a new perceptual noise substitution implementation · 38fd4c2e
      Rostislav Pehlivanov authored
      This commit finalizes the PNS implementation previously added to the encoder
      by moving it to a seperate function search_for_pns() and thus making it
      coder-generic. This new implementation makes use of the spread field of
      the psy bands and the lambda quality feedback paremeter. The spread of the
      spectrum in a band prevents PNS from being used excessively and thus preserve
      more phase information in high frequencies.  The lambda parameter allows
      the number of PNS-marked bands to vary based on the lambda parameter and the
      amount of bits available, making better choices on which bands are to be marked
      as noise. Comparisons with the previous PNS implementation can be found
      here: https://trac.ffmpeg.org/attachment/wiki/Encode/AAC/
      
      This is V2 of the patch, the changes from the previous version being that this
      version uses the new band->spread metric from aacpsy and normalizes the
      energy using the group size. These changes were suggested by Claudio Freire
      on the mailing list. Another change is the use of lambda to alter the
      frequency threshold. This change makes the actual threshold frequencies
      vary between +-2Khz of what's specified, depending on frame encoding performance.
      Reviewed-by: 's avatarClaudio Freire <klaussfreire@gmail.com>
      Signed-off-by: 's avatarMichael Niedermayer <michaelni@gmx.at>
      38fd4c2e
    • Rostislav Pehlivanov's avatar
      aacenc: use the new function for setting special band scalefactor indices · e06578e3
      Rostislav Pehlivanov authored
      This commit enables the function added with commit 7c10b87b and uses that
      new function for setting any special scalefactor indices. This commit does
      not change the behaviour of the encoder since no bands are being marked as
      either NOISE_BT(due to the previous PNS implementation removed in the
      previous commit) or INTENSITY_BT2/INTENSITY_BT.
      Reviewed-by: 's avatarClaudio Freire <klaussfreire@gmail.com>
      Signed-off-by: 's avatarMichael Niedermayer <michaelni@gmx.at>
      e06578e3
  20. 15 Apr, 2015 1 commit
    • Rostislav Pehlivanov's avatar
      aaccoder: Implement Perceptual Noise Substitution for AAC · c5d4f87e
      Rostislav Pehlivanov authored
      This commit implements the perceptual noise substitution AAC extension. This is a proof of concept
      implementation, and as such, is not enabled by default. This is the fourth revision of this patch,
      made after some problems were noted out. Any changes made since the previous revisions have been indicated.
      
      In order to extend the encoder to use an additional codebook, the array holding each codebook has been
      modified with two additional entries - 13 for the NOISE_BT codebook and 12 which has a placeholder function.
      The cost system was modified to skip the 12th entry using an array to map the input and outputs it has. It
      also does not accept using the 13th codebook for any band which is not marked as containing noise, thereby
      restricting its ability to arbitrarily choose it for bands. The use of arrays allows the system to be easily
      extended to allow for intensity stereo encoding, which uses additional codebooks.
      
      The 12th entry in the codebook function array points to a function which stops the execution of the program
      by calling an assert with an always 'false' argument. It was pointed out in an email discussion with
      Claudio Freire that having a 'NULL' entry can result in unexpected behaviour and could be used as
      a security hole. There is no danger of this function being called during encoding due to the codebook maps introduced.
      
      Another change from version 1 of the patch is the addition of an argument to the encoder, '-aac_pns' to
      enable and disable the PNS. This currently defaults to disable the PNS, as it is experimental.
      The switch will be removed in the future, when the algorithm to select noise bands has been improved.
      The current algorithm simply compares the energy to the threshold (multiplied by a constant) to determine
      noise, however the FFPsyBand structure contains other useful figures to determine which bands carry noise more accurately.
      
      Some of the sample files provided triggered an assertion when the parameter to tune the threshold was set to
      a value of '2.2'. Claudio Freire reported the problem's source could be in the range of the scalefactor
      indices for noise and advised to measure the minimal index and clip anything above the maximum allowed
      value. This has been implemented and all the files which used to trigger the asserion now encode without error.
      
      The third revision of the problem also removes unneded variabes and comparisons. All of them were
      redundant and were of little use for when the PNS implementation would be extended.
      
      The fourth revision moved the clipping of the noise scalefactors outside the second loop of the two-loop
      algorithm in order to prevent their redundant calculations. Also, freq_mult has been changed to a float
      variable due to the fact that rounding errors can prove to be a problem at low frequencies.
      Considerations were taken whether the entire expression could be evaluated inside the expression
      , but in the end it was decided that it would be for the best if just the type of the variable were
      to change. Claudio Freire reported the two problems. There is no change of functionality
      (except for low sampling frequencies) so the spectral demonstrations at the end of this commit's message were not updated.
      
      Finally, the way energy values are converted to scalefactor indices has changed since the first commit,
      as per the suggestion of Claudio Freire. This may still have some drawbacks, but unlike the first commit
      it works without having redundant offsets and outputs what the decoder expects to have, in terms of the
      ranges of the scalefactor indices.
      
      Some spectral comparisons: https://trac.ffmpeg.org/attachment/wiki/Encode/AAC/Original.png (original),
      https://trac.ffmpeg.org/attachment/wiki/Encode/AAC/PNS_NO.png (encoded without PNS),
      https://trac.ffmpeg.org/attachment/wiki/Encode/AAC/PNS1.2.png (encoded with PNS, const = 1.2),
      https://trac.ffmpeg.org/attachment/wiki/Encode/AAC/Difference1.png (spectral difference).
      The constant is the value which multiplies the threshold when it gets compared to the energy, larger
      values means more noise will be substituded by PNS values. Example when const = 2.2:
      https://trac.ffmpeg.org/attachment/wiki/Encode/AAC/PNS_2.2.pngReviewed-by: 's avatarClaudio Freire <klaussfreire@gmail.com>
      Signed-off-by: 's avatarMichael Niedermayer <michaelni@gmx.at>
      c5d4f87e
  21. 29 Nov, 2014 1 commit
  22. 12 Sep, 2013 2 commits
  23. 20 Mar, 2013 1 commit
  24. 25 Feb, 2013 1 commit
  25. 22 Jan, 2013 1 commit
  26. 08 Jun, 2012 1 commit
  27. 20 Mar, 2012 1 commit
  28. 23 Jan, 2012 2 commits