This is achieved by using function pointers for AAC SBR functions.
This unfortunately necessitated to use void* in
ff_aac_sbr_apply(_fixed).
Fixes ticket #10999.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is more in line with how we initialize DSP functions
and avoids tables of function pointers as well as relocations
for these.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This allows to merge it with AACDecDSP.init and remove the latter
(it is called only once anyway); it also allows to make
the fixed/float AACDecDSP and AACDecProc implementations internal
to aacdec_fixed/float.c (which also fixes a violation of our
naming conventions). And it some linker errors when either decoder
is disabled.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_aacdec_common_init_once() already uses its own AVOnce.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_aac_sbr_apply() and ff_aac_sbr_apply_fixed() still used
pointers to INTFLOAT which is float or int depending upon
whether USE_FIXED is set or not; in particular, according
to these declarations both functions have the same type.
But that is wrong and given that aacdec.c sets USE_FIXED,
it sees the wrong type for ff_aac_sbr_apply().
This leads to a -Wlto-type-mismatch warning when using lto [1].
Fix this by avoiding INTFLOAT in aacsbr.h (which also means
that aac_defines.h need not be included there any more).
[1]: https://fate.ffmpeg.org/log.cgi?slot=x86_64-archlinux-gcc-lto&time=20240506022217&log=compile
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The fixed point decoder needs it since
905fdb0601.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For some error bitstreams, a CTU belongs to two slices/entry points.
If the decoder initializes and submmits the CTU task twice, it may crash the program
or cause it to enter an infinite loop.
Reported-by: Frank Plowman <post@frankplowman.com>
Fixes: signed integer overflow: 2097152000 + 107142979 cannot be represented in type 'int'
Fixes: 67919/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WAVARC_fuzzer-5955101769400320
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is similar to h264, but here we use manual_avg instead of vaaddu
because rv40's OP differs from h264. If we use vaaddu,
rv40 would need to repeatedly switch between vxrm=0 and vxrm=2,
and switching vxrm is very slow.
C908:
avg_chroma_mc4_c: 2330.0
avg_chroma_mc4_rvv_i32: 602.7
avg_chroma_mc8_c: 1211.0
avg_chroma_mc8_rvv_i32: 602.7
put_chroma_mc4_c: 1825.0
put_chroma_mc4_rvv_i32: 414.7
put_chroma_mc8_c: 932.0
put_chroma_mc8_rvv_i32: 414.7
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
As we do not need to widen accumulators to 64 bits, we effectively get
double capacity for unrolling compared to the integer function. This
explains the slightly better performance gains.
ac3_sum_square_bufferfly_float_c: 65.2
ac3_sum_square_bufferfly_float_rvv_f32: 12.2
Fixes: CID1544265 Logically dead code
Sponsored-by: Sovereign Tech Fund
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Regression since fd172185580c1ccdcfb90bbfdb59fa806fad3117;
triggered by vp4/KTkvw8dg1J8.avi in the FATE suite, but not
when running fate as this code is not used when the bitexact
flag is set.
Bisecting done by ami_stuff, patch from user Mika Fischer
in ticket #10027 (which this commit fixes).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Before 0f8763fbea, av1_frame_ref()
and update_reference_list() could fail and therefore needed to
be checked, which incidentally set ret. This is no longer happening,
leading to a potential use of an uninitialized value which is
also the subject of Coverity ticket #1596605.
Fix this by always setting ret before goto end; do not return
some random ancient value.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This version is seven years old, and present in Debian oldoldstable,
Ubuntu 20.04 and Leap 15.0.
Allows cleaning up the file substantially. In particular, this is
motivated by the desire to stop relying on init_static_data.
Fixes: CID1439569 Unchecked return value
Fixes: CID1439578 Unchecked return value
Sponsored-by: Sovereign Tech Fund
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: CID1420393 Unchecked return value
Sponsored-by: Sovereign Tech Fund
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
xHE-AAC relies on the same postfilter mechanism
that Opus uses to improve clarity (albeit with a steeper
deemphasis filter).
The code to apply it is identical, it's still just a
simple IIR low-pass filter. This commit makes it possible
to use alternative constants.
Adding 10-bit encoding support for HEVC if the input is 8-bit. In
case of 8-bit input content, NVENC performs an internal CUDA 8 to
10-bit conversion of the input prior to encoding. Currently, only
AV1 supports encoding 8-bit content as 10-bit.
Signed-off-by: Diego Felix de Souza <ddesouza@nvidia.com>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
It is possible that ff_progress_frame_await() is called but
ff_progress_frame_report() isn't called when a hardware acceleration
method is used, so a thread for vp9 decoding might get stuck.
Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
The extradata is generated by encoding a dummy frame, then reset
the encoder state by mediacodec flush(). It only works for pixel
format other than AV_PIX_FMT_MEDIACODEC, since I'm not sure how
to create a dummy frame safely with AV_PIX_FMT_MEDIACODEC.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The code was written in 2012, but seems to have been broken
for just as long. Compilation is broken on every MIPS/MIPS64
system with an FPU (which the code depends on).
The function is called only internally in DSP, so we do not
need to expose it.
apply_ltp on MIPS uses this function, but due to the function
being just a glue function with no real optimizations,
duplicate it there.
Up until now, there was one AACDecContext for the fixed
and one for the floating point decoder. These differed
mostly in certain arrays which were int for the fixed-point
and float for the floating point decoder; there were also
differences in corresponding function pointers.
Yet in order to deduplicate the enormous amount of currently
duplicated code between the float and the fixed-point decoder,
one needs common contexts. Given that int and float have the
same size on all common systems, this commit replaces these
arrays by unions of int arrays and of float arrays. The names
of these arrays have been chosen to be compatible with
AAC_RENAME().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
sizeof(PredictorState) is different for the floating-point and
the fixed-point AAC decoders; this is an obstacle for deduplicating
code between these decoders. So don't include this array in
SingleChannelElement, instead add a union of pointers to the
fixed-point PredictorState and the floating-point PredictorState.
The actual arrays are part of the extended ChannelElement
to be allocated by ff_aac_sbr_ctx_alloc_init(); it also
sets the pointers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, AACDecContext included pointers to one of these
contexts depending upon USE_FIXED. Yet deduplicating
the common parts of the float and fixed-point AAC decoders
needs common structures, so we put both of these pointers
in a union.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The AAC fixed-point and floating-point decoders have
a lot of duplicated code; the main obstacle to
deduplicating it is that several structures with the
same name are actually different types, because
they contain INTFLOATs (int or float) and AAC_FLOATs
(SoftFloat or float). SoftFloat and float typically
have different sizes, so dealing with it is the more
complicated of the two.
AAC_FLOAT is mainly used in the sbr code and structures,
so one can still deduplicate the code by only exposing
the common part of ChannelElement (without SBR context)
to the common decoder part. One prerequisite of this
is to move allocating the whole ChannelElement to
code that will stay unduplicated. It is most natural
to move said allocation to ff_aac_sbr_ctx_init().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Each ChannelElement contains exactly one SpectralBandReplication
structure; the latter structure contains lots of buffers
whose size depend upon USE_FIXED (i.e. AAC_FLOAT arrays).
This complicates deduplicating the parts of the AAC decoder
that are duplicated between the fixed-point and the floating
point decoder.
In order to fix this, the SpectralBandReplication structure
will be moved from the part of ChannelElement visible to
the common code. Therefore the ff_aac_sbr_* functions
are ported to accept a ChannelElement*; they will then have
to translate that to the corresponding SpectralBandReplication*
themselves (which is possible, because there are floating-point
and fixed-point versions of these functions).
While just at it, also ensure that these functions are properly
namespaced.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
has_b_frames used in decoder for size of the frame
reordering buffer, and we don't used the max_b_frames
in decoder.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
This function takes a decoded AVDOVIMetadata struct and turns it back
into a binary RPU. Verified using existing tools, and matches the
bitstream in official reference files.
I decided to just roll the EMDF and NAL encapsulation into this function
because the end user will need to do it otherwise anyways.
We need to set up the configuration struct appropriately based on the
codec type, colorspace metadata, and presence/absence of an EL (though,
we currently don't support an EL).
When present, we use the signalled RPU data header to help infer (and
validate) the right values.
Behavior can be controlled by a new DOVIContext.enable flag.
To allow compiling the decoding objects without the encoding objects and
vice versa. Common helpers that users of both APIs need are put into the
shared dovi_rpu.c.
To allow internally re-using it for both the encoder and decoder.
This is based on HEVC only, H.264/AV1 use their own (hopefully correctly
signalled) profiles (and in particular, the AV1 decoders implicitly
default the correct profile in the absence of a configuration record).
The code was written under the misguided assumption that these fields
would only be present when the value changes, however this does not
match the actual patent specification, which says that streams are
required to either always signal this metadata, or never signal it.
That said, the specification does not really clarify what the defaults
of these fields should be in the event that this metadata is missing, so
without any sample file or other reference I don't wish to hazard
a guess at how to interpret these fields.
Fix the current behavior by making sure we always throw this error, even
for files that have the vdr sequence info in one frame but are missing
it in the next frame.
And make it public.
For encoding, users may also be interested in the configured level and
compatibility ID. So generalize the dv_profile field and just expose the
whole configuration record.
This makes the already rather reductive ff_dovi_update_cfg() function
almost wholly redundant, since users can just directly assign
DOVIContext.cfg.
AV1 can put a frame into multiple reference slots;
up until now, this involved creating a new reference
to the underlying AVFrame; therefore av1_frame_ref()
could fail.
This commit changes this by using the ProgressFrame API
to share the underlying AVFrames.
(Hint: vaapi_av1_surface_id() checked whether the AV1Frames
contained in the AV1DecContext were NULL or not (of course
they were not); this has been changed to actually check for
whether said AV1Frame is blank or not.)
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It avoids having to sync ProgressFrame.f and the pointer
typically used to access the AVFrame.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This relies on the common initial seqence guarantee
(and on C11 support for unnamed members).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Use dpb_max_num_reorder_pics to control output instead of
dpb_max_dec_pic_buffering, when dpb_max_dec_pic_buffering
is much larger than dpb_max_num_reorder_pics, it may cause
dpb overflow error.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
Signed-off-by: elinyhuang <elinyhuang@tencent.com>
has_b_frames used in decoder for size of the frame reordering
buffer, setting this field from dpb_max_num_reorder_pics.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
Happens in the mov-elst-ends-betn-b-and-i and mov-ibi-elst-starts-b
FATE tests with frame-threading.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When we used the --disable-ssse3 --disable-optimizations options,
the compiler would not skip the MC_LINKS like the compilation that
enabled the optimization, so it would fail to find the function
prototypes. Hence, this commit uses the same way to add prototypes
for the functions as HEVC DSP.
And, when prototypes are added for the functions, we cannot add the static qualifier.
Therefore, the ff_vvc prefix is needed to avoid the naming conflict.
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
When we used the --disable-ssse3 --disable-optimizations options,
the compiler would not skip the MC_LINKS like the compilation that
enabled the optimization, so it would fail to find the function
prototypes. Hence, this commit uses the same way to add prototypes
for the functions as HEVC DSP.
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Avoids allocations and therefore error checks; also avoids
indirections and allows to remove the boilerplate code
for creating an object with a dedicated free function.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids boilerplate code when creating the context
and avoids allocations and therefore whole error paths
when creating references to it. Also avoids an indirection
and improves type-safety.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
After having created the AVBuffer that is put into frame->buf[0],
ownership of several objects (namely an AVDRMFrameDescriptor,
an MppFrame and some AVBufferRefs framecontextref and decoder_ref)
has passed to the AVBuffer and therefore to the frame.
Yet it has nevertheless been freed manually on error
afterwards, which would lead to a double-free as soon
as the AVFrame is unreferenced.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and therefore error checks and cleanup code;
also avoids indirections.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids implicit av_frame_ref() and therefore allocations
and error checks. It also avoids explicitly allocating
the AVFrames (done implicitly when getting the buffer).
It also fixes a data race: The AVFrame's sample_aspect_ratio
is currently updated after ff_thread_finish_setup()
and this write is unsynchronized with the read in av_frame_ref().
Removing the implicit av_frame_ref() fixed this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids implicit av_frame_ref() and therefore allocations
and error checks. It also avoids explicitly allocating
the AVFrames (done implicitly when getting the buffer).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids implicit av_frame_ref() and therefore allocations
and error checks. It also avoids explicitly allocating
the AVFrames (done implicitly when getting the buffer).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Before commit f025b8e110,
every frame-threaded decoder used ThreadFrames, even when
they did not have any inter-frame dependencies at all.
In order to distinguish those decoders that need the AVBuffer
for progress communication from those that do not (to avoid
the allocation for the latter), the former decoders were marked
with the FF_CODEC_CAP_ALLOCATE_PROGRESS internal codec cap.
Yet distinguishing these two can be done in a more natural way:
Don't use ThreadFrames when not needed and split ff_thread_get_buffer()
into a core function that calls the user's get_buffer2 callback
and a wrapper around it that also allocates the progress AVBuffer.
This has been done in 02220b88fc
and since that commit the ALLOCATE_PROGRESS cap was nearly redundant.
The only exception was WebP and VP8. WebP can contain VP8
and uses the VP8 decoder directly (i.e. they share the same
AVCodecContext). Both decoders are frame-threaded and VP8
has inter-frame dependencies (in general, not in valid WebP)
and therefore the ALLOCATE_PROGRESS cap. In order to avoid
allocating progress in case of a frame-threaded WebP decoder
the cap and the check for the cap has been kept in place.
Yet now the VP8 decoder has been switched to use ProgressFrames
and therefore there is just no reason any more for this check
and the cap. This commit therefore removes both.
Also change the value of FF_CODEC_CAP_USES_PROGRESSFRAMES
to leave no gaps.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also use the correct type limit SIZE_MAX; INT_MAX comes
from a time when this used av_buffer_allocz() which used
an int at the time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The current code resets it all the time unless we are decoding
a DSD frame with identical parameters to the last frame.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is more natural given that WavPack doesn't need the data of
the previous frame at all; it just needs the DSD context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is useful when the lifetime of the object to be shared
is the whole decoding process as it allows to avoid having
to sync them every time in update_thread_context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This part of the code is not slice-threaded and they are
semantically an initialization, so use atomic_init()
instead of the potentially expensive atomic_store()
(which uses sequentially consistent memory ordering).
Also remove the initial initialization directly after
allocating this array.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_thread_progress_replace() can handle a blank ProgressFrame
as src (in which case it simply unreferences dst), but not
a NULL one. So add a blank frame to be used as source for
this case, so that we can use the replace functions
to simplify vp9_frame_replace().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When outputting a show-existing frame, the VP9 decoder simply
created a reference to said frame and returned it immediately to
the caller, without waiting for it to have finished decoding.
In case of frame-threading it is possible for the frame to
only be decoded while it was waiting to be output.
This is normally benign.
But there is one case where it is not: If the user wants
video encoding parameters to be exported, said side data
will only be attached to the src AVFrame at the end of
decoding the frame that is actually being shown. Without
synchronisation adding said side data in the decoder thread
and the reads in av_frame_ref() in the output thread
constitute a data race. This happens e.g. when using the
venc_data_dump tool with vp90-2-10-show-existing-frame.webm
from the FATE-suite.
Fix this by actually waiting for the frame to be output.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This already fixes a race in the vp9-encparams test. In this test,
side data is added to the current frame after having been decoded
(and therefore after ff_thread_finish_setup() has been called).
Yet the update_thread_context callback called ff_thread_ref_frame()
and therefore av_frame_ref() with this frame as source frame and
the ensuing read was unsynchronised with adding the side data,
i.e. there was a data race.
By switching to the ProgressFrame API the implicit av_frame_ref()
is removed and the race fixed except if this frame is later reused by
a show-existing-frame which uses an explicit av_frame_ref().
The vp9-encparams test does not cover this, so this commit
already fixes all the races in this test.
This decoder kept multiple references to the same ThreadFrames
in the same context and therefore had lots of implicit av_frame_ref()
even when decoding single-threaded. This incurred lots of small
allocations: When decoding an ordinary 10s video in single-threaded
mode the number of allocations reported by Valgrind went down
from 57,814 to 20,908; for 10 threads it went down from 84,223 to
21,901.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids implicit av_frame_ref() and therefore allocations
and error checks. It also avoids explicitly allocating
the AVFrames (done implicitly when getting the buffer)
and it also allows to reuse the flushing code for freeing
the ProgressFrames.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids implicit av_frame_ref() and therefore allocations
and error checks.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Frame-threaded decoders with inter-frame dependencies
use the ThreadFrame API for syncing. It works as follows:
During init each thread allocates an AVFrame for every
ThreadFrame.
Thread A reads the header of its packet and allocates
a buffer for an AVFrame with ff_thread_get_ext_buffer()
(which also allocates a small structure that is shared
with other references to this frame) and sets its fields,
including side data. Then said thread calls ff_thread_finish_setup().
From that moment onward it is not allowed to change any
of the AVFrame fields at all any more, but it may change
fields which are an indirection away, like the content of
AVFrame.data or already existing side data.
After thread A has called ff_thread_finish_setup(),
another thread (the user one) calls the codec's update_thread_context
callback which in turn calls ff_thread_ref_frame() which
calls av_frame_ref() which reads every field of A's
AVFrame; hence the above restriction on modifications
of the AVFrame (as any modification of the AVFrame by A after
ff_thread_finish_setup() would be a data race). Of course,
this av_frame_ref() also incurs allocations and therefore
needs to be checked. ff_thread_ref_frame() also references
the small structure used for communicating progress.
This av_frame_ref() makes it awkward to propagate values that
only become known during decoding to later threads (in case of
frame reordering or other mechanisms of delayed output (like
show-existing-frames) it's not the decoding thread, but a later
thread that returns the AVFrame). E.g. for VP9 when exporting video
encoding parameters as side data the number of blocks only
becomes known during decoding, so one can't allocate the side data
before ff_thread_finish_setup(). It is currently being done afterwards
and this leads to a data race in the vp9-encparams test when using
frame-threading. Returning decode_error_flags is also complicated
by this.
To perform this exchange a buffer shared between the references
is needed (notice that simply giving the later threads a pointer
to the original AVFrame does not work, because said AVFrame will
be reused lateron when thread A decodes the next packet given to it).
One could extend the buffer already used for progress for this
or use a new one (requiring yet another allocation), yet both
of these approaches have the drawback of being unnatural, ugly
and requiring quite a lot of ad-hoc code. E.g. in case of the VP9
side data mentioned above one could not simply use the helper
that allocates and adds the side data to an AVFrame in one go.
The ProgressFrame API meanwhile offers a different solution to all
of this. It is based around the idea that the most natural
shared object for sharing information about an AVFrame between
decoding threads is the AVFrame itself. To actually implement this
the AVFrame needs to be reference counted. This is achieved by
putting a (ownership) pointer into a shared (and opaque) structure
that is managed by the RefStruct API and which also contains
the stuff necessary for progress reporting.
The users get a pointer to this AVFrame with the understanding
that the owner may set all the fields until it has indicated
that it has finished decoding this AVFrame; then the users are
allowed to read everything. Every decoder may of course employ
a different contract than the one outlined above.
Given that there is no underlying av_frame_ref(), creating
references to a ProgressFrame can't fail. Only
ff_thread_progress_get_buffer() can fail, but given that
it will replace calls to ff_thread_get_ext_buffer() it is
at places where errors are already expected and properly
taken care of.
The ProgressFrames are empty (i.e. the AVFrame pointer is NULL
and the AVFrames are not allocated during init at all)
while not being in use; ff_thread_progress_get_buffer() both
sets up the actual ProgressFrame and already calls
ff_thread_get_buffer(). So instead of checking for
ThreadFrame.f->data[0] or ThreadFrame.f->buf[0] being NULL
for "this reference frame is non-existing" one should check for
ProgressFrame.f.
This also implies that one can only set AVFrame properties
after having allocated the buffer. This restriction is not deep:
if it becomes onerous for any codec, ff_thread_progress_get_buffer()
can be broken up. The user would then have to get a buffer
himself.
In order to avoid unnecessary allocations, the shared structure
is pooled, so that both the structure as well as the AVFrame
itself are reused. This means that there won't be lots of
unnecessary allocations in case of non-frame-threaded decoding.
It might even turn out to have fewer than the current code
(the current code allocates AVFrames for every DPB slot, but
these are often excessively large and not completely used;
the new code allocates them on demand). Pooling relies on the
reset function of the RefStruct pool API, it would be impossible
to implement with the AVBufferPool API.
Finally, ProgressFrames have no notion of owner; they are built
on top of the ThreadProgress API which also lacks such a concept.
Instead every ThreadProgress and every ProgressFrame contains
its own mutex and condition variable, making it completely independent
of pthread_frame.c. Just like the ThreadFrame API it is simply
presumed that only the actual owner/producer of a frame reports
progress on said frame.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The API is similar to the ThreadFrame API, with the exception
that it no longer has an included AVFrame and that it has its
own mutexes and condition variables which makes it more independent
of pthread_frame.c. One can wait on anything via a ThreadProgress.
One just has to ensure that the lifetime of the object containing
the ThreadProgress is long enough. This will typically be solved
by putting a ThreadProgress in a refcounted structure that is
shared between threads.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: epirat07@gmail.com
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The native VVC decoder does not yet support quality/spatial/multiview
scalability. Bitstreams requiring this feature could cause crashes.
Patch fixes this by skipping NAL units which are not in the base layer,
warning the user while doing so.
Signed-off-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Only the last 256 samples of each frame are used;
the encoder currently uses a buffer for 1536 + 256 samples
whose first 256 samples contain are the last 256 samples
from the last frame and the next 1536 are the samples
of the current frame.
Yet since 238b2d4155 all the
DSP functions only need 256 contiguous samples and this can
be achieved by only retaining the last 256 samples of each
frame. Doing so saves 6KiB per channel.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This codec supports FLAG_B_PICTURE_REFERENCES. We need to correctly fill
the reference_pic_flag with is_reference variable instead of 0 for B
frames.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
libva2 doesn't require a fixed surface-array any more, so we may use
dynamic frame pool for decoding when libva2 is available, which allows a
downstream element stores more frames from VAAPI decoders and fixes the
error below:
$ ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi \
-i input.mp4 -c:v hevc_vaapi -f null -
...
[h264 @ 0x557a075a1400] get_buffer() failed
[h264 @ 0x557a075a1400] thread_get_buffer() failed
[h264 @ 0x557a075a1400] decode_slice_header error
[h264 @ 0x557a075a1400] no frame!
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
This diff removes 4 unused ARMv7 NEON fixed-point DSP functions.
The function were originally moved here by 4958f35a2 (Dec 2013).
After 9e05421db (Jan 2021), as part of the refactor of the AC3
DSP to consistently use 32-bit sample format in the encoder, these
functions were removed from the DSP function table, but the ARMv7
implementations were kept.
Signed-off-by: Geoff Hill <geoff@geoffhill.org>
When starting on a SEI recovery point close enough to the end of
the stream that draining starts before the recovery point's frame
is output, there can be non-recovered frames in the delayed picture
buffer that would currently cause the decoder to fail to output a
frame. This commit skips such frames and outputs the first recovered
frame, if there exists one.
Signed-off-by: arch1t3cht <arch1t3cht@gmail.com>
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
When decoding starts at a SEI recovery point very shortly before the
end of the video stream, there can be frames which are decoded before
the recovery point's frame is output and which will only be output once
the draining has started. Previously, these frames would never be set
as recovered. This commit copies the logic from h264_select_output_frame
to send_next_delayed_frame to properly mark such frames as recovered.
Fixes ticket #10936.
Signed-off-by: arch1t3cht <arch1t3cht@gmail.com>
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
sps_log2_ctu_size_minus5 is between 0 and 2, with 3 reserved for future
use. The VVC decoder allows sps_log2_ctu_size_minus5 to be 3, and so
ctb_size_y should be at least 16 bits to prevent overflows. An
alternative patch would leave sps_log2_ctu_size_minus5 as 8 bits and
disallow sps_log2_ctu_size_minus5 = 3.
Signed-off-by: Frank Plowman <post@frankplowman.com>
The first release of the CTS for AV1 decoding had incorrect
offsets for the OrderHints values.
The CTS will be fixed, and eventually, the drivers will be
updated to the proper spec-conforming behaviour, but we still
need to add a workaround as this will take months.
Only NVIDIA use these values at all, so limit the workaround
to only NVIDIA. Also, other vendors don't tend to provide accurate
CTS information.
This is needed by Vulkan. Constructing this can't be delegated to CBS
because packets might contain multiple frames (when non-shown frames are
present) but we need separate snapshots immediately before each frame
for the decoder.
When Split frame encoding is enabled, each input frame is partitioned into
horizontal strips which are encoded independently and simultaneously by
separate NVENCs, usually resulting in increased encoding speed compared to
single NVENC encoding.
Signed-off-by: Diego Felix de Souza <ddesouza@nvidia.com>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
As we can read in ST 2086:
Values outside the specified ranges of luminance and chromaticity values
are not reserved by SMPTE, and can be used for purposes outside the
scope of this standard.
This is further acknowledged by ITU-T H.264 and ITU-T H.265. Which says
that values out of range are unknown or unspecified or specified by
other means not specified in this Specification.
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Niklas Haas <git@haasn.dev>
These memcpy operands only depend upon sizeof(SampleType)
(and this size is actually the same for both the fixed-point
and the floating-point encoders for most (all supported?)
systems).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These allocations only depend upon sizeof(SampleType)
(and this size is actually the same for both the fixed-point
and the floating-point encoders for most (all supported?)
systems).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is in preparation for sharing even more stuff
common to the fixed and floating-point encoders.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Implicitly disabled by 4679a474f0.
Given that no one has ever complained about this, this commit
removes the now dead code.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
encode_preinit_audio() already checks that the sample rate
is among AVCodec.supported_samplerates.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is unnecessary (the channel layout guessing code became
moot when the channel layouts were enforced generically)
and also dangerous, as a custom channel layout mapping
would leak in case one was used.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is perfectly legal for users to use a custom layout
that is equivalent to a supported native one.
In this case the union in AVChannelLayout is not an uint64_t mask,
but a pointer to a custom map.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Per the lavu/tx docs:
> * For forward transforms (R2C), stride must be the spacing between two
> * samples in bytes. For inverse transforms, the stride must be set
> * to the spacing between two complex values in bytes.
The code did the reverse.
The stride parameter is currently not respected for RDFT transforms,
but has to be correct, for a potential future change.
Removes the special -I flag specified in the avcodec/bsf/ subdirectory.
This makes code copy-pastable to other parts of the ffmpeg codebase, as
well as simplifying the build script.
It also reduces ambiguity, since there are many instances of same-named
header files existing in both libavformat/ and libavcodec/
subdirectories.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: James Almer <jamrial@gmail.com>
The 2 which has been changed to an 8 in the array length expression is
the maximum value of sps_bitdepth_minus8. This was missed when updating
to VVCv2, which increased this maximum from 2 to 8.
Signed-off-by: Frank Plowman <post@frankplowman.com>
The size variable here is taken as gospel for the bounds of the input
buffer in later logic. Clamp it to ensure that the returned region
does not extend past that allocated in the underlying GetBitContext,
even in the case entry point offsets are signalled in the bitstream.
Also assert this for good measure.
Signed-off-by: Frank Plowman <post@frankplowman.com>
num_units_in_tick and time_scale are both 32-bit unsigned integers.
Storing them as ints was causing overflows.
Signed-off-by: Frank Plowman <post@frankplowman.com>
Invalid input files may contain film grain metadata which survives
ff_h274_film_grain_params_supported() but does not pass
av_film_grain_params_select(), leading to a SIGSEGV on hevc_frame_end().
Fix this by duplicating the av_film_grain_params_select() check at frame
init time.
An alternative solution here would be to defer the incompatibility check
to hevc_frame_end(), but this has the downside of allocating a film
grain buffer even when we already know we can't apply film grain.
Fixes: https://trac.ffmpeg.org/ticket/10951
Use context_initialized from the underlying MpegEncContext
instead. Also don't check before ff_mpv_common_end()
in mpeg_decode_end().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The ipu decoder never calls ff_mpv_common_init() or allocates
anything else that would need to be freed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
H.261 does not have keyframes (or indeed frame types) at all,
so this warning is not warranted.
Also remove an always-true check while at it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
(FF_PTR_ADD has to be kept although MPEG-1/2 only supports
YUV pixel formats because our decoder also supports decoding
to AV_PIX_FMT_GRAY8 depending upon CONFIG_GRAY.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, ff_mpv_frame_start() offsets the data of the current
picture and doubles the linesizes of all pictures if the current
picture is field-based so that data and linesize allow to address
the current field only.
This is done based upon the current picture_structure value.
Only two mpegvideo-based decoders ever set this field: mpeg1/2
and VC-1; but the latter only does it after ff_mpv_frame_start()
(when using hardware-acceleration and in order to signal it to
the DXVA2 hwaccel) in which case no offset is applied in
ff_mpv_frame_start(). So only one decoder actually wants this
offset*; therefore move the code performing it to mpeg12dec.c.
*: VC-1 doubles linesize when using field_mode (not only the picture's
linesize, but also uvlinesize and linesize), yet it does not offset
anything. This is further proof that this should not be performed
generically.
Also move MPEG-1/2 specific setting of the top-field-first flag.
(The change here implies that the AVFrame in current_picture
may have different top-field-first flags than the AVFrame
from current_picture_ptr, but this doesn't matter as only
the latter's are used.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
current_picture is not changed after frame_start() at all
and it therefore does not need to be updated (i.e. copied to the
slice thread contexts) a second time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These pointers already point to the same buffers, so using
a union is possible and avoids the overhead of syncing the
pointers (and saves some memory).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
A decoder's private data has already been zeroed (apart from options)
before init is called.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Key line from the spec is:
"All SPS NAL units with a particular value of sps_seq_parameter_set_id
in a CVS shall have the same content."
Prior to this patch, the VVC decoder's behaviour on encountering a
duplicated SPS ID (within the entire bitstream, not restricted to
a CVS) was simply to replace the entry in the SPS lookup table with the
new data. Illegal bitstreams with multiple SPSs in the same CVS sharing
an ID but differing elsewhere could cause all manner of issues.
The patch tracks which SPS IDs have been used in the given CVS using the
new sps_id_used field of VVCParamSets. If it encounters an SPS with an
ID already in use and whose content differs from the previous SPS, it
throws an AVERROR_INVALIDDATA.
Signed-off-by: Frank Plowman <post@frankplowman.com>
ff_ass_subtitle_header_* still used explicit CRLF linebreaks
eventhough they will get normalised to LF later since commit
7bf1b9b357. Just directly use LF.
Unlike what the old comment suggested, standard ASS has no character
escape mechanism, but a closing curly bracket doesn't even need one.
For manual authored sub files using a full-width variant of an
appropriate font and with scaling and spacing modifiers is a common
workaround.
This is not an option here, but we can still make things much less bad.
Now the desired opening bracket still shows up in libass, and
standard renders will merely display a backslash in its place
instead of stripping the following text like before.
Backslashes cannot be escaped by a backslash in any ASS renderer,
but unless followed by specific characters it is just printed out.
Insert a word-joiner character after a backslash to break up
active sequences without changing the visual output.