Bug in D3D11VA Deinterlacing #15197

softworkz · 2024-10-28T11:48:34Z

mpv Information

Exists since 7 years.

Other Information

- Windows version: Any with DirectX 11.1 support
- GPU model, driver and version: All Intel & all AMD GPUs (less obvious with Nvidia)
- Source of mpv:https://github.com/mpv-player/mpv/blob/master/video/filter/vf_d3d11vpp.c
- Introduced in this commit: https://github.com/mpv-player/mpv/commit/49f73eaf7b6f58e82376fc764ab0743c039d5278 (which introduced the filter)

Reproduction Steps

Playback some interlaced video which has some sharp horizontal edges or text/graphics.
Play with D3D11 hw decoding and deinterlacing enabled.
It's not always immediately obvious - you need to look closely: Use a fixed scaling of 1.0 or 2.0, or set the resolution of a 4k display to 1600:900 and watch it there.

Expected Behavior

This is how it looks with bwdif (software) deinterlacing:

flickering_bwdif.mp4

Actual Behavior

And here are the results with D3D11VPP deinterlacing:

Intel & AMD

flickering_intel.mp4

The example is with Intel GPU and it looks the same with AMD GPUs.

Nvidia

Nvidia looks more like bwdif at first sight, but that's just because it supports the "blend" deinterlacing mode. The result is that it doesn't show that flickering/shaking, but with degraded appearance.

Here's a screenshot from Nvidia:

And this from Intel:

Sample Files

mpv_d3d11_deint_sample.zip

The text was updated successfully, but these errors were encountered:

softworkz · 2024-10-28T12:25:20Z

I don't have a build and development workflow for MPV that's why I'm filing an issue rather than a PR, but I think I've gotten behind the reasons why it doesn't work properly.

Issue 1

The first part that is wrong is this:

mpv/video/filter/vf_d3d11vpp.c

Lines 418 to 428 in a210639

    
           if (!mp_refqueue_should_deint(p->queue)) { 
        
               d3d_frame_format = D3D11_VIDEO_FRAME_FORMAT_PROGRESSIVE; 
        
           } else if (mp_refqueue_is_top_field(p->queue)) { 
        
               d3d_frame_format = D3D11_VIDEO_FRAME_FORMAT_INTERLACED_TOP_FIELD_FIRST; 
        
           } else { 
        
               d3d_frame_format = D3D11_VIDEO_FRAME_FORMAT_INTERLACED_BOTTOM_FIELD_FIRST; 
        
           } 
        
           ID3D11VideoContext_VideoProcessorSetStreamFrameFormat(p->video_ctx, 
        
                                                                 p->video_proc, 
        
                                                                 0, d3d_frame_format);

wm4 had commited this code with the following comment:

Another strange detail is how to select top/bottom fields and field
dominance. At least I'm getting quite similar results to vavpp on Linux,
so I'm content with it for now.

The answer to this is simple: you don't have to do that, because the result frames that you get from the DXVA decoders always include both fields already.
What ffmpeg does in that case is to emit the same decoded frame twice and declares it once as field0 and then as field1 (in case of software output, it also adds the linesize to the pointer, so that it starts on the second line instead of the first).
That's probably where the confusion came from and ended up in the wrong code (which I quoted). Directly above those lines is similar code, but that part is correct. It tells the videoprocessor about the frame order (whether top of bottom first). But that's something that doesn't change frequently (almost never).
Yet, the quoted code changes the frame order on each field - obviously not correct.

softworkz · 2024-10-28T12:53:57Z

Issue 2

This is about the deinterlacing modes. For illustration, here some screenshots from DXVAChecker (nice tool):

Nvidia

Intel

AMD

Summary

Intel and Nvidia have one "video processor", AMD has two (separate one for deinterlacing), but all of them are indicating multiple deinterlacing methods, so there's not one for method X and another one for method Y.
How can you choose?

Original Commit

Not a new question at all - this is another part from the original commit of the D3D11VPP filter:

I'm not sure how to select the deinterlacing mode at all. You can
enumate the available video processors, but at least on Intel, all of
them either signal support for all deinterlacers, or none (the latter is
apparently used for IVTC). I haven't found anything that actually tells
the processor which algorithm to use.

I came to wonder about the same thing yesterday and even 8 years later it's still been tough to find out, because it's just not documented clearly.
The answer here is that the deinterlacing mode is determined implicitly based on other parameters you set and how you are using the individual APIs:

Blend
blend means that two fields are merged into a single frame, so there's no doublilng of the framerate, also no reference frames are needed for this: You provide a frame with both fields and get a single frame back
- To choose Blend, you set the output framerate of the videoprocessor to the same rate like the input framerate
Bob
Bob means framerate doubling and it doesn't need any reference frames, it's also a kind of fallback (see below)
- To choose Bob, you set the outpute framerate to the double of the input framerate and you need to submit the (combined) input frame twice to the processor to get the two output frames
Adaptive and Motion Compensation
These are the really interesting ones. I don't think you can really make an explicit choice between them, but to get one of those, you need to do the same like for Bob, but additionally, you need to provide the number of future and past reference frames that is indicated by the processor (see screenshots). Once you don't provide those frames, it automatically falls back to Bob
Inverse Telecine
Again, this is controlled by input/output framerate ratio and the way you feed and receive frames from the processor

That's also the answer to this line in the code:

mpv/video/filter/vf_d3d11vpp.c

Line 274 in a210639

// TODO: so, how do we select which rate conversion mode the processor uses?

Conclusions

The modes "blend" and "ivtc" can be removed right away from code and docs, because they are not compatible with the implementation (expecting frame doubling)
The input and output frame rates need to be set in the D3D11_VIDEO_PROCESSOR_CONTENThttps://learn.microsoft.com/en-us/windows/win32/api/d3d11/nf-d3d11-id3d11videocontext-videoprocessorsetstreamoutputrate_DESC structure before creating the enumerator and probably also on the processor after creation with VideoProcessorSetStreamOutputRate.
Reference frames should be provided to achieve the best possible deinterlacing results

softworkz · 2024-10-28T13:29:25Z

Issue 3

When we set input and output framerate on the processor, there's still one question remaining: When input and output rate are different, how to transform N input frames into M output frames?

The key to that are frame numbers for input and output side. These need to be set in the right way which reflects the relation between input and output frame and in the current code, that not right either.

I had looked into verious other implementations - all different and most of them wrong (as I know now). I had my own theory but wasn't sure about it (yet I was right, hehe).
This morning, I came to wonder how GPU vendors are supposed to implement all this stuff at the driver side when the documentation is so vague about many crucial parts. So I looked at "the other side" - Windows driver development documentation, and guess what? All the things that are left out in the consuming API documentation is there...

Required Changes

mpv/video/filter/vf_d3d11vpp.c

Lines 457 to 463 in a210639

    
           D3D11_VIDEO_PROCESSOR_STREAM stream = { 
        
               .Enable = TRUE, 
        
               .pInputSurface = in_view, 
        
           }; 
        
           int frame = mp_refqueue_is_second_field(p->queue); 
        
           hr = ID3D11VideoContext_VideoProcessorBlt(p->video_ctx, p->video_proc, 
        
                                                     out_view, frame, 1, &stream);

In the Blt call above, a constant value of 0 needs to be supplied instead of frame.
The value of frame needs to be applied to the OutputIndex member of the D3D11_VIDEO_PROCESSOR_STREAM structure.

Then all should be good...

Docs: https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/d3dumddi/ns-d3dumddi-_dxvahdddi_stream_data

kasper93 · 2024-10-29T05:15:33Z

Thank you for the detailed analysis. I’m aware this filter is glued together, and when adding scaling, I was really hoping not to run into issues with the rest of this code.

That being said, it’s not as simple as you suggest; neither of your recommendations improves the situation, and in fact, they make it worse. But I did like 10 minutes try only, so don't take my word too seriously.

I don't have a build and development workflow for MPV

I can look into this when I find the time. Frankly, though, you’re already well ahead with the research you’ve done. Building mpv with MSYS2 is actually quite straightforward. If you want to give it a try, see compile-windows.md. I can do it too, but it’s not a high priority for me and probably will not do it quite soon.

softworkz · 2024-10-29T17:30:34Z

Thank you for the detailed analysis. I’m aware this filter is glued together, and when adding scaling, I was really hoping not to run into issues with the rest of this code.

I'm sure it's not related to the scaling you have added, it has always been like that (one of our beta users mentioned it's a long-standing issue in MPV).
In fact, the scaling works great. By setting video-unscaled and gpu-dumb-mode and adding some code which is adapting the scale factor to the output window size, I was able to achieve playback with a much lower energy consumption. It may have inferior quality, but when running on battery, it's probably an acceptable compromise for many.
So for me, the scaling has been a very welcome addition! ❤️

That being said, it’s not as simple as you suggest; neither of your recommendations improves the situation, and in fact, they make it worse. But I did like 10 minutes try only, so don't take my word too seriously.

I kind of hoped it might be like a 10 minute thing for someone who has a working dev environment.
But as always, devil is in the details, and I'm also not sure whether the Intel and AMD deinterlacers might not depend on getting the reference frames supplied as they are indicating.
One other test I made though is that I ran the test file through ffmpeg with full-qsv hardware transcoding and the deinterlacing from vpp_qsv, so I can confirm that the Intel deinterlacing itself can actually produce proper results.

I can look into this when I find the time. Frankly, though, you’re already well ahead with the research you’ve done. Building mpv with MSYS2 is actually quite straightforward. If you want to give it a try, see compile-windows.md. I can do it too, but it’s not a high priority for me and probably will not do it quite soon.

Originally I had expected having to do this at some point, but mpv is very flexible and working very reliably, so my integration via libmpv is almost complete and that's the first issue where dealing with mpv source would be needed. Right now, I'm a bit short on time, so let's see who may first to find some time and passion for looking into this. 😆

Thanks

kasper93 · 2024-10-29T21:37:59Z

I kind of hoped it might be like a 10 minute thing for someone who has a working dev environment.
But as always, devil is in the details, and I'm also not sure whether the Intel and AMD deinterlacers might not depend on getting the reference frames supplied as they are indicating.

Yes, I suspect this has to be set properly. In fact mpv has all the queue for frames, so we can do it without much trouble, except understanding what exactly API expects from us.

I've tested in madVR which has all this implemented properly (I think) and it produces proper result, but also there is little bit of flicker just after the new elements show, so this make me think that it indeed uses previous frames. And this makes sense, but it needs so love to put into mpv to make it all correctly set-up.

softworkz · 2024-10-29T21:57:03Z

I have reviewes all implementations on GitHub which are using those APIs, if you like I can send you the links. Those known to be good are all providing reference frames, the two samples from Microsoft don't, but well, it's just samples, keeping things simple.

The important thing to consider is that the D3D11 APIs are meaning 2 combined fields in a frame while the frames in the refqueue have one frame per field (even though each two frames are containing both fields), so if the current refqueue frame is frame 0 and the deinterlacer wants two future frames, we need to provide refqueue frames 2 and 4 (not 1 and 2).

softworkz added the os:win label Oct 28, 2024

This comment has been minimized.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in D3D11VA Deinterlacing #15197

Bug in D3D11VA Deinterlacing #15197

softworkz commented Oct 28, 2024 •

edited

Loading

softworkz commented Oct 28, 2024 •

edited

Loading

softworkz commented Oct 28, 2024 •

edited

Loading

softworkz commented Oct 28, 2024

kasper93 commented Oct 29, 2024

softworkz commented Oct 29, 2024

kasper93 commented Oct 29, 2024

softworkz commented Oct 29, 2024

This comment has been minimized.

Bug in D3D11VA Deinterlacing #15197

Bug in D3D11VA Deinterlacing #15197

Comments

softworkz commented Oct 28, 2024 • edited Loading

mpv Information

Other Information

Reproduction Steps

Expected Behavior

Actual Behavior

Intel & AMD

Nvidia

Sample Files

softworkz commented Oct 28, 2024 • edited Loading

Issue 1

softworkz commented Oct 28, 2024 • edited Loading

Issue 2

Nvidia

Intel

AMD

Summary

Original Commit

Conclusions

softworkz commented Oct 28, 2024

Issue 3

Required Changes

kasper93 commented Oct 29, 2024

softworkz commented Oct 29, 2024

kasper93 commented Oct 29, 2024

softworkz commented Oct 29, 2024

This comment has been minimized.

softworkz commented Oct 28, 2024 •

edited

Loading

softworkz commented Oct 28, 2024 •

edited

Loading

softworkz commented Oct 28, 2024 •

edited

Loading