Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The format:stereo-in filter doesn't work when encoding videos #15225

Open
6 tasks done
Arcitec opened this issue Oct 30, 2024 · 5 comments
Open
6 tasks done

The format:stereo-in filter doesn't work when encoding videos #15225

Arcitec opened this issue Oct 30, 2024 · 5 comments

Comments

@Arcitec
Copy link

Arcitec commented Oct 30, 2024

mpv Information

mpv 0.39.0 Copyright © 2000-2024 mpv/MPlayer/mplayer2 projects
libplacebo version: v6.338.2
FFmpeg version: 6.1.2
FFmpeg library versions:
   libavcodec      60.31.102
   libavdevice     60.3.100
   libavfilter     9.12.100
   libavformat     60.16.100
   libavutil       58.29.100
   libswresample   4.12.100
   libswscale      7.5.100

Important Information

Fedora 40, built mpv manually from 0.39.0.

Reproduction Steps

First play the video normally. You will see correctly rendered subtitles (they will be duplicated at the top and bottom halves of the video):

mpv --no-config --vf="format:stereo-in=ab2l" <videofile>

Now use mpv as an encoder. The filter now fails and the subtitles only render once (at the bottom of the screen).

mkfifo /tmp/mpv-enc
mpv --no-config --o=/tmp/mpv-enc --of=nut --ovc=rawvideo --oac=pcm_s16le --vf="format:stereo-in=ab2l" <videofile>

To see the streaming video of the second example, open a second terminal and run mpv /tmp/mpv-enc after having started the "stream" above.

Expected Behavior

Since I am re-encoding 3D videos and burning-in the subtitles, I need the subtitles to appear in 3D format, which is the purpose of that filter. But it's not working in encoding mode. :(

Actual Behavior

The filter is ignored. I see from the log that the filter/its metadata is being inserted, but the correct subtitle rendering code isn't being activated in mpv's encoding mode.

Log File

output2.txt

Sample Files

No response

I carefully read all instruction and confirm that I did the following:

  • I tested with the latest mpv version to validate that the issue is not already fixed.
  • I provided all required information including system and mpv version.
  • I produced the log file with the exact same set of files, parameters, and conditions used in "Reproduction Steps", with the addition of --log-file=output.txt.
  • I produced the log file while the behaviors described in "Actual Behavior" were actively observed.
  • I attached the full, untruncated log file.
  • I attached the backtrace in the case of a crash.
@Arcitec
Copy link
Author

Arcitec commented Oct 30, 2024

I really need mpv's fantastic subtitle renderer when doing these re-encodes. The way mpv lets me style the subtitles is fantastic. Just need to solve this 3D issue...

I had a theory that the built-in [encoding] profile disabling the OSC might be the reason why rendering fails, which is seen in the built-in config here:

mpv/etc/builtin.conf

Lines 32 to 41 in b402418

[encoding]
vo=lavc
ao=lavc
keep-open=no
force-window=no
gapless-audio=yes
resume-playback=no
load-scripts=no
osc=no
framedrop=no

But even after copying that profile to my own mpv.conf and enabling osc=yes and load-scripts=yes, I do not see an OSC and the subtitles do not render in 3D. Even though I confirmed that scripts are loading. So that's a dead end.

Well, the core issue is that the 3D rendering of subtitles fails in encoding mode, and it seems like that will require some actual code changes...

@Arcitec
Copy link
Author

Arcitec commented Oct 30, 2024

I have been researching more and have the following information which may be helpful:

So, since I think mpv uses libass for rendering, the process is similar in mpv's code.

But since mpv already has an implementation of 3D SBS/OU subtitle rendering in "pseudo-gui/OSC" mode, I hope all the code already exists in mpv to make this 3D rendering change for the encoder filter pipeline too.

@Arcitec
Copy link
Author

Arcitec commented Nov 1, 2024

I just began deeper investigations of the source code to see how to implement this.

The 3D subtitle rendering is handled as follows:

  • The format:stereo-in filter applies metadata to the video the same way as if it the metadata was embedded in a Matroska container. This can be seen by searching for "ab2l" ("above below" 3D mode) which is one of the identifiers, in these files: csputils.c and csputils.h.
  • When the format:stereo-in filter is applied, the 3D mode metadata is stored in the video's p->stereo3d property via vf_format.c and command.c. I am 99% sure that this is the same property that's set when opening a Matroska file that has an embedded "StereoMode" metadata element. The filter is basically a way to force that metadata into files that don't have it.
  • The actual double subtitle rendering is handled by video/out/gpu/osd.c, via the get_3d_side_by_side function which sets the screen div (split) mode. When the screen split is applied, it takes the current "GUI (and subtitles I assume) rendering framebuffer" and duplicates it twice and scales each copy, thereby rendering the GUI side by side. I might not have the exact specifics correctly, but this is basically what it does.

Now... here's what that means for encoding:

  • Because the current 3D subtitles are implemented inside the OSD renderer, and is exclusively written for the gpu output mode, it does not exist in the encoding mode, because encoding mode can ONLY use the lavc output mode, as seen in the builtin.conf in the [encoding] section.
  • Therefore I am 99.999999999999% sure that video/out/vo_lavc.c is the correct location for implementing p->stereo3d video property "3D subtitle" rendering support.
  • And since we cannot leverage the "render the OSD twice" trick that GPU output uses, we may have to insert lavfi subtitle rendering filters twice instead, as I described in the previous comment. Doing them above/below is simple via libASS "Alignment" values (one in the middle, one at the bottom). Doing side-by-side "centered in each half-frame" may be hard, and I may have to opt to only implement "above/below" mode.

Does anyone have any insights or advice that could help figure out the rest of this journey? Perhaps even some ideas for using the OSD "double rendering" technique in encode mode? Maybe we can render the subtitles as a layer and duplicate it directly?

Ultimately, I'm trying to have fun with my brother and watch 3D movies with great subtitles and direct re-encoding from mpv to the streaming server (which is a project that I'll release as a GitHub repo when it's complete), which unfortunately isn't possible currently with mpv...

@Arcitec
Copy link
Author

Arcitec commented Nov 1, 2024

I just had an idea if we can make a filter graph. This rendering would even be better than mpv's official GUI/OSD-based rendering:

  • Render with libASS to a libav filter graph layer. (I'll have to check how you do it currently; ffmpeg itself has built-in support for libass rendering too.)
  • Copy the layer via the split filter.
  • Squash both layers to 50% width or height.
  • Overlay them on top of the video.

This would actually implement proper ANAMORPHIC subtitle stretching, which mpv doesn't currently do. (So when the 3D is unstretched, mpv's current GUI-based 3D Subtitles are being stretched and elongated.)

Although I'd have to be careful with the difference between anamorphic 3D: half-OU/half-SBS, and non-anamorphic 3D: full-OU/full-SBS. Squashing should only be done when the input is half-resolution (anamorphic). And mpv's stereo3d flags currently make no distinction between those.

So I'd need to add an extra format:stereo-in filter option which says anamorphic=true to decide whether to squash or not.

Anyway... lavfi filtering like this, if possible, would be the best way to implement subtitles for both above-below and side-by-side modes. It means the subtitle renderer only has to do its job once, and then we do a cheap copy, optional squash, and overlay.

If we're able to make filter graphs for the subtitle renderer in the lavc output, I could prototype it via pure ffmpeg tomorrow and design a graph that does such rendering for all combos: OU, SBS, and anamorphic variants of each.

@Arcitec
Copy link
Author

Arcitec commented Nov 1, 2024

Oh this reminds me... gpu (and gpu-next) work with libass directly and render to a separate layer that's unconstrained by the video resolution. This is for rendering sharp text on top of the GUI window regardless of video resolution. It's a good solution for GUIs but not for video encoding filter graphs.

So the "3D subtitles" code in the OSD is probably totally useless for video encoding.

We'll need a filter graph instead, if that's possible. Does anyone know where video/out/vo_lavc.c currently does the subtitle rendering? I haven't seen it yet and need to go to bed now.

Edit: Oh... vo_lavc.c has this code:

#include "sub/osd.h"

static void draw_frame(struct vo *vo, struct vo_frame *voframe)
{
    struct priv *vc = vo->priv;
    struct encoder_context *enc = vc->enc;
    struct encode_lavc_context *ectx = enc->encode_lavc_ctx;
    AVCodecContext *avc = enc->encoder;

    if (voframe->redraw || voframe->repeat || voframe->num_frames < 1)
        return;

    struct mp_image *mpi = voframe->frames[0];

    struct mp_osd_res dim = osd_res_from_image_params(vo->params);
    osd_draw_on_image(vo->osd, dim, mpi->pts, OSD_DRAW_SUB_ONLY, mpi);

(The first and last lines are most relevant here.)

I'll have to investigate tomorrow to see whether those calls can be adapted to draw twice. I guess it's drawing directly on top of the video frame via libass then, before it's being sent to the lavc encoder! But I haven't checked in depth yet.

It might be possible to implement both the 3D subtitle rendering and anamorphic squashing (by rendering to a separate, squashed layer before compositing it), but I'll focus on just getting 3D rendering working in the encoder at first.


The hackiest part will be figuring out how to adapt osd_draw_on_image: OSD_DRAW_SUB_ONLY mode to draw twice but only in encode-mode, since the OSD handles double-rendering for the normal GPU GUI... And that needs to stay that way since the GUI is rendering sharp, crisp subtitles outside the video frame, intentionally.

So I'll need to add some flag to the "osd subtitle renderer" to say "render the subtitles in 3D mode if p->stereo3d has a known 3D value". This flag would be on when running via encoder, and off when running via GUI. Unless you have a cleaner idea?


Edit: Yeah... I think it would be clean to do OSD_DRAW_SUB_ONLY | OSD_DRAW_SUB_STEREO3D where the latter is the flag that enables 3D rendering in the subtitle bitmap generator in osd.c. Then I'll "just" have to make the osd.c renderer capable of acting on that flag. If that flag is enabled, it will analyze the video's p->stereo3d property to ultimately decide how to render the subtitle (2D or 3D).

Hopefully there's a way to give the subtitle renderer the "usable region (x/y/width/height)" so I can tell it to only render in half of the frame, and then I can manually copy those pixels to the 2nd half. But since it seems to draw directly on the video frame, I would need to first draw to a temporary buffer and then composite the buffer twice onto the frame. Alternatively, I could call libass rendering twice, but that's wasteful. I am guessing that the current GUI 3D subtitle code doesn't render twice? Or does it? If it does, then I guess the performance of doing that is fine.

Most likely, my first implementation will call the subtitle rendering twice. Just to get things rolling. Then we can look at using a draw buffer for the copy instead, but I would need guidance for such an advanced task in this unfamiliar codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant