Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement feature extraction module for verification #208

Open
oscar-davids opened this issue Oct 5, 2020 · 3 comments
Open

Implement feature extraction module for verification #208

oscar-davids opened this issue Oct 5, 2020 · 3 comments

Comments

@oscar-davids
Copy link
Contributor

oscar-davids commented Oct 5, 2020

Abstract

In our decentralized video network, it is important part to detect tamper videos to prevent the myriad of malicious attacks seeking to misinform, cheat to digital multimedia audience.
This is a proposal for implementing verification module in lpms engine.

Motivation

We already developed verifier that run on broadcaster, this verifier has fairly good accuracy for detecting tampered videos.
But we still need outsourcing verifier to reduce the calculation cost in broadcaster and we can introduced TTP(Truste
d Third Part) similar to Zcash and can be used to implement the verification workflow.
To implement this, first, the feature values to be used for verification must be calculated during transcoding.

Proposed Solution

Feature extraction module should proceed simultaneously with transcoding and should be real
time and we also would want the feature diff b/w the source & rendition(s) as well as the source and the extra video.
From this view point, I am going to integrate parallel independent decoding of extra video and
ffmpeg api for calculation feature matrix.
The diagram below shows the work flow of feature extraction module in lpms engine.
In this diagram, if the number of transcoding profiles is two, we would get three different feature matrixes.
One is feature diff between the source and extra, others feature diffs between the source and renditions.

extract_workflow3

  • GO Level

As we can see in above diagram, the parameters for transcoding(+verification) will be two video local url.
Therefore, we should add a function with a string array as parameters in following URL.

func (t *FFMpegSegmentTranscoder) Transcode(fname string) ([][]byte, error) {

The return values are renditions byte array and the final diff score between original and extra video in the json form.
func (t *FFMpegSegmentTranscoder) Transcode2(fname []string) ([][]byte, diffscore []string, error) {
… … …
}

  • C Level

As with Go level, we have to add a function which has two local urls for videos(original & extra) as inputs

int lpms_transcode(input_params *inp, output_params *params,

int lpms_transcode2(input_params *inp, int nb_inputs, output_params *params,
output_results *results, int nb_outputs, output_results *decoded_results)

struct transcode_thread {
int initialized;
struct input_ctx ictx;
struct output_ctx outputs[MAX_OUTPUT_SIZE];
int nb_outputs;
... ... ...
AVFrame *list_frame_original
AVFrame *list_frame_renditions
};

And a list will be added into the " transcode_thread" or “input_ctx” structure to captures frames based on random frame indices. When transcoding, the frames of originals and renditions is stored into this list at our source here and and here

Meanwhile, a thread is used to perform decoding and frame capture of a extra video in parallel.
Finally, call lvpdiff api of ffmpeg to calculate the frame different scores and generate the return value.
I will refer to here and here in our code.

Ideally this function with two inputs will be take 0.2~0.4 seconds more than original transcoding function with one input.

Testing and Considerations

As like original transcoding function with one file input, in the "ffmpeg_test.go" should be added test function related to two inputs.

References

https://www.notion.so/livepeer/Real-Time-Verification-Thoughts-0a0ad16546a54dc3b77589f01f2bc333
https://github.com/livepeer/verification-classifier/tree/master/scripts
https://stackoverflow.com/questions/37353250/may-i-use-opencv-in-a-customized-ffmpeg-filter
https://en.wikipedia.org/wiki/Zcash
https://en.wikipedia.org/wiki/Zero-knowledge_proof

@jailuthra
Copy link
Contributor

jailuthra commented Oct 6, 2020

Looks good @oscar-davids! Just a couple suggestions around the transcode function's argument names:

Rather than having input filenames/params as an array, can we instead have a separate argument for the extra video?

 func (t *FFMpegSegmentTranscoder) Transcode2(fname string, extraname string) ([][]byte, diffscore []string, error) {

Similarly rather than multiple objects of input_params in an array, we can send a new argument extra_inp -

int lpms_transcode2(input_params *inp, input_params *extra_inp, output_params *params,
output_results *results, int nb_outputs, output_results *decoded_results) {
  //...
}

I'm leaning towards not mixing the extra segment with the inputs in the API - as in future there might be a usecase where we might want to have multiple input files (one audio, one video, one subs or whatever) that are completely unrelated to the extra-segment used for verification. And anyway as the normal segments go through a different pipeline compared to the extra segment (which won't go through usual filter+encode according to the diagram) it makes sense to distinguish them clearly for the LPMS user.

Also feel free to name functions TranscodeAndVerify and similar if you think that would be better, instead of using numbers like we've been using before.

Rest of the proposal like having diffscores as an array of n+1 size with n output renditions etc. all sounds good to me :)

@oscar-davids
Copy link
Contributor Author

I'm leaning towards not mixing the extra segment with the inputs in the API

From the API point of view this makes sense. I agree to change the function and parameter name.

func (t *FFMpegSegmentTranscoder) TranscodeAndVerify(fname string, extraname string) ([][]byte, diffscore []string, error) {

int lpms_transcodeandverify(input_params *inp, input_params *extra_inp, output_params *params, output_results *results, int nb_outputs, output_results *decoded_results) { //... }

@yondonfu
Copy link
Member

yondonfu commented Oct 6, 2020

I see that the spec currently references

func (t *FFMpegSegmentTranscoder) Transcode(fname string) ([][]byte, error) {

go-livepeer actually doesn't use FFmpegSegmentTranscoder right now and instead primarily uses the methods defined in ffmpeg.go all of which basically wrap Transcode(). The API for this method is a bit different than the transcode method for FFmpegSegmentTranscoder. At the moment it is:

Transcode(input *TranscodeOptionsIn, ps []TranscodeOptions) (*TranscodeResults, error)

So, I think we should consider how we can either leverage the structs used in this API to pass around the data that we need or what API changes are necessary if the existing structs are not sufficient for what we need to do.

The return values are renditions byte array and the final diff score between original and extra video in the json form.

Is there a particular reason that the diff scores need to be a JSON string? Instead of returning the diff scores as a JSON string, would it make sense to use structs to avoid additional parsing at the Go level? Something like:

type TranscodedResults struct { 
    ...
    FeatureDiffs []FeatureDiff
}

type VerifyFeature int

const (
    VerifyFeature_DCT_L1 = iota
    VerifyFeature_Gauss_MSE
    VerifyFeature_Gauss_L1
    VerifyFeature_Gauss_Threshold_L1
    VerifyFeature_Hist_Chi
)

type FeatureDiff []VerifyFeature

Not sure if the VerifyFeature enum is needed - might be useful if the broadcaster code needs to pass the feature diff to the model for inference?

Rather than having input filenames/params as an array, can we instead have a separate argument for the extra video?

Could we specify the extra filename in TranscodeOptionsIn? i.e.

type TranscodeOptionsIn struct {
    Fname string
    ExtraFname string
    ...
}

Something that isn't mentioned in the spec right now is the use of the lvpdiff filter vs the cuda_lvpdiff filter. We'll want to have a way to use the cuda_lvpdiff filter by checking if we're using Nvidia. We can probably use a check like input.Accel == Nvidia in the Transcode() method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants