Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TGA file footer can be used to create adversarial examples #596

Open
lebr0nli opened this issue Jul 27, 2024 · 1 comment
Open

TGA file footer can be used to create adversarial examples #596

lebr0nli opened this issue Jul 27, 2024 · 1 comment
Labels
adversarial This issue is about adversarial techniques misdetection This issue is about a misdetection on a content type currently supported

Comments

@lebr0nli
Copy link

I noticed that somehow Magika's model is maybe too sensitive to the TGA file footer, and it can be used to create adversarial examples easily.

I also made a CTF challenge about this, I have described how I found this behavior and crafted the adversarial example in my write-up.

Here's an adversarial example I created to make an ELF file be mistakenly identified as a TGA file:

poc.so

The above adversarial example can be compiled by nasm -f bin -o poc.so poc.s with this poc.s

Since GitHub doesn't allow direct file uploads, I had to zip them for uploading. You'll need to unzip them.

If you LD_PRELOAD=./poc.so /bin/cat with a x86-64 Linux, you should see /bin/id been executed, which means this is definitely a valid ELF file, not a TGA file.

However, Magika will identify it as a TGA file, with score 1.0:

$ nasm -f bin -o poc.so poc.s
$ LD_PRELOAD=./poc.so /bin/ls
uid=0(root) gid=0(root) groups=0(root)
$ magika --json poc.so
[
    {
        "path": "poc.so",
        "dl": {
            "ct_label": "tga",
            "score": 1.0,
            "group": "image",
            "mime_type": "image/x-tga",
            "magic": "Targa image data",
            "description": "Targa image data"
        },
        "output": {
            "ct_label": "tga",
            "score": 1.0,
            "group": "image",
            "mime_type": "image/x-tga",
            "magic": "Targa image data",
            "description": "Targa image data"
        }
    }
]
@lebr0nli lebr0nli added misdetection This issue is about a misdetection on a content type currently supported needs triage This issue still needs triage by one of the maintainers labels Jul 27, 2024
@invernizzi
Copy link
Member

Thank you for reporting the issue, Alan! (and nice writeup - we're stoked)

Yes, the current versions of Magika (v1 and v2) only analyses a portion of the file, so bypassing attacks are possible if the attack file format allows to keep those portion in the right places. This choice allows us to maintain a near-constant execution time irrespective of the file size, which is a nice property to have for high throughput deployments. However, it does have drawbacks, as we mention in the README.md in the limitations section, and in the soon-to-be-released paper.

Fortunately, Magika v2 correctly detects the file as ELF with a score of 0.999+. This is using the draft_standard_v2 model - check out the rust implementation that already uses it. That said, we'll looking into improving Magika's resilience against adversarial attacks, though making Magika completely adversarial-attack proof is likely an impossible task.

Since v2 correctly detects this sample, we'll close the bug. Before that happens, though - @reyammer , maybe we can add this in our test suite, just to keep this approach around. For that to happen, though, Alan has to sign the Google CLA (contributor agreement)

@invernizzi invernizzi added adversarial This issue is about adversarial techniques and removed misdetection This issue is about a misdetection on a content type currently supported needs triage This issue still needs triage by one of the maintainers labels Aug 8, 2024
@reyammer reyammer added the misdetection This issue is about a misdetection on a content type currently supported label Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
adversarial This issue is about adversarial techniques misdetection This issue is about a misdetection on a content type currently supported
Projects
None yet
Development

No branches or pull requests

3 participants