TGA file footer can be used to create adversarial examples #596

lebr0nli · 2024-07-27T17:34:27Z

I noticed that somehow Magika's model is maybe too sensitive to the TGA file footer, and it can be used to create adversarial examples easily.

I also made a CTF challenge about this, I have described how I found this behavior and crafted the adversarial example in my write-up.

Here's an adversarial example I created to make an ELF file be mistakenly identified as a TGA file:

poc.so

The above adversarial example can be compiled by nasm -f bin -o poc.so poc.s with this poc.s

Since GitHub doesn't allow direct file uploads, I had to zip them for uploading. You'll need to unzip them.

If you LD_PRELOAD=./poc.so /bin/cat with a x86-64 Linux, you should see /bin/id been executed, which means this is definitely a valid ELF file, not a TGA file.

However, Magika will identify it as a TGA file, with score 1.0:

$ nasm -f bin -o poc.so poc.s
$ LD_PRELOAD=./poc.so /bin/ls
uid=0(root) gid=0(root) groups=0(root)
$ magika --json poc.so
[
    {
        "path": "poc.so",
        "dl": {
            "ct_label": "tga",
            "score": 1.0,
            "group": "image",
            "mime_type": "image/x-tga",
            "magic": "Targa image data",
            "description": "Targa image data"
        },
        "output": {
            "ct_label": "tga",
            "score": 1.0,
            "group": "image",
            "mime_type": "image/x-tga",
            "magic": "Targa image data",
            "description": "Targa image data"
        }
    }
]

The text was updated successfully, but these errors were encountered:

invernizzi · 2024-08-08T12:59:19Z

Thank you for reporting the issue, Alan! (and nice writeup - we're stoked)

Yes, the current versions of Magika (v1 and v2) only analyses a portion of the file, so bypassing attacks are possible if the attack file format allows to keep those portion in the right places. This choice allows us to maintain a near-constant execution time irrespective of the file size, which is a nice property to have for high throughput deployments. However, it does have drawbacks, as we mention in the README.md in the limitations section, and in the soon-to-be-released paper.

Fortunately, Magika v2 correctly detects the file as ELF with a score of 0.999+. This is using the draft_standard_v2 model - check out the rust implementation that already uses it. That said, we'll looking into improving Magika's resilience against adversarial attacks, though making Magika completely adversarial-attack proof is likely an impossible task.

Since v2 correctly detects this sample, we'll close the bug. Before that happens, though - @reyammer , maybe we can add this in our test suite, just to keep this approach around. For that to happen, though, Alan has to sign the Google CLA (contributor agreement)

lebr0nli added misdetection This issue is about a misdetection on a content type currently supported needs triage This issue still needs triage by one of the maintainers labels Jul 27, 2024

invernizzi added adversarial This issue is about adversarial techniques and removed misdetection This issue is about a misdetection on a content type currently supported needs triage This issue still needs triage by one of the maintainers labels Aug 8, 2024

reyammer added the misdetection This issue is about a misdetection on a content type currently supported label Aug 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TGA file footer can be used to create adversarial examples #596

TGA file footer can be used to create adversarial examples #596

lebr0nli commented Jul 27, 2024

invernizzi commented Aug 8, 2024

TGA file footer can be used to create adversarial examples #596

TGA file footer can be used to create adversarial examples #596

Comments

lebr0nli commented Jul 27, 2024

invernizzi commented Aug 8, 2024