Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make more information available in the reports #207

Open
StefanFl opened this issue Dec 24, 2021 · 13 comments
Open

Make more information available in the reports #207

StefanFl opened this issue Dec 24, 2021 · 13 comments
Labels
component:output-formats Supported output formats enhancement New feature or request upstream Items that require upstream work or coordination

Comments

@StefanFl
Copy link

Is your feature request related to a problem? Please describe.

The Python Packaging Advisory Database does contain more information than currently available in pip-audit's reports. In particular there are the references and the aliases.

Describe the solution you'd like

References and aliases should be available in the JSON and other reports. I am currently writing a parser to import pip-audit reports in DefectDojo (https://github.com/DefectDojo/django-DefectDojo) and it would be great if users could see as much information as possible there to be able to assess the vulnerability.

A severity would be great as well, but that doesn't seem to be part of the information in the database.

Describe alternatives you've considered

The only alternative I see is to include a reference to the vulnerability in the Python Packaging Advisory Database in DefectDojo findings, but it would be more convenient for users to get the information directly.

@StefanFl StefanFl added the enhancement New feature or request label Dec 24, 2021
@woodruffw
Copy link
Member

Thanks for the request! I'm aware of the aliases key in the JSON response provided by the PyPA advisory DB, but not the "references" you mentioned. Are you talking about the link key?

Either way, this is indeed something we can and should support in the JSON output format for pip-audit.

@woodruffw woodruffw added the component:output-formats Supported output formats label Dec 24, 2021
@StefanFl
Copy link
Author

Thanks for your quick response! I found the references in the yaml files of the PyPA advisory DB, e.g. https://github.com/pypa/advisory-db/blob/8aa52f490ff7a87026814b5634808f5824d4018a/vulns/aiohttp-session/PYSEC-2018-35.yaml#L41. Don't know how they are called in their JSON response.

@woodruffw
Copy link
Member

Gotcha, thanks for the link. It looks like that key isn't currently exposed via PyPA's JSON API.

For example, here's the vulnerability object produced by requesting aiohttp-session==2.6.0:

  "vulnerabilities": [
    {
      "aliases": [
        "CVE-2018-1000814"
      ],
      "details": "aio-libs aiohttp-session version 2.6.0 and earlier contains a Other/Unknown vulnerability in EncryptedCookieStorage and NaClCookieStorage that can result in Non-expiring sessions / Infinite lifespan. This attack appear to be exploitable via Recreation of a cookie post-expiry with the same value.",
      "fixed_in": [
        "2.7.0"
      ],
      "id": "PYSEC-2018-35",
      "link": "https://osv.dev/vulnerability/PYSEC-2018-35",
      "source": "osv"
    }
  ]

So this will need upstream changes to Warehouse first.

@woodruffw woodruffw added the upstream Items that require upstream work or coordination label Dec 25, 2021
@StefanFl
Copy link
Author

How do you query the API? Using

curl -X POST -d \
          '{"version": "2.6.0", "package": {"name": "aiohttp-session", "ecosystem": "PyPI"}}' \
          "https://api.osv.dev/v1/query"

I get this result, including the references:

{
    "vulns": [
        {
            "id": "PYSEC-2018-35",
            "details": "aio-libs aiohttp-session version 2.6.0 and earlier contains a Other/Unknown vulnerability in EncryptedCookieStorage and NaClCookieStorage that can result in Non-expiring sessions / Infinite lifespan. This attack appear to be exploitable via Recreation of a cookie post-expiry with the same value.",
            "aliases": [
                "CVE-2018-1000814",
                "GHSA-mr4x-c4v9-x729"
            ],
            "modified": "2021-07-02T02:41:32.834524Z",
            "published": "2018-12-20T15:29:00Z",
            "references": [
                {
                    "type": "WEB",
                    "url": "https://github.com/aio-libs/aiohttp-session/pull/331"
                },
                {
                    "type": "REPORT",
                    "url": "https://github.com/aio-libs/aiohttp-session/issues/325"
                },
                {
                    "type": "ADVISORY",
                    "url": "https://github.com/advisories/GHSA-mr4x-c4v9-x729"
                }
            ],
            "affected": [
                {
                    "package": {
                        "name": "aiohttp-session",
                        "ecosystem": "PyPI",
                        "purl": "pkg:pypi:aiohttp-session"
                    },
                    "ranges": [
                        {
                            "type": "ECOSYSTEM",
                            "events": [
                                {
                                    "introduced": "0"
                                },
                                {
                                    "fixed": "2.7.0"
                                }
                            ]
                        }
                    ],
                    "versions": [
                        "0.0.1",
                        "0.1.0",
                        "0.1.1",
                        "0.1.2",
                        "0.2.0",
                        "0.3.0",
                        "0.4.0",
                        "0.5.0",
                        "0.7.0",
                        "0.7.1",
                        "0.8.0",
                        "1.0.0",
                        "1.0.1",
                        "1.1.0",
                        "1.2.0",
                        "1.2.1",
                        "2.0.0",
                        "2.0.1",
                        "2.1.0",
                        "2.2.0",
                        "2.3.0",
                        "2.4.0",
                        "2.5.1",
                        "2.6.0"
                    ],
                    "database_specific": {
                        "source": "https://github.com/pypa/advisory-db/blob/main/vulns/aiohttp-session/PYSEC-2018-35.yaml"
                    }
                }
            ]
        }
    ]
}

It does contain information about the version where the vulnerability has been fixed as well, which would be very useful.

@tetsuo-cpp
Copy link
Contributor

Hey @StefanFl, I believe the example that @woodruffw posted was a response from the PyPI API here. pip-audit can query for vulnerabilities from either the PyPI or OSV APIs via the -s flag so in order to support this, both APIs will have to expose the references key (as you noticed, OSV already exposes this). So that's why this will require a patch to Warehouse.

It does contain information about the version where the vulnerability has been fixed as well, which would be very useful.

I believe pip-audit should already be showing fix versions for each vulnerability.

@StefanFl
Copy link
Author

Of course the fix versions are already there, my fault.

Having the aliases and the link would already be a great start.

@bestis
Copy link

bestis commented Jul 7, 2023

I find it puzzling that there's no severity information and one can't decide what severity issue is unacceptable.

What I understood this is because neither osv or pypi provides severity information? Which I find even more puzzling.

But these has links to advisories that has severity information, for example:
GHSA-mr4x-c4v9-x729
GHSA-r9hx-vwmv-q579

Why not download advisories to get the severity? And if not everything has it, then just assume worst,
but would be much nicer to know the severity and even be able to skip low severity issues.

@woodruffw
Copy link
Member

I find it puzzling that there's no severity information and one can't decide what severity issue is unacceptable.

First, as a gentle reminder: this project is not in control of any vulnerability reports. All pip-audit does is consume vulnerability APIs and match them against dependency lists; if you want severity metadata, then you should raise a feature request upstream with either OSV or PYSEC.

Matching against GHSA would work when the vulnerabilities in question are in the GHSA DB, but there's no formal guarantee of this: we support IDs from all kinds of vulnerability DBs, and in fact don't even prefer GHSA by default (we prefer PYSEC, since it's curated for the Python community's needs). Attempting to "merge" results from separate feeds also poses problems: these kinds of reports are updated surprisingly frequently, and merging means turning a simple data retrieval problem into a deconflicting/matching problem between two potentially contradictory reports.

TL;DR: If you want severity metadata, please work with our upstreams! It's something we won't be able to accomplish on our own.

And if not everything has it, then just assume worst,
but would be much nicer to know the severity and even be able to skip low severity issues.

Independent of the above: I want to advise against taking this kind of approach (and state that we'll probably never assume the worst):

  1. Vulnerability scoring is hard, and context sensitive: reducing it to a single number from an API means that you lack the context for how exploitable it is within your codebase, which is the metric that determines actual priority. In other words: something that's scored as a 2 might actually be a 9 in your code, while something scored a 9 might be a 2 in my code.
  2. Tools like pip-audit need to be very careful to avoid producing security fatigue, and assuming the worst is fatigue-inducing: users quickly learn that "worst" really means "not so bad," and they begin to ignore things they shouldn't.

@bestis
Copy link

bestis commented Jul 7, 2023

I compare the this similar audit's in other languages that usually seem to have this severity thing implemented eg. ci-audit in node. Why OSV or PYSEC doesn't feel that is a irrelevant data is the puzzling thing. And I was not blaming this project about it, I just find it puzzling that they don't have that information.

It might be hard, but all the ReDoS are not so hard. Eg. the PYSEC-2022-42969 you know very well.

Having to need to skip vulns like that, is not nice either. Generally lately there has been so much of these ReDoS ones which usually are pretty pointless like that one. I think vulnerabilities like this blocking the pipelines causes fatigue.

@woodruffw
Copy link
Member

Why OSV or PYSEC doesn't feel that is a irrelevant data is the puzzling thing. And I was not blaming this project about it, I just find it puzzling that they don't have that information.

Please raise it with them! If we can get this data in a consistent matter, we will consider exposing it.

Having to need to skip vulns like that, is not nice either. Generally lately there has been so much of these ReDoS ones which usually are pretty pointless like that one. I think vulnerabilities like this blocking the pipelines causes fatigue

I agree completely. That being said, I don't think that severities solve the problem here: GHSA-r9hx-vwmv-q579 for example has a score of 7.5, despite having basically no attack profile/value. The incentives here are what's broken: if pip-audt provides scores, then the people who spam feeds with ReDoS vulnerabilities will just raise their scores to get them in front of more eyes.

The correct (IMO) approach here is to have curated feeds, with vulnerabilities that get removed (or aggressively pre-filtered) by trusted maintainers. I believe PYSEC attempts to provide this, although it's also an open question as to how best to scale it.

@bestis
Copy link

bestis commented Jul 7, 2023

Oh, didn't notice it had so huge value. Probably because the attack vector is network. Which is then like, yeah, technically. Maybe when there should be ability to skip vulns with mentioned keywords :trollface:

I'm guessing the severities would at least lessen the effect and the less severe problems could be fixed once and while, not like immediately, but as the ecosystem seems to be what it is, here we are.

Of course one could then take the output of pip-audit and crawl the vulns to find out severity to then decide is it a blocker or not, but granted it would be better if the data would just be available from those. Need to think do I have energy to find right places to nag about it, but thank you for confirming that the problem is those curated lists pip-audit uses and I understand if pip-audit doesn't want to start crawling those itself.

@woodruffw
Copy link
Member

Oh, didn't notice it had so huge value. Probably because the attack vector is network. Which is then like, yeah, technically. Maybe when there should be ability to skip vulns with mentioned keywords :trollface:

Yeah, this gets to the core of it: these scoring schemes have dimensions like "network," when the network context here is "a package index that you probably already trust and can send you ZIP bombs anyways."

I think filtering by keyword is probably a good idea here, but IMO is best done by downstream users of pip-audit via the --format=json output: supporting it directly would mean needing to decide whether to support regexes, what subset to support, etc. Users will probably all want slightly different things, so punting to them makes sense to me 🙂

Need to think do I have energy to find right places to nag about it, but thank you for confirming that the problem is those curated lists pip-audit uses and I understand if pip-audit doesn't want to start crawling those itself.

Just in case it helps: the right OSV issue tracker is probably this one: https://github.com/google/osv.dev/issues

@woodruffw
Copy link
Member

#654 concerns vulnerability ratings/scores specifically, so I'll break that part of this discussion into that issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:output-formats Supported output formats enhancement New feature or request upstream Items that require upstream work or coordination
Projects
None yet
Development

No branches or pull requests

4 participants