Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Poetry #402

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ with support from Google. This is not an official Google or Trail of Bits produc

## Features

* Support for auditing local environments and requirements-style files
* Support for auditing local environments, requirements-style files, and poetry.lock files
* Support for multiple vulnerability services
([PyPI](https://warehouse.pypa.io/api-reference/json.html#known-vulnerabilities),
[OSV](https://osv.dev/docs/))
Expand Down
8 changes: 7 additions & 1 deletion pip_audit/_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
PYPI_URL,
DependencySource,
PipSource,
PoetrySource,
PyProjectSource,
RequirementSource,
ResolveLibResolver,
Expand Down Expand Up @@ -329,9 +330,14 @@ def _parse_args(parser: argparse.ArgumentParser) -> argparse.Namespace: # pragm
def _dep_source_from_project_path(
project_path: Path, resolver: ResolveLibResolver, state: AuditState
) -> DependencySource: # pragma: no cover
# Check for a `pyproject.toml`
poetry_lock = project_path / "poetry.lock"
if poetry_lock.is_file():
logger.debug("using PoetrySource as dependency source")
return PoetrySource(path=poetry_lock)

pyproject_path = project_path / "pyproject.toml"
if pyproject_path.is_file():
logger.debug("using PyProjectSource as dependency source")
return PyProjectSource(pyproject_path, resolver, state)

# TODO: Checks for setup.py and other project files will go here.
Expand Down
2 changes: 2 additions & 0 deletions pip_audit/_dependency_source/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
DependencySourceError,
)
from .pip import PipSource, PipSourceError
from .poetry import PoetrySource
from .pyproject import PyProjectSource
from .requirement import RequirementSource
from .resolvelib import PYPI_URL, ResolveLibResolver
Expand All @@ -23,6 +24,7 @@
"DependencySourceError",
"PipSource",
"PipSourceError",
"PoetrySource",
"PyProjectSource",
"RequirementSource",
"ResolveLibResolver",
Expand Down
1 change: 0 additions & 1 deletion pip_audit/_dependency_source/pip.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,6 @@ def collect(self) -> Iterator[Dependency]:
"Package has invalid version and could not be audited: "
f"{dist.name} ({dist.version})"
)
logger.debug(skip_reason)
dep = SkippedDependency(name=dist.name, skip_reason=skip_reason)
yield dep
except Exception as e:
Expand Down
57 changes: 57 additions & 0 deletions pip_audit/_dependency_source/poetry.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
"""
Collect dependencies from `poetry.lock` files.
"""
from __future__ import annotations

import logging
from dataclasses import dataclass
from pathlib import Path
from typing import Iterator

import toml
from packaging.version import InvalidVersion, Version

from pip_audit._dependency_source import DependencyFixError, DependencySource
from pip_audit._fix import ResolvedFixVersion
from pip_audit._service import Dependency, ResolvedDependency, SkippedDependency

logger = logging.getLogger(__name__)


@dataclass(frozen=True)
class PoetrySource(DependencySource):
"""
Dependency sourcing from `poetry.lock`.
"""

path: Path

def collect(self) -> Iterator[Dependency]:
"""
Collect all of the dependencies discovered by this `PoetrySource`.
"""
with self.path.open("r") as stream:
packages = toml.load(stream)
for package in packages["package"]:
name = package["name"]
try:
version = Version(package["version"])
except InvalidVersion:
skip_reason = (
"Package has invalid version and could not be audited: "
f"{name} ({package['version']})"
)
yield SkippedDependency(name=name, skip_reason=skip_reason)
else:
yield ResolvedDependency(name=name, version=version)

def fix(self, fix_version: ResolvedFixVersion) -> None:
"""
Fixes a dependency version for this `PoetrySource`.

Requires poetry to be installed in the same env.

Note that poetry ignores the version we want to update to,
and goes straight to the latest version allowed in metadata.
"""
raise DependencyFixError("fix is not supported for poetry yet") # pragma: no cover
74 changes: 74 additions & 0 deletions test/dependency_source/data/poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

34 changes: 34 additions & 0 deletions test/dependency_source/test_poetry.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
from __future__ import annotations

from pathlib import Path

from packaging.version import Version

from pip_audit._dependency_source import PoetrySource
from pip_audit._service import ResolvedDependency, SkippedDependency

TEST_DATA_PATH = Path(__file__).parent / "data"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another tiny nitpick: could you turn this into an asset or test_data fixture and put it in conftest.py?

Here's the pattern we use on a handful of other projects:

# conftest.py

_ASSETS = (Path(__file__).parent / "assets").resolve()
assert _ASSETS.is_dir()

@pytest.fixture
def asset():
    def _asset(name: str) -> Path:
        return _ASSETS / name

    return _asset

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could. But why? This one is short, type-safe, and not coupled on conftest.py. I love good autocomplete.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main reason is potential reuse: we have a lot of other tests (particularly the requirements file ones) that would benefit from being broken out into stand-alone assets, so having a centralized "asset" fixture would keep things unified.

Separately (and I admit I have no idea if this is "good" practice or not): my understanding is that pytest conceptually encourages each test module to contain nothing except test code, with conftest.py being the idiomatic place to put any scaffolding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a lot of other tests (particularly the requirements file ones) that would benefit from being broken out into stand-alone assets

YAGNI ;)

Separately (and I admit I have no idea if this is "good" practice or not): my understanding is that pytest conceptually encourages each test module to contain nothing except test code

Never heard of that. Test files can contain fixtures, and from my experience, they should when the fixture is used only in that file. Having all fixtures in conftest.py brings unnecessary coupling, makes it harder to track which fixtures aren't needed anymore, and that file being a typical example of "God class" (in this case, God module) can grow to some crazy numbers (last month, I decoupled 10k lines conftest.py). And even if we assume "keep all fixtures in conftest.py", it's not a fixture. It's literally a constant.

Having a constant has direct benefits: it's shorter, type-safe, closer to the code (easier to unravel the test code), and makes the test file self-contained. If none of that persuades you, please, just say it, and I'll update the PR. I want the code I contribute to be nice and readable, but the final word is always on the direct maintainers.



def test_collect(tmp_path: Path) -> None:
lock_content = (TEST_DATA_PATH / "poetry.lock").read_text()
lock_path = tmp_path / "poetry.lock"
lock_path.write_text(lock_content)
sourcer = PoetrySource(path=lock_path)
actual = list(sourcer.collect())
expected = [
ResolvedDependency(name="jinja2", version=Version("2.11.3")),
ResolvedDependency(name="markupsafe", version=Version("2.1.1")),
]
assert actual == expected


def test_invalid_version(tmp_path: Path) -> None:
lock_content = (TEST_DATA_PATH / "poetry.lock").read_text()
lock_content = lock_content.replace("2.11.3", "oh-hi-mark")
lock_path = tmp_path / "poetry.lock"
lock_path.write_text(lock_content)
sourcer = PoetrySource(path=lock_path)
deps = list(sourcer.collect())
assert [dep.name for dep in deps] == ["jinja2", "markupsafe"]
assert isinstance(deps[0], SkippedDependency)