Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytest test_datetime_serialization_speed occasionally fails #1145

Open
ethho opened this issue Dec 18, 2023 · 4 comments
Open

Pytest test_datetime_serialization_speed occasionally fails #1145

ethho opened this issue Dec 18, 2023 · 4 comments
Labels
dev-process Issues related to project management, tests, etc.

Comments

@ethho
Copy link
Contributor

ethho commented Dec 18, 2023

Bug Report

Description

The pytest tests/test_blob.py::test_datetime_serialization_speed fails non-deterministically.

Reproducibility

Include:

  • OS: Ubuntu 23.04
  • CPU: Intel i7-7600U (4) @ 3.900GHz
  • Mem: 16GB
  • Python: 3.8.17
  • MySQL Version: 8.0
  • MySQL Deployment Strategy (local-native | local-docker | remote): local-docker (docker://datajoint/mysql:8.0)
  • DataJoint Version: 0.14.1 commit 03db252

Steps to Reproduce

In a terminal window in the VSCode DevContainer on dev-tests branch (03db252), issue for i in {1..10}; do pytest tests/test_blob.py::test_datetime_serialization_speed; done to run the test 10 times. On my laptop, the test fails once out of 10 times. Since this issue seems hardware-dependent, it might be impossible to reproduce on some systems.

Error Stack

Console Logs
$ for i in {1..10}; do pytest tests/test_blob.py::test_datetime_seria
lization_speed; done
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 2.57s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 2.73s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 2.72s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 2.75s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py F                                                                                                                [100%]

================================================================ FAILURES =================================================================
____________________________________________________ test_datetime_serialization_speed ____________________________________________________

    def test_datetime_serialization_speed():
        # If this fails that means for some reason deserializing/serializing
        # np arrays of np.datetime64 types is now slower than regular arrays of datetime
    
        optimized_exe_time = timeit.timeit(
            setup="myarr=pack(np.array([np.datetime64('2022-10-13 03:03:13') for _ in range(0, 10000)]))",
            stmt="unpack(myarr)",
            number=10,
            globals=globals(),
        )
        print(f"np time {optimized_exe_time}")
        baseline_exe_time = timeit.timeit(
            setup="myarr2=pack(np.array([datetime(2022,10,13,3,3,13) for _ in range (0, 10000)]))",
            stmt="unpack(myarr2)",
            number=10,
            globals=globals(),
        )
        print(f"python time {baseline_exe_time}")
    
>       assert optimized_exe_time * 900 < baseline_exe_time
E       assert (0.00273081300110789 * 900) < 2.3093061309991754

tests/test_blob.py:233: AssertionError
---------------------------------------------------------- Captured stdout call -----------------------------------------------------------
np time 0.00273081300110789
python time 2.3093061309991754
========================================================= short test summary info =========================================================
FAILED tests/test_blob.py::test_datetime_serialization_speed - assert (0.00273081300110789 * 900) < 2.3093061309991754
============================================================ 1 failed in 2.75s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 2.91s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 2.70s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 3.70s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 3.91s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 3.09s ============================================================

Expected Behavior

The pytest tests/test_blob.py::test_datetime_serialization_speed passes every time without failing, regardless of hardware.

Screenshots

N/A

Additional Research and Context

This test fails often (~10% of invocations) when it is run on my laptop (2017 ThinkPad X1 Carbon) in a DevContainer. The test fails very infrequently (~1% of invocations) when run in CI; one example is on this CI run in #1133. I suspect that a slow or busy CPU causes packing of np.datetime64 arrays to happen unexpectedly slowly:

https://github.com/ethho/datajoint-python/blob/e755b9763d950137718fb8447967ceef5172f57d/tests/test_blob.py#L218-L223

    optimized_exe_time = timeit.timeit(
        setup="myarr=pack(np.array([np.datetime64('2022-10-13 03:03:13') for _ in range(0, 10000)]))",
        stmt="unpack(myarr)",
        number=10,
        globals=globals(),
    )

Additional Context

As of 12/18/2023, this test doesn't exist on branch master. This issue was discovered during development of #1142.

@ethho ethho added bug dev-process Issues related to project management, tests, etc. labels Dec 18, 2023
ethho added a commit to ethho/datajoint-python that referenced this issue Dec 19, 2023
@dimitri-yatsenko
Copy link
Member

I don't think the execution time is the right thing to test since it's not deterministic. The test should verify correctness and validity. Should we remove the time-based tests? What are the best practices?

@dimitri-yatsenko
Copy link
Member

This is not a bug since this is only an inadequate test.

@ethho
Copy link
Contributor Author

ethho commented Sep 19, 2024

Should we remove the time-based tests? What are the best practices?

I say we loosen the testing constraints (I think you already did this on master branch), or remove the test entirely. Tests will sometimes share physical hardware (e.g. in GitHub Actions CI runners), so it's not safe to assume that a test has access to >=1 CPU without process switching for the entire duration of the test.

@dimitri-yatsenko
Copy link
Member

I would remove this test altogether. The tests should be for correctness and not for timing since it cannot be guaranteed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dev-process Issues related to project management, tests, etc.
Projects
None yet
Development

No branches or pull requests

2 participants