Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Schema warning with AnyType #2519

Open
1 task
ReubenFrankel opened this issue Jul 3, 2024 · 4 comments
Open
1 task

bug: Schema warning with AnyType #2519

ReubenFrankel opened this issue Jul 3, 2024 · 4 comments

Comments

@ReubenFrankel
Copy link
Contributor

ReubenFrankel commented Jul 3, 2024

Singer SDK Version

0.34.1

Is this a regression?

  • Yes

Python Version

3.7 (EOL)

Bug scope

Taps (catalog, state, etc.)

Operating System

Ubuntu 22.04

Description

Unsure if I am using this correctly, but my understanding is that AnyType should represent a value of any JSON type (i.e. string, integer, object, array or null). Maybe this has been fixed in a later version, but 0.34.1 is the latest compatible version available since we haven't yet dropped Python 3.7 support for the tap.

Code

th.Property("value", th.AnyType)
Could not append type because the JSON schema for the dictionary `{}` appears to be invalid
@ReubenFrankel ReubenFrankel added kind/Bug Something isn't working valuestream/SDK labels Jul 3, 2024
@ReubenFrankel
Copy link
Contributor Author

This seems to be the intended behaviour when using AnyType:

assert caplog.records[0].message == (
"Could not append type because the JSON schema for the dictionary `{}` "
"appears to be invalid."
)

Not sure why this is the case. The main issue here is the amount of times this warning message seems to get thrown out in the logs. For example:

$ meltano invoke tap-capsulecrm 2>&1 | grep -c 'Could not append type'

237

@edgarrmondragon
Copy link
Collaborator

This seems to be the intended behaviour when using AnyType:

assert caplog.records[0].message == (
"Could not append type because the JSON schema for the dictionary `{}` "
"appears to be invalid."
)

Not sure why this is the case. The main issue here is the amount of times this warning message seems to get thrown out in the logs. For example:

$ meltano invoke tap-capsulecrm 2>&1 | grep -c 'Could not append type'

237

Looking at the code, my first guess is that it's because the schema property is not cached, so the PropertiesList gets serialized a bunch of times: https://github.com/Matatika/tap-capsulecrm/blob/f7c360329868a70c445d73943887174f07231a3b/tap_capsulecrm/client.py#L73-L75

@ReubenFrankel
Copy link
Contributor Author

Thanks, that reduces the amount of warnings significantly (once for each stream with a schema referencing AnyType):

$ meltano invoke tap-capsulecrm 2>&1 | grep -c 'Could not append type'

5

I guess this is then more a matter of should the SDK throw that warning message for explicit usage of AnyType?

@edgarrmondragon
Copy link
Collaborator

Thanks, that reduces the amount of warnings significantly (once for each stream with a schema referencing AnyType):

$ meltano invoke tap-capsulecrm 2>&1 | grep -c 'Could not append type'

5

I guess this is then more a matter of should the SDK throw that warning message for explicit usage of AnyType?

Problem is AnyType is just a wrapper for a "<prop>": {} schema element, and that's the same warning you'd get. I increasingly dislike the append_type implementation. I'd rather it be part of the JSONTypeHelper base class. So this:

Property("name", StringType(nullable=True))  # {"properties": {"name": {"type": ["string", "null"]}}, "required": []}

instead of

Property("name", StringType, required=False)  # {"properties": {"name": {"type": ["string", "null"]}}, "required": []}

That'd allow us to refactor these lines

sdk/singer_sdk/typing.py

Lines 688 to 689 in dfdbdde

if self.nullable or self.optional:
type_dict = append_type(type_dict, "null")

and get rid of most of the instances of that warning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants