Recommendation for list APIs that are sparsely populated (e.g. Meltano API and PokeAPI) #952
Replies: 1 comment 1 reply
-
I like option 1 better simply because developers may want access to all the niceties of the Solely for the purpose of enriching a stream record with results from a different endpoint, I think this approach of "silent" (or dummy, as I've been calling them so far 😅) streams approach might be good enough and is already encountered in the wild: https://meltano.slack.com/archives/C01PKLU5D1R/p1655987043129269 My example implementation (of the PokeAPI, no less!) uses that approach, but it requires declaring the dummy stream with placeholders what are ultimately discarded. It'd be better to offer developers a better way of doing this. |
Beta Was this translation helpful? Give feedback.
-
As we move towards a
1.0
release of the SDK, there's one API results pattern that as of yet, I don't think we support.That is, we don't have a strong design recommendations for APIs whose
list
endpoint doesn't return the actual items but instead returnsurl
orref
pointers to the actual detail data.Related to:
Examples
Example 1: PokeAPI
The PokeAPI equivalent of a
list
API returns results that require additional calls to resolve.pokemon
into the text box and pressSubmit
to see resultsOutput:
Details
The challenge here is that the results come back with just
name
andurl
, which is basically aref
to actually get the entity info.If we follow the URL, we'll get the actual data: https://pokeapi.co/api/v2/pokemon/1/
Example 2: Meltano Hub API
Our own Meltano API has a similar pattern, where the
index
doesn't contain the data on each entity.Possible solutions
The two paths I'm aware of this are:
Option 1 - silent and/or non-utilized parent stream
In this first approach, we would create a 'silent parent' stream such as
PokemonIndexStream
in the case of the PokeAPI andMeltanoPluginsIndex
in the case of the Meltano API. This could then have a child stream ofPokemonStream
orMeltanoPluginsStream
respectively.The challenge with this approach is having to create dual Stream classes (not a big deal) but then also declaring the parent one to be 'silent'. The design just feels a bit awkward and 'extra', even if the underlying technical components are not that challenging.
This approach should work even if the parent stream is not silent, but it might be confusing to an end user to have an 'index' stream as well as the base entity stream with detailed data. For example, this would create two target tables:
plugin_index
andplugins
, with the first one not offering much value over the second.Option 2 - parse the inner URL and merge results with list view
In the second approach, we introduce a means of sending complementary requests, so that only a single stream is needed. During
post_process()
or another method, the developer makes a call to the embeddedurl
and then grafts the results into the base entity description.@edgarrmondragon - Do you have any thoughts on which of these are 'recommendable' to tap developers? Have you run into this and do you have any best practices?
Beta Was this translation helpful? Give feedback.
All reactions