Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChemicalEntity <derivesFrom> ChemicalEntity (PUBCHEM.COMPOUND:6140 - L-phenylalanine) #169

Open
sstemann opened this issue Jan 24, 2022 · 6 comments

Comments

@sstemann
Copy link
Contributor

Query: derivesFrom.json
PK: 15f5935e-4479-405c-9e4b-f0cda7194df7
Control: Looking for what metabolite (chemicalEntity ) derives from L-phenylalanine. The control should be L-tyrosine (Tyrosine also fine)

image

@finnagin
Copy link
Contributor

finnagin commented Jan 25, 2022

Hey @MarkDWilliams , just as an fyi I made a workflowatized version of this query you could try looking at. It basically takes the query from above but overlays normalized google distance as well as fisher exact test values over the edges. It then scores and ranks the results using those values.

Unfortunately, it cannot currently be run through the workflow runner since @kennethmorton is still hard at work getting its fanning and merging functionality implemented. However, you can still run it through ARAX for the time being.

@MarkDWilliams
Copy link
Collaborator

Expected result

Subject Predicate Object
PUBCHEM.COMPOUND:6057 biolink:derives_from PUBCHEM.COMPOUND:6140

@webyrd
Copy link

webyrd commented Jan 25, 2022 via email

@colleenXu
Copy link
Contributor

colleenXu commented Jan 26, 2022

To update: we have the data from SEMMEDDB. However, there are issues retrieving it.

There seems to be a semantic + ID-mapping mismatch between:

  • this query: Phenylalanine + Tyrosine as chemicals
  • the data in semmeddb: Phenylalanine UMLS:C0031453 is aapp, which the biolink-model maps to Polypeptide. Tyrosine UMLS:C0041485 is gngm, which biolink-model maps to GenomicEntity.
  • the semantic types reported by SRI ID resolver. It reports both of the UMLS IDs as Protein (phenylalanine, tyrosine)

@cbizon
Copy link
Collaborator

cbizon commented Jan 26, 2022

Hi @colleenXu can you make an issue for the semantic types here: https://github.com/TranslatorSRI/NodeNormalization/issues

FWIW, if the NN matched the biolink model mappings (which we will revisit), that won't allow there will still be a mismatch to the query. There's a mixin, I think, that would allow chemicals + proteins.

@vdancik
Copy link
Contributor

vdancik commented Jan 26, 2022

@cbizon , @sierra-moxon , is there any appetite to bring Polypeptide back to chemical hierarchy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants