-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Milestone
Description
One thing that's really helped with identifying the correct clique is to (hugely) prefer preferred name exact matches over non-exact matches. One problem with conflating identifiers before putting them into NameRes is that unless we've chosen exactly right name for the overall clique, it can make it harder to find that clique and for other, non-exact matches to override it.
One way of fixing that would be to revert to our previous plan for handling conflations, which is to load the conflation files into memory and to apply conflation on-the-fly as follows:
- You can search NameRes without conflation turned on to get individual clique entries -- this is often what you want anyway, although you then may need to manually conflate it afterwards.
- You can search NameRes with conflation turned on -- we run search for the best matches and, if a match has a conflated identifier, we expand it on-the-fly to include all the other cliques in that conflation. We also add some metadata to indicate what is being conflated.
- If you want to look up a CURIE, we return the unconflated image, but can (optionally?) also include the list of other CURIEs that we would conflate it do under a particular conflation.
This would at least fix tylenol, which has an exact match in UMLS:C0699142, but which would then be combined into acetaminophen when conflation is applied. It might help with others, too.
Metadata
Metadata
Assignees
Labels
No labels