-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Regarding the 3 desired properties for the basic-encoding algo (info preserving, idempotent, universal), we currently have a trade-off: the current algorithm satisfies the first 2, but is not entirely universal. More precisely, it fails to encode "hybrid" graphs, such as the one obtained by merging a Full graph (containing triple terms) with an already Basic-encoded graph.
An alternative would be to accept hybrid graphs and "do our best" with them, i.e. :
- leave any bnode that encodes a triple term untouched,
- encode any remaining triple terms, reusing a bnode encoding it if alreadt present (to avoid creating duplicates).
This algorithm would have a different trade-off as the current one
- slightly more complex (to reuse existing encoded triple terms)
- it would be universal (it would not fail on any graph)
- it would not be completely information preserving.
Explanation of the last point above: consider two RDF-full graphs G and H both containing triple terms. If we merge G and H, and encode the result, we get J. If we encode G into G', then merge G' and H, then encode the result, we also get J.
This caveat in information-preservation is not, in my opinion a problem, because "hybrid" graphs are not something we expect "in the wild" anyway, only as the result of internal operations in systems who need basic-encoding. Currently, we simply refuse "hybrid" graphs and advise people to not produce them. With this new proposal, people would not have to care about that anymore, they would be able to merge a mix of full and basic-encoded graphs, and encode or decode the result, depending of the flavour they want.
So to summarize, I find this alternative more flexible and usable.