Skip to content

alternative for basic-encoding: sacrifice "information preservation" rather than "universality" #1

@pchampin

Description

@pchampin

Regarding the 3 desired properties for the basic-encoding algo (info preserving, idempotent, universal), we currently have a trade-off: the current algorithm satisfies the first 2, but is not entirely universal. More precisely, it fails to encode "hybrid" graphs, such as the one obtained by merging a Full graph (containing triple terms) with an already Basic-encoded graph.

An alternative would be to accept hybrid graphs and "do our best" with them, i.e. :

  • leave any bnode that encodes a triple term untouched,
  • encode any remaining triple terms, reusing a bnode encoding it if alreadt present (to avoid creating duplicates).

This algorithm would have a different trade-off as the current one

  • slightly more complex (to reuse existing encoded triple terms)
  • it would be universal (it would not fail on any graph)
  • it would not be completely information preserving.

Explanation of the last point above: consider two RDF-full graphs G and H both containing triple terms. If we merge G and H, and encode the result, we get J. If we encode G into G', then merge G' and H, then encode the result, we also get J.

This caveat in information-preservation is not, in my opinion a problem, because "hybrid" graphs are not something we expect "in the wild" anyway, only as the result of internal operations in systems who need basic-encoding. Currently, we simply refuse "hybrid" graphs and advise people to not produce them. With this new proposal, people would not have to care about that anymore, they would be able to merge a mix of full and basic-encoded graphs, and encode or decode the result, depending of the flavour they want.

So to summarize, I find this alternative more flexible and usable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions