Skip to content

RFC: Handling of Alternative Multibases #5349

@kevina

Description

@kevina

As we switch to using CidV1 base32 encoding the question is how we should handle this displaying of CID in alternative multibases.

This issue is only about the output. For the input will we already handle CIDv1 in any base supported by go-multibase.

Here are the alternatives, as I see it, and what I estimate it will take to do it.

1. Don't Preserve the multibase, Use Settings or Defaults to determine multibase

Accept CidV1 in any multibase, but ignore the base that is used. Output the CIDs based on either a global --cid-base flag or a config value.

This means that the string a user passes in may not be the same string that is returned. For example

$ ipfs pin add zb2rhfwrxLo7wm3ZT2ishCdimPKWdRCe3KuxHFQYMR8Kyj96x
pinned bafkreiekhi5qqy64ovh7jckh5ewnyt6lcvfzqfubv3l66cazo5dknu6dre recursively

This also means that #5234 (ipfs resolve should respect CID base) will not be implemented.

The most straightforward way to implement this is replacing every call of cid.String() with something like cid.Encode(base) (see ipfs/go-cid#60) and make sure the base is available at the point when this method needs to be called. The WithValue functionally on contexts can be helpful here.

As most CID's are converted to strings on the server side, rather than the client, doing a quick hack such as setting a global variable is not going to work.

Another way to implement this, at least when working with the API, is to modify our API to make CID's a special type and then set the multibase as part of the writer that formats the output for this user. This means extra CPU cycles as the CID is converted from a binary represenation to a string twice. If we go this route I recommend the API be hardcoded to use base32 as is a more efficient encoding than our current default of base58btc.

2. Preserve the multibase. Children of DAGs inherit multi-base of parent when known.

When a CID is decoded preserve the multibase used with the CID so when it is converted back to a string the same multibase will be used. In addition associate the CID of any children of the DAG with the same multibase prefix. When a CID does not have a multibase associated use the settings from the command line or config option.

ipfs pin add will then display the same base. In addition commands like ipfs ls will use the same base to display directory entries as the hash used for the parent.

$ ipfs ls f017012205be9e545b52def2d18a922146c6cdec6bcbf4007b6ee7bec3341673bbf1d8c06
f01551220f6ba0f2491f2371dee0ea66e8812331b78f162697d2f41b5d55b667e5ae6bc67 3606  diff.go
f01551220e43f16413e6e1aa73eee0194d40363b7082d06faae8d6e94f18bd5f8cc08dfee 18299 object.go
f015512201985aa3462ff163c46de0f5e6634bde41f2bd537780ccbe5401aea1cccc49577 9057  patch.go

Implementing this will involve associating a multibase with CID's when it is known (see ipfs/go-cid#66). It will also involve setting the multibase of children CID's in the DAG code. New CID's will also need to get the base set somewhere.

After thinking about this I think this will involve less code change than #1 as the base does not need to be set everywhere, although the changes may not be as straightforward.

I think this is the most user friendly approach.

3. Attempt to Preserve the multibase.

This solution does not store the multi-base with the CID but makes an attempt to output the same multibase on a ad-hoc bases. For the current implementation of ipfs resolve it turned out to be easy. ipfs pin add is more difficult. It was in fact after looking at ipfs pin add that I decided that #2 will be required to do it properly, especially when multiple CIDs are given on the command line.

Implementing this involves doing #1 and in addition detecting the multibase of any CIDs used and setting and using the correct base for the output. Things can get tricky when CID's with different mutlibases are used at once. This will involve more code that either other solution.

#5289 started effort towards this solution (see the code for ipfs resolve and ipfs ls)

I do not like this solution as it will only work some of the time and prefer we go with #1 or #2 for more predictable output.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions