Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/http-gateways/path-gateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ editors:
name: Protocol Labs
url: https://protocol.ai/
xref:
- url
- rfc3986
- trustless-gateway
- subdomain-gateway
- dnslink-gateway
Expand Down Expand Up @@ -511,7 +511,7 @@ When deserialized responses are enabled,
and no explicit response format is provided with the request, and the
requested data itself has no built-in content type metadata, implementations
SHOULD perform content type sniffing based on file name
(from :ref[url] path, or optional [`filename`](#filename-request-query-parameter) parameter)
(from URI path, or optional [`filename`](#filename-request-query-parameter) parameter)
and magic bytes to improve the utility of produced responses.

For example:
Expand Down
290 changes: 290 additions & 0 deletions src/ipips/ipip-0518.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,290 @@
---
title: "IPIP-0518: HTTP(S) URLs in Routing V1 API"
date: 2025-10-13
ipip: proposal
editors:
- name: Marcin Rataj
github: lidel
url: https://lidel.org/
affiliation:
name: Shipyard
url: https://ipshipyard.com
relatedIssues:
- https://github.com/ipfs/specs/issues/192
- https://github.com/ipfs/specs/issues/496
- https://github.com/multiformats/multiaddr/issues/63
- https://github.com/multiformats/multiaddr/issues/87
- https://github.com/ipshipyard/roadmaps/issues/15
- https://github.com/ipfs/specs/pull/518
order: 518
tags: ['ipips']
xref:
- rfc3986
---

## Summary

Allow HTTP(S) URLs alongside multiaddrs in the `Addrs` field of the Peer schema in the Delegated Routing V1 HTTP API to enable easier integration with HTTP-based infrastructure.

## Motivation

The current Delegated Routing V1 HTTP API requires all peer addresses to be encoded as [multiaddrs](https://github.com/multiformats/multiaddr). While multiaddrs provide a flexible and protocol-agnostic way to represent network addresses, many IPFS services are primarily accessible via HTTP(S) endpoints, including:

- IPFS Gateways (both path and subdomain gateways)
- Delegated routing endpoints themselves
- HTTP-based content providers and pinning services

Converting HTTP(S) URLs to multiaddrs requires additional complexity:
- HTTP URLs must be encoded as `/dns4/example.com/tcp/80/http` or `/dns4/example.com/tcp/443/https`
- This conversion is not intuitive for developers familiar with web standards
- It does not capture HTTP semantics where the same website can be exposed on both TCP (HTTP/1.1, HTTP/2) and UDP (HTTP/3)
- A single `https://example.com` URL automatically supports multiple transport protocols, but multiaddr representation requires separate entries for each transport
- Parsing multiaddrs back to URLs requires additional libraries and logic

By allowing native HTTP(S) URLs in the `Addrs` field, we can:
- Simplify integration with existing web infrastructure
- Reduce conversion overhead for HTTP-based services
- Improve developer experience by using familiar URL formats
- Improve interoperability with the wider HTTP and URI ecosystem
- Enable future-proofing for non-HTTP URI schemes in ecosystem experimentation without requiring permission from gatekeepers
- Maintain backward compatibility with existing multiaddr-based implementations

## Detailed design

### Changes to the Peer Schema

The `Addrs` field in the [Peer Schema](https://specs.ipfs.tech/routing/http-routing-v1/#peer-schema) will accept both multiaddr strings and HTTP(S) URL strings:

```json
{
"Schema": "peer",
"ID": "bafz...",
"Addrs": [
"/ip4/192.168.1.1/tcp/4001",
"/dns4/libp2p-peer.example.com/tcp/4001/ws",
"https://trustless-gateway.example.org",
"https://custom-port.example.net:8443"
],
"Protocols": ["transport-bitswap", ...]
}
```

### Parsing Logic

Implementations MUST use the following logic to distinguish between multiaddrs and URLs:

1. If a string in `Addrs` starts with `/` (forward slash), parse it as a multiaddr
2. Otherwise, attempt to parse it as a URI according to :cite[rfc3986]
3. If neither parsing succeeds, or if the address type is not supported by the implementation, the address MUST be ignored (skipped)
4. Processing MUST continue with the remaining addresses in the array
5. Implementations SHOULD log warnings for addresses they cannot parse or do not support

This approach ensures forward compatibility: new address types can be introduced without breaking existing clients, as unsupported addresses are simply skipped.

### Supported URL Schemes

Initially, only the following URL schemes SHOULD be supported:
- `http://` - HTTP endpoints
- `https://` - HTTPS endpoints

Future specifications MAY add support for additional schemes.

### URL Requirements

URLs in the `Addrs` field:
- MUST be absolute URLs (not relative)
- MUST include the scheme (`http://` or `https://`)
- SHOULD NOT include paths, query parameters, and fragments, but clients MUST account for them being present as part of defensive programming and either act on them, ignore them, or skip such addresses
- SHOULD point to endpoints that support IPFS protocols listed in the `Protocols` field

### Examples

#### HTTPS-only Content Provider

```json
{
"Schema": "peer",
"ID": "12D3KooWExample...",
"Addrs": [
"https://trustless-gateway.example.com"
],
"Protocols": ["transport-ipfs-gateway-http"]
}
```

#### Hybrid Peer with Multiple Transports

```json
{
"Schema": "peer",
"ID": "12D3KooWExample...",
"Addrs": [
"/ip4/192.168.1.1/tcp/4001",
"/ip4/192.168.1.1/udp/4001/quic-v1",
"https://my-node.example.org:8080"
],
"Protocols": ["transport-bitswap", "transport-ipfs-gateway-http"]
}
```

## Design rationale

### Why not create a new field?

Adding URLs to the existing `Addrs` field rather than creating a new field (e.g., `URLs`) has several advantages:
- Maintains backward compatibility - existing clients continue to work
- Avoids duplication when the same endpoint can be expressed as both multiaddr and URL
- Simplifies the schema without adding complexity
- Follows the principle that addresses are addresses, regardless of encoding

### Clear disambiguation

The parsing rule (strings starting with `/` are multiaddrs, others are URIs) provides clear, unambiguous disambiguation:
- Multiaddrs ALWAYS start with `/` by specification
- Valid URLs NEVER start with `/` (they start with a scheme like `http://`)
- This makes parsing deterministic and fast

### Incremental adoption

This change allows for incremental adoption:
- Clients that don't understand URLs can simply skip them
- Servers can start including URLs immediately for URL-aware clients
- No flag day or coordinated upgrade required

## User benefit

This change benefits multiple user groups:

### For developers

- Simplified integration with existing HTTP infrastructure
- No need for multiaddr encoding/decoding libraries for HTTP endpoints
- Clearer, more readable configurations and debugging
- Barrier of adoption is removed: developers can implement HTTP-based routing and retrieval without having to re-implement libp2p concepts like Multiaddr, making it orders of magnitude easier to create light IPFS clients

### For service providers

- Easier to advertise HTTP-based services
- Can provide URLs that include paths and query parameters if needed
- Reduced complexity in route announcements

### For end users

- Potentially faster connection establishment to HTTP services
- Better compatibility with web-based IPFS implementations
- Lower barrier for creating new clients gives end users more choice and less vendor lock-in
- Provides viable escape path in case any of the open source projects gets captured by forces that do not put end user's good first

## Compatibility

This IPIP is fully backward and forward compatible:

### For existing clients

- Clients that only understand multiaddrs MUST skip URL entries they don't recognize (this is already implemented and proven to work when new protocols like `/quic`, `/quic-v1`, `/webtransport`, and `/webrtc-direct` were rolled out)
- Clients MUST continue processing remaining addresses even when encountering unsupported entries
- No changes required to existing parsing logic for multiaddr strings
- The `Addrs` field remains an array of strings

### Forward compatibility

- The requirement to skip unsupported addresses ensures that new address types can be added in the future
- Clients MUST NOT fail when encountering unknown address formats
- This allows the ecosystem to evolve without breaking existing implementations or without the need for permission or central coordination

### For existing servers

- Servers can continue sending only multiaddrs
- No changes required if URLs are not used

### Migration path

1. Servers can start including both multiaddrs and URLs for the same endpoints
2. Clients can be updated to parse URLs at their own pace
3. Eventually, servers may choose to only send URLs for HTTP(S) endpoints

## Security

### URL validation

Implementations SHOULD validate URLs to prevent security issues:
- Verify the URL scheme is allowed (`http://` or `https://`)
- Consider rate limiting for URL-based connections if non-success (!=200) responses are received
- Validate URL length limits (DNS names are limited to 253 characters; practical URL length is typically 2048-8192 characters depending on implementation)

### HTTPS preference

Implementations SHOULD ignore `http://` URLs and only act on `https://` URLs for security and performance (HTTP/2 multiplexing) reasons.

The `http://` scheme SHOULD be allowed only for testing and private LAN deployments, and only when an explicit opt-in flag is set by the end user.

### DNS considerations

URLs rely on DNS resolution, which has different security properties than IP-based multiaddrs. The same rules that apply to `/dns`, `/dns4`, and `/dns6` multiaddrs apply here:
- DNS responses can be spoofed if DNSSEC is not used
- Clients SHOULD use secure DNS transports where available
- Certificate validation MUST be performed for HTTPS URLs

## Alternatives

### Separate URL field

Adding a separate `URLs` field was considered but rejected because:
- It would complicate the schema
- It could lead to confusion about which field to use
- It wouldn't be backward compatible

### URL-to-multiaddr conversion requirement

Requiring all HTTP endpoints to be encoded as multiaddrs was the status quo but has proven cumbersome in practice. Multiple implementations on NPM and Golang alone behaved in slightly different fashion around how the schema, default port, optional path, fragment, and HTTP basic-auth were handled. This led to hard-to-debug errors due to multiaddr-URL conversion being ultimately lossy and 1:1 round-trip not being possible (see [multiaddr#63](https://github.com/multiformats/multiaddr/issues/63)).

### Custom multiaddr protocols with keyword arguments

Adding keyword arguments to multiaddr protocols was proposed in [multiaddr#87](https://github.com/multiformats/multiaddr/issues/87) to allow expressing `https://` URLs as multiaddrs without losing any information on conversion. This approach was not adopted because it would add even more complexity that multiaddr implementers would have to deal with.

This solution was not feasible - adding native URI support is better as it removes walls and obstacles, rather than making existing ones taller.

## Test fixtures

Implementations can test compatibility using these example responses:

### Mixed addresses response

```json
{
"Providers": [
{
"Schema": "peer",
"ID": "12D3KooWTest1...",
"Addrs": [
"/ip4/127.0.0.1/tcp/4001",
"http://localhost:8080",
"/dns4/example.com/tcp/443/https",
"https://example.net"
],
"Protocols": ["transport-bitswap", "transport-ipfs-gateway-http"]
}
]
}
```

### URL-only response

```json
{
"Providers": [
{
"Schema": "peer",
"ID": "12D3KooWTest2...",
"Addrs": [
"https://trustless-gateway.example.org"
],
"Protocols": ["transport-ipfs-gateway-http"]
}
]
}
```

## Copyright

Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
27 changes: 19 additions & 8 deletions src/routing/http-routing-v1.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,9 @@ editors:
url: https://ipshipyard.com
xref:
- ipip-0337
- ipip-0518
- ipns-record
- rfc3986
order: 0
tags: ['routing']
---
Expand All @@ -52,7 +54,11 @@ As such, human-readable encodings of types are preferred. This specification may
## Common Data Types

- CIDs are always string-encoded using a [multibase]-encoded [CIDv1].
- Multiaddrs are string-encoded according to the [human-readable multiaddr specification][multiaddr].
- Addresses in the `Addrs` field can be:
- Multiaddrs: string-encoded according to the [human-readable multiaddr specification][multiaddr], always starting with `/`
- HTTP(S) URLs: absolute URLs with `http://` or `https://` schemes, parsed as URIs according to :cite[rfc3986]
- Parsing logic: if a string starts with `/`, parse as multiaddr; otherwise, parse as URI
- Unsupported addresses: implementations MUST skip addresses they cannot parse or do not support, and MUST continue processing remaining addresses (see [IPIP-0518](https://specs.ipfs.tech/ipips/ipip-0518/))
- Peer IDs are string-encoded according [PeerID string representation specification][peer-id-representation]: either a Multihash in Base58btc, or a CIDv1 with libp2p-key (`0x72`) codec in Base36 or Base32.
- Multibase bytes are string-encoded according to [the Multibase spec][multibase], and SHOULD use base64.
- Timestamps are Unix millisecond epoch timestamps.
Expand All @@ -77,16 +83,18 @@ This API uses a standard version prefix in the path, such as `/v1/...`. If a bac

Optional `?filter-addrs` to apply Network Address Filtering from [IPIP-484](https://specs.ipfs.tech/ipips/ipip-0484/).

- `?filter-addrs=<comma-separated-list>` optional parameter that indicates which network transports to return by filtering the multiaddrs in the `Addrs` field of the [Peer schema](#peer-schema).
- `?filter-addrs=<comma-separated-list>` optional parameter that indicates which network transports to return by filtering the addresses in the `Addrs` field of the [Peer schema](#peer-schema).
- The value of the `filter-addrs` parameter is a comma-separated (`,` or `%2C`) list of network transport protocol _name strings_ as defined in the [multiaddr protocol registry](https://github.com/multiformats/multiaddr/blob/master/protocols.csv), e.g. `?filter-addrs=tls,webrtc-direct,webtransport`.
- `unknown` can be be passed to include providers whose multiaddrs are unknown, e.g. `?filter-addrs=unknown`. This allows for not removing providers whose multiaddrs are unknown at the time of filtering (e.g. keeping DHT results that require additional peer lookup).
- Multiaddrs are filtered by checking if the protocol name appears in any of the multiaddrs (logical OR).
- Negative filtering is done by prefixing the protocol name with `!`, e.g. to skip IPv6 and QUIC addrs: `?filter-addrs=!ip6,!quic-v1`. Note that negative filtering is done by checking if the protocol name does not appear in any of the multiaddrs (logical AND).
- `unknown` can be be passed to include providers whose addresses are unknown, e.g. `?filter-addrs=unknown`. This allows for not removing providers whose addresses are unknown at the time of filtering (e.g. keeping DHT results that require additional peer lookup).
- Addresses are filtered by checking if the protocol name appears in any of the multiaddrs, or if the URI scheme matches for HTTP(S) URLs (logical OR in both cases).
- Example: `http` can be be passed to include providers whose addresses are HTTP-compatible. This will include `http://` `https://` URIs, and `/http` `/https` and `/tls/http` Multiaddrs.
- For the purpose of filtering, implementations SHOULD include `/tls/http` Multiaddrs when `https` is passed as a filter to ensure composed multiaddrs are included in results.
- Negative filtering is done by prefixing the protocol name with `!`, e.g. to skip IPv6 and QUIC addrs: `?filter-addrs=!ip6,!quic-v1`. Note that negative filtering is done by checking if the protocol name does not appear in any of the addresses (logical AND).
- If no parameter is passed, the default behavior is to return the original list of addresses unchanged.
- If only negative filters are provided, addresses not passing any of the negative filters are included.
- If positive filters are provided, only addresses passing at least one positive filter (and no negative filters) are included.
- If both positive and negative filters are provided, the address must pass all negative filters and at least one positive filter to be included.
- If there are no multiaddrs that match the passed transports, the provider is omitted from the response.
- If there are no addresses that match the passed transports, the provider is omitted from the response.
- Filtering is case-insensitive.

##### `filter-protocols` (providers request query parameter)
Expand Down Expand Up @@ -315,14 +323,17 @@ The `peer` schema represents an arbitrary peer.
{
"Schema": "peer",
"ID": "bafz...",
"Addrs": ["/ip4/..."],
"Addrs": ["/ip4/...", "https://trustless-gateway.example.com"],
"Protocols": ["transport-bitswap", ...]
...
}
```

- `ID`: the [Peer ID][peer-id] as Multihash in Base58btc or CIDv1 with libp2p-key codec.
- `Addrs`: an optional list of known [multiaddrs][multiaddr] for this peer.
- `Addrs`: an optional list of known addresses for this peer, which can include both:
- [Multiaddrs][multiaddr]: strings starting with `/`, e.g., `/ip4/192.168.1.1/tcp/4001`
- HTTP(S) URLs: absolute URLs with `http://` or `https://` schemes, e.g., `https://trustless-gateway.example.com`
- Implementations MUST skip addresses they cannot parse or do not support and continue with remaining addresses
- If missing or empty, it means the router server is missing that information, and the client should use `ID` to lookup updated peer information.
- `Protocols`: an optional list of protocols known to be supported by this peer.
- If missing or empty, it means the router server is missing that information, and the client should use `ID` and `Addrs` to lookup connect to the peer and use the [libp2p identify protocol](https://github.com/libp2p/specs/tree/master/identify) to learn about supported ones.
Expand Down