Skip to content

Conversation

@erickcestari
Copy link
Contributor

@erickcestari erickcestari commented May 30, 2025

This PR updates the "Malformed: invalid currency UTF-8" test vector to use 3 bytes of invalid UTF-8 instead of 2 bytes, since some applications like rust-lightning only validate encoding/size with proper length encoding (03 length with 3-byte value).

This commit replaces it with a test vector that:

  • Uses length encoding (03 length with 3-byte of invalid UTF-8)

@t-bast
Copy link
Collaborator

t-bast commented Jun 2, 2025

@thomash-acinq can you review this and check what eclair does?

@thomash-acinq
Copy link

The current test vector uses the following TLV stream:

06 02 8041  // Type: 6 (offer_currency), Length: 2, Value: two bytes that form an invalid utf8 string
0a 05 414c494345  // Type: 10 (offer_description), Length: 5, Value: "ALICE"
16 21 020202020202020202020202020202020202020202020202020202020202020202  // Type: 22 (offer_issuer_id), Length: 33, Value : some public key

It does what it claims to do (test that the offer_currency is validated as a utf8 string) and I don't see any need to change it.

Your proposed change is

06 03 4d2aa4  // Type: 6 (offer_currency), Length: 3, Value: three bytes that form an invalid utf8 string
08 02 2710  // Type: 8 (offer_amount)
0a 0c 5465737420766563746f7273 // Type: 10 (offer_description)
16 21 02eec7245d6b7d2ccb30380bfbe2a3648cd7a942653f5aa340edcea1f283686619  // Type: 22 (offer_issuer_id)

which does not seem to be an improvement.

@erickcestari
Copy link
Contributor Author

The current test vector uses the following TLV stream:

06 02 8041  // Type: 6 (offer_currency), Length: 2, Value: two bytes that form an invalid utf8 string
0a 05 414c494345  // Type: 10 (offer_description), Length: 5, Value: "ALICE"
16 21 020202020202020202020202020202020202020202020202020202020202020202  // Type: 22 (offer_issuer_id), Length: 33, Value : some public key

It does what it claims to do (test that the offer_currency is validated as a utf8 string) and I don't see any need to change it.

Your proposed change is

06 03 4d2aa4  // Type: 6 (offer_currency), Length: 3, Value: three bytes that form an invalid utf8 string
08 02 2710  // Type: 8 (offer_amount)
0a 0c 5465737420766563746f7273 // Type: 10 (offer_description)
16 21 02eec7245d6b7d2ccb30380bfbe2a3648cd7a942653f5aa340edcea1f283686619  // Type: 22 (offer_issuer_id)

which does not seem to be an improvement.

Some applications (rust-lightning) validate currency codes by first checking for exactly 3 bytes before parsing the UTF-8 content. Since this test vector isn't 3 bytes long, those applications would reject it based on length rather than detecting the invalid UTF-8 encoding. Using a 3-byte invalid UTF-8 sequence would better test the UTF-8 validation logic specifically. I'll also remove the other fields to keep this test vector focused solely on offer_currency.

@thomash-acinq
Copy link

OK, I understand the problem now. But if we really want to check that this field is a valid ISO 4217 code we should also add test vectors with symbols (US$), lowercase letters (usd) or even codes that look good but are not part of the standard (ABC).
Eclair is not affected as it doesn't support currency conversion so it will reject any offer that sets offer_currency.

@erickcestari erickcestari force-pushed the fix-malformed-currency-utf8-test-vector branch 3 times, most recently from e09ce18 to 5d3a0d5 Compare June 4, 2025 12:25
@erickcestari
Copy link
Contributor Author

OK, I understand the problem now. But if we really want to check that this field is a valid ISO 4217 code we should also add test vectors with symbols (US$), lowercase letters (usd) or even codes that look good but are not part of the standard (ABC). Eclair is not affected as it doesn't support currency conversion so it will reject any offer that sets offer_currency.

I have change the offer to lno1qcplllhapqpq86q2q4qkc6trv5tzzq6muh550qsfva9fdes0ruph7ctk2s8aqq06r4jxj3msc448wzwy9s

that have this data part:
0603fffefd080203e80a05416c6963651621035be5e9478209674a96e60f1f037f6176540fd001fa1d64694770c56a7709c42c

06 03 fffefd  // Type: 6 (offer_currency), Length: 3, Value: three bytes that form an invalid utf8 string
08 02 03e8 // Type: 8 (offer_amount), Length: 2, Value: 1000
0a 05 416c696365 // Type: 10 (offer_description), Length: 5, Value: "Alice"
16 21 035be5e9478209674a96e60f1f037f6176540fd001fa1d64694770c56a7709c42c  // Type: 22 (offer_issuer_id), Length: 33, Value : some public key

@t-bast
Copy link
Collaborator

t-bast commented Jun 11, 2025

@thomash-acinq @rustyrussell can you test this updated test vector?

@t-bast
Copy link
Collaborator

t-bast commented Jul 25, 2025

@thomash-acinq @rustyrussell ping?

@thomash-acinq
Copy link

Eclair now supports the currency field and properly validates it.
The new test case is correct (as was the old one), however I think it should be replaced by something that looks a lot more like a real currency code, like "usd" or "ABC". There is only a small number of acceptable values and checking that it's valid UTF8 feels like checking that a number is nonnegative when the only valid values are 0 and 1.

Replace 2-byte invalid UTF-8 test with 3-byte version, as some
applications like rust-lightning only validate encoding/size with
proper length encoding (03 length with 3-byte value).
@erickcestari erickcestari force-pushed the fix-malformed-currency-utf8-test-vector branch from 5d3a0d5 to 117095c Compare July 28, 2025 10:28
@erickcestari erickcestari changed the title Fix currency UTF-8 test vector to actually test invalid UTF-8 sequences Update "invalid currency UTF-8" test vector Jul 28, 2025
@erickcestari
Copy link
Contributor Author

Eclair now supports the currency field and properly validates it. The new test case is correct (as was the old one), however I think it should be replaced by something that looks a lot more like a real currency code, like "usd" or "ABC". There is only a small number of acceptable values and checking that it's valid UTF8 feels like checking that a number is nonnegative when the only valid values are 0 and 1.

I got a bit confused about what the spec wants to say about the ISO 4217 requirement. Does "as an ISO 4217 three-letter code" mean it should validate against the actual ISO 4217 currency list, or just follow the 3-letter ASCII uppercase format?

If it's full validation against real currencies, then testing with "ABC" makes sense since it would be rejected. But if it's just format validation, then "ABC" would actually be valid and the UTF-8 test is more appropriate for catching actual format violations.

The spec seems ambiguous on this point. Clarifying this would help ensure consistent behavior across implementations.

@thomash-acinq
Copy link

For me "an ISO 4217 three-letter code" clearly means that it must be in the actual list, so "ABC" must fail, I don't find it ambiguous. That's how eclair does it.

@TheBlueMatt
Copy link
Collaborator

That list changes regularly (presumably), though. While I think it's reasonable to validate against, I don't think we'll bother trying to keep a list up to date. We just reject anything that isn't 3-char ASCII uppercase letters.

@rustyrussell
Copy link
Collaborator

We simply check that it's valid UTF-8 (it's part of our parsing code):

$ ./devtools/bolt12-cli decode lno1qcplllhapqpq86q2q4qkc6trv5tzzq6muh550qsfva9fdes0ruph7ctk2s8aqq06r4jxj3msc448wzwy9s
bolt12-cli: Bad offer: invalid offer data

If it's an unknown currency we will decode it and allow the user to fetch it (there are no requirements on the contents of this field unless "if it uses offer_amount to provide the user with a cost estimate:", which is the front end's responsibility):

lightning-cli decode lno1qcp5zsjrpqpzwyq2p32x2um5ypmx2cm5dae8x93pqthvwfzadd7jejes8q9lhc4rvjxd022zv5l44g6qah82ru5rdpnpj
{
   "type": "bolt12 offer",
   "offer_id": "f6b441732d02887b91b1f029ab308101e03f9e0c10ae3357f01045fb4970895d",
   "offer_currency": "ABC",
   "offer_amount": 10000,
   "warning_unknown_offer_currency": "unknown currency code",
   "offer_description": "Test vectors",
   "offer_issuer_id": "02eec7245d6b7d2ccb30380bfbe2a3648cd7a942653f5aa340edcea1f283686619",
   "valid": true
}

Note: we don't allow creation of such things (I had to use our external tool to make this): we can't parse "100ABC" because we don't know how many decimals. e.g. 100USD becomes 10,000 "offer_amount";

$ lightning-cli offer 1000ABC test-offer-bad-currency
{
   "code": -32602,
   "message": "Unknown currency suffix ABC"
}

Ack 5d3a0d5

@thomash-acinq
Copy link

Yes it's inconvenient to have to keep the list up to date, however it's necessary to be able to understand the offer. The list is important not just to check that the currency is valid (we could just show the currency code to the end user who we can assume is already familiar with such codes) but to display the amount. If I create an offer that costs 50$ and you don't validate against the list, you will display 5000 USD instead of the correct 50.00 USD.

@TheBlueMatt
Copy link
Collaborator

Right, the actual user-facing logic really needs to know what the currency is (also to do conversion), we just punt that upstream rather than handling it ourselves (and it sounds like CLN does too).

Copy link
Collaborator

@t-bast t-bast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 117095c

t-bast added a commit to ACINQ/eclair that referenced this pull request Aug 12, 2025
thomash-acinq pushed a commit to ACINQ/eclair that referenced this pull request Aug 18, 2025
@t-bast t-bast merged commit 39713c5 into lightning:master Aug 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants