You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Support for atoms containing any Unicode code point was added
in Erlang/OTP 20 (PR-1078).
After that change, an atom can contain up to 255 Unicode code
points. However, atoms used in Erlang source code is still limited to
255 bytes because the atom table in the BEAM file only has a byte for
holding the length in bytes of the atom text. For instance, the `🟦`
character has a four-byte encoding (`<<240,159,159,166>>`), meaning
that Erlang source code containing a literal atom consisting of 64 or
more such characters cannot be compiled.
This commit changes the atom table in BEAM files to use a variable
length encoding for the length of each atom. For atoms up to 15 bytes,
the length is encoded in one byte. The header for the atom table is
also changed to indicate that new encoding of lengths are used.
Attempting to load a BEAM file compiled with Erlang/OTP 28 in
Erlang/OTP 27 or earlier will result in the following error message:
1> l(t).
=ERROR REPORT==== 8-Oct-2024::08:49:01.750424 ===
beam/beam_load.c(150): Error loading module t:
corrupt atom table
{error,badfile}
`beam_lib` is updated to handle the new format. External tools that
use `beam_lib:chunks(Beam, [atoms])` to read the atom table will
continue to work. External tools that do their own parsing of the atom
table will need to be updated.
0 commit comments