This module adds a unicode
behavior to ZMK. Some highlights:
- Add any code point to the keymap using
&uc
without prior definition - Convenience macros for even easier input of commonly used code points
- Configurable input systems that can be switched while keyboard is in use
To load the module, add the following entries to remotes
and projects
in
config/west.yml
.
manifest:
remotes:
- name: zmkfirmware
url-base: https://github.com/zmkfirmware
- name: urob
url-base: https://github.com/urob
projects:
- name: zmk
remote: zmkfirmware
revision: v0.3
import: app/west.yml
- name: zmk-unicode
remote: urob
revision: v0.3 # set to same as ZMK version above
self:
path: config
For Unicode input to work one must prepare both the keyboard and the operating system. Please read through the entire section to understand what is required.
To initialize the Unicode module, include the unicode.dtsi
header near the
top of the keymap:
#include <behaviors/unicode.dtsi>
Manual input. Unicode code points are added to the keymap using &uc CP1 CP2
, where CP1
and
CP2
are hexadecimal code points. The former is produced if the key is pressed by itself, the
latter is produced if the key is pressed while Shift is active. To yield a code point
CP
independently of Shift, one can use the shortcut &uc CP 0
(or &uc CP CP
).
For instance &uc 0xE4 0xC4
yields ä
(U+00E4
) when pressed by itself and yields Ä
(U+00C4
)
when Shift
is active. By contrast, &uc 0xe4 0
always yields ä
.
Code points must be in the range of 0x00
to 0x10FFFF
. Leading zeros can be omitted without loss;
i.e., &uc 0xE4 0
is equivalent to &uc 0x00E4 0
.
Padding. Regardless of whether leading zeros are omitted on the keymap, they are omitted, by
default, when the code point is send to the OS. This tends to be more reliable for input systems
that I have personally tested. Should this cause issues, one can force padding to a minimum length
using the minimum-length
property. E.g.,
&uc {
minimum-length = <4>;
};
Please let me know if certain input systems require strictly positive padding.
Convenience macros. This module includes a collection of convenience macros to simplify the
inclusion of common code points. For instance, instead of using &uc 0xE4 0xC4
to get ä/Ä
one can
equivalently use &uc UC_DE_AE
. All currently available macros can be seen
here.
There are six configurable input systems (see below for descriptions and further customization
options). Each input system is associated with a Default-mode
label used to configure the initial
system when the keyboard starts up, and a Set-mode
label used to switch inputs while the
keyboard is in use.
Default-mode | Set-mode | |
---|---|---|
macOS | UC_MODE_MACOS |
UC_SET_MACOS |
Linux | UC_MODE_LINUX |
UC_SET_LINUX |
Linux (alt) | UC_MODE_LINUX_ALT |
UC_SET_LINUX_ALT |
Windows (WinCompose) | UC_MODE_WIN_COMPOSE |
UC_SET_WIN_COMPOSE |
Windows (HexNumpad) | UC_MODE_WIN_ALT |
UC_SET_WIN_ALT |
Emacs | UC_MODE_EMACS |
UC_SET_EMACS |
Initial system. When the keyboard starts up, it initially selects the system defined by
the default-mode
property of the uc
behavior. To set it, add the following to your keymap
outside its root node.
&uc {
default-mode = <UC_MODE_LINUX>; // Replace with desired input system.
};
Switching while in use. To switch the input system while the keyboard is in use, add &uc UC_SET_MODE
bindings to your keymap. For instance, &uc UC_SET_LINUX
will switch the input system
to Linux.
Most input systems require some preparation of the OS. Continue reading to learn about the differences between these systems, additional configuration options, and how to prepare your OS.
1. macOS (UC_MODE_MACOS
)
macOS has built-in support for Unicode input, supporting all possible code points.
To enable, go to System Preferences → Keyboard → Input Sources, then add Unicode Hex Input to the list (under Other), and activate it from the input dropdown in the menu bar. Note that this may disable some Option-based shortcuts such as Option+Left and Option+Right.
The UC_MODE_MACOS
input system has one configurable property macos-key
,
which defaults to LALT
. The system will:
- press and hold
macos-key
(LALT
per default) - input the code point sequence
- release
macos-key
To overwrite macos-key
, add the following outside of the root node of your
keymap:
&uc {
macos-key = <LALT>; // replace with desired key
};
2. Linux IBus (UC_MODE_LINUX
)
For desktop environments with IBus, Unicode input is enabled by default, supports all possible code
points, and works almost anywhere. Without IBus, it works under GTK apps, but rarely anywhere else.
(Though, according to this stack exchange answer, it is
possible to install IBus
under other DEs.)
If the system is not working in certain applications, it is worth trying out UC_MODE_LINUX_ALT
.
The UC_MODE_LINUX
input system has one configurable property linux-key
,
which defaults to LC(LS(U))
. The system will:
- tap and release
linux-key
(LC(LS(U))
by default) - input the code point sequence
- tap and release
SPACE
To overwrite linux-key
, add the following outside of the root node of your
keymap:
&uc {
linux-key = <LC(LS(U))>; // replace with desired key
};
3. Linux Alt (UC_MODE_LINUX_ALT
)
This is a variant of UC_MODE_LINUX
, which keeps holding LCTRL + LSHFT
for
the entire input.
The UC_MODE_LINUX_ALT
input system has one configurable property linux-alt-key
,
which defaults to LC(LSHSFT)
. The system will:
- press and hold
linux-alt-key
(LC(LSHFT)
by default) - tap and release
U
- input the code point sequence
- tap and release
SPACE
- release
linux-alt-key
To overwrite linux-alt-key
, add the following outside of the root node of your
keymap:
&uc {
linux-alt-key = <LC(LSHFT)>; // replace with desired key
};
4. WinCompose (UC_MODE_WIN_COMPOSE
)
This input system requires a third-party tool called WinCompose. It supports all possible code points, and is the recommended input mode for Windows.
To enable, install the latest release from GitHub. Once installed, it will automatically run on startup. This works reliably under all versions of Windows supported by WinCompose.
The UC_MODE_WIN_COMPOSE
input system has one configurable property win-compose-key
,
which defaults to RALT
. The system will:
- tap and release
win-compose-key
(RALT
by default) - tap and release
U
- input the code point sequence
- tap and release
RET
To overwrite win-compose-key
, add the following outside of the root node of your
keymap:
&uc {
win-compose-key = <RALT>; // replace with desired key
};
5. Windows HexNumpad (UC_MODE_WIN_ALT
)
This is Windows' built-in hex numpad Unicode input mode. It only supports code
points up to U+FFFF
, and is not recommended due to reliability and
compatibility issues.
To enable, run the following as an administrator, then reboot:
reg add "HKCU\Control Panel\Input Method" -v EnableHexNumpad -t REG_SZ -d 1
The system will:
- press and hold
LALT
- tap and release
KP_PLUS
- input the code point sequence
- release
LALT
6. Emacs (UC_MODE_EMACS
)
Emacs supports code point input with the insert-char
command.
The system will:
- tap and release
LC(X)
- tap and release
N8
- tap and release
RET
- input the code point sequence
- tap and release
RET
Two Kconfig
settings can be used to fine-tune the timing of the input system.
ZMK_UNICODE_TAP_MS
sets the time to wait (in milliseconds) between the press and release of each key in the Unicode sequence. (defaults to 5)ZMK_UNICODE_WAIT_MS
sets the time to wait (in milliseconds) between tapping keys in the Unicode sequence. (defaults to 0)
If you experience "mangled" sequences appearing on the screen, increasing the defaults may help. This is known to help on slow terminal connections and weak Bluetooth signals.
- There is currently no error checking for code points. Please make sure that you specify a valid code point string.
- Code points are send using the default US keyboard layout. Alternative layouts are currently not supported.
#include <behaviors.dtsi>
#include <dt-bindings/zmk/keys.h>
#include <behaviors/unicode.dtsi> // Source header for this module
// Optional: Overwrite default behavior properties
&uc {
default-mode = <UC_MODE_LINUX>; // Default to Linux input system
minimum-length = <0>; // Set to desired minimum input length
linux-key = <LC(LS(U))>; // Overwrite Linux compose key
win-compose-key = <RALT>; // Overwrite WinCompose compose key
};
/ {
keymap {
compatible = "zmk,keymap";
default_layer {
bindings = <
/* Add some code points */
&uc UC_DE_AE /* ä/Ä - same as &uc 0xE4 0xC4 */
&uc UC_DE_OE /* ö/Ö - same as &uc 0xFC 0xDC */
&uc UC_DE_UE /* ü/Ü - same as &uc 0xF6 0xD6 */
&uc 0x1F596 0 /* 🖖 */
/* Add bindings to switch between input modes */
&uc UC_SET_MACOS
&uc UC_SET_LINUX
>;
};
};
};
Contributions in any form are very welcome!
New convenience macros should be prefixed by UC_<NS>
where <NS>
is a unique namespace
reflecting the use. In particular, all language headers should use the corresponding two-letter ISO
language code as namespace. Other
headers are encouraged to use at least three-letter namespace to avoid conflicts.