...making Linux just a little more fun!

<-- prev | next -->

Keymap and IOCTLs

By Anonymous

In issue #86 of the Linux Gazette there's a nice example of how to remap keys from a C application:

http://linuxgazette.net/issue86/misc/bint/kbe.c.txt

The author, the late Stephen Bint, was working on an editor for the text console and a library for such an editor. He didn't like to have the screen "dirtied" when pressing <PgUp> or <PgDn> while also pressing the <Shift> key. Indeed, that combination is reserved by default to scroll backward or forward through console output: the kernel will intervene and do things before the editor comes to know what keys you have pressed.

There is a way around it: redefine the keys temporarily, restoring them when exiting your application. Of course, if you can understand the C source for loadkeys and dumpkeys, you can skip both Stephen's examples and what follows. However, since I had problems getting even a vague idea of what loadkeys and dumpkeys do, I decided to write down a few details, to complement the old LG article referenced above.

All of this is relevant only for the GNU/Linux text console. You can forget about anything this "simple" if you work only under X11.

1. Introduction

A PC keyboard sends signals to the Linux kernel - more precisely, to the kernel keyboard driver - telling it that a certain key has been pressed or released. Normally, applications let the kernel translate those signals into strings using the current (screen) character set and the traditional ASCII encoding, where one character equals one byte. This used to be the normal approach, until recently. Nowadays, Unicode is spreading, and you are now likely to have Unicode for the screen, rather than a character set as in the old days. The keyboard will then produce Unicode strings, using the UTF-8 encoding, where one character may need more than one byte. For the purpose of this article, however, there is no difference between the two modes; we refer to them both as translation mode.

Direct handling of keyboard signals (raw and semi-raw mode) by the application is rare.

The keymap tells you about the translation the kernel is doing for you. However, if you want to know what's going on, don't look into the local keymap: It is not complete, since it includes other files; it is formatted in an inconvenient way; and your distro may have modified it at boot time anyway. If you want to know what is really in force, and be able to read it, issue

    dumpkeys -1 > my_keymap.txt

at the shell prompt and examine (the nicely formatted) my_keymap.txt. Consider the following:

Whatever the keys do, they do it on the basis of an 'action code' assigned to them. For keymap programming, you must know these action codes. How do you get them? Issue

    dumpkeys --long-info > actioncodes.txt

at the prompt. You get a list of action codes (left column) and the key labels for them (right column). Some labels have synonyms: for instance, the action code for <PgUp> is 0x0118, but you will not read 'PgUp' next to it; you will see the label 'Prior'.

Keys like <Left> may not insert anything, but still they have their effects:

An example of the first kind is <Shift><PgUp> where the console output is scrolled bypassing the current application.

However, more often than not we are concerned about (ii). We press <Left>, the cursor moves one space to the left, the application is in control. That implies the application has been notified that <Left> was pressed. How was it notified? The keyboard driver knows the action code for the key <Left>. Although it does not arrange directly for the cursor movement, the action code arranges for a string to be sent to the application, so the application can do the right thing. The string for <Left> is normally "\033[D", where the octal \033 represents the escape character. It is the application that decides to move the cursor left upon receiving "\033[D".

In translation mode, the application receives strings of one character, normally for insertion of that character, or multi-character strings for functional keys triggering some action. They are called functional keys because they are not just the F1 to F12 keys across the upper row of the keyboard; the <Left> key is also a functional key. Unfortunately, the keymap utilities say 'function keys' - i.e., <Left> is a function key to them. To put it mildly, this is just a bit confusing.

So, beside those few exceptions like <Shift><PgUp>, the application is running the show. It receives strings of one or more characters. The one-character strings are very often for insertion, but not always so. You surely have been exposed to old-style user interaction: press 'a' for All or 'c' for Cancel.

OK, but where do those strings like "\033[D" for <Left> come from?

For that, look into the local keymap - e.g., the US default keymap. Almost at the top of the keymap, you will see a line that says

    strings as usual

and that's it. That line defines the strings for F1-F20 and for Home, End,..., Left, Right, and so on. The strings originate from DEC hardware, and are still around after DEC passed away in the mid-90s.

Now, why is the keymap defining F1-F20, if the keyboard has only F1-F12? Because Unix keyboards were not PC keyboards. The default Linux keymaps (any language) set

    shift keycode 59 = F13      (physical F1 key)
    shift keycode 60 = F14      (physical F2 key)
    ...
    shift keycode 66 = F20      (physical F8 key)

So F13-F20 are not useless; actually they are not enough. Indeed, you will notice that

    control keycode 59 = F25    (physical F1 key)
    control keycode 60 = F26    (physical F2 key)
    ...
    control keycode 88 = F36    (physical F12 key)

are defined in the US keymap, although no strings are assigned to F21 and up. These entries mean, e.g., that <Shift><F1> will give F13 and <Ctrl><F1> will give F25. However, since F25 has an empty string, an empty string is forwarded to the application, and the application does exactly nothing.

Why is it so? Why are those keys set to send an empty string? Well, Linux 1.0 had a default keymap where <Ctrl><F1> was same as F1 - and so on. Sometime down the road, there was a change, for reasons which to me do not seem worth an investigative effort. Just note that the Russian Linux keymap has not changed on this point, and that FreeBSD has low-level operations on <Ctrl><F1> to <Ctrl><F10>.

Summing up, the keys not used by the kernel for its own purposes send a standard string to the application, possibly an empty string. Other keys are just void - i.e., undefined.

Category: Programming
Whatever the keys do, they do it on the basis of an 'action code' assigned to them. For keymap programming, you must know these action codes.

2. Changing the keymap

Before we get into some C code, let's mention the obvious: You can modify your keymap in an editor and then activate it. You are supposed to be already in the know, in this respect, otherwise you would not be able to understand what this article is about.

For instance, take those shifted <PgUp> and <PgDn> for scrolling the console. You edit the keymap to

    shift keycode 104       = Prior
    control alt keycode 104 = Scroll_Backward

    shift keycode 109       = Next
    control alt keycode 109 = Scroll_Forward

save it, load it with loadkeys and the console scrolling will be done with <Alt><Ctrl><PgUp>, <Alt><Ctrl><PgDn> , which is not in anybody's way.

This is, of course, a lot easier to do and to understand than Stephen's code; however, Stephen gave a nice example of how to use ioctls, and here is where we resume his efforts. We'll add a little example, and show how to use a couple of those ioctls.

We want to assign a string to <Ctrl><Enter> so as to be able to tell <Enter> from <Ctrl><Enter> - in the default keymap, these are the same. Here are all the details required:

    physical key        <Ctrl><Enter>
    keycode             28
    action code label   F50
    action code         0x013b
    associated string   "\033[M~4"
  1. Where does the keycode come from? From the keymap.
  2. Where does F50 come from? It's arbitrary. The keymap understands F1-F255, but only F1-F20 are fully defined with strings. F25-F36 are there but not fully defined. F50 should not encroach on anything. Needless to say, it can be changed.
  3. Where does the action code come from? From the file 'actioncodes.txt' that we generated a couple of paragraphs upstream.
  4. Where does the associated string come from? It's arbitrary; you are welcome to replace it with whatever you like.
  5. In a first step, we insert a couple of headers and definitions in our code (if not already there):
        #include <linux/kd.h>
        #include <linux/keyboard.h>
        struct kbentry ke;
        struct kbsentry ks;
    

    In a second step, we instruct the keyboard driver to associate our string with F50:

        ks.kb_func   = 50;
        ks.kb_string = "\033[M~4";
        ioctl(fileno(stdin), KDSKBSENT, &ks);
        //  if assignment fails, an error will be set
        //  in global variable ERR
    

    This and the kbentry/kbsentry structures are scantily documented in the manpage console_ioctl, written in 1995 and never updated. Same for KDSKBSENT, which can be memorized as 'Keyboard Driver Set KeyBoard String ENTry'.

  6. Finally, we bind F50 to <Ctrl><Enter>:
        ke.kb_table = 4;
        ke.kb_index = 28;
        ke.kb_value = 0x013b;
    

    The kb_table is determined by the modifier used. We are pressing two keys, <Ctrl> and <Enter>, so we are using modifier <Ctrl> which has number 4 in the keymap.

    The kb_index is the keycode, the kb_value is the action code, both already explicitly stated above.

    We use yet another ioctl call:

        ioctl(fd,KDSKBENT,&ke);
        // if assignment fails, an error will be set
        // in global variable ERR
    

    and we are done - just pay attention when typing KDSKBSENT vs. KDSKBENT. F50 is now associated with the string of our choice. This string is sent to the application when <Ctrl><Enter> is pressed.

3. Moral

Essentially, what we have done is to allow for

    <Enter>
    <Ctrl><m>
    <Ctrl><Enter>

to be all distinct. Think about it: The first two send the same string, namely "^M", character 13. They can be told apart by checking the modifier, since the first one is produced with 1 key, while the second one needs 2 keys.

The third entry, <Ctrl><Enter>, can not, by default, be distinguished from <Ctrl><m>, since both produce "^M", and both signal that <Ctrl> was pressed. (You would have to move to raw or semi-raw mode to find out - not recommended.) With our little modification, we now have all our ducks in a row. Also, the trick can be applied elsewhere, for instance, to <Tab>, <Ctrl><i>, <Ctrl><Tab>.

Note that applications other than the one you are working on may rely on the keymap defaults, and get confused by your ioctls intervention. A well-behaved application should thus reverse the changes made. However, if the application crashes for whatever reasons, the reversing code will not be executed, and the keyboard will not be reset.

In other words: when you exit this application, just reload the default keymap. For that, the command will be

    loadkeys us

or 'uk' or 'fr' or 'ge', or whatever you are using.

Talkback: Discuss this article with The Answer Gang


Bio picture A. N. Onymous has been writing for LG since the early days - generally by sneaking in at night and leaving a variety of articles on the Editor's desk. A man (woman?) of mystery, claiming no credit and hiding in darkness... probably something to do with large amounts of treasure in an ancient Mayan temple and a beautiful dark-eyed woman with a snake tattoo winding down from her left hip. Or maybe he just treasures his privacy. In any case, we're grateful for his contribution.
-- Editor, Linux Gazette

Copyright © 2007, Anonymous. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 136 of Linux Gazette, March 2007

<-- prev | next -->
Tux