ATMega32u4: Scanning a full-size keyboard matrix with two 74hc595 in series

The problem

TLDR: Not enough IO pins on MCU to address a full layout keyboard.

As an avid user of mechanical keyboards, I often find myself wanting to build yet another one! In my projects bin I found a wooden laser-cut plate destined for a keyboard for my girlfriend that she was to write her masters thesis on. She got her master degree but no keyboard, sad-face smiley.

The project started way back as an idea of making a mechanical keyboard with a Lenovo Yoga laptop layout + a numpad. This seemed like a simple enough build so I got a laser-cut wooden plate fit for 101 Cherry MX compatible switches. Fast forward waaay too long, I got hold of stabilizers and a nice subtle set of PBT keycaps. I wired it up with some nice thick CAT6 scraps I had lying around and noticed that I would probably not have enough pins for the matrix on an ATMega32u4. Some time passed and I suddenly remembered I had a bunch of 74hc595 shift registers in the parts bin.

I wired up the shift registers in series as explained on every tutorial on the internet and suddenly I could address 16 columns using just three IO pins. Add six rows and a last column and the entire key matrix uses just 10 IO pins.

74hc595 quick rundown

The 74hc595 is an old, old serial in, parallel out shift register with an active high. This sounds like gibberish but simply means that the microcontroller sends a byte serially to the shift register which then outputs the individual bits on 8 separate pins. Active high means that a 1-bit is represented by the voltage for logic high, 5v in our case. One of the pins on the IC lets you daisy chain them meaning you can chain n of them together and just send n bytes to the first one. It will send the overflow bytes on to the next one, and so on. You can now address any number of pins using just three IO pins.

The shift register is FIFO meaning that the first byte you send gets forwarded first. So to enable pin 1 on the second register you send two bytes, 0x01and 0x00. As for responsiveness and latency it’s quite an awesome IC for our needs. It allows clocking in data at a whopping 20 MHz, which is orders of magnitude faster than a keyboard needs.

To send a byte you have to change the state of the data pin at the same time as the clock pin. To illustrate let’s look at how to send a 1 every other bit by sending the byte 0b010101:

Bit number            0     1     2     3     4     5     6     7
                      __    __    __    __    __    __    __    __    __    
CLOCK              __|  |__|  |__|  |__|  |__|  |__|  |__|  |__|  |__|  |__

                __                                                         __   
LATCH        __|  |_______________________________________________________|  |__

Bits                  0     1     0     1     0     1     0     1
                            __          __          __          __   
DATA               ________|  |________|  |________|  |________|  |_______

To prepare the data to be sent we change the latch pin state and begin sending our data. For every bit we pulse the clock pin. At the end we pull the latch high again.


Below is a very rudimentary look at how to wire the registers together. They share the latch and clock pin. The upper one gets the data from the MCU while the bottom one gets the data from the overflow pin on the upper register. The column pins are wired to the columns of the keyboard.

          |               |
 Col 9 +--+    74hc595    +--+
          |               |
Col 10 +--+               +--+ Col 8
          |               |
Col 11 +--+               +-----------------------------+
          |               |                             |
Col 12 +--+               |                             |
          |               |                             |
Col 13 +--+               +--------------+------------+ |
          |               |              |            | |
Col 14 +--+               +----------------+--------+ | |
          |               |              | |        | | |
Col 15 +--+               +--+           | |        | | |              +---------------+
          |               |              | |        | | | SHR_DATA     |               |
   Gnd +--+               +----------+   | |        | | +--------------+               |
          |               |          |   | |        | |                |     ATMega    |
          +---------------+          |   | |        | |   SHR_LATCH    |     32u4      |
                                     |   | |        | +----------------+               |
          +---------------+          |   | |        |                  |               |
          |               |          |   | |        |     SHR_CLOCK    |               |
 Col 1 +--+    74hc595    +--+       |   | |        +------------------+               |
          |               |          |   | |                           |               |
 Col 2 +--+               +--+ Col 0 |   | |                           +---------------+
          |               |          |   | |
 Col 3 +--+               +----------+   | |
          |               |              | |
 Col 4 +--+               |              | |
          |               |              | |
 Col 5 +--+               +--------------+ |
          |               |                |
 Col 6 +--+               +----------------+
          |               |
 Col 7 +--+               +--+
          |               |
   Gnd +--+               +--+
          |               |

The software

For this keyboard I went with customizing the open-source keyboard formware called QMK. In its default state it relies on the AVR chips’ internal pullup resistors and uses the row pins as current sinks to pull the pin values down to 0. When you press any key current flows from the column pin into the row pin, which makes the column pin go from 1 to 0. The firmware can then map that to a keycode based on which row and column it is.

Since our shift register is active high, we can’t use the internal pullup resistors on the AVR. And since the AVR does not have pulldown resistors, we have to add a resistor to ground for each of our rows (I used a 47k, but you can use just about any resistor above ~5k). The firmware makes it easy to create a new layout/keybaord but also makes it quite easy to add custom matrix scannnig logic. I took the code for writing to the shift register and repurposed it for scanning using a shift register and active high instead of low. After a couple of hours of fiddlying with it I got a working keyboard.

The code is up at my GitHub ripdajacker/qmk_firmware (yogaext branch). Let’s take a quick look at it.


The config.h file consists of a bunch of constants for the particular keyboard configuration. Here are some relevant parts:

/* row pins */
#define ROW_A  D1
#define ROW_B  D0
#define ROW_C  D4
#define ROW_D  C6
#define ROW_E  D7
#define ROW_F  E6

/* columns 0 - 16 */
#define SHR_LATCH B2
#define SHR_CLOCK B3
#define SHR_DATA  B1
#define SHR_COLS { 0x0001, 0x0002, 0x0004, 0x0008, 0x0010, 0x0020, 0x0040, 0x0080, 0x0100, 0x0200, 0x0400, 0x0800, 0x1000, 0x2000, 0x4000, 0x8000 }

The rows A to F references the pins on the mucrocontroller. The defines prefixed with SHR_ are related to interacing with the shift register. The define called SHR_COLS contains 16 two-byte values. These are the values I send to the shift register when scanning a given column. The same array can be written with a left shift, but this makes it easier to read in a couple of months/years.


We now move on to matrix.c which has all the meat of the code. Let’s start with matrix_init() method, which is run once when the keyboard is powered up:

 * Row pins PD2, PD3... PD7 are input for rows A-F.
 * The rows are pulled low with a pull-down resistor.
 * The columns are scanned using two 74hc595 in series on pins defined in config.h.
void matrix_init(void) {


    for (uint8_t i = 0; i < MATRIX_ROWS; i++) {
        matrix[i]            = 0;
        matrix_debouncing[i] = 0;

The above code initializes the relevant pins to be inputs and outputs and initializes our matrix to 0. It then calls the built-in method matrix_init_quantum().

To scan the matrix we need to send bytes serially to our shift register like mentionend in the previous section. First let’s look at our clock pulse method:

static inline void shift_pulse(void) {

This i very simple. It simply changes the clock pin state from high to low every time it is called. Now let’s send a single byte:

static void shift_out_single(uint8_t value) {
    for (uint8_t i = 0; i < 8; i++) {
        if (value & 0b10000000) {
        } else {

        value = value << 1;

We iterate over the bits in the byte and pull the pin high if the current bit is 1, we pull it low if it is 0. Between each bit we pulse the clock pin and left shift the value to go to the next bit.

Combining the above we can now shift two bytes at a time:

static void shift_out(uint16_t value) {
    uint8_t first_byte  = (value >> 8) & 0xFF;
    uint8_t second_byte = (uint8_t)(value & 0xFF);


We first set the latch pin low, split the 16-bit value into two 8-bit values and send them to the first register. The first byte we send is then shifted out to the second register (remember, they are daisy-chained). We then pull our latch pin high. And using this method we can finally scan our columns:

static void select_col(uint8_t col) {
    // SHIFT out columns 0 to 15
    if (col < 16) {
    } else {

If the column is one of the first 16 (between 0 and 15) we use our shift registers. Since the keyboard has 17 columns we use a separate pin for the 17th column. To read the rows we simply check if a row pin is 0 or 1 using the read_rows method:

static uint8_t read_rows(void) {
     return (readPin(ROW_F) << 5)
     | (readPin(ROW_E) << 4)
     | (readPin(ROW_D) << 3)
     | (readPin(ROW_C) << 2)
     | (readPin(ROW_B) << 1)
     | (readPin(ROW_A) );

The above is simply reading all the rows and combining the results into 1 byte.

Scanning the matrix

We can now combine all of the above methods to write the needed matrix_scan method:

uint8_t matrix_scan(void) {
    for (uint8_t col = 0; col < MATRIX_COLS; col++) {
        uint8_t rows = read_rows();
        for (uint8_t row = 0; row < MATRIX_ROWS; row++) {
            bool prev_bit = matrix_debouncing[row] & ((matrix_row_t)1 << col);
            bool curr_bit = rows & (1 << row);
            if (prev_bit != curr_bit) {
                matrix_debouncing[row] ^= ((matrix_row_t)1 << col);
                debouncing = DEBOUNCE;

    if (debouncing) {
        if (--debouncing) {
        } else {
            for (uint8_t i = 0; i < MATRIX_ROWS; i++) {
                matrix[i] = matrix_debouncing[i];
    return 1;

Lets break the above down. We start with a loop over all the columns and for every column we call select_col, which sends 5v through our column. Then there’s a small delay to let the signal settle. We then read the rows. If any row, column combination is changed, we save it into the matrix_debouncing array.

If any change is detected the debouncing variable is set to the default debounce value 5. If debouncing is true (i.e. not zero), we count down and delay 1ms between each. If debouncing is 0, we overwrite the values in matrix[] with our new values from matrix_deboucning[].

After each scan we call the built-in method matrix_scan_quantum, which does the keycode conversion and such.


This was a fun project to do and write about. Shift registers are a very useful way to extend the IO capabilities of a cheap microcontroller, and I will definitely use it in future projects. For the next keyboard build I may opt for some IC that is active low, that way I can take advantage of the pullup registers built in to the AVR chips.

Thanks for reading if you made it this far. Happy hacking.


© 2023 Jesenko Mehmedbasic - this is a footer.