In the previous article, we explored how base64 encoding works and successfully implemented it in Rust. Now, let’s delve into the process of decoding a base64-encoded string. It is important to note that the implementations discussed here are primarily for educational purposes, and for production environments, it is recommended to employ well-established libraries.

Theory

Here are the steps to decode a base64-encoded string:

  • Split the string into groups of 4 characters.
  • Find the index of each character in the base64 character map.
  • Convert each index into binary.
  • Combine all the binaries in a group into one binary, resulting in 24 bits.
  • Split the binary into groups of 8 bits, which correspond to characters in the ASCII table.

These steps will enable us to successfully decode the base64-encoded string.

Let’s apply these steps to decode the string UnVzdA== which we encoded in the previous article.

  1. Split the string into groups of 4 bytes and find index of eaach character in the base64 table:

step one

  1. Compine the binary and then split them into three groups of 8 bits

step one

  1. Convert every 8 bits into a corresponding character from the ASCII table

step one

We do the same steps for the rest of the string. Note that in our example there are two padding characters, which we ignore them.

Implementation

Here is the implementation in Rust:

pub fn decode(input: &[u8]) -> String {

	let mut output: Vec<u8> = Vec::new();
	for chunk in input.chunks(4) {

    	let a = decode_char(chunk[0]);
    	let b = decode_char(chunk[1]);
    	let c = decode_char(chunk[2]);
    	let d = decode_char(chunk[3]);

    	let dec1 = ((a << 2) | (b & 0x30) >> 4) as u8;
    	let dec2 = (((b & 0x0F) << 4) | (c & 0x3C) >> 2) as u8;
    	let dec3 = (((c & 0x03) << 6) | (d)) as u8;

    	output.push(dec1);
    	output.push(dec2);
    	output.push(dec3);
	}

	String::from_utf8(output).unwrap().replace("\0", "")
}


fn decode_char(input: u8) -> u8 {
	BASE_CHARS.iter().position(|&c| c == input).unwrap_or(0) as u8
}
  • for chunk in input.chunks(4)

To begin the decoding process, we start by dividing the input into groups of 4 bytes.

  • Within the decode_char function, we locate the corresponding index of each character in the base64 character table and then return the index value. This step allows us to map each character to its appropriate index during the decoding process. During the decoding process, we ignore paddings and other characters that are not present in the base64 character map.

  • let dec1 = ((a << 2) | (b & 0x30) >> 4) as u8;

    1. Starting with a = 'U', we shift the binary representation of a two times to the left:

      (a << 2) => (01010100 << 2) => (01010000)

    2. For b = 'n', we first apply a bitwise AND operation with 0x30 (which is binary 110000) to remove the last four bits of b:

      (b & 0x30) => (100111 & 110000) => (100000)

    3. Then, we shift the result four times to the right:

      (b & 0x30) >> 4 => (100000 >> 4) => (000010)

    4. Next, we combine the results from step 1 and step 2 using the bitwise OR operation:

      (01010000 | 000010) => (01010010)

  • let dec2 = (((b & 0x0F) << 4) | (c & 0x3C) >> 2) as u8;:

    1. Starting with b = 'n', we first apply a bitwise AND operation with 0x0F (which is binary 001111) to remove the first two bits of b:

      (b & 0x0F) => (100111 & 001111) => (000111)

    2. Then, we shift the result four times to the left:

      ((b & 0x0F) << 4) => (000111 << 4) => (01110000)

    3. For c = 'V', we first apply a bitwise AND operation with 0x3C (which is binary 111100) to retrieve the last four bits of c:

      (c & 0x3C) => (010101 & 111100) => (010100)

    4. Then, we shift the result two times to the right:

      ((c & 0x3C) >> 2) => (010100 >> 2) => (00000101)

    5. Next, we combine the results from step 1 and step 2 using the bitwise OR operation:

      (((b & 0x0F) << 4) | (c & 0x3C) >> 2) => (01110000 | 00000101) => (01110101)

  • let dec3 = (((c & 0x03) << 6) | (d)) as u8;

    1. Starting with c = ‘V’, we first apply a bitwise AND operation with 0x03 (which is binary 000011) to retrieve the last two bits of c:

      (c & 0x03) => (010101 & 000011) => (000001)

    2. Then, we shift the result six times to the left:

      ((c & 0x03) << 6) => (000001 << 6) => (01000000)

    3. For d = 'z', we simply use the value of d as it is.

    4. Next, we combine the results from step 1 and the value of d using the bitwise OR operation:

      (((c & 0x03) << 6) | d) => (01000000 | 110011) => (01110011)

Testing

Now let’s write some tests:

#[cfg(test)]
mod tests {
    use super::*;

 	#[test]
    fn test_decode() {
        let decoded = decode(b"WW91IGFyZSBhbGxvd2VkIHRvIGJlIGJvdGggYSBtYXN0ZXJwaWVjZSBhbmQgYSB3b3JrIGluIHByb2dyZXNzLCBzaW11bHRhbmVvdXNseS4=");
        assert_eq!("You are allowed to be both a masterpiece and a work in progress, simultaneously.", decoded);
    }
    
}

And the result:

running 1 test
test base64::tests::test_decode ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

Summary

In the two blog posts, we learned how to do Base64 encoding and decoding and applied it in Rust. I think it’s a great way to learn a new programming language because we can challenge ourselves by combining it with other things we’re not familiar with. It makes learning more fun and productive.