Cryptopals Set 1 writeup

Posted on July 11, 2019 by rctcwyvrn

Code can be found here

Hi, a simple writeup for the cryptopals set 1 challenges for the crypto noob from a crypto noob. There are plenty of other tutorials, so look for a better one if this one makes no sense

This is gonna mostly be a tutorial for how to do this byte stuff in python, which is really unintuitive to me anyway

NOTE: Some of the most trouble I had in these challenges was getting the data to the right types, and it involved lots and lots of stackoverflow and following other guides, remember there’s no shame in doing that and don’t feel bad when you see your 10th TypeError in a row

Challenge 1: Convert hex to base64

For this challenge you just need to know how to do this stuff in python, I used the codecs library

Decode: Some encoded format like hex or base64 or ascii –> bytearray Encode: bytearray –> Some encoded format like hex or base64 or ascii

So following the hint you convert like this: hex -> bytes -> base64

Here’s some examples for how it works

def hex_to_bytes(hex_in):
	return codecs.decode(hex_in, 'hex')

def base64_to_bytes(hex_in):
	return codecs.decode(hex_in, 'utf-8')

def bytes_to_hex(byte_in):
	return codecs.encode(byte_in,'hex').decode()

Challenge 2: Fixed XOR

For this one you want to use python’s ^ operator, which acts on two bytes and returns the logical XOR So the steps are

  1. Convert both hex strings to bytes
  2. Create a new bytearray for the output
  3. Loop on the bytearrays for the two input strings
  4. Append the result of ^ to the output
  5. Encode the output bytes back to hex (im too lazy to check if i actually have to do this)

Challenge 3 Single-byte XOR cipher

I see why these are in order now… Theoretically it’s not hard, the problem for me was getting the stupid python syntax correct…

Here’s the framework

  1. Convert to bytes as usual
  2. Loop from 0 to 255 to loop over all the possible single chars
  3. Do a single-byte xor on each of those, here’s code from the tutorial I found
def single_char_xor(in_raw, char_val):
	output_bytes = b''
	for byte in in_raw:
		output_bytes+=bytes([byte ^ char_val])
	return output_bytes

Source: https://laconicwolf.com/2018/05/29/cryptopals-challenge-3-single-byte-xor-cipher-in-python/

For all the other python things, follow along with laconicwolf and google. I’ll lay out the rest of the framework, I would recomend just trying it from here and referring back here when you get stuck

  1. Calculate a “english_score”, using something like this https://en.wikipedia.org/wiki/Letter_frequency to determine if something is a phrase or not
  2. Create a dictionary of score/bytearray pairs and sort them to find which bytearray has the best score

Since the best score = most like an english phrase, the key that makes the best english phrase is (probably) the best key. So thats it!

Challenge 4 Detecting single-byte XOR cipher

It’s challenge 3 but literally just more

  1. file = open(“data.txt”)
  2. Loop through the file line by line by using python magic, for line in file: detect_single_char_xor(line), where that function is your code from Challenge 3
  3. Do the same sorting proccess as challenge 3 to again which determine which bytearray has the best score

Now the party is really going!

Aside 1: Converting plaintext strings and chars to bytes

  1. Declare an empty list, I called mine temp
  2. Append [ord(char)] for each char in the plaintext to temp
  3. my_bytes = bytes(temp)

ord converts a char to it’s byte value, so we just make a bytearray of the bytes and we have the string in it’s bytes for us to mess around with!

Aside 2: Having an empty bytearray to start appending bytes to

  1. Literally just output_bytes = b’’

What the hell python, how is this legal. You can redo the code from aside 1 with this new information btw

Challenge 5: Repeating-key XOR

Mostly a combination of what we’ve seen already, I would reccomend making sure you can do this on your own before reading any guides, since it should be mostly copy paste from challenges 3 and 4

  1. Take the key and plaintext
  2. Convert the plaintext into bytes
  3. Loop over the bytes and append on bytes([ord(key[count]) ^ byte]) where count is incremented and modded over the length of the keystring
  4. Return and you’re done!

Challenge 6: Break repeating-key XOR

The big bad!

Part 1: Hamming distance function

List of mistakes I made along the way

  1. You want to compare bits, not bytes, so convert the byte (which is really just an int) into a string of bits (Stackoverflow it, no shame in doing so)
  2. The bits may not have the same length, so you need to add the distance between their lengths to the dist
  3. Make sure you are indexing the string in the right direction
  4. Make sure not to index off the end of the bit string

Part 2: Rest of the fucking owl

Honestly I don’t know how my code managed to be bug free, but it somehow was…

Here’s the functions I used:

  1. hdist(bytes1,bytes2), hamming distance function
  2. take_block(in_bytes, a, b), returns the bytes from a to b
  3. blockify(in_bytes, block_size), converts the bytes into a list of block_size sized bytes
  4. transpose(blocks), takes the list from blockify and transposes it as detailed in the challenge (step 6)
  5. break_repeating_key_xor(enc_bytes, guess_len), the big boi

hdist was explained in part 1 and the other functions are fairly self explanatory except for 5.

Here’s what break_repeating_key_xor() did:

  1. Loop over keysizes from 2 to guess_len
  2. Break the entire… As I was writing this I realized that I just rewrote the code for blockify(), basically line for line…
  3. (revised) Call blockify to create the list of blocks
  4. Use some nice python magic to make a list of all the dists for all the combinations of two blocks
  5. Sum it up and normalize it by the length of the list and the key_size
  6. Add it into the list of potential key_sizes
  7. (out of the key_size loop now) Sort the list
  8. Blockify by the optimal key_size
  9. Transpose them
  10. Call break_single_byte_xor() from challenge 3 to get a single-byte key
  11. Put em all together, use chr() to convert them back to ascii and you get your final key!

Key = {Terminator X : Bring the noise} My code is available, but I would really not recommend comparing your answer to them as I am fairly inexperienced in writing good python code, I write just barely good enough python code. There’s defintely one or two off by one bugs in my code too.

Challenge 7 AES in ECB mode

I’m stupid and didn’t read the instructions, do this in code because you’ll need it alot later. I used pycrpyto

Challenge 8 Detecting AES in ECB mode

The main part of the challenge is figuring out how to actually detect ECB encryption, and the hint isn’t super helpful.

The idea is that if there is a duplicate 16 byte plaintext in the original message, then it will also be duplicated in the ECB. But why we can assume that there is duplicated plaintext is beyond me… Here’s what I followed: https://crypto.stackexchange.com/questions/20941/why-shouldnt-i-use-ecb-encryption and https://obrien.io/writeups/crypto/2018/02/01/cryptopals-set-1-writeup/ to check my answers

Anyway you want to do the type wrangling you’re probably used to now

  1. Open the file
  2. lines = f.readlines()
  3. for line in lines
  4. unhexlify(line.strip()), the strip() is important! Don’t be dumb like me and forget it
  5. Append those onto a new list enc[]
  6. Loop through enc and call is_ecb() on them until it finds something

is_ecb() is easy once you understand how to actually detect ecb

  1. Find the # of bytes in in_bytes
  2. Find the # of bytes in in_bytes without duplicates
  3. If they’re the same length then it’s not ECB, but if the second is smaller then it’s probably ECB encoded

The answer doesn’t seem to be something that’s “obviously correct” like in the earlier challenges, but I’m reasonable sure my code is correct.

And that concludes Set 1! Pretty fun but also defintely frustrating at times when you get nothing but TypeErrors for 20 minutes straight trying to convert the input to what you want. Set 2 coming soon tmtm