Cryptopals Set 1 writeup
Posted on July 11, 2019 by rctcwyvrn
Code can be found here
Hi, a simple writeup for the cryptopals set 1 challenges for the crypto noob from a crypto noob. There are plenty of other tutorials, so look for a better one if this one makes no sense
This is gonna mostly be a tutorial for how to do this byte stuff in python, which is really unintuitive to me anyway
NOTE: Some of the most trouble I had in these challenges was getting the data to the right types, and it involved lots and lots of stackoverflow and following other guides, remember there’s no shame in doing that and don’t feel bad when you see your 10th TypeError in a row
Challenge 1: Convert hex to base64
For this challenge you just need to know how to do this stuff in python, I used the codecs library
Decode: Some encoded format like hex or base64 or ascii –> bytearray Encode: bytearray –> Some encoded format like hex or base64 or ascii
So following the hint you convert like this: hex -> bytes -> base64
Here’s some examples for how it works
def hex_to_bytes(hex_in):
return codecs.decode(hex_in, 'hex')
def base64_to_bytes(hex_in):
return codecs.decode(hex_in, 'utf-8')
def bytes_to_hex(byte_in):
return codecs.encode(byte_in,'hex').decode()
Challenge 2: Fixed XOR
For this one you want to use python’s ^ operator, which acts on two bytes and returns the logical XOR So the steps are
- Convert both hex strings to bytes
- Create a new bytearray for the output
- Loop on the bytearrays for the two input strings
- Append the result of ^ to the output
- Encode the output bytes back to hex (im too lazy to check if i actually have to do this)
Challenge 3 Single-byte XOR cipher
I see why these are in order now… Theoretically it’s not hard, the problem for me was getting the stupid python syntax correct…
Here’s the framework
- Convert to bytes as usual
- Loop from 0 to 255 to loop over all the possible single chars
- Do a single-byte xor on each of those, here’s code from the tutorial I found
def single_char_xor(in_raw, char_val):
= b''
output_bytes for byte in in_raw:
+=bytes([byte ^ char_val])
output_bytesreturn output_bytes
Source: https://laconicwolf.com/2018/05/29/cryptopals-challenge-3-single-byte-xor-cipher-in-python/
For all the other python things, follow along with laconicwolf and google. I’ll lay out the rest of the framework, I would recomend just trying it from here and referring back here when you get stuck
- Calculate a “english_score”, using something like this https://en.wikipedia.org/wiki/Letter_frequency to determine if something is a phrase or not
- Create a dictionary of score/bytearray pairs and sort them to find which bytearray has the best score
Since the best score = most like an english phrase, the key that makes the best english phrase is (probably) the best key. So thats it!
Challenge 4 Detecting single-byte XOR cipher
It’s challenge 3 but literally just more
- file = open(“data.txt”)
- Loop through the file line by line by using python magic, for line in file: detect_single_char_xor(line), where that function is your code from Challenge 3
- Do the same sorting proccess as challenge 3 to again which determine which bytearray has the best score
Now the party is really going!
Aside 1: Converting plaintext strings and chars to bytes
- Declare an empty list, I called mine temp
- Append [ord(char)] for each char in the plaintext to temp
- my_bytes = bytes(temp)
ord converts a char to it’s byte value, so we just make a bytearray of the bytes and we have the string in it’s bytes for us to mess around with!
Aside 2: Having an empty bytearray to start appending bytes to
- Literally just output_bytes = b’’
What the hell python, how is this legal. You can redo the code from aside 1 with this new information btw
Challenge 5: Repeating-key XOR
Mostly a combination of what we’ve seen already, I would reccomend making sure you can do this on your own before reading any guides, since it should be mostly copy paste from challenges 3 and 4
- Take the key and plaintext
- Convert the plaintext into bytes
- Loop over the bytes and append on bytes([ord(key[count]) ^ byte]) where count is incremented and modded over the length of the keystring
- Return and you’re done!
Challenge 6: Break repeating-key XOR
The big bad!
Part 1: Hamming distance function
List of mistakes I made along the way
- You want to compare bits, not bytes, so convert the byte (which is really just an int) into a string of bits (Stackoverflow it, no shame in doing so)
- The bits may not have the same length, so you need to add the distance between their lengths to the dist
- Make sure you are indexing the string in the right direction
- Make sure not to index off the end of the bit string
Part 2: Rest of the fucking owl
Honestly I don’t know how my code managed to be bug free, but it somehow was…
Here’s the functions I used:
- hdist(bytes1,bytes2), hamming distance function
- take_block(in_bytes, a, b), returns the bytes from a to b
- blockify(in_bytes, block_size), converts the bytes into a list of block_size sized bytes
- transpose(blocks), takes the list from blockify and transposes it as detailed in the challenge (step 6)
- break_repeating_key_xor(enc_bytes, guess_len), the big boi
hdist was explained in part 1 and the other functions are fairly self explanatory except for 5.
Here’s what break_repeating_key_xor() did:
- Loop over keysizes from 2 to guess_len
- Break the entire… As I was writing this I realized that I just rewrote the code for blockify(), basically line for line…
- (revised) Call blockify to create the list of blocks
- Use some nice python magic to make a list of all the dists for all the combinations of two blocks
- Sum it up and normalize it by the length of the list and the key_size
- Add it into the list of potential key_sizes
- (out of the key_size loop now) Sort the list
- Blockify by the optimal key_size
- Transpose them
- Call break_single_byte_xor() from challenge 3 to get a single-byte key
- Put em all together, use chr() to convert them back to ascii and you get your final key!
Key = {Terminator X : Bring the noise} My code is available, but I would really not recommend comparing your answer to them as I am fairly inexperienced in writing good python code, I write just barely good enough python code. There’s defintely one or two off by one bugs in my code too.
Challenge 7 AES in ECB mode
I’m stupid and didn’t read the instructions, do this in code because you’ll need it alot later. I used pycrpyto
Challenge 8 Detecting AES in ECB mode
The main part of the challenge is figuring out how to actually detect ECB encryption, and the hint isn’t super helpful.
The idea is that if there is a duplicate 16 byte plaintext in the original message, then it will also be duplicated in the ECB. But why we can assume that there is duplicated plaintext is beyond me… Here’s what I followed: https://crypto.stackexchange.com/questions/20941/why-shouldnt-i-use-ecb-encryption and https://obrien.io/writeups/crypto/2018/02/01/cryptopals-set-1-writeup/ to check my answers
Anyway you want to do the type wrangling you’re probably used to now
- Open the file
- lines = f.readlines()
- for line in lines
- unhexlify(line.strip()), the strip() is important! Don’t be dumb like me and forget it
- Append those onto a new list enc[]
- Loop through enc and call is_ecb() on them until it finds something
is_ecb() is easy once you understand how to actually detect ecb
- Find the # of bytes in in_bytes
- Find the # of bytes in in_bytes without duplicates
- If they’re the same length then it’s not ECB, but if the second is smaller then it’s probably ECB encoded
The answer doesn’t seem to be something that’s “obviously correct” like in the earlier challenges, but I’m reasonable sure my code is correct.
And that concludes Set 1! Pretty fun but also defintely frustrating at times when you get nothing but TypeErrors for 20 minutes straight trying to convert the input to what you want. Set 2 coming soon tmtm