Introduction
Crython was a challenge at BearcatCTF 2025 that involved deobfuscation of a large Python file using an Abstract Syntax Tree.
Opening the File
As soon as you open the file, you get hit with a massive jumble of operations:
1 | f = int.from_bytes(input('Enter your flag: ').encode(),'big') |
We see a large number at the end of the if statement, being XORed with the final result. If the value is falsy (0 in this case, since it’s an integer), then the program prints the “Nice flag!” message. Otherwise, we get “WRONG!”.
This means that whatever this massive number is, is the correct result for this program (since if you XOR a number by itself, you get zero.)
But, this number on its own didn’t convert to anything meaningful, so we figured there was something more.
Approaching Deobfuscation
Upon a cursory glance, we notice something: it’s just a massive string of the same operations, over and over again.
A pattern you can quickly notice is: lots of bit operations (inverts, AND/OR/XOR, bit shifts, etc.)
But there’s also a few arithmetic operations, as you can see in this code snippet: (1^1**1*1)
- that is 1 XOR 1 to the power of 1 times 1. (Or just 1 XOR 1, or 0).
Either way, there’s a lot to be evaluated here. If we sat here trying to deobfuscate this manually, we’d be here forever.
Introducing the AST
This is where something called an abstract syntax tree, or AST, comes in.
An abstract syntax tree is basically an interpreted program, parsed into nodes that interconnect with each other.
Let’s give, say, the simple operation a = 1
. This might be interpreted as an assignment node, with the left branch (the object being assigned to) being the variable a
and the right branch (the value) being the number 1.
Now, how can an AST help us with this challenge? We can walk the AST of the script (meaning, we visit every node on the tree) and recursively evaluate all these jumbled nodes, replacing them with the numbers they evaluate to.
In fact, Python even has a built in ast module for parsing Python code, meaning we don’t have to create our own AST.
For example, let’s take (1^1**1*1)
. It looks something like this (rendered with the VisAST module):
This might look a little crazy at first, but to explain:
- BinOp is simply just the name in Python for any mathematical or bitwise operator that takes two parameters (left and right, as we might refer to them later). So things like XOR, multiplication, and exponents, which are all used in this code.
- Following this tree, we can see the statement does a bitwise XOR between a constant 1 and another BinOp. That BinOp is a multiplication between another BinOp, which simply summarizes to “1 to the power of 1”, and its own constant 1.
If you follow the code, comparing it to the tree, you can quite easily see where things in the code appear in the AST, at least in a small tree like this one.
Looking at the ast
Module
During my research, I saw that the ast
module has a NodeTransformer class, which allows us to walk the AST and modify the nodes.
Let’s start with a simple sample, attempting to consolidate 1+2+3+4+5+6
:
1 | import ast |
Woah, this looks interesting. But, let’s go over it, line by line.
First, we import ast
and parse the AST for 1+2+3+4+5+6
. Okay, seems pretty straightforward.
We then have a ReplaceConstantOps
class - this is where the magic happens. We have a function called visit_BinOp
in this class - this is a function built into the class that allows us to modify behavior that happens when we visit a BinOp node (which, if you recall, is just any arithmetic or bitwise operator with two parameters.)
We grab the operator, but then before doing anything with the left and right nodes of the BinOp, we visit them and replace them with the output of the transformer.
This is done so that we can recursively evaluate nested BinOps. Without this, we wouldn’t be able to deobfuscate the code.
Then, we do a quick sanity check and ensure we’re now dealing with two Constants. Then, we check if the node is an addition node, and return a Constant node that is simply the two nodes added together.
When running this code, you get 21
as the output, which is correct!
This is an example of a basic constant folding algorithm, which basically just means replacing expressions that will always evaluate to a constant value, with that constant value.
Expanding for Crython
At this point, it’s just a matter of implementing every node type and operation that the Crython challenge uses.
We can see from a quick glance that we’re dealing with:
- BinOp, which we’ve already went over
- UnaryOp, which is what those
~
and-
operations before numbers and expressions are. This is much like a BinOp, except it only uses one parameter instead of two - Call, for those sum() calls that we see throughout the code.
I started by implementing a majority of the BinOp operations:
1 | if type(o) == ast.BitAnd: |
Then, I implemented UnaryOp, which was much the same as the BinOp code, just modified to only have one parameter (the node.operand
):
1 | def visit_UnaryOp(self, node): |
Another somewhat complicated part of the code is the call folding, so I’ll walk through that real quick.
1 | def visit_Call(self, node): |
First, you can see that we grab the function name out of node.func.id
, and prepare a tuple of arguments to sum().
We check if we’re dealing with a sum()
call, and if we are, we iterate over the arguments. If we’re dealing with a Tuple object as an argument, we iterate over the elements of the Tuple, visiting each node in the tuple and replacing the existing node. We do this so we can evaluate any expressions in the tuple to Constants that we can easily add together. We also add the evaluated node to our own tuple.
Once we have all the members of the tuple evaluated, we call sum() on the tuple and return a Constant value that is the sum() call fully evaluated.
Note: We did eventually find out that all the sum
calls are simply sum(1, 2, 3, 1)
, meaning a simple return ast.Constant(7)
might’ve worked here too. But, for completeness, we decided to implement a full sum() evaluator.
With this being done, we run the program, and cross our fingers…
1 | f = int.from_bytes(input('Enter your flag: ').encode(), 'big') |
That looks more like it! A lot shorter, but still kinda looks like a jumble. At least it’s readable now!
Now that we have readable code, we can kinda decipher what’s going on here. The first bit in our input corresponds to the 162nd bit in the large integer, the second to the 291st, and so on.
This means we simply just have to swap around the bits, and we get the flag.
I decided enough was enough with ASTs at this point, so I used a good old fashioned regular expression and evaluated each operation:
1 | bit_regex = r"\(f & ([0-9]+)\) ([<>]+) ([0-9]+)" |
Again, this is a little confusing, so let’s break it down.
That regex basically matches any (f & X) [BITSHIFT] Y
where BITSHIFT is either a left shift (<<
) or right shift (>>
). It also has groups to grab both numbers and the bitshift so we can evaluate it later on.
Then, we iterate over every appearance of that regex pattern. We join the groups with a space and then evaluate it as code (the join comes out as X [BITSHIFT] Y
due to how we set up our groups, so for (f & 1) << 162
this would be 1 << 162
.) Then we AND it with the jumbled bits value.
If the AND comes out as true, we set that bit in our final integer. If not, we know that bit is not set in the final integer, so we don’t set it.
Then, we use to_bytes
to get the string.
And our output:
1 | C:\CTF\bearcat25>py astcrython.py |
That’s the flag!
Conclusion
I hope this article was valuable in explaining how ASTs work, and how they can be used to deobfuscate Python code. This was a learning experience for me during the CTF, and I hope it can be to you too!