Skip to content

Transform regular Python code into a human-averse, yet still-functional equivalent.

License

Notifications You must be signed in to change notification settings

brandonasuncion/Python-Code-Obfuscator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Python Code Obfuscator

I was browsing /r/dailyprogrammer on Reddit one day, and attempted one of the daily challenges. After doing the challenge, I read through the comments and found a very interesting submission.

Seeing that baffled me at first sight, but after reading /u/ntxhhf's breakdown of his code, I was inspired to make my own code obfuscator for Python using the ideas in his post.

What exactly does this script do?
It takes your regular-looking Python code, and obfuscates it! It takes any specified Python script, and will attempt to create an equivalent script that has the same exact functionality as the original, but is incredibly difficult for humans to read regularly.

So... how is this useful?
According to Wikipedia...

Programmers may deliberately obfuscate code to conceal its purpose (security through obscurity) or its logic or implicit values embedded in it, primarily, in order to prevent tampering, deter reverse engineering, or even as a puzzle or recreational challenge for someone reading the source code.

Examples

Masking Numerical Values

Input: Using Netwon's Method to find the square root of 17

n = 17; x = 1
for i in range(100): x = x - ((x**2 - n) / (2*x))
print(x)

Output:

__=((()==[])+(()==[]));___=(__**__);____=((___<<___));_____=((____<<(__**__)));______=((_____<<(__**__)));
_________=((___<<_____));__________=((((___<<_____))<<(__**__)));_=((__**__)+(______<<(__**__)));_______=(__**__)
for ________ in range((_____+(_________<<(__**__))+(__________<<(__**__)))):
    _______=_______-((_______**((___<<___))-_)/(((___<<___))*_______))
print(_______)

Obfuscating Strings

There are two ways the parser can encrypt strings. The first way is with hex strings, and the other using the number encoding method above.

Example Input: print("Hello World!")

Output

_=((()==[])+(()==[]));__=(_**_);___=((__<<__));____=((___<<(_**_)));_____=((__<<____));______=((_____<<(_**_)));
_______=str(''.join(chr(__RSV) for __RSV in [((____<<(_**_))+(______<<(_**_))),((_**_)+____+______+(((_____<<(_**_)))<<(_**_))),
(____+(((___<<(_**_)))<<(_**_))+______+(((_____<<(_**_)))<<(_**_))),(____+(((___<<(_**_)))<<(_**_))+______+(((_____<<(_**_)))<<(_**_))),
((_**_)+___+____+(((___<<(_**_)))<<(_**_))+______+(((_____<<(_**_)))<<(_**_))),((_____<<(_**_))),((_**_)+___+____+_____+(((_____<<(_**_)))<<(_**_))),
((_**_)+___+____+(((___<<(_**_)))<<(_**_))+______+(((_____<<(_**_)))<<(_**_))),(___+_____+______+(((_____<<(_**_)))<<(_**_))),
(____+(((___<<(_**_)))<<(_**_))+______+(((_____<<(_**_)))<<(_**_))),(____+______+(((_____<<(_**_)))<<(_**_))),((_**_)+______)]))
print(_______)

Hiding Calls to Python's Built-In Functions

In Python, we can call a built-in function indirectly: getattr(__import__('builtins'), 'abs')(5)
So to call a function, we just use the string-encoding method detailed above. It's definitely not space-efficient, but it works!

Input: print(chr(65))

Output:

_=((()==[])+(()==[]));__=(_**_);___=((__<<__));____=((___<<(_**_)));
_____=((____<<(_**_)));______=((_____<<(_**_)));_______=((((_____<<(_**_)))<<(_**_)));
________=str(''.join(chr(__RSV) for __RSV in [((__<<__)+(______<<(_**_))+(_______<<(_**_))),
((_**_)+____+______+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_))),
((_**_)+_____+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_))),(____+_____+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_))),
(____+______+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_))),((_**_)+_____+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_))),
(___+____+_____+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_))),((_**_)+___+______+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_)))]));
_________=str(''.join(chr(__RSV) for __RSV in [(______+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_))),
(___+______+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_))),((_**_)+_____+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_))),
(___+____+_____+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_))),(____+______+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_)))]));
__________=str(''.join(chr(__RSV) for __RSV in [((_**_)+___+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_))),
(_____+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_))),(___+______+_______+(((((_____<<(_**_)))<<(_**_)))<<(_**_)))]))
getattr(__import__(________), _________)(getattr(__import__(________), __________)(((_**_)+(((((_____<<(_**_)))<<(_**_)))<<(_**_)))))

Usage

usage: obfuscator.py [-h] [--debug] inputfile outputfile

positional arguments:
  inputfile   Name of the input file
  outputfile  Name of the output file

optional arguments:
  -h, --help  show this help message and exit
  --debug     Show debug info

Mini-FAQ

  • Should I use this for distributing my source code?
    As of the time I'm writing this, I highly recommend against that idea. There are some instances of code in that the parser cannot handle (multi-line strings, for instance). Also, the output really won't do much to prevent reverse-engineering.
  • The output is too big! How do I reduce the output size?
    As of right now, the biggest impact is the inefficiency of encoding strings. For the smallest output, make sure to set the following constants in the script:
    USE_HEXSTRINGS = True
    OBFUSCATE_BUILTINS = False
    REMOVE_COMMENTS = True
    
  • Is there a license?
    Apache License 2.0.

Credits

  • Brandon Asuncion - Code
    Questions/Comments? Feel free to contact me at: [email protected]
  • /u/ntxhhf - For the idea, and his breakdown of using lists/sets to create boolean values and integers

About

Transform regular Python code into a human-averse, yet still-functional equivalent.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

No packages published

Languages