Python has been a popular language among penetration testers from some time now and is used extensively here at RSM. Python version 3 has been out since December 2008 and yet many scripts currently being produced by the security community exclusively target version 2.7. Given that Python 2.7 is in maintenance mode only at this point, it’s important for people to have the tools and knowledge necessary to write code today that won’t need major work in the future to run in Python 3.x. To that end, this blog aims to provide some useful tips and tricks to address issues common for pentesters wanting to write code which is compatible with both Python 2 and Python 3.
#5 Using Dictionaries
In Python 3 the dictionary methods items, keys, values all return specific types instead of lists as they did in Python 2. This can cause errors when they are being used as a mutable type (list).
my_teams_scores = {'alice': 0, 'bob': 0} my_team = my_teams_scores.keys() # my_team is a list in Python 2 and dict_keys in Python 3 (append wont work) my_team.append('spencer')
Instead the dict_keys can be converted to a list to be used in the example above.
my_teams_scores = {'alice': 0, 'bob': 0} my_team = list(my_teams_scores.keys()) # my_team is a list in both versions, the list call has no effect in Python 2 my_team.append('spencer')
#4 Using CamelCase Modules
Python 3 renamed a few modules which used the CamelCase naming convention to the more consistent snake_case naming convention. Some of these modules did not have dramatic API changes and can be effectively used under either name. The renamed modules include ConfigParser, Queue, SimpleHTTPServer and BaseHTTPServer. The two HTTPServer modules are slightly different as the main classes can now be found in the http.server module.
import sys if sys.version_info.major < 3: # import the Python 2 ConfigParser module and rename it import ConfigParser as configparser else: import configparser parser = configparser.ConfigParser() parser.read('some_config.ini')
#3 Printing
One of the easiest ways to tell if a script is not meant to run on Python 3 is to see if “print” is used as a keyword or function. In Python 3 the print keyword was changed to a function meaning that it needs to have parenthesis around what is being printed.
print 'this will only work in python 2!' print('this will work in either python 2 or python 3!')
Printing data on the same line:
# this will only work in Python 2.x and not Python 3.x print 'waiting for something...', print 'done!'
from __future__ import print_function # this will work in Python 2.7 and Python 3.x print('waiting for something...', end='') print(' done!')
#2 Making Web Requests
Most often when web requests need to be made in a script the urllib2 module is used, however this module is unavailable in Python 3. It was one of many modules that were renamed in Python 3 and or moved into a different package. Python 3 instead offers the urllib.request package and while these internal modules can be used most of the time to accomplish basic web request tasks it does require some import-fu to rename the packages (See “Using CamelCase Modules” above). Instead of employing this complex solution, an easier way to make web requests in both Python 2 and Python 3 is to simply use the third party requests package. The Python requests package is available through pip supports a wide variety of Python versions and most importantly maintains the same API across these meaning that the same code can utilize it in any of the supported versions.
# this will only work in Python 2 import urllib2 url_h = urllib2.urlopen('https://warroom.rsmus.com') # this will work in Python 2.6 - Python 3.4 (as of Requests version 2.8.1) import requests resp = requests.get('https://warroom.rsmus.com')
#1 Using the new Bytes Type
Arguably the biggest learning curve for experienced Python 2 users is in the introduction of the new bytes type which resembles strings prefixed with a “b” such as:
b'these are bytes'
This type will be used extensively with any kind of code that uses raw sockets or uses base64 encoding. The important thing to note when using bytes is when to convert the data to the string type more commonly familiar with. When a function returns it’s value as bytes and the user is expecting it to contain text data it can be converted to a string using the .decode(‘utf-8’) idiom. In Python 2 this will convert the str object to a unicode object where as in Python 3 this will convert the bytes object to a str object. This takes advantage of the fact that Python 2 supports more operations between str and unicode objects than Python 3 supports between bytes and str objects, most notably concatenation.
# this will work only in Python 2.x # sock is an established socket connection data = sock.recv(4) if data == 'exit': sys.exit(0)
In this example “data” is assumed to be a string, as would be used in some kind of custom shell.
# this will decode the data as utf-8 and work in Python 2.7 & Python 3.x as expected # sock is an established socket connection data = sock.recv(4) data = data.decode('utf-8') if data == 'exit': sys.exit(0)
It’s also useful to note that in Python 2.7 strings can be prefixed with a lower case b and are still of a str type. This allows for quick concatination or comparison operations to be compatible in both version for example the cross version compatible solution above could also be:
# sock is an established socket connection data = sock.recv(4) # instead of decoding the data the bytes type is compared to another bytes type if data == b'exit': sys.exit(0)