linkedin Skip to Main Content
Categories

How to Test Python REST APIs

Development

There are many types of tests, and they all seem great in theory, but how do we test a REST API? In order to do that, let’s first break down a REST API into parts, so we can focus on testing one part at a time.

A REST API is an interface that accepts connections via the internet, executes some business logic, and then returns a result. Fundamentally, that means that an API has input, business logic, and output. The business logic should be tested using traditional unit test techniques. After that, what’s left is testing the input and output.

APIs over HTTP (an internet protocol), called HTTP APIs’ have input and output which both have metadata (data about the requests) called headers that are used to give certain information, like how to interpret the data sent (is it encrypted? Is it text? Is it JSON? etcetera). 

HTTP API output also has a special piece of metadata called the status code, which is an integer, meant to be machine readable, not human readable, which is used to give an overall status of the request.

The next step that you see in most (but not all!) APIs, is some way of controlling what the user can and cannot do. You could call it user validation or permissions, but it’s usually just called authorization and authentication.

In order to break down how to actually test these pieces, let’s start off with an example Python Flask API that we’ll be testing against:

from flask import Flask, jsonify, request, views app = Flask(__name__) class CustomFlaskAPI(views.MethodView): def get(self): headers = request.headers if headers.get("should_error", False): return jsonify({"error": "ERROR"}), 400 return jsonify({"a": "b"}), 200 def post(self): content_type = request.headers.get("Content-Type") if content_type != "application/json": return jsonify({ "error": "Unknown Content Type: {}".format(content_type) }), 415 return jsonify({}), 200 # Add Endpoints: app.add_url_rule('/api/', view_func=CustomFlaskAPI.as_view('api')) if __name__ == "__main__": app.run()
Code language: Python (python)

HTTP status codes are important

In an HTTP API, you’re always going to have one, and they can have a lot of value, as they offer a programmatic quick summary of the status of a request. However, remember that the developers behind each API determine what HTTP codes they send back, so sometimes a misleading or incorrect HTTP status code can be sent back. That’s good to keep in mind just because a common mistake is assuming that the OK status means everything went well.

An example of this, is that in many asynchronous processes over APIs, the OK status received from an API may just mean it received the request successfully, not that the API actually did anything with that data. The process could fail later down the line, and the HTTP status code won’t magically change to reflect that.

The first thing to know about HTTP status codes is that they are intended to be read by machines, so they’re really just numbers like 200, even though those numbers map to human readable values, like OK.

The really important thing to know about HTTP codes is that the range of 200299 means everything went well, 300399 means some kind of redirect occurred, 400499 means some kind of understandable error occurred, and 500+ means something, most likely catastrophic, happened. Personally, I love to reference HTTP Cats for what each code means, but there is also an official spec you can read.

So, how do we test for these status codes? 

Let’s first think about what kinds of tests you can build for a program. There are:

  • “Happy” paths, where everything goes well
  • “Sad” paths, where errors occur
  • “Edge” cases or “Corner” cases, which are scenarios that don’t occur very often

Let’s apply these different paths to our API status code testing. Let’s test that:

  • We get a 200 “OK” when everything goes likes we want it to
  • We get 400s when we have various errors that we expect

It’s worth mentioning that 500 errors are inherently errors that aren’t expected or handled, so if you can come up with a test case for a 500+ error then the error itself should probably be a 400-499 error instead.

import unittest from unittest import TestCase import requests # pip install requests class IntegrationTests(TestCase): def test_status_codes_200(self): result = requests.get("http://localhost:5000/api/") assert result.status_code == 200 def test_status_codes_400(self): result = requests.get( "http://localhost:5000/api/", headers={"should_error": "error"}, ) assert result.status_code == 400 def test_status_codes_404(self): result = requests.get("http://localhost:5000/does_not_exist/") assert result.status_code == 404 if __name__ == "__main__": unittest.main()
Code language: Python (python)

Headers are the metadata of requests

The next piece of an API request to focus on is the headers of your HTTP calls. Headers contain all types of metadata about the request itself. For example, a common header is the content-type which tells the requester what format the data is in (for example application/json means it’s in JSON). There are some standard headers, but custom ones can also be added so all of the above should be tested.

import unittest from unittest import TestCase import requests class IntegrationTests(TestCase): def test_header_invalid(self): result = requests.post( "http://localhost:5000/api/", headers = {'Content-Type': 'random/content'}, json = { "username": "fake", "password": "fakse", }, ) assert result.status_code == 415 assert result.headers.get("Content-Type") == "application/json" def test_header_correct(self): result = requests.post( "http://localhost:5000/api/", headers = {'Content-Type': 'application/json'}, json = { "username": "fake", "password": "fakse", }, ) assert result.status_code != 415 assert result.headers.get("Content-Type") == "application/json" if __name__ == "__main__": unittest.main()
Code language: Python (python)

ℹ️ A very important note about headers is that many of them are very easy to change, so if you have your API be dependant on them for things like permissions you could be leaving your API open to man in the middle attacks.

Authentication versus Authorization

Now that we have tests for our REST API’s status codes and headers, we want to test that a user can only do what we expect them to be able to do.

In the industry you’ll sometimes hear people say things about “auth”-ing or “auth”ed, but what does that mean? It is jargon for saying both authenticated and authorized. But what is the difference?

This boils down to two questions: Are you who you say you are, and are you allowed to perform the action you’re trying to do. In other words – are you really my housemate, and even if you are, are you really allowed in my room? These two questions are the core of authentication and authorization. In the vernacular they’re often used interchangeably, but they actually have separate meanings.

Authentication

Authentication is checking if you are who you say you are. In other words, when you log in to a website and you give it your username and password you are authenticating – saying you are a real user. Authentication is who you are.

import unittest from unittest import TestCase import requests import sqlalchemy from sqlalchemy.orm import sessionmaker from flask import Flask, jsonify, request, views app = Flask(__name__) # Connect to the DB and reflect metadata. engine = sqlalchemy.create_engine("postgresql://coderpad:@/coderpad?host=/tmp/postgresql/socket") connection = engine.connect() Session = sessionmaker(bind=engine) session = Session() metadata = sqlalchemy.MetaData() metadata.reflect(bind=engine) class CustomFlaskAPI(views.MethodView): def post(self): content_type = request.headers.get("Content-Type") if content_type != "application/json": return jsonify({ "error": "Unknown Content Type: {}".format(content_type) }), 415 username = request.json.get('username') password = request.json.get('password') # NEVER PASS THIS IN PLAIN TEXT if username is None or password is None: return jsonify({ "error": "Missing username or password", }), 400 users_table = metadata.tables["users"] result = connection.execute( users_table.select().where( (users_table.c.username == username) & (users_table.c.password == password) # Never actually store in plain text ) ).fetchall() if (len(result) <= 0): return jsonify({ "error": "Invalid username or password", }), 400 return jsonify({}), 200 # Add Endpoints: app.add_url_rule('/api/', view_func=CustomFlaskAPI.as_view('api')) class IntegrationTests(TestCase): def test_login(self): result = requests.post( "http://localhost:5000/api/", headers = {'Content-Type': 'application/json'}, json = { "username": "real_user", "password": "real_encrypted_password", }, ) assert result.status_code == 200 def test_invalid_login(self): result = requests.post( "http://localhost:5000/api/", headers = {'Content-Type': 'application/json'}, json = { "username": "fake_user", "password": "fake_encrypted_password", }, ) assert result.status_code == 400 if __name__ == "__main__": app.run() unittest.main()
Code language: Python (python)

Authorization

Authorization is when you then try to view your friend’s private profile. Because you are a real user that is marked as a friend you are allowed to see the profile. But a random other valid user may not be allowed to see it. Authorization is what you can do.

Another example of authorization would be if you’re an admin or not – if you’re an admin you may be authorized to do things like delete accounts, whereas a regular user is not allowed to do such things. A website has to constantly check if you’re authorized to do certain actions, while it may only check now and then that you are authenticated – that you are who you claim to be.

import unittest from unittest import TestCase import requests import sqlalchemy from sqlalchemy.orm import sessionmaker from flask import Flask, jsonify, request, views app = Flask(__name__) # Connect to the DB and reflect metadata. engine = sqlalchemy.create_engine("postgresql://coderpad:@/coderpad?host=/tmp/postgresql/socket") connection = engine.connect() Session = sessionmaker(bind=engine) session = Session() metadata = sqlalchemy.MetaData() metadata.reflect(bind=engine) class CustomFlaskAPI(views.MethodView): def patch(self): content_type = request.headers.get("Content-Type") if content_type != "application/json": return jsonify({ "error": "Unknown Content Type: {}".format(content_type) }), 415 # Token to show user has already Authenticated login_token = request.json.get('login_token') if login_token is None: return jsonify({ "error": "Unauthenticated Request", }), 400 token_table = metadata.tables["tokens"] result = connection.execute( token_table.select().where(token_table.c.token == login_token) ).fetchall() if len(result) <= 0: return jsonify({ "error": "Unauthenticated Request", }), 400 username = result[0] # Check if user is Authorized to make this change permissions_table = metadata.tables["permissions"] result = connection.execute( permissions_table.select().where( (permissions_table.c.username == username) & (permissions_table.c.patch_permission == True) ) ).fetchall() if len(result) <= 0: return jsonify({ "error": "Unauthorized Request", }), 400 return jsonify({}), 200 # Add Endpoints: app.add_url_rule('/api/', view_func=CustomFlaskAPI.as_view('api')) class IntegrationTests(TestCase): def test_login(self): result = requests.post( "http://localhost:5000/api/", headers = {'Content-Type': 'application/json'}, json = { "username": "real_user", "password": "real_encrypted_password", }, ) assert result.status_code == 200 def test_invalid_login(self): result = requests.post( "http://localhost:5000/api/", headers = {'Content-Type': 'application/json'}, json = { "username": "fake_user", "password": "fake_encrypted_password", }, ) assert result.status_code == 400 if __name__ == "__main__": app.run() unittest.main()
Code language: Python (python)
Permission levels

Testing what a user is authorized to do also includes testing the different levels of users, like how there are admins that can see different parts of websites than regular users. Or, like how a friend may be authorized to see your profile while a stranger may not be.

If you’re a valid, logged in user is either true or false (authenticated), but you may be allowed to view this profile but not that account. What exactly a user is authorized to do depends on the permissions unique to one user.

The primary way that permissions are doled out to each user account is using tokens. For example, an admin user may actually just be a user with all the tokens:

view_your_account, view_account_global, delete_account, add_account, post_comment, delete_your_comment, delete_comment_global

Whereas a non-admin user may only have:

view_your_account, view_friend_account, post_comment, delete_your_comment

Each user would have a set of tokens representing if a user should be allowed to do whatever they’re trying to do. For example, a non-admin user can only delete comments they’ve posted, while an admin account may be able to delete any comment.

These tokens are often then combined with other information. For example, when a non-admin user is attempting to view an account, if the account is a friend then it can be viewed, otherwise the account cannot be viewed.

In other words, the flow of logic may look like this:

def allowed_to_view(user, account_to_view): if (user.has_permission(“view_account_global”)): return True elif (user.has_permission(“view_your_account”) and user.account == account_to_view): return True elif (user.has_permission(“view_friend_account”) and account_to_view.user in user.friends): return True else: return False
Code language: Python (python)

In reality you’d want to condense the if statements, but I left it split out for clarity.

Always have input validation

After you’ve authenticated that the user is real and authorized that they’re allowed to do whatever they’re trying to do, you then need to check that you have all the information you need in order to perform that action. In other words, you need to check that the user has given you all the information you need. For example, if you are taking in a credit card number, but the user only gives you one digit then that’s invalid input, and your API should reject it.

Invalid input of all types is always something to test for. This is partially because this is a common way to attack applications. For example, a very common type of attack is an SQL Injection attack: this type of attack relies on an API putting user input directly into a database, without making sure it’s safe. The user embeds some kind of SQL into the user input, giving them the ability to do something they otherwise wouldn’t be allowed to do, and the API executes the command because the input wasn’t validated.

A common SQL Injection attack is, in the username field, entering ;DROP TABLE users. This, if not properly validated or cleaned, could lead to the deletion of user data. Ergo input validation is very important!

Common input validation test cases

Beyond testing that there is no sneaky SQL hidden in your user input, you also want to make sure to test the data types of given input and the bounds of that data (for example, the number of characters in the given string).

If you’re expecting an integer, you want to make sure the user gives you an integer. After all, if you don’t validate that the input is an integer, your code could do some funky things, including erroring out, if it acts on the assumption that the data is of the expected type. Running len won’t work on an integer (len(3)), leading to an exception, but running len on a string will (len(“string”)) work just fine.

There are a couple of standard tests to always remember when testing input that is an integer

  1. Positive numbers
  2. Decimal numbers (for example, 3.14)
  3. Negative numbers
  4. Data of different, unexpected forms, like strings, potentially like numbers in strings (so “3.14”)

Point number 4 – testing the data given the wrong data type is something you’ll always want to do.

Aside from all of that, user input is usually specified and validated in the code using a schema (description of what is expected), so your tests can re-use that same schema by running all your bogus input data through the schema and confirming it fails the way you expect it to.

import unittest from unittest import TestCase from flask import Flask, jsonify, request, views from marshmallow import Schema, fields, validate from flask_api_4 import CustomFlaskAPI from parameterized import parameterized app = Flask(__name__) class UserPostInput(Schema): name = fields.Str(required=True) children = fields.Int( required=True, strict=True, # Must be a whole number validate=validate.Range(min=1), # Must be positive ) class CustomFlaskAPI(views.MethodView): def post(self): errors = UserPostInput().validate(request.json) if len(errors) != 0: return jsonify({ "errors": errors, }), 400 return jsonify({}), 200 # Add Endpoints: app.add_url_rule('/api/', view_func=CustomFlaskAPI.as_view('api')) class UnitTests(TestCase): def test_valid_input(self): with app.test_request_context( "/api/", method="POST", json={ "name": "real_name", "children": 1, }, ): result, status_code = CustomFlaskAPI().post() assert status_code == 200 assert result.json == {} @parameterized.expand([ [1, False, 1, True], ["name", True, "not_a_number", False], ["name", True, -1, False], ["name", True, 3.14, False], ]) def test_invalid_input( self, name, valid_name, children, valid_children, ): with app.test_request_context( "/api/", method="POST", json={ "name": name, "children": children, }, ): result, status_code = CustomFlaskAPI().post() assert status_code == 400 errors = result.json.get("errors", {}) assert valid_name == ("name" not in errors) assert valid_children == ("children" not in errors) if __name__ == "__main__": app.run() unittest.main()
Code language: Python (python)

You should also be sure to test receiving more or fewer fields than you expect, as the user may forget something or add something you don’t need. Some APIs choose to ignore extra fields, and some choose to throw an error when receiving extra fields. You should write the test based on the expected behavior from your API.

Output validation enables testing

When your API is returning output to the user it’s less intuitive to do output validation; after all, your API made the output, so it’s not going to be wrong, right? Well, there are lots of reasons to do output validation. One of them harkens back to contract testing: during development, if you test what your API is returning, then you’re never going to be surprised at what the customer receives. And thus your customer is never going to be surprised, so you won’t break their code that calls your API. Happy customers mean good business. In other words, your tests can let you know if you ever accidentally change the output keys or data types coming from your API.

A more general reason is that having output validation is more testable; you have a schema to use as the correct output in your tests.

import unittest from unittest import TestCase from flask_api_5 import CustomFlaskAPI, PostOutput from flask import Flask, jsonify, request, views from marshmallow import Schema, fields, validate app = Flask(__name__) class PostOutput(Schema): answers = fields.Int( required=True, strict=True, # Must be a whole number validate=validate.Range(min=1), # Must be positive ) class CustomFlaskAPI(views.MethodView): def post(self): result = { "answers": request.json.get("children", 0) + 1, } errors = PostOutput().validate(result) if len(errors) != 0: # Errors instead of returning malformed data return jsonify({ "errors": errors, }), 400 return jsonify(result), 200 # Add Endpoints: app.add_url_rule('/api/', view_func=CustomFlaskAPI.as_view('api')) class UnitTests(TestCase): def test_valid_output(self): with flask_app.test_request_context( "/api/", method="POST", json={ "name": "real_name", "children": 1, }, ): result, status_code = CustomFlaskAPI().post() assert status_code == 200 # Can check result validity by hand assert result.json == { "answers": 2, } # OR can re-use schema here assert PostOutput().validate(result.json) == {} def test_invalid_output(self, children): with flask_app.test_request_context( "/api/", method="POST", json={ "name": "name", "children": -1, }, ): result, status_code = CustomFlaskAPI().post() assert status_code == 400 # Can check result validity by hand errors = result.json.get("errors", {}) assert "answers" in errors # OR can re-use schema here errors = PostOutput().validate(result.json) # Can re-use schema here assert "answers" in errors if __name__ == "__main__": app.run() unittest.main()
Code language: Python (python)

API testing has many facets

REST APIs have many moving pieces, but, just like with any other software development problem, they just need to be broken down into pieces to be effectively tested. REST API calls over HTTP are made up of status codes, headers, user input (further separated into request parameters and request body), and output (what the API returns).

You should also always consider if your API should have some kind of security – and if it does you’ll need both authentication and authorization.

💡 Don’t forget to utilize both unit tests and integration tests to make sure you thoroughly test all parts of your API.

And, after you’ve gone through this exercise and identified all your dependencies, it might be a good time to hop over to CoderPad’s blog on open source dependency security to make sure you aren’t introducing security risks into your API.

Jennifer is a full stack developer with a passion for all areas of software development. He loves being a polyglot of programming languages and teaching others what he’s learned.