GDScript Priority Queue Benchmarks

Last week I was working on implementing graph search algorithms, specifically Dijkstra’s Algorithm and A*, for my game in Godot. These algorithms require a priority queue to be able to sort and pop out the node with the lowest cost from the set of nodes in the frontier. The recommendation for best performance is to use a binary heap, so I wrote an implementation of that data structure in GDScript after studying the Wikipedia article about how it works.

extends Reference
class_name PriorityQueue
"""
Priority Queue. Min heap priority queue that can take a Vector2 and its
corresponding cost and then always return the Vector2 in it with
the lowest cost value.
Based on: https://en.wikipedia.org/wiki/Binary_heap
"""
var _data: Array = []

func insert(element: Vector2, cost: float) -> void:
	# Add the element to the bottom level of the heap at the leftmost open 
        # space
	self._data.push_back(Vector3(element.x, element.y, cost))
	var new_element_index: int = self._data.size() - 1
	self._up_heap(new_element_index)

func extract():
	if self.empty():
		return null
	var result: Vector3 = self._data.pop_front()
	# If the tree is not empty, replace the root of the heap with the last
	# element on the last level.
	if not self.empty():
		self._data.push_front(self._data.pop_back())
		self._down_heap(0)
	return Vector2(result.x, result.y)

func empty() -> bool:
	return self._data.empty()

func _get_parent(index: int) -> int:
	# warning-ignore:integer_division
	return (index - 1) / 2

func _left_child(index: int) -> int:
	return (2 * index) + 1

func _right_child(index: int) -> int:
	return (2 * index) +  2

func _swap(a_idx: int, b_idx: int) -> void:
	var a = self._data[a_idx]
	var b = self._data[b_idx]
	self._data[a_idx] = b
	self._data[b_idx] = a

func _up_heap(index: int) -> void:
	# Compare the added element with its parent; if they are in the correct 
        # order, stop.
	var parent_idx = self._get_parent(index)
	if self._data[index].z >= self._data[parent_idx].z:
		return
	self._swap(index, parent_idx)
	self._up_heap(parent_idx)

func _down_heap(index: int) -> void:
	var left_idx: int = self._left_child(index)
	var right_idx: int = self._right_child(index)
	var smallest: int = index
	var size: int = self._data.size()

	if right_idx < size and self._data[right_idx].z < self._data[smallest].z:
		smallest = right_idx

	if left_idx < size and self._data[left_idx].z < self._data[smallest].z:
		smallest = left_idx

	if smallest != index:
		self._swap(index, smallest)
		self._down_heap(smallest)

My implementation takes a Vector2 and a cost. Internally I convert both to a Vector3, with the z value representing the cost. This makes the elements easier to move around.

To use this Priority Queue, just instantiate a new queue using the new() method from the Reference object, then insert() Vector2s with costs. After that, calling the extract() method gives you the Vector2 in it with the lowest cost.

func _ready() -> void:
	var binary_heap: Reference = PriorityQueue.new()
	binary_heap.insert(Vector2(42,42), 12)
	binary_heap.insert(Vector2(3,3), 5)
	binary_heap.insert(Vector2(1,1), 8)
	print("The vector with the lowest cost is: ", binary_heap.extract())

--- Debugging process started ---
Godot Engine v3.4.5.stable.official.f9ac000d5 - https://godotengine.org
OpenGL ES 3.0 Renderer: Mesa Intel(R) Graphics (ADL GT2)
OpenGL ES Batching: ON
 
The vector with the lowest cost is: (3, 3)

However, I wondered how much more performance this data structure from 1964 was giving me versus a naive implementation that I, a non-computer-scientist regular-joe-programmer, could come up with on a Sunday morning. So I wrote a naive implementation that just takes in Vector2s and then sorts the data on extract().

extends Reference
class_name NaivePriorityQueue
"""
Naive Priority Queue. Naive implementations of a priority queue that can 
take a Vector2 and its corresponding cost and then always return the
Vector2 in it with the lowest cost value.
"""
var _data: Array = []
var _unsorted: bool = true

func insert(element: Vector2, cost: float) -> void:
	# Add the element to the end of the array
	self._data.append(Vector3(element.x, element.y, cost))
	# Then mark the data as unsorted
	self._unsorted = true

func extract():
	if self.empty():
		return null
	if self._unsorted:
		self._sort()
	var result: Vector3 = self._data.pop_front()
	return Vector2(result.x, result.y)

func empty() -> bool:
	return self._data.empty()

func _sort() -> void:
	self._data.sort_custom(self, "_vector3_sorter")
	self._unsorted = false

func _vector3_sorter(a: Vector3, b: Vector3) -> bool:
	if a.z < b.z:
		return true
	return false

Then I wrote a test that roughly simulates how a priority queue would be used by Dijkstra’s Algorithm or A* as a frontier. Specifically, extract a node from the queue after exploring the previous node’s neighbors in four directions.

func test(vectors: Array, queue: Reference) -> void:
	# Insert the vectors into the queue, but every 4 inserts,
	# extract the vector with the lowest cost. This roughly 
	# simulates how Dijkstra's Algorithm or A* would use a
	# queue as a frontier.
	var counter: int = 0
	for v in vectors:
		if counter == 4:
			queue.extract()
			counter = 0
		queue.insert(v, v.x)
		counter += 1

Next, I wrote a simple benchmark which uses GDScript’s built in OS.get_ticks_msec() to get the amount of time passed in milliseconds since the engine started. First I got the current tick, instantiated the queue to be tested, ran the test, and got the tick again. The difference between the first tick and the tick taken after the test completed was the amount of time in milliseconds it took the queue to process the test. Dividing that value by 1000 gave me the time in seconds.

Below is the full benchmarking and test code:

extends Node2D

func _ready() -> void:
	run_benchmark()

func run_benchmark() -> void:
	var rng = RandomNumberGenerator.new()
	rng.randomize()
	# Generate random vectors
	var n: int = 10000
	var random_vectors: Array = []
	for i in n:
		# Generate a float randomly
		var random_f: float = float(rng.randi_range(0, n))
		# To keep things simple, just make the x,y and cost all the same
		var element: Vector2 = Vector2(random_f, random_f)
		random_vectors.append(element)

	# Test Min Heap
	var time_start: int = OS.get_ticks_msec()
	var binary_heap: Reference = PriorityQueue.new()
	test(random_vectors, binary_heap)
	var bh_elapsed_time: float = (OS.get_ticks_msec() - time_start)/1000.0
	print("Min Heap Queue took %s seconds" % bh_elapsed_time)

	# Test Naive Implementation
	time_start = OS.get_ticks_msec()
	var naive_priority_queue: Reference = NaivePriorityQueue.new()
	test(random_vectors, naive_priority_queue)
	var naive_elapsed_time: float = (OS.get_ticks_msec() - time_start)/1000.0
	print("Naive Queue took %s seconds" % naive_elapsed_time)

	var percent_diff: float = (((naive_elapsed_time/bh_elapsed_time)-1)*100)
	print("Percentage difference: %s%%" % percent_diff)

func test(vectors: Array, queue: Reference) -> void:
	# Insert the vectors into the queue, but every 4 inserts,
	# extract the vector with the lowest cost. This roughly 
	# simulates how Dijkstra's Algorithm or A* would use a
	# queue as a frontier.
	var counter: int = 0
	for v in vectors:
		if counter == 4:
			queue.extract()
			counter = 0
		queue.insert(v, v.x)
		counter += 1

To run the benchmark, I attached this script to my main scene and pressed F5 to start it.

--- Debugging process started ---
Godot Engine v3.4.5.stable.official.f9ac000d5 - https://godotengine.org
OpenGL ES 3.0 Renderer: Mesa Intel(R) Graphics (ADL GT2)
OpenGL ES Batching: ON
 
Binary Heap Queue took 0.108 seconds
Naive Queue took 58.736 seconds
Percentage difference: 54285.185185%

With 10,000 randomly generated vectors and costs, the difference between the binary heap implementation and the naive implementation was huge. It only took the binary heap implementation 0.11 seconds to finish the test while my naive implementation took 58.74 seconds, or 54,000% more time. I ran the benchmark a few more times, and the results were similar to the values above. It always took the naive implementation around 58 seconds to complete the test while the binary heap did it in a little over a tenth of a second. Amazing.

I’ve always been awed by computer science algorithms, data structures, and the genius computer scientists who discovered them. These algorithms and data structures give us a window into the awe-inspiring underlying logic of the universe. It was fun to learn about binary heaps and experiment with them last week because I got another chance to look through that window, and marvel.