Captain Comic randomizer seed algorithm

David Fifield <david@bamsoftware.com>

Last modified:

Video game randomizers typically work by seeding a pseudorandom number generator, then calling into the PRNG whenever the randomization algorithm needs a random number. The same seed always results in the same randomized game, as long as the randomization algorithm makes the same calls to the PRNG in the same order.

The seed in the Captain Comic randomizer works differently. Rather than being the input for a PRNG, the seed directly encodes the item assignment. For every seed, there is one item assignment, and for every item assignment, there is one seed.

There's nothing wrong with the PRNG approach. The seed algorithm used in the Captain Comic randomizer is only possible because there's a fairly small space of randomized games—only about 47 bits. In more complicated games, a seed that directly encodes all the randomized parameters would be unreasonably long. The technique used in the Captain Comic randomizer has a couple of nice properties, though. First, it guarantees that the probability distribution over randomized item assignments is uniform: when you find an item in a certain location, you don't gain any information from that discovery beyond what can be inferred from the logic graph. In a PRNG-based algorithm, not all item assignments are equally likely, so you could (at least in theory) narrow down the set of possible seeds every time you see an item by eliminating the seeds that don't put that item at that location, and after observing a few items have a conditional probability for the remaining items that is far from uniform. Second, it means there is a seed for any item assignment you could want. For example there is a seed—225A396DE0DC8—that yields the vanilla item assignment; I would guess that most randomizers don't have that property.

Episode 10 of the live coding videos is devoted to elaborating the seed algorithm. Episode 32 is about making the seed representation less transparent.

Short description of the algorithm

The implementation of the algorithm is the function multiset_unrank in src/seed.rs of the source code.

The subroutine combination_unrank:

Background

There are 24 stages in The Adventures of Captain Comic, with one item slot per stage.

There are 24 items in the game, not all distinct. There are 8 unique items (Corkscrew, Door Key, Boots, Lantern, Teleport Wand, Gems, Crown, Gold), 5 Blastola Colas, and 11 Shields.

corkscrew key boots lantern wand gems crown gold cola cola cola cola cola shield shield shield shield shield shield shield shield shield shield shield

The items with their multiplicities form a multiset. The output of the randomizer is a permutation of this multiset; i.e., an assignment of one item to each of the 24 item slots. We are looking for a way to assign a number to each distinct permutation. That number is called a rank and the process of recovering a permutation from a rank is unranking. A seed encodes a rank which determines the item assignment.

Warm-up: factorial number system

There's a straightforward way to rank and unrank permutations, the factorial number system.

Let's try unranking an item assignment from the rank r = 123456789012345678901234. We start with 24 empty item slots, numbered 0 to 23.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

To place the Corkscrew, we take r mod 24, which will be a number between 0 and 23. In this case, 123456789012345678901234 mod 24 = 10, so we put the Corkscrew in item slot 10. Then we renumber the 23 remaining slots from 0 to 22, and divide out the Corkscrew's contribution to the rank by dividing r by 24 and rounding down. The new r = 123456789012345678901234 / 24 = 5144032875514403287551.

0 1 2 3 4 5 6 7 8 9 corkscrew 10 11 12 13 14 15 16 17 18 19 20 21 22

The next item to place is the Door Key, and there are 23 slots it might go in. Now we take r mod 23 and find 5144032875514403287551 mod 23 = 16, so the Door Key goes in the item slot that is now numbered 16. We divide r by 23 to get the new r = 5144032875514403287551 / 23 = 223653603283234925545. We renumber the 22 remaining slots from 0 to 21.

0 1 2 3 4 5 6 7 8 9 corkscrew 10 11 12 13 14 15 key 16 17 18 19 20 21

We repeat this process for every item. The assigned slot refers to the renumbered slots at each step; e.g. when the assigned slot is 0, it means to choose the leftmost slot that has not yet been assigned.

item r # slots assigned slot
corkscrew 123456789012345678901234 24 10
key 5144032875514403287551 23 16
boots 223653603283234925545 22 19
lantern 10166072876510678433 21 12
wand 484098708405270401 20 1
gems 24204935420263520 19 13
crown 1273943969487553 18 13
gold 70774664971530 17 10
cola 4163215586560 16 0
cola 260200974160 15 10
cola 17346731610 14 12
cola 1239052257 13 1
cola 95311712 12 8
shield 7942642 11 4
shield 722058 10 8
shield 72205 9 7
shield 8022 8 6
shield 1002 7 1
shield 143 6 5
shield 23 5 3
shield 4 4 0
shield 1 3 1
shield 0 2 0
shield 0 1 0

Here is the item assignment that results from the initial rank r = 123456789012345678901234:

cola wand shield cola shield shield shield shield shield shield corkscrew shield gold lantern cola cola gems key crown shield shield boots cola shield

The problem with using the factorial number system for unranking a seed is that it treats all the items as distinct—and they are not. There are multiple ranks that result in the same permutation, because the algorithm thinks each Blastola Cola is different from the other Blastola Colas, and each Shield is different from the other Shields. Because of this multiple counting, seeds are longer than they need to be: the number of seeds is 24! = 620448401733239439360000 ≈ 79 bits. Next, we will see that by not multiply counting indistinguishable permutations, we can reduce the size of the seed space to 24! / 5! / 11! = 129529505064960 ≈ 47 bits.

Permutations of a multiset

Only a small change is needed to make the above algorithm work for a multiset. Instead of assigning one item at a time, we will assign one group of distinct items at a time. We'll start by assigning the 8 unique items—up to this point the algorithm is unchanged. When we come to the point that we need to assign the 5 Blastola Colas into 16 remaining slots, we use what's left of r to select one of the 165 = 4368 possible combinations of 16 items taken 5 at a time. We convert an integer into a combination using the combinatorial number system, which provides an unrank function for combinations. I won't go into detail on the combinatorial number system, but you can see the combination_unrank subroutine in the algorithm description.

We'll work through an example of unranking r = 123456789012345. We start with 24 empty item slots, numbered 0 to 23.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

We handle each unique item as a "group" of 1 item. There are 241 = 24 ways to choose 1 slot from 24 slots. To place the Corkscrew, we take r mod 24 = 9. We place the Corkscrew in item slot 9, renumber the remaining slots from 0 to 22, and divide out the Corkscrew's contribution to the rank by dividing r by 24 and rounding down. The new r = 123456789012345 / 24 = 5144032875514.

0 1 2 3 4 5 6 7 8 corkscrew 9 10 11 12 13 14 15 16 17 18 19 20 21 22

The process continues like this for all 8 unique items. Up to this point, the algorithm has been the same as the one using the factorial number system.

0 1 2 gold 3 key 4 5 6 corkscrew lantern wand 7 boots 8 9 crown 10 11 12 13 14 15 gems

Now r = 4163, and we have to assign the 5 Blastola Colas into 16 remaining slots. The number of ways to do the assignment is 165 = 4368. We compute r mod 4368 = 4163, and therefore call combination_unrank(4163, 5) to get [15, 14, 10, 9, 3], and assign Blastola Colas to those five slots. Then we renumber the remaining slots and set r = 4163 / 4368 = 0 (rounded down).

0 1 2 gold cola key 3 4 5 corkscrew lantern wand 6 boots 7 cola crown cola 8 9 10 cola cola gems

Now we have to assign the final item group of 11 Shields to 11 remaining slots. 1111 = 1 tells us there is only one way to do that. combination_unrank(0, 11) gives us that single combination, [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]. We assign the Shields to those slots and we are done.

shield shield shield gold cola key shield shield shield corkscrew lantern wand shield boots shield cola crown cola shield shield shield cola cola gems

Here is the trace of unranking 123456789012345.

item group r # combinations assigned slots
corkscrew 123456789012345 24 [9]
key 5144032875514 23 [5]
boots 223653603283 22 [11]
lantern 10166072876 21 [8]
wand 484098708 20 [8]
gems 24204935 19 [18]
crown 1273943 18 [11]
gold 70774 17 [3]
cola 4163 4368 [15, 14, 10, 9, 3]
11×shield 0 1 [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

The number of distinct item assignments, and therefore the number of ranks needed to represent them all, is 24! / 5! / 11! = 129529505064960. Another way to understand this number is as 241 × 231 × 221 × 211 × 201 × 191 × 181 × 171 × 165 × 1111.

Seed encoding

A rank is an integer; for input to the program we have to encode it as a string. I defined two seed encodings, which are distinguished by their first character.

1XXXXXXXXXXXX

Seeds that start with 1 directly encode a rank as 12 hexadecimal digits.

For example, the seed 1070032C6453A contains hexadecimal 070032C6453A which represents the rank 7697433249082.

2XXXXXXXXXXXX

Seeds that start with 2 are obfuscated. Like the 1 format, there are 12 hexadecimal digits representing an integer n. To recover the rank, you take r = (43n mod 129529505064961) − 1.

For example, the seed 225A396DE0DC8 contains hexadecimal 25A396DE0DC8 which represents n = 41384541031880. Then (43n mod 129529505064961) − 1 = 7697433249082 recovers the rank.

Originally, I only had the 1-format seeds. But encoding the rank so directly makes it possible to mentally calculate some information about the item assignment. To find the location of the Corkscrew, you just have to take the rank mod 24, which is just about possible to do mentally when the rank is represented as a hexadecimal numeral. The number of distinct item assignments is x = 129529505064960, and x + 1 just happens to be a prime number, which means the multiplicative group of integers mod x + 1 is a cyclic group. 43 is a generator of this group, so 43n for 0 ≤ n < x gives a permutation of the integers between 1 and x + 1 inclusive. From there we subtract 1 to put it back into the range 0 to x − 1.

The randomizer now generates 2-format seeds by default, but it still accepts 1-format seeds. So that's your reward for reading this page, you know that the randomizer accepts an additional seed format not documented on the main page. Episode 32 of the live coding videos covers the implementation of 2-format seeds.

If you want the 2-format seed for a specific item assignment, you have to do a discrete log, but a discrete log on 48 bits is no big deal. I got the rank of the vanilla item assignment, 7697433249082, by running the algorithm in reverse. Then I converted it to a 2-format seed as follows:

sage: x = 129529505064960
sage: K = GF(x+1, 'b')
sage: K
Finite Field of size 129529505064961
sage: b = K.multiplicative_generator()
sage: b
43
sage: b.multiplicative_order()
129529505064960
sage: r = 7697433249082
sage: "%012X" % r
'070032C6453A'
sage: "%012X" % discrete_log(r+1, b, x)
'25A396DE0DC8'

References

Appendix: A lexicographic algorithm

While developing the randomizer, I found that the rank and unrank methods of Permutations_mset in SageMath 9.0 used an inefficient algorithm (inherited from EnumeratedSets.rank and EnumeratedSets.unrank). unrank(r), for example, works by enumerating every permutation starting from 0 and taking the rth one. I wanted to improve them; however Sage's functions are specified to number permutations in lexicographic order, and the algorithm I developed for the randomizer produces an order that is not lexicographic, so it's not a drop-in replacement. After some more research I came up with more efficient algorithms suitable for Sage. The changes are in Sage ticket #29942 and became part of Sage 9.2.