Tuply singleton

I noticed this while at a company team meetup. Whenever we were discussing which place we should eat food at, most of my coworkers were paying attention to two numbers: rating and number of votes.

For example, Restaurant #1 can have a 5.0-star rating with 1 vote, and Restaurant #2 can have a 3.9-star rating with 17 votes.

This metric makes a tuple, where the first element is the rating and the second element is the vote count. It is represented by, e.g. (3.9, 17) in Haskell (and is of type (Double, Int)), or [3.9, 17] (or { fst: 3.9, snd: 17 }) in JavaScript.

So we need two elements (or numbers) to capture this metric. What is a good way to combine them in a single element, such that both can be extracted at any time?

In Haskell terms, fst and snd are of type (a, b) -> a and (a, b) -> b respectively. Since we only care about numbers in this case, we’ll make our lives easier, and essentially we’re looking for fst :: Double -> Double and snd :: Double -> Double.

What is a good way to encode this? Let’s see. Consider we have a prime number p together with some numbers x_1 \in \mathbb{R}, x_2 \in \mathbb{N} – our metric is now (x_1, x_2).

We know for sure that x_1 \cdot p^{x_2} makes a single number – the argument is as follows: number powered to a number makes a number, and a number multiplied by a number makes a number.

So using this information we can make a single number out of two numbers – that is easy. The trickier bit is extracting two numbers (correctly) out of a single number. Will the way we used work? Let’s check. Remember our number was x_1 \cdot p^{x_2}, which is really just x_1 \cdot \underbrace{p \cdot p \cdot \ldots \cdot p}_{x_2 \ \text{times}}.

To decode we just have to use division. The question is, what do we divide, and can we be sure that it will always produce the correct result? Well, first of all, we have to be careful about which prime number we pick.

We know that 0 \leq x_1 \leq 5, so if we pick 5 as a prime number, and we’re dealing with 5-star ratings then the encoding will break since we will have a number like 5 \cdot 5 \cdot ... \cdot 5 so we can’t really extract any meaningful data out of it⭐️. So we need to make sure that x_1 < p. Let's consider p = 7: x_1 \cdot \underbrace{7 \cdot 7 \cdot \ldots \cdot 7}_{x_2 \ \text{times}}. This is better.

To extract x_1 we have to keep dividing our number with 7 until it is less than or equal to 5 and count the number of times we did that. For example if we had x_1 \cdot 7^5, we divide that number by 7 just 5 times and then we reach \leq 5, since 0 \leq x_1 \leq 5. This gives our first element.

Now that we have the first element, the second element will be easy. To extract x_2 out of x_1 \cdot 7^{x_2}, we divide the number by the first element (which we already know how to get), and then use logarithm with base 7 to get the x_2 out of 7^{x_2}.

Finally, why is it important that p is a prime number? Well, if it wasn't prime we couldn't really extract it. Assume p is a composite number in x_1 \cdot p^{x_2}. This means that in some cases p may have some factors equal to x_1 – e.g. p = 8 = 4 \cdot 4 and x_1 = 4). Earlier we already⭐️ showed why this breaks the encoding.

With all this information, let's write the code in Haskell:

 
mkPair :: Double -> Double -> Double
mkPair a b = if a > 5 || a < 0 then error "Out of bounds" else a * (7 ** b)

fst' :: Double -> Double
fst' x = if x > 5 then fst' (x / 7) else x

snd' :: Double -> Double
snd' x = logBase 7 (x / fst' x)

Some tests:

> let p = mkPair 5.0 1 in printf "Pair: %f, First: %f, Second: %f\n" p (fst' p) (snd' p)
Pair: 35.0, First: 5.0, Second: 1.0
> let p = mkPair 3.9 17 in printf "Pair: %f, First: %f, Second: %f\n" p (fst' p) (snd' p)
Pair: 907259004550107.3, First: 3.9, Second: 17.0

Of course, it may break for some numbers:

> let p = mkPair 4.9 10 in printf "Pair: %f, First: %f, Second: %f\n" p (fst' p) (snd' p)
Pair: 1384128720.1000001, First: 4.8999999999999995, Second: 10.000000000000002

But that's only due to computers' limitation in terms of floating-point numbers, not our mathematical logic.

This concept is very similar to Gödel numbering and FRACTRAN.

7 thoughts on “Tuply singleton

  1. Are you sure, that you need a prime number in this special case? I know it’s often important in encodings. But in this case I think it doesn’t matter as long as p is bigger than 5. Can you give an example of an ambiguous encoding with p=8?

    Liked by 1 person

Leave a comment