In this blog post, we’ll tackle the following puzzle:
Given an array of integers, count the number of continuous subsequences, such that elements of every subsequence are arranged in strictly increasing order.
The optimal solution to this puzzle is to use the dynamic programming (DP) technique. But, in order to apply this technique, we first need to express the solution through a recurrent formula. So, I will start first by expressing it in Haskell, and then translate the implementation to PHP.
Examples
Before we try to express the problem in terms of a recurrent formula, it’d be good to look at a few examples.
Example 1: Consider the array [0, 1, 2, 3, 15]. We can construct the following continuous subsequences such that all elements are in strictly increasing order:
[0], [1], [2], [3], [15] – This list is easy, just the singleton of each element. But, in addition, we have: [0, 1], [0, 1, 2], [0, 1, 2, 3], [0, 1, 2, 3, 15], where the starting element is 0. And, also: [1, 2], [1, 2, 3], [1, 2, 3, 15], … and then [2, 3], [2, 3, 15], [3, 15]… which adds to exactly 15 different subsequences.
Example 2: Considering the array [10, 10, 10] we end up with [[10], [10], [10]] – a rather trivial example.
Mathematical expression (Haskell)
This is probably the hardest bit of every DP challenge – coming up with the correct recurrent formula. I don’t know of an easy way to find a formula fast, other than that it is usually a trial & error process – mostly playing around with the examples might give you a hint as to what the formula should look like.
After a while, I figured that the function works pretty well, where
is an array and
is a list of previous elements (if any). Thus,
will represent the current calculation, given as state an array of numbers and another array of previous elements.
Implementation #1
Instead of counting, we’ll be building the actual lists – this might help with debugging to see if we actually got the right subsequences constructed. After you figured the original idea out, transforming from building a list to counting numbers is easy, because both operations form a Monoid: .
Let’s start with the most trivial cases – whenever either the list of numbers or the list of previous elements is empty, just return the current known elements:
-- in the case of empty array, just return the
-- previous elements
f [] prev = [prev]
-- a singleton array is same as returning [[x]]
f [x] [] = f [] [x]
This makes sense, because the subsequences of any singleton list is just that singleton itself (recall the [10, 10, 10] example).
Next, in case there are no previous elements to work with, we just start building the tree by both including and excluding the current element.
-- calculate the subsequences of both x included and
-- excluded
f (x:xs) [] = f xs [x] ++ f xs []
Why are we doing this? Consider f [1, 2] []. We need to process both f [2] [1] and f [2] []. In the first case, the 1 needs to be included in the list of previous elements for it to be considered at least as a singleton. In the second case, we test the assumption that it is smaller than the processing element (2 in this case) in attempt to include the subsequence [1, 2].
This reasoning brings us to the next definition:
-- if the last element in the previous list is smaller
-- than the current one, include the previous list
-- and in addition the previous list with the current
-- element combined
f (x:xs) prev | last prev < x =
f [] prev ++ f xs (prev ++ [x])
In the case the last element in the previous list is smaller than the current processing element, we continue constructing the (continuous) subsequence (e.g. [1, 2, ...]), but also including the current state of previous list, because it’s a valid subsequence (e.g. [1, 2]).
And the final pattern match, in case the last element in the previous list is not smaller than the current processing element:
-- otherwise just return the previous list
f (x:xs) prev = f [] prev
Trying it out:
> f [0, 1, 2, 3, 15] []
[[0],[0,1],[0,1,2],[0,1,2,3],[0,1,2,3,15],[1],[1,2],[1,2,3],[1,2,3,15],[2],[2,3],[2,3,15],[3],[3,15],[15]]
> length $ f [0, 1, 2, 3, 15] []
15
Looks correct!
Implementation #2
Given what we now have, we can easily move to numbers by replacing [prev] with 1 (the base case) and ++ with + (numbers instead of lists):
g [] prev = 1
g [x] [] = g [] [x]
g (x:xs) [] = g xs [x] + g xs []
g (x:xs) prev | last prev < x = g [] prev + g xs (prev ++ [x])
g (x:xs) prev = g [] prev
In this case, we are just counting and not showing any subsequences, so it kinda makes to make prev be the last number, instead of a list of numbers, but that’s fine.
Trying it out:
> g [0, 1, 2, 3, 15] []
15
Now that we have an idea of how this problem really works, we can move to our favourite non-FP language: PHP.
Implementation in PHP
We’ll do this in an iterative fashion; we’ll start with a direct translation from Haskell to PHP:
function countDecreasingSubarrays( $arr, $prev = [] ) {
// In the case of empty array, just return the previous element (1)
if ( empty( $arr ) || ( 1 == count( $arr ) && empty( $prev ) ) ) {
return 1;
}
// Process the subsequences with both current element included and excluded
if ( empty( $prev ) ) {
return countDecreasingSubarrays( array_slice( $arr, 1 ), [] )
+ countDecreasingSubarrays( array_slice( $arr, 1 ), [ $arr[0] ] );
}
// Prev is non empty and smaller, so we have a count and other stuff to process with it included
if ( end( $prev ) < $arr[0] ) {
return 1 + countDecreasingSubarrays( array_slice( $arr, 1 ), array_merge( $prev, [ $arr[0] ] ) );
}
return 1;
}
While this function works (you can try it out), it’s not the most optimal one. We keep slicing an array (which is expensive), so we can improve it by modifying the state space from $prev to $i, $prev where $i will represent the current processing index:
function countDecreasingSubarrays_2( $arr, $i = 0, $prev = null ) {
// In the case of empty array, just return the previous element (1)
if ( ! isset( $arr[ $i ] ) || ( $i + 1 == count( $arr ) && is_null( $prev ) ) ) {
return 1;
}
// Process the subsequences with both current element included and excluded
if ( is_null( $prev ) ) {
return countDecreasingSubarrays_2( $arr, $i + 1, null )
+ countDecreasingSubarrays_2( $arr, $i + 1, $arr[ $i ] );
}
// Prev is non empty and smaller, so we have a count and other stuff to process with it included
if ( $prev < $arr[ $i ] ) {
return 1 + countDecreasingSubarrays_2( $arr, $i + 1, $arr[ $i ] );
}
return 1;
}
Finally, we can implement memoization for this function, but I’ll leave it at that. Trying it out:
php > $x = [0, 1, 2, 3, 15];
php > var_dump( countDecreasingSubarrays_3( $x ) ); // 15
int(15)
php > $x = [10, 10, 10];
php > var_dump( countDecreasingSubarrays_3( $x ) ); // 3
int(3)
php > $x = [9, 8, 7, 6, 5];
php > var_dump( countDecreasingSubarrays_3( $x ) ); // 5
int(5)
Conclusion
As stated and as we saw, once we got the right recurrent formula and we were able to express it clearly using the language of mathematics (Haskell), it was pretty easy to convert it to a imperative programming language.
Does it work the other way around (converting from imperative to mathematical)? Probably, but with more effort, at least for me, personally.
But, why bother translating to PHP, or any imperative language for that matter? One thing I found about Haskell is that, while it’s great at expressing mathematical stuff, it’s not so easy (at least for me) to optimize functions. It probably requires more in-depth knowledge around how things work internally, but you could say the same for PHP or any imperative language, I guess. It’s probably that I just have more experience with dealing with algorithms and optimizations in these languages.
In any case, you should now go and learn some Haskell! 🙂