%% -*- LaTeX -*- \documentclass{tmr} %% This file is simultaneously %% (a) a valid Haskell source file, %% (b) a valid LaTeX source file, and %% (c) valid input to the lhs2TeX program. %% Is that neat, or what? %include lhs2TeX.fmt %options ghci % possible packages to fix URL formatting: urlbst, breakurl? \title{Generating Multiset Partitions} \author{Brent Yorgey\email{byorgey@@gmail.com}} \newcommand{\N}{\mathbb{N}} \newcommand{\term}[1]{\emph{#1}} \newcommand{\bell}[1]{\varpi_{#1}} \newcommand{\vleq}{\trianglelefteq} \newcommand{\nvleq}{\ntrianglelefteq} \newcommand{\vgeq}{\trianglerighteq} \begin{document} \begin{introduction} \emph{Multisets} are a generalization of sets that allow elements to occur more than once. In this article, I present a general algorithmic framework in Haskell for generating the unique \emph{partitions} of multisets using time linear in the number of partitions. The framework is based on the key observation that multisets can be represented using vectors in $\N^n$. It is general enough that it also lends itself to efficient computation of a number of related combinatorial functions, including vector partitions, integer partitions, and power sets. This article is literate Haskell; the reader is encouraged to load it into their favorite interpreter and experiment while reading along. Every example shown in the text was actually computed by the listed code, through the magic of lhs2TeX~\cite{lhs2TeX}. % Also, wow, does the % italic numeral `2' have a fantastic curlicue, or what? 2 2 2. %if style == newcode \begin{code} module Partitions where import Data.List import Control.Monad import Control.Arrow import Test.QuickCheck \end{code} %endif \end{introduction} \section{Introduction} A few months ago, working on a Project Euler~\cite{euler} problem (\#159, to be precise), I wanted to write a function to generate all factorizations of a given integer $n \geq 2$. A factorization of $n$ is simply a nonempty list of integers, all greater than $1$, whose product is $n$. For example, $30$ has five distinct factorizations: $30 = 15 \times 2 = 10 \times 3 = 6 \times 5 = 5 \times 3 \times 2$. (Note that we include the trivial factorization $30 = 30$ and consider factorizations equal up to permutation of the factors, e.g.\ $15 \times 2$ and $2 \times 15$ are the same factorization.) How can we compute all factorizations of a given integer efficiently? Consider representing integers by multisets of their prime factors. For example, $30 = \{2,3,5\}$, and $12 = \{2,2,3\}$. We must use multisets of prime factors rather than sets, since we want to be able to distinguish between, say, $6 = \{2,3\}$ and $12 = \{2,2,3\}$. Given an integer $n$ and its multiset of prime factors, a factorization of $n$ corresponds to a particular grouping of the prime factors into subsets. For example, for $n=30$, the factorization $10 \times 3$ corresponds to the grouping $\{\{2,5\}, \{3\}\}$. In general, such a grouping into subsets is known as a \term{partition}. Formally, a partition of a multiset $M$, or \term{multipartition}, is a collection of sub-multisets (usually just referred to as subsets) $T_i \subseteq M$ for which \begin{equation*} \biguplus T_i = M, \end{equation*} where $\uplus$ denotes multiset union; for example, $\{1,2\} \uplus \{1,3\} = \{1,1,2,3\}$. As an example, we can list all the multipartitions of $\{2,2,3\}$: \[ \{\{\{2,2,3\}\}, \{\{2,2\}, \{3\}\}, \{\{2\}, \{2,3\}\}, \{\{2\}, \{2\}, \{3\}\}\}. \] These correspond to the factorizations $12 = 4 \times 3 = 2 \times 6 = 2 \times 2 \times 3$, and in general it's not hard to see that factorizations of an integer exactly correspond to partitions of its multiset of prime factors. So, I set out to write some code for generating multiset partitions. This turned out to be rather tricky, and it took me a few days to come up with a good solution. The solution I eventually found, however, ended up being quite general, and serves as a nice case study in the expressiveness of Haskell. I found I was largely able to think ``through'' the language directly about the problem, without having to do too much translation between my head and the code. (Or perhaps this only means that I think in Haskell, but that wouldn't be such a bad thing either.) Algorithms for generating multiset partitions have been published previously, by, for example, Knuth~\cite{TAOCP} (who also notes the correspondence between multisets and vectors in $\N^n$). However, Knuth's implementation uses an imperative paradigm. To my knowledge there have been no published treatments of the multipartition problem in a functional context, although I would gladly welcome correction on this point. \section{A na\"ive approach: set partitions} Since multisets are a generalization of sets, let's start by generating set partitions and see how far it takes us. Given a set $S$, we can generate its partitions as follows: first, choose an arbitrary element $s \in S$; then generate the power set (set of all subsets) of the remaining elements of $S$; each one of these subsets can be combined with $s$ and a partition of the remaining elements to form a partition of $S$. That is, for each $T \subseteq (S \setminus \{s\})$, recursively find the partitions of $S \setminus (\{s\} \cup T)$, and add $\{s\} \cup T$ to each to form a partition of $S$. For example, let $S = \{2,3,5\}$ and choose $s = 2$, so the first subset of every generated partition will contain $2$. Next we generate the power set of $S \setminus \{2\} = \{3, 5\}$, namely, $\{\{3,5\}, \{3\}, \{5\}, \varnothing\}$. Each one of these can be combined with $s$. Combining the first with $s$ yields our first partition, $\{\{2\} \uplus \{3,5\}\} = \{\{2,3,5\}\}$, since there is nothing left to recursively partition. Combining $\{3\}$ with $s$ yields a first subset of $\{2,3\}$, and recursively partitioning the remaining $\{5\}$ yields only itself, giving us $\{\{2,3\}, \{5\}\}$. Combining $\{5\}$ with $s$ similarly gives us $\{\{2,5\}, \{3\}\}$. Finally, we combine $\varnothing$ with $s$, giving $\{2\}$ as the first subset; recursively partitioning the remaining $\{3,5\}$ gives both $\{\{3,5\}\}$ and $\{\{3\}, \{5\}\}$, and prepending $\{2\}$ gives us the final two partitions, $\{\{2\}, \{3,5\}\}$ and $\{\{2\},\{3\},\{5\}\}$. Some thought (and/or trying more examples) should convince the reader that this succeeds in generating each partition of $S$ exactly once. This algorithm is implemented in Listing~\ref{lst:set-partitions}, using lists as a simple representation of sets. (Incidentally, most of the code in this article, including that in Listing~\ref{lst:set-partitions}, was tested with QuickCheck~\cite{QC}; the QuickCheck properties are not shown but can readily be found in the source.) \begin{listing}[htp] \begin{code} -- pSet s generates the power set of s, pairing each subset -- with its complement. -- e.g. pSet [1,2] = [([1,2],[]),([1],[2]),([2],[1]),([],[1,2])]. pSet :: [a] -> [([a],[a])] pSet [] = [([],[])] pSet (x:xs) = mapx first ++ mapx second where mapx which = map (which (x:)) $ pSet xs first f (x,y) = (f x, y) second f (x,y) = (x, f y) -- setPartitions S generates a list of partitions of S. setPartitions :: [a] -> [[[a]]] setPartitions [] = [[]] setPartitions (s:s') = do (sub,compl) <- pSet s' let firstSubset = s:sub map (firstSubset:) $ setPartitions compl \end{code} \caption{Computing set partitions. \label{lst:set-partitions}} \end{listing} % $ % (fix syntax highlighting in Emacs, which is very confused by the % mixture of LaTeX and Haskell =) %if style == newcode \begin{code} -- compute Bell numbers using the Pierce triangle. -- See Knuth, TAOCP Vol.4, 7.2.1.5. nextPierceRow :: [Integer] -> [Integer] nextPierceRow r = next where next = last r : zipWith (+) r next pierceTriangle = iterate nextPierceRow [1] bellNumbers = map head pierceTriangle bell = (bellNumbers !!) -- test to make sure the right number of partitions are generated. prop_bell :: [Int] -> Bool prop_bell s = genericLength (setPartitions s) == bell (length s) -- many of the properties exhibit combinatorial blow-up after only a -- few tests, so implement a special testing function testN which -- allows us to limit the number of tests performed. Use -- (testN n) just like test. only :: Int -> Config only n = defaultConfig { configMaxTest = n } testN :: (Testable a) => Int -> a -> IO () testN = check . only \end{code} %endif % $ For example, evaluating |setPartitions [2,3,5]| yields \begin{quote} \eval{setPartitions [2,3,5]}, \end{quote} %% ??? better way to do this (w/o horizontal offset)? just as we computed before. Now, given a suitable implementation of |factor|, which returns a list of the prime factors of its argument, generating factorizations is straightforward: \begin{code} factorizations2 :: (Integral a) => a -> [[a]] factorizations2 = map (map product) . setPartitions . factor \end{code} Let's try |factorizations2 30|: \begin{quote} \eval{factorizations2 30}. \end{quote} Lovely! However, there's a problem: as you may have guessed, |setPartitions| doesn't do so well when given a multiset instead of a set. For example, evaluating |setPartitions [2,2,3]| yields \begin{quote} \eval{setPartitions [2,2,3]}. \end{quote} The two copies of |2| are treated as if they are distinct, and we end up with both |[[2,3],[2]]| and |[[2],[2,3]]|, which are equivalent when considered as multiset partitions. This means that an expression such as |factorizations2 12| will give duplicate results as well: \begin{quote} \eval{factorizations2 12}. \end{quote} This certainly won't do. We need to get rid of the duplications, but how? The easiest and most obvious way is to create a new function, |msetPartitions|, which generates set partitions and then culls duplicates: \begin{code} msetPartitions :: (Ord a) => [a] -> [[[a]]] msetPartitions = nub . map (sort . map sort) . setPartitions \end{code} Normalizing the partitions with |map (sort . map sort)| before applying |nub| is necessary, since otherwise |nub| will not consider partitions such as |[[2,3],[2]]| and |[[2],[2,3]]| equal. Evaluating |msetPartitions [2,2,3]| yields \begin{quote} \eval{msetPartitions [2,2,3]} \end{quote} as expected. With a definition of |factorizations3| using |msetPartitions| in place of |setPartitions|, we can now correctly compute the factorizations of $12$: %if style == newcode \begin{code} factorizations3 :: (Integral a) => a -> [[a]] factorizations3 = map (map product) . msetPartitions . factor \end{code} %endif \begin{quote} \eval{factorizations3 12}. \end{quote} Are we done? Some degree of inefficiency is clearly involved here, since throwing away duplicate partitions means wasted work to generate them. The question is, how much inefficiency? The code certainly has conciseness and elegance going for it, so we might be willing to overlook a little inefficiency if it means we can have a concise implementation that still works for all practical purposes. Unfortunately (you probably saw this one coming), |msetPartitions| turns out to be staggeringly inefficient. The worst case occurs when we have the most possible overlap between elements, namely, a multiset with $n$ copies of the same element. First, since |setPartitions| never examines the elements of its argument, it clearly generates the same number of partitions for such a list as it does for any other $n$-element list, namely, the $n$th Bell number $\bell{n}$. Bell numbers \cite{mw_bell, concrete} count the number of distinct set partitions of an $n$-element set, and satisfy the recurrence \[ \bell{0} = 1; \quad \bell{n+1} = \sum_{k=0}^n \binom{n}{k} \bell{k}. \] (Extra credit: how does this recurrence relate to |setPartitions|?) Thus, the first few Bell numbers for $n \geq 0$ are $1,1,2,5,15,52,203,877,4140,21147\dots$ \cite{oeis_bell}. On the other hand, the number of distinct multipartitions of such a worst-case multiset is the $n$th partition number $P_n$, which counts the number of ways of writing $n$ as a sum of positive integers. Partition numbers \cite{mw_partition, TAOCP} have the generating function \[ (1 + z + z^2 + \dotsb)(1 + z^2 + z^4 + \dotsb)(1 + z^3 + z^6 + \dotsb) \dots = \prod_{n \geq 1} \left( \sum_{k \geq 0} z^{nk} \right) = \prod_{n \geq 1} \frac{1}{1 - z^n}, \] and also satisfy a recurrence discovered by Euler, \begin{multline*} P_0 = 1; \quad P_n = P_{n-1} + P_{n-2} - P_{n-5} - P_{n-7} + P_{n-12} + P_{n-15} - \dotsb \\ = \sum_{\substack{-\infty < k < \infty\\k \neq 0}} (-1)^{k+1} P_{n - (3k^2 + k)/2}, \end{multline*} if we stipulate that $P_n = 0$ when $n < 0$. The first few partition numbers for $n \geq 0$ are $1,1,2,3,5,7,11,15,22,30\dots$ \cite{oeis_partitions}. As you can see, Bell numbers grow far more quickly than partition numbers. (This can be shown analytically; $\bell{n}$ grows as $(n/\log n)^n$, whereas $P_n$ only grows as $A^{\sqrt{n}}/n$ for a certain constant $A$ \cite{TAOCP}.) For example, |msetPartitions| correctly computes the $P_{10} = 42$ unique multipartitions of |replicate 10 1|, but it takes a few seconds to do so, since it must cull duplicates from an initial list of $\bell{10} = 115975$. And it isn't long before things become completely hopeless: computing the modest $P_{30} = 5604$ unique multipartitions of |(replicate 30 1)| in this way would require culling duplicates from among the whopping $\bell{30} = 846749014511809332450147$ (that's $8.5 \times 10^{24}$) generated by |setPartitions|! So, for all its conciseness, |msetPartitions| won't work after all. We need a way to directly generate all the unique partitions of a multiset, using only time linear in the number of partitions generated. \section{Multiset partitions are vector partitions} In order to directly generate partitions of a multiset $M$, we need to impose some sort of ordering on the subsets of $M$, and to be able to generate them in order. Using such an ordering, we could guarantee that the subsets in a partition always occur in order, thus ruling out the possibility of partitions occurring multiple times with their subsets arranged differently. Indeed, this is how |setPartitions| works. Considering the elements of a set $S$ (represented by a Haskell list) to be ``ordered'' by the order of their occurrence in the list, the partitions generated by |setPartitions| always contain strictly increasing subsets in increasing order (where subsets are ordered by their first elements). We could also order multiset subsets lexicographically, but it is not clear how we could efficiently generate them in order. Considering some elements equal to one another throws a big wrench into things, since we can no longer simply recurse over a multiset one element at a time, as |setPartitions| does. Suppose we have a multiset $M$, which has $n$ unique elements. The key insight is that we can represent $M$ by a pair $(v,e)$, where $v$ is a vector in $\N^m$ (that is, an $m$-tuple of non-negative integers), $e$ is an ordered list of length $m$, and $m \geq n$. In particular, $e$ is a list of distinct elements which are a superset of the unique elements of $M$, and $v$ records the number of occurrences of each element (possibly zero for elements not occurring in $M$). For example, the multiset containing $\{6,5,7,4,4,5,4\}$ can be represented by the vector $(3,2,0,1,1)$ together with the list of elements $[4,5,29,6,7]$. Using this representation, multiset union corresponds to componentwise vector addition, which reduces the multiset partition problem to that of finding all sets of vectors in $\N^n$ which sum to a given vector. In other words, multiset partitions are vector partitions in disguise! As an example, if we represent the multiset $M = \{2,2,3\}$ by the vector $(2,1)$ and element list $[2,3]$, then the partitions of $M$ correspond to partitions of $v$, as shown below: \begin{gather*} \{\{\{2,2,3\}\}, \{\{2,2\}, \{3\}\}, \{\{2,3\}, \{2\}\}, \{\{2\}, \{2\}, \{3\}\} \} \\ \cong \\ \{\{(2,1)\}, \{(2,0), (0,1)\}, \{(1,1),(1,0)\}, \{(1,0),(1,0),(0,1)\}\}. \end{gather*} This is good, because it is easy to impose an ordering on vectors, and, as we will see, they can also be efficiently generated in order. \section{Multisets} We will represent vectors in Haskell as |[Int]|. This is a simplification, but works fine for our purposes. Multisets are represented as discussed previously: \begin{code} type Vec = [Int] data MultiSet a = MS [a] Vec deriving Show \end{code} Of course, this Haskell definition of |MultiSet| is too lax; in particular, it doesn't preclude the element list and count vector having different lengths. We could correct this by using a list of pairs instead of a pair of lists; however, most of the time we'll be manipulating the element list and count vector separately, and the headache of constantly zipping and unzipping the two lists simply isn't worth it. Converting between multisets and regular lists is straightforward (Listing~\ref{lst:convert-list-ms}). Although an |Eq| instance for the element type is all that is needed for the conversion, an |Ord| instance allows it to be done more efficiently, so we include both as options. \begin{listing} \begin{code} fromList :: (Ord a) => [a] -> MultiSet a -- O(n lg n) fromList = uncurry MS . unzip . map (head &&& length) . group . sort fromListUnord :: (Eq a) => [a] -> MultiSet a -- O(n^2) fromListUnord xs = MS nx counts where nx = nub xs counts = map (\elt -> (length . filter (==elt) $ xs)) nx toList :: MultiSet a -> [a] toList (MS es cs) = concat $ zipWith replicate cs es \end{code} \caption{Converting between lists and multisets. \label{lst:convert-list-ms}} \end{listing} %if style == newcode \begin{code} prop_toFromList :: (Ord a) => [a] -> Bool prop_toFromList xs = (toList . fromList $ xs) == sort xs prop_toFromListUnordIdem :: (Eq a) => [a] -> Bool prop_toFromListUnordIdem xs = (tfu . tfu $ xs) == (tfu xs) where tfu = toList . fromListUnord \end{code} %endif For example, here's |fromList [2,3,3,2,3,5]|: \begin{quote} \eval{fromList [2,3,3,2,3,5]}. \end{quote} Now that we can easily convert between lists and our multiset representation, we can get on with the real work: generating vector partitions. \section{Vectors} First, some notation: if $u,v \in \N^n$, then $u \vleq v$ indicates that $u$ is componentwise less than or equal to $v$, and $u \leq v$ indicates that $u$ is lexicographically less than or equal to $v$. In other words, $u \vleq v$ means that every element of $u$ is no greater than the corresponding element of $v$, whereas $u \leq v$ implies only that $u$ is no greater than $v$ in the first element where they differ. Therefore, $u \vleq v$ implies $u \leq v$, but the converse does not hold. For example, $(1,2,7) \leq (1,3,5)$ (since $2 \leq 3$), but $(1,2,7) \nvleq (1,3,5)$ (since $7 \nleq 5$). A few simple utility functions for manipulating |Vec| values are shown in Listing~\ref{lst:vec-impl}. Note that the |Ord| instance for lists is lexicographical. Thus, for |u,v :: Vec|, the Haskell expression |u <= v| corresponds exactly to the notation $u \leq v$ introduced earlier. The relation $u \vleq v$ is implemented by the custom |(<||=)| operator. \begin{listing}[htp] \begin{code} -- recall that type Vec = [Int]. -- componentwise comparison of vectors. (<|=) :: Vec -> Vec -> Bool xs <|= ys = and $ zipWith (<=) xs ys -- (vUnit v) produces a unit vector of the same length as v. vUnit :: Vec -> Vec vUnit [] = [] vUnit [_] = [1] vUnit (_:xs) = 0 : vUnit xs -- (vZero v) produces a zero vector of the same length as v. vZero :: Vec -> Vec vZero = map (const 0) -- test for the zero vector. vIsZero :: Vec -> Bool vIsZero = all (==0) -- do vector arithmetic componentwise. (.+), (.-) :: Vec -> Vec -> Vec (.+) = zipWith (+) (.-) = zipWith (-) \end{code} \caption{Implementation of |Vec|. \label{lst:vec-impl}} \end{listing} % $ \section{Generating vector partitions} Given a vector $v$, the general method we will use to recursively generate all its partitions is as follows: first, we include the singleton set $\{v\}$ as a special case; then, for each smaller vector $v'$ in some appropriate subset of $\{ u : u \vleq v\}$, recursively generate all partitions $P'$ of $v - v'$, and combine to form the partitions $\{v'\} \cup P'$. Since the vectors in each $P'$ sum to $v - v'$, adding $v'$ will create a partition of $v$. All we have left to determine is what the ``appropriate subset'' of $\{ u : u \vleq v \}$ should be. First, each partition should contain vectors in lexicographically non-decreasing order; as discussed before, this will guarantee that we don't get duplicate partitions. In order to enforce this restriction as we recurse, we must keep track of a current lower limit $v_L$, which is the largest (hence, most recent) vector in the current partially built partition. We will only choose vectors $v'$ for which $v' \geq v_L$. Second, we need not bother with vectors $v'$ for which $v - v' < v'$, since in that case it would be impossible to complete a non-decreasing partition starting with $v'$. In other words, we only need to consider vectors $v'$ which are less than or equal to ``half'' of $v$, defined as the vector in the exact middle of the lexicographically ordered list of vectors $u \vleq v$ (rounding down if the list has an even number of elements). In summary, we want to choose vectors $v'$ for which $v' \vleq v$ and $\frac{1}{2}v \geq v' \geq v_L$. To realize this, we first need a function to compute ``half'' of a vector $v$. The implementation of |vHalf| is straightforward, recursively splitting along each dimension until finding one that splits evenly, then copying the remainder of the vector. We also need a function to generate a lexicographically sorted list of vectors which fall within a certain range. The function |withinFromTo| takes three vectors |m|, |s|, and |e|, and generates the list of vectors $v$ for which $m \vgeq v$ and $s \geq v \geq e$; that is, vectors falling \emph{within} the $n$-dimensional box framed by the origin and $m$, starting \emph{from} $s$, and running down \emph{to} $e$. Listing~\ref{lst:vector-partitions} puts all of these ideas together. \begin{listing} \begin{code} vPartitions :: Vec -> [[Vec]] vPartitions v = vPart v (vUnit v) where vPart v _ | vIsZero v = [[]] vPart v vL = [v] : [ v' : p' | v' <- withinFromTo v (vHalf v) vL, p' <- vPart (v .- v') v' ] vHalf :: Vec -> Vec vHalf [] = [] vHalf (x:xs) | (even x) = (x `div` 2) : vHalf xs | otherwise = (x `div` 2) : xs downFrom :: Int -> [Int] downFrom n = [n,(n-1)..0] -- (within m) generates a decreasing list of vectors v <|= m. within :: Vec -> [Vec] within = sequence . map downFrom -- This is a lie. withinFromTo' :: Vec -> Vec -> Vec -> [Vec] withinFromTo' m s e = takeWhile (>= e) . dropWhile (> s) . within $ m \end{code} \caption{Computing vector partitions. \label{lst:vector-partitions}} \end{listing} % $ %if style == newcode \begin{code} vSum :: [Vec] -> Vec vSum = foldl' (.+) (repeat 0) prop_vPartitions :: Vec -> Property prop_vPartitions v = (not . null $ v) && (not . vIsZero $ v) ==> (all ((==v) . vSum) . vPartitions $ v) \end{code} %endif % $ Let's try an example, and compute the partitions of the vector $(2,1)$: \begin{quote} \eval{vPartitions [2,1]}. \end{quote} Looking good! We're almost done, but there is still one loose end to tie up. The single quote at the end of |withinFromTo'| (Listing~\ref{lst:vector-partitions}) is not a typo; in fact, |withinFromTo'| is an impostor! It only illustrates the intended behavior of its quote-less cousin used in the definition of |vPartitions|. Although |withinFromTo'| elegantly matches its intuitive definition described earlier, it is inefficient. Just like |msetPartitions|, it generates some vectors only to discard them. Since our goal is to write ``productive'' code that doesn't do any unnecessary work, we'll actually use a more efficient version at the cost of some readability. \section{Efficiently enumerating vectors} The real |withinFromTo| is shown in Listing~\ref{lst:generating-vecs}. The implementation of this function is tricky to get right; careful attention must be paid to appropriate base cases, especially the case when $s \nvleq m$. In such a case we must first ``clip'' $s$ against $m$ before proceeding. Otherwise we risk generating vectors $v$ for which $v \leq s$ but $v \nvleq m$. In the general case, we recurse over each dimension and ensure that the generated vectors stay lexicographically between $s$ and $e$. \begin{listing}[htp] \begin{code} clip :: Vec -> Vec -> Vec clip = zipWith min withinFromTo :: Vec -> Vec -> Vec -> [Vec] withinFromTo m s e | not (s <|= m) = withinFromTo m (clip m s) e withinFromTo m s e | e > s = [] withinFromTo m s e = wFT m s e True True where wFT [] _ _ _ _ = [[]] wFT (m:ms) (s:ss) (e:es) useS useE = let start = if useS then s else m end = if useE then e else 0 in [x:xs | x <- [start,(start-1)..end], let useS' = useS && x==s, let useE' = useE && x==e, xs <- wFT ms ss es useS' useE' ] \end{code} \caption{A better implementation of |withinFromTo|. \label{lst:generating-vecs}} \end{listing} %if style == newcode \begin{code} prop_wNaive :: Vec -> Bool prop_wNaive m = (within m) == (withinFromTo m m (vZero m)) prop_wftNaive :: Vec -> Vec -> Vec -> Property prop_wftNaive m s e = (lengthsEqual [m,s,e]) ==> (withinFromTo m s e) == (withinFromTo' m s e) -- ??? create a custom generator for multiple lists? lengthsEqual :: [Vec] -> Bool lengthsEqual = (==1) . length . nub . map length prop_wftNonincr :: Vec -> Vec -> Vec -> Property prop_wftNonincr m s e = (lengthsEqual [m,s,e]) ==> and $ zipWith (>=) vs (tail vs) where vs = withinFromTo m s e prop_wftWithin :: Vec -> Vec -> Vec -> Property prop_wftWithin m s e = (lengthsEqual [m,s,e]) ==> all (<|= m) (withinFromTo m s e) prop_wftLEq :: Vec -> Vec -> Vec -> Property prop_wftLEq m s e = (lengthsEqual [m,s,e]) ==> all (<= s) (withinFromTo m s e) prop_wftGEq :: Vec -> Vec -> Vec -> Property prop_wftGEq m s e = (lengthsEqual [m,s,e]) ==> all (>= e) (withinFromTo m s e) \end{code} %endif % $ \section{A walk in the park} Now that we've descended into the depths, tamed multisets, slain the |Vec| beast, and bludgeoned |withinFromTo| into submission, the climb back out will be nothing but a scenic stroll! We have only to wave one of our magic vector functions, and Things will Happen. Our vector functions are so magical, in fact, that we can use them to compute not just multiset partitions, but a number of other interesting functions as well. \begin{listing}[htp] \begin{code} -- Integer partitions. intPartitions :: Int -> [[Int]] intPartitions = map (map head) . vPartitions . return -- Power sets. mPowSet :: MultiSet a -> [MultiSet a] mPowSet (MS elts v) = map (MS elts) $ within v powSet :: (Ord a) => [a] -> [[a]] powSet = map toList . mPowSet . fromList powSetUnord :: (Eq a) => [a] -> [[a]] powSetUnord = map toList . mPowSet . fromListUnord -- Partitions. mPartitions :: MultiSet a -> [[MultiSet a]] mPartitions (MS elts v) = map (map (MS elts)) $ vPartitions v partitions :: (Ord a) => [a] -> [[[a]]] partitions = map (map toList) . mPartitions . fromList partitionsUnord :: (Eq a) => [a] -> [[[a]]] partitionsUnord = map (map toList) . mPartitions . fromListUnord \end{code} \caption{Casually waving some magic vector functions around. \label{lst:applications}} \end{listing} % $ %if style == newcode \begin{code} prop_partitions :: (Ord a) => [a] -> Bool prop_partitions s = normalize (partitions s) == normalize (msetPartitions s) normalize :: (Ord a) => [[[a]]] -> [[[a]]] normalize = sort . map (sort . map sort) -- compute partition numbers using Euler's recurrence. genPentagonals :: [Integer] genPentagonals = map (\n -> n * (3*n - 1) `div` 2) ([1..] >>= \n -> [n,-n]) select :: [Integer] -> [Integer] -> [Integer] select ns xs = select' 1 ns xs where select' _ [] _ = [] select' _ _ [] = [] select' i nns@(n:ns) xxs@(x:xs) | i < n = select' (i+1) nns xs | i ==n = x : select' (i+1) ns xs nextPartitions :: [Integer] -> [Integer] nextPartitions ps = p' : ps where p' = sum $ zipWith (*) (concat $ repeat [1,1,-1,-1]) (select genPentagonals ps) partitionNums = map head $ iterate nextPartitions [1] -- make sure the right number of integer partitions are generated. prop_partitionNums :: Int -> Property prop_partitionNums n = (n > 0) ==> (genericLength $ intPartitions n) == (partitionNums !! n) \end{code} %endif % $ For one, consider the problem of splitting an integer $n$ into a multiset of smaller integers whose sum is $n$. Such integer partitions can be seen as degenerate vector partitions. In particular, the partitions of the integer $n$ correspond exactly to partitions of the 1-dimensional vector $(n)$. This is implemented by the function |intPartitions|, shown in Listing~\ref{lst:applications}. For example, |intPartitions 5| yields \begin{quote} \eval{intPartitions 5}. \end{quote} It seems to work! As a check, we can also (rather inefficiently!) compute the partition numbers~\cite{oeis_partitions} with |map (length . intPartitions) [1..]|: \begin{quote} \eval{map (length . intPartitions) $ [1..19]}... \end{quote} % $ In particular, |length $ intPartitions 30| yields \eval{length $ intPartitions 30} almost instantly. This is clearly an improvement over |length $ msetPartitions (replicate 30 1)|, which theoretically produces the same value but probably wouldn't finish before the heat death of the universe. % $ Computing power sets is easy, too (Listing~\ref{lst:applications}), since there is a natural bijection between unique subsets of a multiset $(e,v)$ and the set of vectors $u \vleq v$. For example, here's |powSet [2,2,3,3]|: \begin{quote} \eval{powSet [2,2,3,3]}. \end{quote} Last (but not least!), finding partitions of a multiset (also in Listing~\ref{lst:applications}) is now a simple matter of finding partitions of the representative count vector. Here's |partitions [2,2,3]|: \begin{quote} \eval{partitions [2,2,3]} \end{quote} No repeated partitions here! \section{Factorizations, reloaded} We can finally use the |partitions| function for its original intended purpose, to efficiently generate integer factorizations. In fact, we can use |powSet| to efficiently generate divisors, as well. Listing~\ref{lst:factorizations} shows how. (This certainly is not the fastest or most robust factoring code possible, but it isn't really the point. Any implementation of |factor| could be combined with |partitions| and |powSet| in this way.) \begin{listing}[htp] \begin{code} primes :: (Integral a) => [a] primes = 2 : 3 : [ p | p <- [5,7..], isPrime p ] isPrime :: (Integral a) => a -> Bool isPrime n = all ((/= 0) . mod n) $ upToSqrt n primes upToSqrt :: (Integral a) => a -> [a] -> [a] upToSqrt n = takeWhile (<= (round . sqrt . fromIntegral $ n)) factor :: (Integral a) => a -> [a] factor 1 = [] factor n = factor' n [] primes where factor' n fs ps@(p:pt) | (fromIntegral p > (sqrt . fromIntegral $ n)) = n:fs | (n `mod` p == 0) = factor' (n `div` p) (p:fs) ps | otherwise = factor' n fs pt divisors :: (Integral a) => a -> [a] divisors = map product . powSet . factor factorizations :: (Integral a) => a -> [[a]] factorizations = map (map product) . partitions . factor \end{code} \caption{Factorizations as multiset partitions. \label{lst:factorizations}} \end{listing} % $ For example, we can compute the factorizations of $30$: \begin{quote} \eval{factorizations 30}. \end{quote} But that's easy, since $30$ has no repeated prime factors. Let's try something like $24$: \begin{quote} \eval{factorizations 24}. \end{quote} Or how about |length $ factorizations 1073741824|? \begin{quote} \eval{length $ factorizations 1073741824} \end{quote} Funny, I think I remember seeing that number somewhere recently\dots As a final comment, the reader may note that we are actually taking advantage of the fact that multisets and vectors also correspond to monomials. For example, the multiset $\{1,1,2,3,3,3,3\}$ corresponds to the monomial $x_1^2 x_2 x_3^4$. Positive integers can be viewed as monomials over the primes, $2^{\alpha_2} 3^{\alpha_3} 5^{\alpha_5} \dots$, which leads directly to the code in Listing~\ref{lst:factorizations}. Writing a function to compute general monomial factorizations is left as an exercise for the reader. \section{Acknowledgments} Thanks to David Amos and Wouter Swierstra for their helpful comments on a first draft of this article. \section{About the author} Brent Yorgey has a BA in Computer Science from Williams College in Massachusetts, USA, and hopes to begin studying for a PhD in the fall of 2008. He spends entirely too much time chatting in \#haskell (as byorgey) when he should be doing other things (such as editing this article). \bibliography{Partitions} \end{document}