Coding For Food: Transportation Algorithm in F#, Part 2

After having had a look at the algorithm in the last post, here is some F# code to implement it. Please not that all the source files are on CodePlex, and it may be of help to download and run that project in Visual Studio and with some strategically placed breakpoints.

File SampleData1.fs has some (surprise!) sample data. The tableau that is represented by this data is the one from the first post.

let srcCapacity = [ 12
                    1
                    5 ]

let dstCapacity = [10;8]


let costMatrix = Microsoft.FSharp.Math.Matrix.Generic.ofList
                    [   [2;4]
                        [8;12]
                        [12;6]
                        ]

Finding the Initial Solution

Sources and destinations (or rather their capacities) are represented by lists, and the cost matrix is an integer Matrix type as defined in the F# PowerPack. The very first step in the algorithm is to find a feasible solution (InitialFeasibleSolution.fs). We expect to feed in our two lists (sources and destinations) and to get another list that contains a first solution for the problem. This is of the form

[(row_number, col_number, apportioned);(row_number, col_number, apportioned);…]

For each cell in the solution, we get its co-ordinates and the capacity that has been assigned to the cell. The code looks a bit wild, but is quite simple in principle – compare the heads of the two lists, find the smaller value, remove the head from the list that has the smaller value, subtract this value from the head of the other list, and then do the same with the two resulting lists. Meanwhile, concatenate the new list with the result and return it. Oh, and carry the column and row indices with you, because we need that info. Quite simple:

let find_feasible_nwc src dst =
    let rec find_feasible src_l src_idx dst_l dst_idx =
        match src_l, dst_l with
        | h_src::t_src, h_dst::t_dst when h_src < h_dst ->
            (src_idx, dst_idx, h_src)::find_feasible 
                t_src (src_idx + 1) ((h_dst - h_src)::t_dst) dst_idx
        | h_src::t_src, h_dst::t_dst ->
            (src_idx, dst_idx, h_dst)::find_feasible 
                ((h_src - h_dst)::t_src) src_idx t_dst (dst_idx + 1)
        | _ -> []
    find_feasible src 0 dst 0

(Anyone who can come up with a way to make this tail recursive gets a special mention). There are other methods to find an initial solution, but the nice thing about the North-West Corner is that it does not require any knowledge at all about the cost matrix, so doesn’t complicate the code unnecessarily.

Dual Values

Once we have a solution, we need the dual values. The problem with this is that we cannot assume anything about the order in which we see the cells of the solution. In a small example they will look nice and tidy, but for a big tableau that has already gone through a few steps of the algorithm we must assume that they can come in any order, and not necessarily in the one that we need to be able to calculate the duals one by one. In principle, we need a topological sort first that leaves the solution cells in the order that is demanded for the calculations. But there is no way to do a topological sort nicely, without recurring to mutable state, so it’s a no-win situation. What we do here is to go through the list of cells in the solution and see how many duals we can calculate in one go. If there are any left to calculate, we do it again with the solution cells that have not contributed to the calculations yet, and so forth. In fact, this is a topological sort (compare this code to Brian’s little example that’s linked earlier, and you’ll notice it basically does the same thing), except we do it in-place, so to say. The code is in Dual.fs:

let calc_dual_values (cM: int Microsoft.FSharp.Math.Matrix) basis_solution =
    let rows, cols = cM.Dimensions
    (*  Create arrays to hold the dual values for
        sources (rows) and destinations (columns).
        To be able to express unknown quantities,
        we use arrays of option type.   *)
    let u = Array.create rows None
    let v = Array.create cols None
    (*  Here we go through all the cells of the
        current solution to see for which we
        can calculate the missing dual value.
        We can only calculate v if we have the
        corresponding u, or viceversa. 
        Any cells for which we don't have
        u or v are returned in a new list.  *)
    let rec calc sol =
        match sol with
        | [] -> [] // Not necessary, but avoids compiler warnings
        | (row, col, _) as h::t ->
            // Get the first cell
            match u.[row], v.[col] with
            // No u or v, return head and try tail
            | None, None -> h::calc t
            // u and v known. Can't really happen,
            // but again avoids compiler warnings
            | Some(ui), Some(vi) -> calc t
            // u known: calculate v, drop head, process tail
            | Some(ui), None ->
                v.[col] <- Some(cM.[row, col] - ui)
                calc t
            // v known: calculate u, drop head, process tail
            | None, Some(vi) ->
                u.[row] <- Some(cM.[row, col] - vi)
                calc t
    (*  This implements a loop that is executed until
        all u and v are known   *)
    let rec ccalc sol =
        match calc sol with // do calculations
        | [] -> ()  // all processed -> done
        | l -> ccalc l // some cells left -> try those again
    (*  Initialise an arbitrary element of one array with 0 *)
    u.[0] <- Some(0)
    (*  Start calculations *)
    ccalc basis_solution
    (*  Return two arrays with dual values.
        The arrays contain option type values; we could
        strip them off before returning them, but here we
        just let the next step of the algorithm take this
        into account and return the arrays as they are. *)
    u, v

Find Entering Cell

We now have two arrays that hold the duals for sources and destinations, respectively. In order to find the entering cell (or confirm that there isn’t any), we need to go through all non-basis (non-solution) cells, i.e. we would need to find those cells in the matrix that don’t belong to the solution. However, in order to keep the code simple, we do this for all the cells. The difference between the cost coefficient for basis cells and the duals will always be 0 (we only just calculated one from the other!), so there is no danger that we accidently choose a basis cell.

The code in NewBasisCell.fs uses the Matrix.foldi method that’s in the PowerPack and folds over all the elements of a matrix. We initiate it with a state value that has negative row and column indices, so that we know in the end if nothing has been found, and a value of 0 to compare all the dual/cost differences against. Only values <0 will be found, and of those we get the minimum, which is what we want. If we can’t have that, there’s no entering cell and the algorithm is done.

let find (cM: int Microsoft.FSharp.Math.Matrix) (u:int option array, v:int option array) =
    let (row, col, _) = 
        Microsoft.FSharp.Math.Matrix.Generic.foldi
            (fun row col ((state_row, state_col, state_value) as state) elem ->
                let value = elem - u.[row].Value - v.[col].Value
                if value < state_value then
                    (row, col, value)
                else
                    state) 
            (-1, -1, 0) cM
    if row >= 0 && col >= 0 then Some(row, col, 0) else None

If there is an entering cell, it is returned as a tuple with the row and column indices, and the capacity, so it has the same format as the elements of the current solution list, and can thus be conveniently inserted into that list. The initial capacity is set to 0 as required.

Find Path for Entering Cell

This code was already explained in the previous post about the Stepping Stone algorithm, please see there. The function in SteppingStone.fs is a bit different from what we had earlier, because we need to carry through the capacity value for the cells, so that we can use them in the next step. This wasn’t a concern with the original Stepping Stone example, but the code change is very minor. This is what the function declaration looks like with the implementation details left out:

let find_path ((targetRow, targetCol, _) as target) lstSolution =
    (* see source file *)

It takes the tuple that represents the entering cell as the first argument, and the current solution as the second, and returns a path from the entering cell through the solution. The entering cell is the first element of the returned list.

Making the New Solution

In order to go to the next solution, we need to find the minimum in the path for the entering cell (as explained in the last post), do all the additions and subtractions on the path, decide which cell to remove and remove it, and then append the cells of the old solution that were not on the path. This requires a few helper functions in NewSolution.fs.

When looking for the minimum, we only take into account every other cell on the path, starting with the second (the first one in this implementation is the entering cell). This can be done with some clever pattern matching:

let rec private find_min path = 
    match path with
    | [_;(_, _, v)] -> v
    | _::(_, _, v)::t -> min v (find_min t)
    | _ -> failwith "Unexpected data for find_min."

The function relies on there being an even number of elements in the list, and this will always be the case. If it is not, we are in trouble, and tell the user that we are. Can’t happen, though. Really.

Next, we need a function that adds/subtracts the minimum that we found from all elements on the path. Again, this only works for even numbers of elements, but we’re safe.

let new_path path =
    let m = find_min path
    let rec flatten p =
        match p with
        | (r1, c1, v1)::(r2, c2, v2)::t -> 
            (r1, c1, v1 + m)::(r2, c2, v2 - m)::flatten t
        | _ -> []
    flatten path

We go through the path now and see which element has a capacity value of 0. In principle, there can be more than one, if the problem is degenerate. This would need special treatment, because otherwise we could end up with an endless loop in the algorithm. Keeping in mind that there is still some work to do on the fringes, which can be left for later, we just remove the first zero element that we find. The function remove_first is shown elsewhere, but here we have a little helper function that makes sure we don’t remove the very first element of the list (the entering cell) – which, in fact, can’t really happen anyway, can it?

let rec private remove_first_zero_t =
    function    | (h::t) -> h::remove_first (fun (_, _, v) -> v=0) t 
                | _ -> []

With all this in place, we can now write the function that takes the old/current solution, and the path with the entering cell, and makes a new solution out of these.

let transform_basis_solution solution new_path =
    let change_set = 
        new_path |> Seq.map (fun (r, c, _) -> (r, c)) |> Set.ofSeq
    let nonzero_new_path =
        new_path |> remove_first_zero_t
    solution 
    |> List.filter (fun (r, c, _) -> not (change_set.Contains(r, c)))
    |> List.append nonzero_new_path

We have all the bits in place now to run the algorithm, all that’s missing is some control structure that initiates it, finds out when it’s finished, and gives back the result.

To be continued

Transportation Algorithm in F#, Part 3 puts those pieces together to form a working program.

Transportation Algorithm in F#, Part 1 explains the algorithm.

Coding For Food

Tuesday, 18 January 2011

Transportation Algorithm in F#, Part 2

Finding the Initial Solution

Dual Values

Find Entering Cell

Find Path for Entering Cell

Making the New Solution

To be continued

Previous

No comments:

Post a Comment