optimize - Maple Help

Writing Efficient Maple Code

Description

 This document provides tips to help you to write more efficient Maple code. Some of this material is general and applies to programming in most languages, but other points are Maple-specific, or apply only to programming languages that are similar to Maple.
 The first two-and most important-rules of computer program optimization describe when not to optimize.
 1. Do not begin optimizing your code until after you have most of your program designed and working well.
 2. Do not begin optimizing your code until you have thoroughly profiled it.
 Maple now has quite sophisticated profiling facilities that gather fine-grained, execution-time statistics for your application. For more information, see the help pages debugopts, kernelopts, profile, exprofile, and excallgraph. Note the profile option to kernelopts and the traceproc option to debugopts.
 Profiling efforts are not useful early in the development process, except, perhaps, to catch gross errors. Only after your application is nearly functionally complete should you begin profiling your code. Only after rigorous profiling of your code is any attempt at optimization likely to be successful.
 Unlike conventional languages, such as C, Ada, and FORTRAN, Maple is designed for ''evolutionary programming''. Conventional languages expect that you have well-defined specifications before you begin to write code, and expect the specification to remain unchanged once development begins. Most Maple programmers initially have only a vague idea of the final application functionality. They do not arrive at a final design specification until after they have investigated a number of examples, and tried a few prototypes. This evolutionary approach to development is characteristic of the type of mathematical exploration for which Maple was designed. Early in the development effort, you can focus on interactive testing and modeling of your problem. Only after you have discovered how best to model a problem and present its solution should you profile the code and focus on optimizing it to achieve optimal performance.
 The remainder of this page focuses on concrete tips to help you achieve better performance from your Maple code. Where possible, an explanation as well as a method is given so that you can extrapolate from the information provided here.
 • Very few Maple applications are I/O bound and so, on average, 80 percent of the execution time of a Maple application is spent on less than 5 percent of the code. Careful profiling is the only way to determine where this critical 5 percent is located. Use profiling information to determine the sections of your code that take the most time and consume the most storage, then optimize those parts.
 • Ensure that you are using the correct algorithm, that is, the best optimization. You can, for example, write a ''quicksort'' algorithm to get O(n*ln(n)) performance in a sorting application. But, if your data is often sorted on input, you can achieve O(n) performance (asymptotically) by checking for a sorted sequence first.
 • Choose your data structures carefully. Learn about the performance characteristics of the operations supported by various data structures. Do not store expressions in a list if you must assign to its entries, because list assignment results in copying the entire list each time. Use an Array instead.
 • Use the correct command for the job. For example, do not use sum or prod when add or mul is adequate. Computing integrals numerically should be done by evaluating an expression of the form evalf( Int( expr ) ), which uses the inert integral operator Int, rather than by using evalf( int( expr ) ), which first attempts to compute the symbolic integral of expr. Computational linear algebra can be most efficiently done by using the LinearAlgebra package.
 • Avoid explicit do loops. Most uses of loops in Maple can be replaced by calls to very fast (built-in) procedures that perform an iteration. If you need to construct an expression sequence (including sets, lists, sums, and products), avoid iteratively building the sequence by appending to it. Consider using any of the seq, map, map2, foldl, zip, select, and remove commands instead. There are cases, however, when a loop can be faster than a corresponding built-in command (see the factorial example below).
 As an example, consider the task of writing a procedure that takes a product expression and returns the product of all factors that are calls to the procedure Limit or limit. A very inefficient method to accomplish this is as follows.
 > GetLimits := proc( e )     local    term, p;     if type( e, '*' ) then         p := 1;         for term in e do             if type( term, 'specfunc( anything, {Limit,limit} )' ) then                 p := p * term             end if         end do:     else         1     end if end proc:
 A much more efficient method uses the select command.
 > GetLimits := proc( e )     if type( e, '*' ) then         select( type, e, 'specfunc( anything, {Limit,limit} )' )     else         1     end if end proc:
 This second approach avoids a quadratic space algorithm for forming the product to be returned and reduces the number of local variables by two.
 Suppose that you now want to apply this operation to every expression stored in a list L. One way to do that is to iterate over the members of the list L by using a loop. But where are the results going to be stored? You require another list L2 in which to store the results. So, start with this idea:
 > for i from 1 to nops( L ) do L2[ i ] := GetLimits( L[ i ] ) end do:
 Almost as soon as it is written down, several problems become evident. First, lists in Maple are immutable, so the assignments in the loop body force a copy of the list L2 to be created at each iteration. Furthermore, we need to create the list L2 initially (the first copy), but what should be stored in it? These problems are avoided by using an Array instead of a list.
 > L2 := Array( 1 .. nops( L ) ):
 > for i from 1 to nops( L ) do L2[ i ] := GetLimits( L[ i ] ) end do:
 This is efficient and, if there are other reasons why an array is preferred to a list, then this solution is quite workable. However, lists consume less storage than do arrays, and access time is a little faster for lists. If L2 must be a list, or if it does not matter whether it is a list or an array, then the best solution is to use map.
 > L2 := map( GetLimits, L );
 If this whole operation is performed only once in your application, you could eliminate the overhead of calling the procedure GetLimits. In this case, you can manually "inline" its code by using:
 > L2 := map2( select, type, L, 'specfunc( anything, {Limit,limit} )' );
 The astute reader will notice the bug in this code. The operation defined by GetLimits is only defined for expressions of type *. One way to repair this is as follows.
 > L := map2( select, type, select( type, L, '*' ),     'specfunc( anything, {Limit,limit} )' );
 • On a related point, consider the common task of extracting the ''constant'' part of a product. You might, for example, want to isolate the constant part of the product 2*Pi*sin(x/2 + y/2)*tan(1), which is 2*Pi*tan(1). This can be done by using select, as shown in the preceding example. But, suppose that you also want to store the ''non-constant'' factor. In this case, you can also use the procedure remove, which acts like select, but in the opposite sense.
 > c, f := select( type, e, 'constant' ), remove( type, e, 'constant' );
 This method is quite efficient as it stands, but it can be improved. Notice that the separate calls to select and remove must each traverse the input expression e. This requires two linear passes over e. They can be replaced by a single pass that uses the procedure selectremove to perform exactly this task. The expression above can be rewritten as:
 > c, f := selectremove( type, e, 'constant' );
 • Avoid useless recomputation. If you must use an explicit loop, study the loop body carefully to determine whether any of the computations can be "factored out" of the loop (that is, computed once, before entering the loop). For example, a loop that has the form:
 > for i from 1 to n do     a[ i ] := f( Pi ) / g( i ) end do;
 in which f and g are procedures can be more efficiently written by "factoring out" the computation of f( Pi ). Then the code looks like:
 > s := f( Pi );
 > for i from 1 to n do     a[ i ] := s / g( i ) end do;
 Of course, if this loop is merely initializing an array a, you should use the following instead.
 > s := f( Pi );
 > a := Array( 1 .. n, [ seq( s / g( i ), i = 1 .. n ) ] );
 • Try to avoid excessively recursive procedures. Many recursive commands can be transformed into an iterative style by using loops. (Then, the loops can often be replaced with faster, built-in commands.)
 Consider the standard factorial procedure FACT, defined recursively as:
 > FACT := proc( n::nonnegint )     if n < 2 then         1     else         n * FACT( n - 1 )     end if end proc:
 The first step in optimizing this procedure is to transform it into a tail-recursive form.
 > FACT := proc( n::nonnegint )     local    afact;     afact := proc( m, a )         if m < 2 then             a         else             afact( m - 1, m * a )         end if     end proc;     afact( n, 1 ) end proc:
 This method uses a tail-recursive subcommand afact that requires two arguments to store the intermediate results in an accumulator, which is passed to the recursive invocation as the second parameter a. Since it is tail-recursive (and not merely recursive), the subcommand afact is an iteration, and can be rewritten as a loop.
 > FACT := proc( n::nonnegint )     local    afact, m;     afact := 1;     for m from 1 to n do         afact := afact * m     end do;     afact; end proc:
 In this case, the replacement of the loop with a call to mul does not necessarily lead to a performance improvement. It improves performance for moderately large arguments, but for extremely large numbers, garbage collection begins to affect performance. Performance degrades because the products formed at each loop iteration are purely numeric, and so the amount of ''garbage'' created at each loop iteration is ''constant''. (This is not quite true; the size of the integers allocated does indeed grow, but an integer has no internal pointers for the garbage collector to follow, so reclamation of garbage integers uses a bounded amount of stack space during garbage collection.) Whether a version of FACT based on mul is faster than the last version above is platform dependent.
 • Set the Digits environment variable to a value appropriate for your computations. Maple provides unlimited precision floating-point arithmetic, but many applications of floating-point computations do not require high-precision computation, and the cost of computation rises with the value of Digits. Be aware that many commands include special code for floating-point computations at settings of Digits that are below the hardware precision. If you want to plot cos( x ) over the interval x = -Pi..Pi, do not use Digits:=100.
 • Avoid expensive, recursive operations. Some commands, even built-in procedures, are very expensive because they must traverse an entire expression to compute the result. Two specific examples are has and hastype. A common efficiency error is to use has( expr, x ) when member( x, expr ) suffices.
 • Do not create garbage. Maple provides automatic storage management. This means that you can allocate as much storage as you want, and Maple will take care of reclaiming any storage that is no longer in use. However, this automatic recycling of storage takes a certain amount of time, and creates unnecessary "garbage" (allocated storage that is no longer needed) that increases the amount of work that must be done by the garbage collector. (The "garbage collector" is the part of Maple that takes care of recycling unused storage.)
 Tune the garbage collector for the machine on which you want to run your application. The most effective optimizations are those that tune an application for optimal use of the computer's memory hierarchy. Experiment with various garbage collection settings to optimize the performance of long-running computations.
 • A general rule of thumb-one that does not apply universally-is that you achieve greater efficiency by employing a functional style that avoids side-effects instead of an imperative style. Traditional, imperative languages such as C and FORTRAN encourage you to create objects too early, forcing you to update them later as program execution progresses. Modern functional languages like Haskell or ML create objects at the right time, so that side-effects can be avoided. Maple supports both the functional and imperative styles of programming. Generally, however, a functional approach leads to more efficient Maple code. Code that is free of side-effects is much easier to optimize. (One of the first operations performed by most optimizing compilers is to transform a program into a form free of side-effects.) Procedures that have no side-effects are eligible for the remember or cache options, described below.
 Consider the loop example above, in which the computation of the f( Pi ) term was moved outside of the loop body. The tacit assumption was made that the computation of f( Pi ) was side-effect free. Note that this optimization requires that the computation of f( Pi ) be free of side-effects. If, for instance, f is defined as
 > f := proc( s )     print(HELLO);     sin( s ) end proc:
 then the optimization described above would not be valid, because it changes the behavior of the program. Other side-effects are assignments to global, lexically-scoped local variables from an enclosing scope or environment variables; I/O related effects; and "call-by-name" commands that assign a value to one of their arguments before returning to the calling code.
 • In some cases, memoize computed values. Maple makes this very easy and convenient by means of the option remember. If you are concerned about the storage used by accumulating values for a frequently called procedure, then also include option system, or use option cache instead. Note that this technique is only for use in computations free of side-effects.
 • In some situations, in which you must be very careful with storage costs, a simple memoization technique called "caching" provides a low storage overhead solution that can dramatically improve average case performance. Suppose that you have an expensive procedure f.
 > f := proc()     .... # some expensive computation end proc:
 Introduce a variable, f_cache, that is a global or local variable of a module or procedure in which f is defined. This holds the most recently computed value of f. Also introduce a variable, f_cache_args, to hold the expression sequence of arguments to f that produced the value f_cache. (They must be visible in the body of f.) Now modify the definition of procedure f to resemble the following (assuming that these are lexically scoped local variables):
 > f := proc()     if assigned( f_cache ) and [f_cache_args] = [args] then         return f_cache     end if;     ... # original computation end proc:
 Caching is an example of a specialized optimization that can be highly effective if applied intelligently.
 • Make liberal use of the option inline procedure option to create simple, inline procedures. This encourages more abstract thinking about a problem at no extra procedure-call cost at execution time.
 • In rare cases, it may actually be wise to implement time-critical portions of your application in a low-level language, such as C. You can also use the Compile command to compile Maple to native code and automatically link it into your session. Alternatively, you can access existing high-performance, third-party libraries. In this case, Maple provides a foreign function interface that enables you to dynamically link against external libraries of compiled object code. This approach is used in the LinearAlgebra package, which uses state-of-the-art numeric libraries from NAG to implement some hardware-precision, floating-point linear algebra computations.