You are currently on IBM Systems Media’s archival website. Click here to view our new website.

IBM i > DEVELOPER > RPG

RPG Fundamentals: Arrays

RPG Fundamentals: Arrays

This month we're beginning a series on modern RPG fundamentals. So many changes have taken place in the language over the years that, increasingly, we find that students in our classes are unaware of many of the basic tools available to them. So, begging your forgiveness in advance if this is all "old hat" to you, we're going to start off with array processing. In this piece, we're looking at loading and searching arrays. Note that our intent is to focus on RPG's built-in facilities, and not the many more varied and flexible array functions provided by add-on libraries such as Arraylist, which was recently added to the OSSILe Open Source collection. Hopefully by the time you finish this series you'll be eager to explore the capabilities of libraries such as this.

Over the years, as we have added languages other than RPG to our personal toolboxes, one thing has become abundantly clear. As RPGers we do not make as much use of arrays as we should. Often we find people using a work file, for example, when an array would have been a better performing choice. There are probably many historical reasons for this. For one, until V6 came along, the maximum size of an array was limited to 32K elements. For another, array operations such as SORTA assumed that all elements in the array were in use. This meant that you had to preset the array content to high values in order to ensure that you didn't end up with a bunch of blanks/zeros at the beginning of the array. Last but not least, performance of array look ups was poor. But we're getting ahead of ourselves—more on this later.

Basic Array Searching

In the code example below, (F) shows the initial definition of the sample array we'll be using. For these tests we used a size of 5,000 elements. (G) shows the invocation of the FillArray() subprocedure. We decided to load the array with 5,000 entries and to use a maximum value of 10,000 for the valueArray values. This will produce data that in our simple tests could theoretically find a match roughly 50% of the time. See the sidebar "Generating test values" for more details on the method used.

At (I) we use %lookup to search the array for the value in searchFor, test the result of the search and increment the appropriate counter.

(F) dcl-s  valueArray  Packed(7) Dim(5000);

     // Fill array with random values
(G) FillArray( 5000 : 10000 ) ;  

(H) For searchFor = 1 to maxValue;
(I)   location = %LookUp( searchFor: valueArray );
      If location = 0;
        countMissing += 1;
      Else;
        countFound += 1;
      EndIf;
    EndFor;

This approach to searching an array is the most common that we have encountered over the years—but it is relatively inefficient. This is because coding it this way requires %Lookup to perform a linear search. In other words, it will compare each entry in the array in turn against the search value. The result is that, on average, for an array of n elements the number of comparisons will be n / 2. In our case, that means that an average of 2,500 comparisons need to be made to determine if the search argument is present in the array or not.

If this method is inefficient why is it still so commonly used? There are probably a number of reasons, but the most likely is that this method of processing was the only one available when using the old LOOKUP op-code. And while RPGers have embraced %Lookup as a replacement for the op-code, many are unaware that %Lookup offers a high-speed search option. So how do we utilize that?

High-Speed Searching.

There are two basic changes required to our program in order to take advantage of the high-speed search. First we need to add the ASCEND (or DESCEND) keyword to the array definition. You can see that at (J) below.

Second we must ensure that the array is actually in the sequence that we have told the compiler. In order to do this we also needed to add a SORTA (K) operation prior to beginning the search loop.

These are the changed/added lines:

(J)    dcl-s  valueArray  Packed(7) Dim(10000)  Ascend;

(K)    SortA valueArray;  // Sort on value

How much faster is this approach? In a word: LOTS. In our tests of 10,000 searches on our 5,000 element array, the fast search outperformed the basic search by a factor of over 100 to 1. That's a big difference. You can see for yourself from the two program displays shown below. You can guess which one is which!

DSPLY  Found 3285 - Missing 6715 - Time = 1.797 sec

DSPLY  Found 3148 - Missing 6852 - Time = .017 sec 

In case you are wondering, we performed our timing by using %Timestamp() to capture the time immediately before and after the FOR loop and then using %Diff to calculate the number of seconds between the two.

The larger the array, the more dramatic the speed difference. For example, doubling the size of the array to 10,000 elements will result in roughly doubling the elapsed time for the traditional linear method. The time for the fast search though will not show any significant change. How can this be?

The answer is actually quite simple: The fast search enabled by the Ascend/Descend keyword is implemented by using a "binary chop" technique. In this approach the first comparison is against the middle element in the array. If that is not a match then the array is "chopped" into two parts, and a test is then made to determine if the required value is (potentially) in the lower or upper part of the array. Whichever part is selected is then subjected to the same process. I.e., the midpoint of the selected portion of the array becomes the next comparison position. Hopefully you can see from this example why doubling the size of the array has little impact on the time taken. The very first "chop" will reduce the scope of the array back to its original size—so the number of additional operations required is minuscule compared with the impact on a linear search.

Jon Paris is a technical editor with IBM Systems Magazine and co-owner of Partner400.

Susan Gantner is a technical editor with IBM Systems Magazine and co-owner of Partner400.



Like what you just read? To receive technical tips and articles directly in your inbox twice per month, sign up for the EXTRA e-newsletter here.



Advertisement

Advertisement

2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.

New and Improved XML-INTO

Namespace support makes the opcode a viable option

Authenticating on the Web

The finer points of OpenRPGUI, Part 1

The Microphone is Open

Add your voice: Should IBM i include open-source RPG tools?

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
IBMi News Sign Up Today! Past News Letters