You are currently on IBM Systems Media’s archival website. Click here to view our new website.


Two Tools for Testing

Testing. Possibly the most hated word in the RPG programmer’s dictionary, although “End User” or “Documentation” might run very close seconds. After all, it compiled, what could possibly be wrong with it?

Part of the problem is the requirement to generate test data. If you’re like us, then you know that your test data is probably not all that it could be. This is particularly the case when it comes to new applications. Keying in hundreds of records is just not our idea of fun and so we have a tendency to try and get away with the minimum amount needed to test our new applications. But that is so yesterday! Today we have alternatives—tools that can generate test data for us!

We briefly introduced you to one such tool back in September 2013. That tool was faker, a free open-source PHP utility.

Introducing generatedata

Since that time however, we’ve also discovered—a website that lets you generate test data and download it in CSV, HTML, JSON, XML, and SQL formats among others. Take a look at Figure 1 and you’ll see some of the options available.

You begin by naming the data set you wish to create and then optionally select the country for which the data is to be generated. In this case, we selected the United States. You then proceed to name each column in turn and select a data type. These range from things like auto-increment numbers such as we used for the customerId, to street addresses, city names, phone numbers and more. Interestingly, when we selected a phone/fax number we had the additional option to select a format for the number. If you look at the figure, you’ll see we selected Canada (2). This is not because we’re based in Canada but because USA was not an option—go figure.

Once you’ve described all of the columns you wish to create, you can then select the format of the data export. Depending on the option you select, you may have additional parameters to set. In our example, that was the delimiter to use for the CSV file. Once generated, we would simply use the CSV file as input to CPYFRMIMPF to populate the table.

Another of the output options is SQL, but sadly DB2 is not currently available. However, you could always use a format such as Oracle and then make any adjustments needed manually. In our experience, the major incompatibilities we’ve encountered lie in the generated CREATE TABLE statements. Since we will be defining our own tables in DDL, this is mostly a non-issue. We can simply delete the table definition statements and use the generated INSERT statements as-is. This seemed to work just fine as long as we stayed away from the MySQL and SQL Server options.

The last thing that you select is whether the output is to be generated as a Web page or as a downloaded file. If you choose to generate a Web page, then you’ll need to copy the generated data and paste it where you need it.

Extending the Limits

While you can simply use the generatedata website to create your test data, you’ll quickly discover that there are limits imposed on your usage. For example, the tool provides a feature that allows you to save your settings. This means that you don’t have to start from scratch if later you want to modify the table and then regenerate test data for the new version. But if you try to do this on the website, you’ll fail—unless you have “contributed” (i.e., paid money) and been given a User ID. Similarly, you cannot change the number of test records to be generated; you’re stuck with 100. There may be other limitations as well but these are the two that we encountered during our tests.

However, if you don’t want to spend money, there is a simple alternative—just download the code and run it on your own system! It was only after we downloaded the package that we discovered that generatedata is also written in PHP. We downloaded the package and then ran it on our Macs. You can also run it on your IBM i, as long as you have ZendDB (MySQL) installed.

When you subsequently invoke the package for the first time, it will go through a simple installation process and from that point on all of the blocked features, such as saving your configurations and changing the number of records generated, are enabled.

But what if the available options for data generation don’t meet your needs? Perhaps you need to generate part numbers that have (say) two alpha characters followed by three numerics followed by another alpha? In that case, you have two basic options:

  1. Generate test data that ignores the column in question (or perhaps just generates the numeric portion) and then write a small piece of code to “fix up” that column later.
  2. Modify the package to add your own custom data types. The documentation describes how to do this and the supplied code is well documented for the most part and can be used as a base.

faker Revisited

The biggest difference between faker and generatedata is that the latter includes a full-blown graphic interface. faker on the other hand is a set of API calls that you incorporate in your own logic. This means that it’s far more flexible in that the data can be generated and utilized immediately in whatever form you choose. In the following faker code sample, we’ve attempted to come as close as is practical to the output we produced with generatedata. In practice, when using faker, we would normally take the generated data and write it directly to the required table rather than go through the CSV phase. But in the interest of providing a direct comparison, we’ve chosen to produce a CSV file in this case.

The first point to note is the require ... statement shown at (( A )). This incorporates the faker facilities in the script. At (( B )) we define constants for the comma, quote and end of line strings to be used in the CSV output. The CSV file is itself is opened at (( C )). The “w” opens it for output and will create it if required.


require __DIR__ .'/../src/autoload.php';  (( A ))

define ( 'COMMA', ',' );    (( B ))
define ( 'QUOTE', '"' );
define ( 'CRLF', "\r\n" )'
$handle = fopen("/Users/jonparis/Documents/Articles IBM Magazine/Extra 2015/JonsTestData.csv", "w"); (( C ))

$faker = Faker\Factory::create(); (( D ))

$customerId = 0;

for ($i=0; $i < 100; $i++) {

  $customerId += 7;
  $csvData = $customerId . COMMA . 
             QUOTE . $faker->company . QUOTE . COMMA .       (( E ))
             QUOTE . $faker->streetAddress . QUOTE . COMMA .
             QUOTE . $faker->city . QUOTE . COMMA .
             QUOTE . $faker->postcode . QUOTE . COMMA .
             $faker->phoneNumber . COMMA .
             $faker->phoneNumber . CRLF;

  $len = fwrite( $handle, $csvData );  (( F ))

fclose( $handle );  (( G ))


The first use of the faker facilities occurs at (( D )) where we create a new faker factory instance and assign it the name $faker. Once the factory has been created, we can call its methods to generate the required test data. You can see an example of this at (( E )). The faker documentation identifies all of the available methods many of which, as you can see, have similar names to the options we selected in generatedata. Once all of the individual data items have been generated and concatenated together (courtesy of the . operator in PHP), the resulting record is written to the file (( F )). Last, but not least, when all 100 records have been generated the file is closed (( G )) and the program terminates.

As you can see, even though faker is a completely object-oriented function, we didn’t have to write in an OO fashion to use it. We just had to know how to call the appropriate methods. As we’ve often said before, this is one of the things that we think makes PHP a good choice for RPGers because you can use OO code without having to launch fully into the OO world.

Which To Use—generatedata or faker?

So which of these tools should you choose? This is not an easy question to answer but here are a few our thoughts on the topic.

generatedata has a great GUI that’s easy to use and, within limits, can be used without installing any software. As long as the provided data types meet your needs, you don’t need to write even a single line of code. However, should you need data types that it doesn’t include, you must either extend the tool or write a program to fill in the gaps. The same is true of faker, of course, but two things work in its favor.

First, from our limited explorations, faker seems easier to add new data types to than generatedata. This is mostly because with generatedata you not only have to write the data-generator component, but you also have to integrate the new type(s) into the GUI components. With faker, all you have to do is add the generator.

Second, since with faker you’re always writing a PHP script anyway, you could simply add your own logic to generate the new type without actually extending the tool at all. Just code any additional logic needed in your PHP script. This might include, for example, issuing SQL queries against existing tables to obtain real customer numbers or product codes, etc. The only time you’d really need to extend faker itself is if your new data type will be needed frequently by others in your organization.

So there you have it—two tools to make the job of generating test data a lot easier and one less excuse for inadequate testing.

Jon Paris is a technical editor with IBM Systems Magazine and co-owner of Partner400.

Susan Gantner is a technical editor with IBM Systems Magazine and co-owner of Partner400.

Like what you just read? To receive technical tips and articles directly in your inbox twice per month, sign up for the EXTRA e-newsletter here.



2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.

New and Improved XML-INTO

Namespace support makes the opcode a viable option

Authenticating on the Web

The finer points of OpenRPGUI, Part 1

The Microphone is Open

Add your voice: Should IBM i include open-source RPG tools?

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
IBMi News Sign Up Today! Past News Letters