String Arrays in Atari Basic

David E. Carew

Atari Basic differs from most other micro-Basic dialects in its handling of strings. Atari Basic allows strings of any length (limited only by the hardware resource of memory). At the same time an expression like A$(X,Y) in Atari Basic is a substring reference, standing for that piece of A$ beginning at the Xth character position of A$ and running through the Yth position. In many other Basics A$(X,Y) is a string array reference, implying the existence of an array of many strings and referring to the particular string at row X, column Y in the A$ array of many strings.

It is inevitable that those used to reading and programming other Basic's will perceive this difference as a shortcoming of Atari Basic. In fact this is not necessarily a shortcoming at all, but rather a reasonable design decision in implementing a Basic. If the use of substring operations will be more common than the use of string arrays and this is a reasonable assumption for micro-computer applications, then one can eliminate slow and clumsy special function calls such as MID$( ), LEFT$( ), RIGHT$( ) in favor of compact, direct substring references like A$( ). Properly done, this results in a Basic which is faster in executing more common operations. For the occasional application where a string array is needed, it is possible to build your own string arrays in Atari Basic by setting up a single "large" string, and then defining a calculation to convert a row-and-column reference into the correct substring reference for the "piece" of the "large" string corresponding to the row-and-column reference which was made. If you stop and think about it, there are no "rows and columns" in a computer's memory. Those Basic's which provide arrays do so by simulating rows and columns out of a straight list of memory addresses, or positions in memory. We can easily duplicate this behavior by simulating "rows and columns" out of a straight list of character positions in a single, large string. This article is to show exactly how this can be done.

Suppose we wish to have a string array 4 rows by 3 columns, with each string in the array having a maximum length of 20 characters. We start by setting these quantities up in variables:

100 ROWMX=4:COLMX=3:LNGMX=20

Given these quantities, we know how long to make our "array" string:

150 TTS1Z=ROWMX*COLMX*LNGMX
200 DIM ARR$(TTS1Z)

We could perform the reference conversion calculations each time a reference is made in the program, but since each repeat of a particular reference would imply a repeat of exactly the same calculation, it is more efficient as well as more convenient to perform the conversion calculations once and store the results in such a way that they are easily accessed as needed. One table (numeric array) for the beginning substring positions and one for ending substring positions allows for convenient addressing and this is illustrated below:

206 REM BG IS BEGIN SUBSTR TABLE, EN IS END SUBSTR
210 DIM BG(ROWMX,COLMX)
220 DIM EN(ROWMX,COLMX)
230 REM INITIALIZE "STR$ ARRAY" CONTROL TABLES
240 FOR RW=1 TO RDWMX:FOR CL=1 TO COLMX
250 BG(RW,CL)=COLMX*LNGMX*(RW-1)+(LNGMX*(CL-1)+1)
260 EN(RW,CL)=BG(RW,CL)-1+LNGMX
270 NEXT CL:NEXT RW

The only step remaining would be to initialize ARR$ to all blanks (or some other appropriate filler).

Having made these extra arrangements to start with, then every occurrence of another Basic's ARR$(X,Y) expression might be replaced with an Atari Basic equivalent:

ARA$(BG(X,Y),EN(X,Y))

This solves the address conversion part of the problem. A detail or two may remain. In most string-array Basic dialects, ARR$(3,4) may have a length of zero, or any other length up to some maximum. In Atari Basic, using string-array simulation, ARR$(3,4),EN(3,4)) has a length of LNGMX exactly, no more and no less. The consequences of this detail depend on the application. For instance, a string-array Basic may test for an empty array cell using a LEN function, like this:

6000 IF LEN(A$(3,4))=0 THEN...

The equivalent array-simulation code might involve a string of length LNGMX initialized to all blanks. Then an empty cell is not LEN equal zero, but rather equal to the "always empty" string, e.g.:

6000 IF A$(BG(E,Y),EN(X,Y))=NUL$ THEN...

Also, placing a string shorter than LNGMX into a simulated array may require taking its length into account.

7000 ARR$(GB(X,Y),BG(X,Y)-1+LEN(NEW$))=NEW$

The above code places a short (i.e., LEN(NEW$)=LNGMX)NEW$ into the X,Y cell of ARR$, beginning at the first character position of the cell and taking as many positions in the cell as required by the length of NEW$. This statement is obviously longer, less intuitively clear and certainly somewhat slower executing than the non-Atari Basic equivalent:

7000 ARR$(X,Y)=NEWS

However, the simulation still provides a single statement, directly substitutable for the non-Atari equivalent, if for example you are covering a listing from some other Basic. I have found that other details I have encountered are similarly susceptible to fairly happy solutions.

The next time you have an application which cries out for string arrays (or a possible conversion of a listing which already uses string arrays) you might consider the approach suggested here. Once you have mastered string array simulations for the relatively rare situations where you actually need them, then Atari Basic's compensating payoff of quicker, cleaner substring manipulation seems all the sweeter.

David E. Carew, Interactive Management Systems Corp., 3700 Galley Rd., Colorado Springs, CO 80909.

Table of Contents
Previous Section: Monkey Wrench
Next Section: Talk is Getting Cheaper