File Storage Techniques

From: Michael Current (aa700@cleveland.Freenet.Edu)
Date: 02/12/92-12:05:37 AM Z

From: aa700@cleveland.Freenet.Edu (Michael Current)
Subject: File Storage Techniques
Date: Wed Feb 12 00:05:37 1992

Reprinted from the A.C.E.C. BBS (614)-471-8559

      File Storage Techniques

Often when you are writing complex
programs--particularly large data
analysis programs--you will need to
save information in disk files for
later recall.  Those programmers
who have dealt with this aspect of
programming will realize that this
is a completely new area of

First it should be emphasized that
when you are considering how to
store your data that you MUST plan
it out--preferably on paper.  This
is especially true when you are
storing many different values in
one record.  Without written
records of your storage format it
is just a matter of time before you
get confused.

When writing down the formats of
files on paper I find it useful to
use a sort of shorthand.  I use the
symbol # to represent a 1-byte
entry in a file.  I use the ?
symbol to represent an entry of
varying size.  It is easy to use
this shorthand to represent other
situations as well.  For example,
if you had an entry that was always
6 bytes long you could represent
it with ######.  So your written
records might look like this:

?: User Name
#: User Access Level
##: Number of Downloads
?: Last Date Called

I find this shorthand method VERY
helpful.  I would think it would
help other programmers, too.  If
not, I encourage you to think of
your own shorthand method that
makes sense to you.

Now we come to the topic of HOW to
store the data.  There are really
only two ways to do this: Store it
as straight text data, or to
convert it into single bytes and
store this.  In my shorthand, a ?
represents pure text.  To load this
type of data out of the file you
would use the INPUT statement. This
method is quite inefficient and is
only practical when storing text
or complex numbers (numbers with
fractions, etc.).

The second method, storing
single-byte values, generally takes
more program code but saves space
in the file.  Example: If you were
to store the value 1 million
(1000000) as text it would require
7 bytes in the file (one for each
digit).  However, if you converted
it into a binary format it would
require only 3 bytes.  Byte 1 would
contain the value 15.  This would
tell the program to multiply 15 by
65536 and arrive at 983,040.  The
second byte would contain the value
66.  This would tell the program to
multiply 66 by 256.  This would
yield the value 16,896.  Finally,
the third byte would represent the
number of ones left over, 64. So
you would add these three numbers
together and get the value of 1
million (983,040+16,896+64=1000000)
So you have saved 4 bytes by using
this storage method.

Note that the above storage format
can store numbers in the range of
0-16,842,751.  If you need to store
negative numbers this method can
also be used.  If you use the 8th
bit of the first byte as a flag,
you can store values with a range
of -8,388,607 to +8,388,607.

How do you convert values into 3
byte values?  Those with a
reasonable amount of programming
experience will know how.  But for
those of you who have not yet
ventured into this area of
programming, here are two programs.
The first one will convert the
number NUMBER into a three byte
sequence, B1 B2 and B3.  The second
program does the opposite: Converts
B1 B2 and B3 into NUMBER.  These
programs both use the second method
above--the one that allows the use
of negative numbers.

Note: Make sure the value of NUMBER
does not exceed the limits of
-8,388,607 to +8,388,607.

100 B1=INT(NUMBER/65536)
105 B2=INT((NUMBER-B1*65536)/256)
110 B3=NUMBER-B1*65536-B2*256
115 IF NUMBER<0 THEN B1=B1+128


100 X=0
105 IF B1>127 THEN B1=B1-128:X=1
110 NUMBER=B1*65536+B2*256+B3

Even these very simple methods can
save a considerable amount of disk
space.  The same technique can be
used to save memory: If you must
have large amounts of numeric data
in memory, using this technique and
string variables can more than
double the amount of numbers which
can be kept in memory.  This method
would require only 3 bytes for each
value in contrast to the 7 bytes
required for each entry in a
numeric matrix.

Once you master the ideas behind
file storage, you will be able to
develop your own methods that you
will be able to design specifically
for the needs of the situation.  No
single method will work ideally for
all situations.  Hopefully the
routines presented in this file
will get you started in the right

               -- Craig Steiner

 Michael Current, Cleveland Free-Net 8-bit Atari SIGOp   -->>  go atari8  <<--
   The Cleveland Free-Net Atari SIG is the Central Atari Information Network
      Internet: / UUCP: ...!umn-cs!ccnfld!currentm
     BITNET:{interbit} / Cleveland Free-Net: aa700

Return to message index