Chapter One
Introducing Assembly Language

Start programming immediately in machine language! Turn on your Atari computer and type in this program. Then run it, type a few words, and you'll see something very interesting on your computer screen.

BONUS PROGRAM NO. 1
"D:HEADSUP.BAS"

10 REM ** "D:HEADSUP.BAS" **
20 REM ** A MACHINE LANGUAGE PROGRAM **
30 REM ** THAT YOU CAN RUN **
40 REM ** STANDING ON YOUR HEAD **
50 REM
60 GRAPHICS 0 : PRINT
100 POKE 755, 4
110 OPEN #1,4,0,"K:"
120 GET #1,K
130 PRINT CHR$(K);
140 GOTO 120

Download (Saved BASIC)
Download / View (Listed BASIC)

Screen Shot

This is, of course, a BASIC program. Line 60 clears your computer screen with a GRAPHICS 0 command. Line 110 opens the Atari keyboard as an input device. Then, in lines 120 through 140, there is a loop that prints typed-in characters on your screen. But the most important line in this program, the line that makes it do what it's supposed to do, is line 100. The active ingredient of line 100, the instruction POKE 755,4, is actually a machine language instruction. In fact, all POKE commands in BASIC are machine language instructions. When you use a POKE command in BASIC, what you're actually doing is storing a number in a specific memory location in your computer. And when you store a number in a specific memory location in your computer, what you're doing is using machine language.

Under the Hood of Your Atari

Every computer has three main parts: a Central Processing Unit (CPU), memory (usually divided into two blocks called Random Access Memory [RAM] and Read Only Memory [ROM], and Input/Output (I/O) devices.

Central Processing Unit
Main Parts Of The Atari Home Computer

Your Atari's main input device is its keyboard. Its main output device is its video monitor. Other I/O devices that an Atari computer can be connected to (or interfaced with) include telephone modems, graphics tablets, cassette data recorders, and disk drives. In a microcomputer, all of the functions of a central processing unit are contained in a MicroProcessor Unit (or MPU). Your Atari computer's MPU, as well as its CPU (Central Processing Unit), is a circuit using Large Scale Integration (LSI) called a 6502 microprocessor.

The 6502 Family

The 6502 microprocessor, your computer's command center, was developed by MOS Technology, Inc. Several companies are now licensed to manufacture 6502 chips, and a number of computer manufactures use the 6502 processor in their machines. The 6502 chip and several updated models, such as the 6502A and the 6510, are used not only in Atari computers, but also in personal computers manufactured by Apple, Commodore, and Ohio Scientific. That means, of course, that 6502 assembly language can also be used to program many different personal computers - including the Apple II, Apple II+, Apple //e and Apple ///; all Ohio Scientific computers; the Commodore PET computer, and the Commodore 64. And that's not all; the principles used in Atari assembly language programming are universal; they're the same principles that assembly language programmers use, no matter what kind of computers they're writing programs for. Once you learn 6502 assembly language, it will be easy to learn to program other kinds of chips, such as the Z-80 chip used in Radio Shack and CP/M based computers, and even the powerful newer chips that are used in 16-bit microcomputers such as the IBM-PC.

The Fountains of ROM

Your computer has two kinds of memory: Random Access Memory (RAM) and Read Only Memory (ROM). ROM is your Atari's long-term memory. It was installed in your computer at the factory, and it's as permanent as your keyboard. Your computer's ROM is permanently etched into a certain group of chips, so it never gets erased, even when the power is turned off. For most home computer owners, that's a good thing. Without its ROM, your Atari wouldn't be an Atari. In fact, it wouldn't be much more than an expensive, high tech doorstop. The biggest block of memory in ROM is the block that holds your computer's Operating System, or OS. Your Atari's operating system is what enables it to do all of those wonderful things that Ataris are supposed to do, such as accepting inputs from the keyboard, displaying characters on the screen, and so on. ROM is also what enables your computer to communicate with peripherals such as disk drives, cassette recorders, and telephone modems. If you own one of Atari's XL series of computers, your unit's ROM package also contains a number of added features, such as a built-in self-diagnostic system, a built-in foreign language character set, and built-in BASIC.

RAM is Fleeting

ROM, as you can imagine, was not built in a day. Your Atari's ROM package is the result of a lot of work by a lot of assembly language programmers. RAM, on the other hand, can be written by anybody - even you. RAM is your computer's main memory. It has a lot more memory cells than ROM does, but RAM, unlike ROM, is fleeting. The trouble with RAM is that it's erasable, or, as a computer engineer might put it, volatile. When you turn your computer on, the block of memory inside it that's reserved for RAM is as empty as a blank sheet of paper. And when you turn your computer off, anything you may have in RAM disappears. That's why most computer programs have to be loaded into RAM from mass storage devices such as cassette data recorders and disk drives. After you've written a program, you have to store it somewhere so it won't be erased when the power goes off and erases your RAM.

Your computer's RAM, or main memory, can be visualized as a huge grid made up of thousands of compartments, or cells, something like tiers upon tiers of post office boxes along a wall. Each cell in this vast memory matrix is called a memory location, and each memory location, like each box in a post office, has an individual and unique memory address. The analogy between computers and post office boxes doesn't end there. A computer program, like an expert postal worker putting mail in post office boxes, can get to any location in its memory about as quickly as it can get to any other. In other words, it can access any location in its memory at random. And that's why user-addressable memory in a computer is known as random access memory.

Its "Letters" are Numbers

Our post office analogy isn't absolutely perfect, however. A post office box can be stuffed full of letters, but each memory location in a computer's memory can hold only one number. And that number can represent only one of three things:

The stored number itself;
A code representing a typed character; or
A machine language instruction.

What Next?

When a computer goes to a memory location and finds a number, it must be told what to do with the number it finds. If the number equates to just a number, then the computer must be told why the number is there. If the number is code representing a typed character, then the computer must be told how the character is to be used. And if the number is to be interpreted as a machine language instruction, the computer must be told that, too.

Its Instructions are Programs

The instructions that computers are given so that they can find and interpret the numbers stored in their memories are called computer programs. People who write programs are, of course, called programmers. The languages that programs are written in are called programming languages. Of all the programming languages assembly language is the most comprehensive.

Running a Machine Language Program

When your computer runs a program, the first thing it has to be told is where the program has been stored in its memory. Once it has that information, it can go to the memory address where the program begins and take a look at what's there. If the computer finds an instruction that it's programmed to understand, then it will carry out that instruction. The computer will then move on to the next address in its memory. After it follows the instruction it finds there, it will move on to the next address, and so on. The computer will repeat this process of carrying out an instruction and moving on to the next one until it reaches the end of whatever program has been stored in its memory. Then, unless it encounters an instruction to return to an address within the program or to jump to a new address, it will simply sit there, patiently waiting to receive another instruction.

Computer Languages

As you know, programs can be written in dozens of computer languages such as BASIC, COBOL, Pascal, LOGO, and so on. Languages like these are called high level languages, not because they're particularly esoteric or profound, but because they're written at too high a level for a computer to understand. A computer can actually understand only one language, machine language, which is written entirely in numbers. So before a computer can run a program written in a high level language, the program must somehow be translated into machine language.

Programs written in high level languages are usually translated into machine language using software packages called interpreters and compilers. An interpreter is a piece of software that can convert a program into machine language as it is being written. Your Atari BASIC interpreter is a high level language interpreter. Interpreters can also be used to convert a few other high level languages, such as LOGO and Pilot, into machine language. A compiler is a software package designed to convert high level languages into machine language after they are written. COBOL, Pascal and most other high level languages are usually translated into machine language with the help of compilers.

Machine Language Assemblers

Interpreters and compilers are not used in writing assembly language programs. Assembly language programs are almost always written with the aid of software packages called assemblers. A number of other assemblers for Atari computer are available, including Atari's very advanced Macro Assembler and Text Editor package. An assembler doesn't work like an interpreter, or like a compiler. That's because assembly language is not a high level language. One could say, in fact, that assembly language is not really a programming language at all. Actually, assembly language is nothing more than a notation system used for writing machine language programs using alphabetical symbols that human programmers can understand.

What we're trying to get across here is the fact that assembly language is totally different from every other programming language. When a high level language is translated into machine language by an interpreter or compiler, one instruction in the original programming language can easily equate to dozens - sometimes even hundreds - of machine language instructions. When you write a program in assembly language, however, every assembly language instruction that you use equates to just one machine language instruction with exactly the same meaning. In other words, there is an exact one-to-one relationship between assembly language instruction and machine language instructions. Because of this one-to-one correspondence, machine language assemblers have a much easier job than interpreters and compilers have.

Since assembly language programs (often called source code) can be converted directly into machine language programs (often known as object code), an assembler can just zip right along, turning source code listings into object code without having to struggle through any of the tortuous translation contortions that interpreters have to face each time they carry out their appointed rounds. Assemblers also have one other advantage over compilers. The programs that they produce tend to be more straightforward and less repetitious. Assembled programs are more memory efficient and run faster than interpreted and compiled programs.

The Programmer's Plight

Unfortunately, a price has to be paid for all of this efficiency and speed; and the individual who pays that price is, sadly enough, the assembly language programmer. Ironically, even though assembly language programs run much faster than programs written in high level languages, they require many more instructions and take much longer to write. One widely quoted estimate is that it takes an expert programmer about ten times as long to write an assembly language program than it would take him (or her) to write the same program in a high level language such as BASIC, COBOL, or Pascal. On the other hand, assembly language programs run 10 to 1000 times faster than BASIC programs, and can do things that BASIC programs can't do at any speed. So if you want to become an expert programmer, you really have no choice but to learn assembly language.

How Machine Language Works

Machine language, like every other computer language, is made up of instructions. As we have pointed out, however, every instruction used in machine language is a number. The numbers that computers understand are not the kind that we're accustomed to using. Computers think in binary numbers - numbers that are nothing but strings of ones and zeros. Here, for example, is part of an actual computer program written in binary numbers (the kind of numbers that a computer understands):

It doesn't take much imagination to see that you'd be in for quite a struggle if you had to write long programs, which typically contain thousands of instructions, in binary style machine language. With an assembler, however, the job of writing a machine language program is considerable easier. Here, for example, is the above program as it would appear if you wrote it in assembly language:

CLC
CLD
LDA
#02
ADC
#02
STA
$CB
RTS

You may not understand all of that yet, but you'll have to admit that it at least looks more comprehensible. What this program does, by the way, is add 2 and 2. Then it stores the result of its calculation in a certain memory location in your computer - specifically, memory address 203. Later on we'll come back to this program and take a closer look at it. Then you'll get a chance to see exactly how it works. First, though we're going to go into a little more detail about assemblers and assembly language.

Assembly Language and BASIC Compared

Assembly language is written using three-letter instructions called mnemonics. Some mnemonics are quite similar to BASIC instructions. One assembly language instruction that's much like a BASIC instruction is RTS, the last instruction in the sample routine we just looked at. RTS ( written 0110 0000 in machine language) means "ReTurn from Subroutine." It's used much like the RETURN instruction in BASIC. There's also an assembly language mnemonic that's similar to BASIC's GOSUB instruction. It's written JSR, and means "Jump to SuBroutine." Its equivalent in binary coded machine language is 0010 000.

Not all assembly language instructions bear such a close resemblance to BASIC instructions, however. An assembly language instruction never tells a computer to do something as complex as draw a line or print a letter on a screen, for example. Instead, most assembly language mnemonics instruct computers to carry out very elementary tasks such as adding two numbers, comparing two pieces of data, or (as we have seen) jumping to a subroutine. That's why it often takes vast numbers of assembly language instructions to equal just one or two words in a high level language.

Source Code and Object Code

When you write an assembly language program, the listing that you produce is called source code, since it's the source from which a machine language program will be produced. Once you've written an assembly language program in source code, you can run it though an assembler. The assembler will then convert it into object code, which is just another name for a machine language program produced by an assembler.

The Speed and Efficiency of Machine Language

Since assembly language instructions are so specific (you might even say primitive) it obviously takes lots of them to make up a complete program; many, many more instructions than it would take to write the same program in a high level language. Ironically, machine language programs still take up less memory space than programs written in high-level languages do. That's because when a program written in a high level language is interpreted or compiled into machine language, big blocks of machine code must be repeated every time they are used. But in a well-written assembly language program, a routine that's used over and over can be written just once, and then addressed as many times as needed with JSR, RTS, and similar commands. Many other kinds of techniques can also be used to conserve memory in assembly language programs.

Return to Table of Contents | Previous Chapter | Next Chapter

Chapter One Introducing Assembly Language

BONUS PROGRAM NO. 1 "D:HEADSUP.BAS"