May 2025 (First Draft) - October 2025 (Final)

Last month, Bill Gates celebrated the 50th anniversary of Microsoft by releasing its earliest known source code: the Altair BASIC interpreter 3.0. The significance of Microsoft's first endeavor has been discussed ad nauseum by industry and insiders for decades - to which, I have nothing to add.
As for the source code, I hope to help future code readers enjoy exploring minds from an earlier era. The first hurdle? Converting the 314 pages of fanfold impact printouts into something easier to analyze. Here you go:
Altair BASIC 3.0 source on GitHub:
- WYSIWYG from the paper feed (repo / raw)
- Assembly only (with and without line numbers)
- ...and more view slices discussed in the repo
Armed with accessible source, we're ready to start the journey to appreciate what Gates called "the coolest code I've ever written".
My Goal: To break knowledge barriers preventing modern programmmers from jumping in to the code by collecting foundational knowledge in one place (right here!)
Solid Foundations - Prerequisite knowledge for Altair BASIC
Computing in 1975 was more simple than today. It was possible for a young person to grasp the entire computation chain: from hardware and machine instructions through the single running program to the display / teleprinter. Much like three talented 20-somethings that shoehorned this BASIC interpreter in to 4K of memory. Simple ... but obscure by today's standards.
We'll walk through the tools of the era in rough order of importance. These tools are essential to understand exactly what we're seeing in the source code without constant cross-referencing to ask "What is this?". As we move down the list, the resources will address the broad implications of "Why" rather than "What or How".
The Intel 8080 CPU
Key Resource:The 8080 Assembly Language Programming Manual (1975)
All assembly for Altair BASIC targets the Intel 8080. It's on the programmer to bootstrap this chip to support their application. If you're coming from an x86 background (like me), then you have to retrain your thinking since this processcor predates the 8086 by four years. The architecture at the time was known as MCS-80, and had several quirks were eventually ironed out. The most important points for our purposes are:
The Registers

General purpose registers are 8-bits wide. The register names do not follow later conventions leading to software compatibility complications with later x86 CPUs. Fortunately, the convention isn't too far from what the 8080 provides.
Register Pairs
The 8080 implements 'register pairs' for performing 16-bit computations using two 8-bit registers. This idea falls away with the more generalized x86 architecture after 1978. The key point for the Altair BASIC source is to know the pairs, how to reference them, and the 8 specific instructions involving pairs. In general, instructions reference the given register unless it is a special pair instruction.
For example, the INX B instruction will increment the combined B,C value by one. In most cases this means 'INX B' will actually only increment the value in C unless it is all 1s, which carries in to B. The special instructions to watch for are:
| Instruction | Valid Regs | Purpose |
|---|---|---|
| PUSH | B,D,H,PSW | Adds the two-byte contents of the register pair to the stack |
| POP | B,D,H,PSW | Fills the register pair with two bytes from the stack |
| DAD | B,D,H,SP | 16-bit add of the register pair and H,L contents. DAD H is equivalent to left shift of H,L |
| INX | B,D,H,SP | 16-bit increment of the target register pair |
| DCX | B,D,H,SP | 16-bit decrement of the target register pair |
| XCHG | None | Swaps H,L with D,E |
| XTHL | None | Swaps H,L with *SP+1,*SP respectively |
| SPHL | None | Moves the contents of H,L in to SP directly (no memory changes) |
Interrupt Vectors
The first 64 words in memory (0o0000-0o0100) are dedicated to handling interrupts. External devices invokve interrupts by supplying an RST instruction with a handler address. The processor expects the application programmer to lay out eight routines of 8 bytes each from at the start of memory. If the application does not use hardware interrupts then should be disabled on initialization with a 'DI' instruction. Refer to Chapter 5, Page 60 of the 8080 guide for the details.
Altair BASIC disables all hardware interrupts and instead uses the handlers to dispatch common functions. The CPU still responds to RST invoked from within the application, but will not listen for hardware RSTs. Discussion about the Altair BASIC handlers continues below
The MACRO-10 Assembler
Key Resource: The MACRO Assembler Reference Manual (1973 and 1978*)
Altair BASIC was written on Harvard's PDP-10 and tested on an 8080 emulator written by Paul Allen. Consequently, Gates and Allen used the MACRO-10 assembler and its extended features. You'll need to recognize these features in order to quickly read through the code.
*The 1978 reference didn't exist when Altair BASIC was developed, but I find the quality easier to read than the earlier manual.
Pseudo-Ops
Instructions provided by MACRO-10 range from crafting specific bytes and sequences to merely visual styling of the source listing when printed. For Altair BASIC, know these operations:
| Pseudo-Op | Count | Purpose |
|---|---|---|
| BLOCK | 49 | Reserves memory space. No code generated. Next instruction address skips by BLOCK size |
| DEFINE | 29 | Creates a symbol as a macro of other symbols. Similar to #define in C |
| END | 2 | Mandatory last statement in a MACRO program |
| EXTERNAL | 16 | Reserves symbols for resolution at load time. Cannot be defined in current source |
| IFE | 205 | Conditional branching assembly - see below for further discussion |
| IFN | 351 | Conditional branching assembly - see below for further discussion |
| INTERNAL | 27 | Tags symbol as global to all other loaded programs. Must be defined in this source |
| LIST | 2 | Resumes listing the program following an XLIST statement |
| PAGE | 48 | Ends the logical page of source. Only relevant for source display |
| PRINTX | 21 | Outputs text during assembly |
| RADIX | 3 | Sets the radix (base) in the program and is applied to all numbers |
| RELOC | 8 | Directly sets the address of the next (and subsequent) instructions |
| SALL | 2 | Suppresses all macros and expansions. Helpful when listing sources |
| SEARCH | 2 | Declares a source for MACRO-10 to search for unresolved symbols |
| SUBTTL | 50 | Sets the subtitle see at the top of source listing pages |
| TITLE | 2 | Sets the title of the program |
| XLIST | 2 | Suspends output of the program listing during Pass 2 |
| XWD | 53 | Generates code specified. Useful for crafting exact instructions. See below |
Conditional Assembly
IFE and IFN evaluate expressions follow the branch that evaluates to 0 or not 0 respectively. The source makes extensive use of IFE and IFN pseudo-ops to generate only the code necessary for the build target. The target is chosen with pre-configured with constant values in a 'Common File' prior to assembly. The assembler determines which code paths are dead and only generates output on the desired path based on evaluation of the given expression. The source includes all possible code paths, but only one specific path is built based on the configuration.
Here is a sneak peak of the 'Common File' from the first page of source showing IFE in action. We will discuss how to read this properly in coming sections. Bolded comments are mine:
BASIC MCS 8080 GATES/ALLEN/DAVIDOFF MACRO 47(113) 03:12 10-SEP-75 PAGE 1
C 6-SEP-64 03:11 COMMON FILE
1 00100 SEARCH MCS808
2 00200 SUBTTL COMMON FILE
3 00300 SALL
4 000002 00400 LENGTH==2 /* LENGTH of 2 implies 12K (Extended) BASIC build target */
5 000001 00500 REALIO==1 /* REALIO of 1 means real machine, not simulator */
6 000000 00600 CASSW==0 /* The 000000 to the left shows generated constant */
7 000000 00700 PURE==0
8 000000 00800 LPTSW==0
9 000000 00900 DSKFUN==0
10 000000 01000 CONSSW==0
11
12 000016 01200 CLMWID==^D14
13 000000 01300 RAMBOT=^O20000
14 000001 01400 CONTRW==1
15 01500 IFE REALIO,< /* IFE expects a 0 but gets a 1 -- branch not taken */
16 /* Nothing generated 01600 LPTSW==0
17 * 01700 CASSW==0
18 * 01800 CONSSW==0
19 * 01900 DSKFUN==0
20 * 02000 CONTRW==0>
21 *
22 * 02200 IFE LENGTH,< /* IFE expects a 0 but gets a 2 -- branch not taken */
23 * 02300 EXTFNC==0 /* These are the settings for 4K BASIC */
24 * 02400 MULDIM==0
25 * 02500 STRING==0
26 * 02600 CASSW==0
27 * 02700 LPTSW==0
28 * 02800 DSKFUN==0
29 * 02900 CONSSW==0
30 * 03000 CONTRW==0>
31 *
32 * 03200 IFE LENGTH-1,< /* IFE expects a 0 but gets a 1 -- branch not taken */
33 * 03300 EXTFUN==1 /* These are the settings for 8K BASIC */
34 * 03400 MULDIM==1
35 */ 03500 STRING==1>
36
37 03700 IFE LENGTH-2,< /* IFE gets a 0 -- branch taken and constants output */
38 000001 03800 EXTFUN==1 /* These are the settings for Extended BASIC */
39 000001 03900 MULDIM==1
40 000001 04000 STRING==1>
Other Small Details
- OCTAL format is used for nearly all addresses throughout the source. Couting is 1-7, 10-17, 77 + 1 = 100, etc.
- Symbols have a maximum of 6 characters. This includes jump lables and variables
- Some symbols and functions are not resolved with the source, such as ADR and DC. These are likely in MCS808 shown on line 1
- Many pseudo-ops are dedicated to clean code display and not code generation, like TITLE, PAGE, SALL, etc
- The XWD pseudo-op is used in unexpected ways to shorten code generation. More on that later.
The basics of BASIC
Key Resource: MITS BASIC Manual (1975)
The design and operation of BASIC interpreters were well-known in 1975. Dartmouth BASIC had been widely shared for over 10 years and universities included interpreter implementation in computer science curricula. This allowed Gates and Allen to focus on the immediate challenge of implementing BASIC on the resource-constrained Altair.
For us, it's helpful to understand what a generic BASIC intepreter does beforehand so we can focus on the Altair implementation and appreciate the optimizations made it feasible on limited hardware. The BASIC manual linked above tells us how the software should behave and provides many hints about the implementation.
The user view of the runtime experience:

Even in 1975, interactions and outputs feel similar to a modern Python interpreter. Users input single immediate (direct) command, such as PRINT 2+2, and expect an immediate response. Alternatively, a user may enter a long program of (indirect) commands, and the computer will proceed accordingly with potentially interactive outputs. An interpreter facilitates this experience by reading input, checking for validity, evaluating requests, providing output, and preparing for the next commands.
Altair BASIC compresses this process as much as possible in order to operate with minimum memory space. There isn't enough resources to support generalized semantic analysis as the output will happen in a single pass. Under the hood, the solution looks like this:

The names in [square brackets] are the symbol names of the supporting data structures used in Altair BASIC. We will delve in to the data structures during the loading discussion and step through this loop during the execution discussion.
The Altair 8800
Key Resource: Altair 8800 Operator's Manual
Optional Resource: Altair 8800 Theory of Operations and Schematics
Ironically, the Altair itself is last on the list of things to know for the BASIC source code. We can talk all day about the box itself -- it's the entire reason we're here! But for our purposes, BASIC talks to the 8080 while it lives in the memory supplied by the Altair. The challenges of manipulating the front panel and configuring peripherals (teletypes, tapereaders, etc) are assumed to be overcome by the time BASIC loads. Considering that the mythology says Gates and Allen didn't even have a physical Altair to build and test BASIC on, we can also get by without digging in too far. Feel free to dig in to the manuals, but the most important points for us are:
- Three memory configurations: 4K, 8K, and Extended (12K or more)
- Altair uses a two-stage loading process to start BASIC (Details on page 46 of the BASIC manual)
- First stage loader begins to read from an external source (tape, cassette, etc)
- Second stage loader verifies that BASIC was loaded correctly and begins execution
- This was the loader that legends say Allen wrote while the plane landed for the product demo
- Altair begins executing at memory location 0, which is where we begin the journey into the BASIC source
The Source Code
Here are facts about the source released by Bill Gates in April 2025:
- Source version states that it is Altair BASIC 3.0
- 314 pages of fanfold paper scanned in to 157 pages of PDF (2 physical pages per PDF page)
- One physical page is 11" x 14" 7/8
- 66 lines per page (6 LPI) with 132 characters per line. From the top:
- 2 unused lines
- 2 header lines
- 1 blank line
- 53 source content lines
- 8 blank footer lines
- Source is two files combined (F3 and F4) which includes the common (config) file for each prepended
- F3 printout is dated 10 September 1975
- F4 printout is dated 27 August 1975
- 12,628 lines of source, including comment lines and whitespace
- F3 contains 7801 lines of source including its common file.
- F4 contains 4827 lines of source including its common file.
- F3 includes the core BASIC intepreter
- F4 contains the math package and the system initalization code
- System initialization code is the last section loaded, runs once, and the space is recovered as program memory area
- Symbol tables from the assemblers first-pass(?) are appended to the source listing after both F3 and F4
Layout of the source release
The source printouts contain several sections of note, which are duplicated between source files F3 and F4.
![]() | F3/F4 Headers - Identifies the subsequent file along with process, print job, and system information | |
| Common File - An 'included' configuration file with static variables for assembler usage. Note that F3 and F4 are using different Common File configurations (12K and 8K respectively). | ||
| F3/F4 Source - The program! F3 is BASIC, F4 is Math package and system initialization | ||
| F3/F4 Trailer - Verification of print job process complete | ||
| Duplicated Pages - Extra pages of F3 snuck in to F4. Physically the same page, but printed and scanned separately | ||
F3/F4 Symbol Table - The list of symbols known to the respective source file. Includes a value,reference and a possible flag such as ext or spd. See the 1978 MACRO book page 6-4 for flag details. | ||
| F3/F4 Cross-Reference Symbol Table - The CR Symbol Table shows a list of line numbers for each symbols. Line number is the assembler assigned number on the far left of a page, not programmer provided number. See section 2 below. |
Source subsections
The F3 and F4 source files group their routines in to subsections, each titled conveniently at the top of each page. The sections are:
| F3 Sections (BASIC Interpreter) | F4 Sections (Math Package) |
|---|---|
| Common File | Common File |
| Version 3.0 | Math Package Configuration |
| Some Explanation | FP Addition and Subtraction |
| RST Routines | Natural Log Function |
| Dispatch Tables,Reserved Words, Error Text | FP Multiplication and Division |
| Low Segment - RAM | Sign, SGN, Float, Neg, ABS |
| Text Contants | FP Movement Routines |
| General Storage Management | Compare Two Numbers |
| Error Handler, Ready, Compacting, New, Clear, Main | Integer, Single, and Double Conversion |
| The LIST Command | Greatest Integer Function |
| FOR Statement | Integer Arithmetic Routines |
| New Statement Fetcher | Double Precision Arithmetic |
| Restore, Stop, End, LINGET, CHRCON | FP Input Routines |
| Run, Goto, Gosub, Return | FP Output Routines |
| Let | Exponentiation and Square Root |
| On Goto | Exponential Function |
| If ... Then | Polynomial Eval and Random Numbers |
| Sine, Cosine, and Tangent | |
| Input and Read | Arctangent |
| Next | System Initialization |
| Formula Evaluation | |
| Dimension and Variable Searching | |
| Multiple Dimension | |
| Function and Integer to Float Routines | |
| Simple User-Defined-Functions | |
| String Functions | |
| Fancy List, Delete, Edit, LLIST | |
| Disk Code | |
| CLOAD, CSAVE, Console | |
| Peek and Poke |
Anatomy of a source page
The source listing from the MACRO assembler includes both the source and generated machine code for the specific platform configured in the common file. I will predominantly focus on the source code on the right hand side of the page. People interested in direct platform code might be interested in following the left-hand side, which is helpful as a check for assembler operation.
Here is an annotated view of a typical page of the source:

Section 2 line numbers are relevant for the source listing only. Sections 3,4 and 5 are the generated output of the assembler. Sections 6 and 7 are the input assembly code
| Section | Purpose |
|---|---|
| 1 | The two-line header at the top of every source page. Important info includes the page, file, and section subtitle |
| 2 | The listing line number. This is only relevant for the printout and isn't a part of the source |
| 3 | The address of the subsequent two words of generated bytes. Ususally sequential unless explicitly skipped (i.e BLOCK statement) |
| 4 | Upper 18-bit word of output. Not directly applicable for the 8080 which has at most 16-bits addressable. Throughout the source code, only 3 values are observed. 001000 whenever the lower byte is a instruction (op-code). 000000 when the lower byte is an operand (data). And 777777 for negative immediate values. |
| 5 | Lower 18-bit word of output. Contains the relevant generated code for each opcode or operand. |
| 6 | Source line number. These were explicitly written by Gates et al during development. |
| 7 | The hand crafted MACRO-10 and MCS-80 assembly code and comments. |
Key points:
- Only addresses appearing in Section 3 are live code paths that appear in the output build. The address is the location in the output. This is controlled by the configuration files
- When sections 3 and 4 are blank yet section 5 has a value: This indicates a constant value referenced by the assembly during a pass
- Values in section 5 with a trailing apostraphe indicate a relocatable memory reference, otherwise treat it as a value (opcode or operand)
Code quirks and idioms
Combine fifty year-old code with resource-constrained hacks and you'll have a code base littered with headscratchers. I'll dissect some ideas to watch for as we walk through the source.
XWD pseudo-op used for flow control
The XWD pseudo-op provided by the MACRO-10 assembler is typically used to craft bit-level specific output without interference from the assembler. Quite literally "Insert only this value at the current address". Microsoft used XWD to change control flow by inserting instructions that force subsequent lines to be read as operands (data) rather than executing them. A few examples:
1874 002072' 001000 000036 27940 SNERR: MVI E,ERRSN ;"SYNTAX ERROR" 1875 002073' 000000 000002 1876 002074' 001000 000001 27960 XWD ^O1000,1 ;"LXI B," OVER THE NEXT 2 1877 002075' 001000 000036 27980 DV0ERR: MVI E,ERRDV0 ;DIVISION BY ZERO 1878 002076' 000000 000013 1879 28000 IFN LENGTH,< 1880 002077' 001000 000001 28020 XWD ^O1000,1 ;SKIP NEXT TWO 1881 002100' 001000 000036 28040 NFERR: MVI E,ERRNF> ;"NEXT WITHOUT FOR" ERROR 1882 002101' 000000 000001 1883 002102' 001000 000315 28060 ERROR: CALL STKINI ;RESET THE STACK AND FLAGS
This code segment has multiple entry points on lines 1874, 1877, and 1881 that set different string references but all 'fall through' to line 1883 without side effects. The trick is that the XWD instructions place an 'LXI B' op-code only without adding the 16-bit immediate operand expected causing the MVI instruction and its operand to be loaded into register pair BC. If we entered this code block at the top from like 1874, the "syntax error" string reference is loaded in to E, then the XWD instruction is executed as LXI B, gobbling up the subsequent MVI and it's operand as data rather than executing them. This repeats on line 1880 and thus we fall through to line 1883.
So why not just use a real LXI B, or even a jump to ERROR? It's more memory expensive -- the assembler will use 3 bytes to fill out LXI and its operands or jump reference. Using XWD only sacrifices 1 byte and smashes BC while weaving 3 code paths that slide to the same exit point without consequential side effects.
Another XWD example:
2372 002646' 001000 000361 34640 FOUND: POP PSW ;TAKE OFF GARBAGE ORIG POINTER 2373 002647' 001000 000170 34660 MOV A,8 ;GET RESERVED WORD # 2374 002650' 001000 000366 34680 ORI 128 ;SET MSB TO FLAG AS RESERVED WORD 2375 002651' 000000 000200 2376 002652' 001000 000362 34700 XWD ^O1000,^O362 ;"JP" AROUND THE POP H AND MOV A,M 2377 002653' 001000 000341 34720 TABEND: POP H ;GET BACK ORIG POINTER 2378 002654' 001000 000176 34740 MOV A,M ;GET BACK ORIG CHAR 2379 002655' 001000 000321 34760 POP D ;GET STUFF POINTER BACK
This produces the same effect of skipping 2 instructions without smashing BC in the process. But why use JP (Jump If Positive)? The previous instruction of ORI 0b10000000, in addition to it's BASIC use for signalling a reserved also, also guarantees that the SIGN flag will be reset (negative), thus the fake JP branch will never be taken and we will fall through, saving us 2 bytes while preserving key registers.
Therein lies the art of weaving together multiple code paths and choosing the most appropriate operation to subtitute based on how many instructions you need to skip while avoiding destructive side effects like smashing important registers. Here is a summary of XWD usage throughout the Altair BASIC source:
| XWD Instruction | Mimicks | Desired effect | Side effects |
|---|---|---|---|
| XWD ^O1000,1 | LXI B | Skips 2 instructions | Smashes register pair BC |
| XWD ^O1000,^O16 | MVI C | Skips 1 instruction | Smashes register C |
| XWD ^O1000,^O21 | LXI D | Skips 2 instructions | Smashes register pair DE |
| XWD ^O1000,^O76 | MVI A | Skips 1 instruction | Smashes accumulator |
| XWD ^O1000,^O362 | JP | Skips 2 instructions | No side effects |
| XWD ^O1000,^O366 | ORI | Ensure register A isn't zero, Z flag is reset, skips 1 instruction | Smashes A, Z flag reset |
CPI instruction as a fast 'type' mechanism
The source often uses the CPI instruction to make a fast determination about an input value. The most common situation is comparing a value in the accumulator for an exact value. If the values match, the ZERO flag will be 1 and the subsequent branch will usually follow a JZ, JNZ, CZ, or CNZ instruction.
In the case of user input, we see CPI often compared to "A". If the user has input a digit (0-9) in to the accumulator, the "A" charcater will assert the Carry flag, while an input letter will not. This way we can quickly determine if the user input a number or letter as expected and branch using JC, JNC, CC, or CNC.
Note that this behavior is not consistent when characters arrive in the accumulator with the RST routine CHRGET, because it directly manipulates flags during the call.
Unresolved symbols
Not every symbol seen in the source has a definition that we can reference. I believe most of these are included in the universal file suggested by SEARCH MCS808 line that unfortunately I can't find references to. It's essentially an #include or 'import' statement. I've made some educated guesses about the meaning of these symbols, but I'm open to better interpretations if anyone has better information. Until then, consider these:
$CODE- Always resolves to 0 (zero) in this assembler pass. Likely the offset to the first address of code (code segment in x86 lingo) to generalize the load address.ADR(x)- Reserves the current two-bytes as an address to the given routine. Wrapped in a localADRPdefinition to include a precedence value for the operator table.DC- Define Constant. Used to allocate static string data to the current space in memory. Similar to the DB directive in x86. Altair BASIC wraps this withDCI,DCE, andDCLhelpers early in the assembly.VMOVE- Used to copy descriptors in the 12K version. This external routine prefers 3-byte descriptors vs the 4-byte internal design used by theMOVEroutine in 4K and 8K.
Number base ambiguity
By default, all addresses are read in octal format (base 8). All immediate values in operands are decimal (base 10). The following control delimiters tell the assembler how to explicitly handle the subsequent number:
^O- Octal number. So^O76is equal to decimal 62 and binary 00111110^D- Decimal number. So^D76is equal to octal 114 and binary 01001100^B- Binary number. Included here for completion, but this isn't actually used in the code
You might wonder about hexadecimal? It turns out that MACRO in this era didn't have a built-in hex notation. Word sizes in the mid 1970s hadn't settled to be a multiple of 8/16, and the PDP-10 itself used 36-bit words (18-bit half-words), to which octal representation is better aligned. The PDP-8 was 12-bit words but the later PDP systems did ultimately land on 16-bit words.
The RET trampoline
The source code makes some use of return instructions not paired with a previous CALL instruction. Some address is pushed on to the stack, possibly much earlier, before the eventual RET invocation. If you see an unexplained return in the command sequence, but a jump brought you to this code block, the look to the nearest PUSH for further explanation.
In fact, be prepared for all forms of changing flow control such as calls, jumps, returns, and direct loading of the program counter with PCHL. It all happens in Altair BASIC.
Loading Altair BASIC
The memory footprint of a program was very easy to understand in the simple days of the 1970s. There is no OS, thus no virtual memory or paging system, and certainly nothing like ASLR. Altair BASIC owned the box and could directly address any physical space in memory. When the program tape read the final byte in to memory, execution is ready to begin from memory location 0.
Before running the program, we'll review how it is laid out in memory, what the key sections are, why the program is organized in this way, and what structures support execution.
The view at load time
Altair BASIC lives in memory in the same order of memory addresses from section 3 of the source pages (example shown earlier). Here is the memory map of the program:

Starting from the interrupt handlers, the code drops down to fixed tables used for dispatch, then to static variable spaces. The 2nd half of the code is the runtime-execution, the largest sections being the BASIC statements and math package.
Altair BASIC follows the classic pattern of keeping single-use and optional-use code at the end so that this space can be recovered for dynamic storage. System initialization code runs once at the end, and the optional math functions follow before, allowing these final two blocks of ~730 bytes to be reclaimed at run time.
Interrupt vectors
We'll dig in to some detail of the handlers now, including the purpose and specific code lines, but save the mechanical details and the extended handlers for the execution discussion
As mentioned earlier, the 8080 CPU architecture allows the first 64 bytes of memory to be used for interrupts: eight handlers with eight bytes each. There are caveats with the Altair BASIC implementation.
- The first instruction at memory location 0 is 'DI', which disables all hardware interrupts system-wide. Interrupts are never enabled again elsewhere in code. No external interrupts will ever invoke the handlers while Altair BASIC is running.
- Software interrupts from within AltairBASIC will still trigger jumps to the interrupt handlers via the RST instruction and macro'd friendly name
- AltairBASIC does not use the first or last handler and instead uses that space for initialization and user-defined functions
- Be aware of PC and stack composition in the handlers - not all use RET directly and some jump arbitrarily
- The discrepancy between the 64 actual bytes for handlers and the 67 bytes noted in the load time layout is due to the final unused routine overrunning its space without consequence
The interrupt handlers are dedicated to common low-level routines called often throughout the code. The RST instructions are MACRO'd to a friendly name. The eight handlers and their purposes are:
| Handler # | Offsets | Macro Name | Purpose |
|---|---|---|---|
| 0 | 0-7 | None | Not an interrupt handler. Jumps to init at the end of code, which self-modifies jump to "OK" |
| 1 | 10-17 | SYNCHK | Checks the character currently pointed at for a specific value, leading to possible syntax error |
| 2 | 20-27 | CHRGET | Increments to the next character and fetches it in to accumulator |
| 3 | 30-37 | OUTCHR | Outputs the value in the accumulator to the terminal |
| 4 | 40-47 | COMPAR | Compares the values of register pairs DE and HL with result in FLAGS. Carry = DE > HL, Zero means equal. |
| 5 | 50-57 | FSIGN | Checks the sign of the floating accumulator by setting the actual accumulator to the same sign |
| 6 | 60-67 | PUSHM | Pushes memory location at HL to the stack. Dereferneces value to BC |
| 7 | 70-103* | None | Not a handler, but can be set by the user for their own purpose. *This handler runs longer than designed |
Now for the lines of code that made up each handler location. I've removed original comments, whitespace, and machine code for clarity. I've added my own comments in bold that I believe may be relevant from the modern viewpoint, but I highly recommend reading the original comments in the source.
Interrupt 0 - Not used (memory location 0-7 octal)13040 START: DI /* Disables external interrupts */ 13060 JMP INIT /* Jumps to system init */ 13080 13160 IFN LENGTH-2,< /* Excess space used for static 13180 ADR(DEINT) pointers to routines */ 13240 ADR(GIVABF)> 13260 13300 IFE LENGTH-2,< 13320 ADR(FECINT) 13340 ADR(MAKINT)>
The first line of this handler is the first line for the whole program: Disable external interrupts, which are never reenabled. The next instruction jumps out of the handler to the system initialzation routine, which modifies this jump instruction to as a jump to 'Ready', which is effectively the top of the main loop. This way if the system ever somehow jumps back to insturction 0, it can pick up normal execution.
Interrupt 1 - SYNCHK (memory location 10-17 octal)
13600 MOV A,M /* Get next char from M in to A */ 13620 XTHL /* Swap HL and SP, SP+1 13640 M now looks at char to verify 13660 CMP M /* If they match, Z=1 */ 13680 13700 INX H /* Manual advance to next instruction */ 13720 XTHL /* Restore to stack for eventual RET */ 13740 13760 JNZ SNERR /* No match - error, else fall down to CHRGET */
SYNCHK verifies that the next input character matches the character given to the handler.
A successful SYNCHK will always execute the next handler (CHRGET) because execution falls through to pull the character in to A.
Interrupt 2 - CHRGET (memory location 20-27 octal)
14060 IFE LENGTH,<CHRGTR:> /* In 4K, we skip a later inline optimization */ 14080 INX H /* Advance pointer to next character */ 14100 MOV A,M /* Pull it in to A */ 14120 CPI ":" /* A with ":". 0-9 force carry bit*/ 14200 RNC /* If a letter, return to get more letters */ 14220 JMP CHRCON /* Continue CHRGET later in source */
Loads the next input character in to A before passing execution to an extended routine, which skips spaces and handles numbers.
It's worth discussing the CHRCON extended handler in this space because it has a subtle flag manipulation:
/* Only digits and control characters make it this far - letters returned already above */ 43280 CHRCON: CPI " " /* Check for space, zero if true */ 43300 JZ CHRGTR /* It's a zero - get another character */ 43320 CPI "0" /* Digits will not have carry asserted at this point */ 43360 CMC /* Invert Carry - Digits will now have carry asserted */ 43380 INR A /* Add one to reset flags */ 43400 DCR A /* Subtract back to original so Zero means A=0 */ 43420 RET
Interrupt 3 - OUTCHR (memory location 30-37 octal)
14360 OUTDO: PUSH PSW /* Save FLAGS */ 14380 IFN CONTRW,< /* If suppression feature is on... 14400 LDA CNTWFL Get the suppression status */ 14420 ORA A> /* Get status of flag in FLAGS*/ 14440 IFE LENGTH!CONTRW!LPTSW,< /* No suppression, so... 14460 LDA TTYPOS> Get the output line position */ 14480 JMP OUTCON /* Jump to extended handler */
Prints the character in A to the output terminal, preserving flags, which are restored in the extended handler discussed later in Execution.
Interrupt 4 - COMPAR (memory location 40-47 octal)
14680 MOV A,H /* Get high 8-bits */ 14700 SUB D /* Subtract other high 8 bits */ 14720 RNZ /* Return if not equal - FLAGS set*/ 14740 MOV A,L /* H=D so repeat for low 8-bits */ 14760 SUB E 14780 RET /* Return with FLAGS set */
Compares values in [DE] to those in [HL] using simple subtraction. Caller must set up register pairs prior to invoking COMPAR. Caller checks flags for intended purpose after return
Carry is set if [HL] < [DE]. Zero is set if [HL] = [DE]
Interrupt 5 - FSIGN (memory location 50-57 octal)
15100 SIGN: LDA FAC /* Get the exponent in to A */ 15120 ORA A /* Get the flags */ 15140 JNZ SIGNC /* Jump to extended handler to test sign */ 15160 RET /* Must be zero - just return */
Tests the sign of the floating point accumulator by grabbing the exponent value and setting the corresponding FLAGS. Recall that using an OR operation on the same input won't change the value, but will set flags for sign, zero, parity.
Interrupt 6 - PUSHM (memory location 60-67 octal)
15420 XHTL /* Stack to HL -> M points at argument to PUSHM */ 15440 SHLD PUSHMA+1 /* Moves it to the end of the the next handler */ 15460 POP H /* Restores the pointer temporarily moved to the stack */ 15480 IFN LENGTH,< /* In the 8k+ BASIC... 15500 JMP $CODE+59 Jumps to MOV C,M in the next handler */
Pushes the value pointed to by M / (HL) on to the stack while retaining its value in BC and advancing the HL pointer. Note that this handler actually falls in to the next handler regardless of implementation. If larger than 4k, we jump around the true handler 7 start so that a user-defined ISR can be defined.
Interrupt 7 - Not used (memory location 70-77 octal, but extends to 0o102)
15560 RET /* Immediately exit this handler */ 15600 NOP /* No-op, but reserved byte for UDF address */ 15620 NOP> /* No-op, but reserved byte for UDF address */ 15640 MOV C,M /* Get low 8-bits value to push in to C */ 15660 INX H /* Move to more significant byte */ 15680 MOV B,M /* Get high 8-bits value to push in to B */ 15700 INX H /* Move to before both bytes */ 15720 PUSH B /* Put it on stack */ 15760 PUSHMA: JMP PUSHMA /* RET set up back in handler 6 */
The continuation of PUSHM, with space retained for a jump to a user-defined handler* (currently RET, NOP, NOP) in the first 3 bytes.
*The user-define handler doesn't appear to be set or self-modified within BASIC, but requires the user to POKE these addresses according to Altair BASIC manual on page 67
Data structures
Altair BASIC relies on several data structures to efficiently control program flow with minimum computation and overhead.
Most of the tables and data structures are generated at build time and loaded as-is. We'll dig in to those as well as a few uninitialized abstractions such as the Floating Point Accumulated (FAC) defined in the analgous .BSS section.
Function Dispatch TableFunctions in BASIC take one or two arguments and return a value. The function dispatch table is defined by the FUNDSP symbol and it consists of 24 sequential 16-bit function pointers. These pointers are defined in the source using the ADR() macro. The 24 BASIC functions are listed here in row-major order:
| SGN | INT | ABS | USR |
| FRE | INP | POS | SQR |
| RND | LOG | EXP | COS |
| SIN | TAN | ATN | PEEK |
| LEN | STR$ | VAL | ASC |
| CHR$ | LEFT$ | RIGHT$ | MID$ |
Operator Precedence Table
This very small table denoted OPTABof 7 bytes in 12k BASIC only contains the values 121, 121, 123, 123, 127, 80, and 70 representing the operators for +, -, *, /, ^, AND, and OR. In earlier versions, the pointers to there routines were also stored here but that is no longer true. The table is still used in Formula Evaluation which we will discuss later. The table is constructed with the following code, half of it 'dead' from the top macro:
16540 DEFINE ADRP(X),<ADR(X)> 16560 IFE LENGTH-2,< 16580 DEFINE ADRP(X)<>> /* ADRP generates no output */ 16600 OPTAB: 121 /* Byte output */ 16620 16640 16660 ADRP(FADDT) 16680 121 /* Byte output */ 16700 ADRP(FSUBT) 16720 123 /* Byte output */ 16740 ADRP(FMULTT) 16760 123 /* Byte output */ 16780 ADRP(FDIVT) 16800 IFN EXTFNC,<127 /* Byte output */ 16820 ADRP(FPWRT)> 16840 IFN LENGTH,< 16860 80 /* Byte output */ 16880 ADRP(AND) 16900 70 /* Byte output */ 16920 ADRP(OR)>Reserved Word List
This section should contain all of the keywords and their matching crunch values - however the code listing appears to be incomplete due to this sequence of commands that disable source listing output:
17060 Q=128-1 17080 DEFINE DCI(A),<Q=Q+1 17090 XLIST /* Suppresses listing output */ 17100 DC(A) 17110 LIST> /* Enables listing output */
The second clue that most of this table is suppressed in the listing is the large gaps in BASIC line numbering that begins just afterward. Normally the line number increment by 20 and sometimes 10, however in the reserved word table we see:
17140 ENDTK==Q 17180 FORTK==Q 17240 DATATK==Q 17360 GOTOTK==Q 17420 IFTK==Q 17480 GOSUTK==Q 17540 REMTK==Q 17660 IFE LENGTH-1,< 17700 ELSETK==Q 17780 IFN DSKFUN,<DCI"DSKO$"> 17800 IFN LPTSW,<DCI"LPRINT"> 17820 IFN LENGTH,< 17880 PRINTK==Q 17940 IFE REALIO,< 17960 DCI"DDT">Assuming regular line number intervals of 20, each of those symbol assignments to the value Q is missing one or more lines in between. That missing line likely increments the Q value using the
DCI macro declared earlier while suppressing the output. Logically, all of those Q assignments would not be equal, but we don't see the increment occurring due to the embedded XLIST in DCI.
The final clue that the listing isn't telling us the entire story is by looking at the output addresses immediately before and after this table:
973 000171' 000000 000106 16900 70
974 16920 ADRP(OR)>
/* ... first half of reserved word table ... */
1006 18160 ; END OF COMMAND LIST
1007 000371' 000000 000124 18180 "T"
1008 000372' 000000 000101 18200 "A"
Source lines 973 outputs a byte at octal memory address 171 but the very next byte after the table writes a contant at octal address 371, imply that we've skipped 128 bytes without the assembler reporting any output between these two points.
Combining the above evidence with the statement dispatch table operator ordering, I will speculate that the suppressed lines of code look something like this (my guesses in bold):
17120 DCI"END" 17140 ENDTK==Q 17160 DCI"FOR" 17180 FORTK==Q 17200 DCI"NEXT" 17220 DCI"DATA" 17240 DATATK==Q 17260 DCI"INPUT" 17280 DCI"DIM" 17300 DCI"READ" 17320 DCI"LET" 17340 DCI"GOTO" 17360 GOTOTK==Q 17380 DCI"RUN" 17400 DCI"IF" 17420 IFTK==Q 17440 DCI"RESTORE" 17460 DCI"GOSUB" 17480 GOSUTK==Q 17500 DCI"RETURN" 17520 DCI"REM" 17540 REMTK==Q 17560 DCI"STOP" 17580 DCI"PRINT" 17600 DCI"LIST" 17620 DCI"CLEAR" 17640 DCI"NEW" 17660 IFE LENGTH-1,< 17680 DCI"ELSE" 17700 ELSETK==Q /* ... table continues ... */
The last byte of the table is zero (NUL)
Statement Dispatch TableAs with the Function Dispatch table, the Statement Dispatch table (STMDSP), is simply an ordered sequence of pointers to the code execution routines.
The table for the 12k build is 100 bytes long and contains the expected 50 statement dispatch pointers out of a possible 56 allowable across all build configurations. The table is organized as follows:
| Index | Statement | Routine | In this build? |
|---|---|---|---|
| 0 | END | END | Yes | 1 | FOR | FOR | Yes | 2 | NEXT | NEXT | Yes | 3 | DATA | DATA | Yes | 4 | INPUT | INPUT | Yes | 5 | DIM | DIM | Yes | 6 | READ | READ | Yes | 7 | LET | LET | Yes | 8 | GOTO | GOTO | Yes | 9 | RUN | RUN | Yes | 10 | IF | IF | Yes | 11 | RESTORE | RESTORE | Yes | 12 | GOSUB | GOSUB | Yes | 13 | RETURN | RETURN | Yes | 14 | REM | REM | Yes | 15 | STOP | STOP | Yes | 16 | ELSE | ELSE | Yes | 17 | TRON | TON | Yes | 18 | TROFF | TOFF | Yes | 19 | EDIT | EDIT | Yes | 20 | OUT | FNOUT | Yes | 21 | ON | ONGOTO | Yes | 22 | NULL | NULL | Yes | 23 | WAIT | FNWAIT | Yes | - | DSKO$* | DSKO$ | No | - | LPRINT | No | 24 | POKE | POKE | Yes | 25 | Yes | 26 | DEF | DEF | Yes | 27 | CONT | CONT | Yes | - | ??? | OUT** | No | 28 | LIST | LIST | Yes | - | LIST | LLIST | No | 29 | DELETE | DELETE | Yes | 30 | CLEAR | CLEAR | Yes | - | CLOAD | CLOAD | No | - | CSAVE | CSAVE | No | - | CONSOLE | CONSOLE | No | 31 | NEW | SCRATH | Yes | 32 | CDBL | FRCDBL | Yes | 33 | CINT | FRCINT | Yes | 34 | CSNG | FRCSNG | Yes | 35 | + | DADD | Yes | 36 | - | DSUB | Yes | 37 | * | DMULT | Yes | 38 | / | DDIV | Yes | 39 | >,<,= | DCOMP | Yes | 40 | + | FADD | Yes | 41 | - | FSUB | Yes | 42 | * | FMULT | Yes | 43 | / | FDIV | Yes | 44 | >,<,= | FCOMP | Yes | 45 | + | IADD | Yes | 46 | - | ISUB | Yes | 47 | * | IMULT | Yes | 48 | / | IDIV | Yes | 49 | >,<,= | ICOMP | Yes |
*Unable to find specific documentation about the disk code from 1975.
**This OUT routine doesn't appear elsewhere in the code, fortunately it is not included in this build

Potentially up to 185 bytes are dedicated to uninitialized variable space with 132 of those bytes backing symbols for interpreter run-time activties. Since this space cannot grow, the function of these symbols is generally limited to pointers, flags, and fixed buffers. I'll identify these symbols, their purpose, and the ideas behind them. Knowing these will speed up code reading significantly since they are often used within critical mechanisms of the interpreter. The remaining 53 bytes support the floating point accumulated discussed in the next section. Let's dig in to these key symbols by type.
| Pointer | Fixed | Points to... |
|---|---|---|
| MEMSIZ | Yes | The highest location in memory. Computed by user input during initialization |
| TEMPPT | No | The first free temporary descriptor (one of TEMPST) |
| FRETOP | No | The highest unused location in 'string space' (think heap). As space is used, FRETOP decreases until it crosses STKTOP. |
| OLDTXT | No | The start of a text statement in progress |
| STKTOP | Usually* | The first stack entry, by default 50 bytes below MEMSIZ |
| TXTTAB | Yes | The start of user text (program) data |
| VARTAB | No | The start of simple variable space |
| ARYTAB | No | The start of array variable space |
| STREND | No | The end of variable storage |
| DATPTR | Yes | The first location after system code - the null 0 at [TXTTAB]-1. Used to 'Read' through user program DATA statement lists. |
Most of these pointers are used to manage the run-time (high) memory allocations. The 'Fixed' column indicates if these pointers will change during normal program execution. Remember that in assembly, there really isn't a concept of data structure or data type. Typically, we use fixed memory addresses to point at the base of a memory region for a data structure and dynamic addresses to point at the 'work' inside of these locations. I'll discuss this idea in more detail during execution when we have to set up the run-time memory using these pointers.
*Generally fixed, but user can change the stack allocation with the CLEAR command
| Flag | Size (bytes) | Purpose |
|---|---|---|
| BUFMIN | 1 | A single character: Comma (ASCII value 44). Denotes the start of the BUF buffer. |
| PRTFLG | 1* | Flags output to go to the printer |
| CNTWFL | 1 | Flag to suppress output via CTRL-O |
| DIMFLG | 1 | Determines if a pointer target is being processed with DIM. |
| VALTYP | 1 | Input type indicator (0 -> Number, 1-> String) |
| OPRTYP | 1** | Stores the operator number. Shares the same memory byte as DORES |
| DORES | 1** | Affects if crunching is performed on input. Shares the same memory byte as OPRTYP |
| SUBFLG | 1 | Flag used to allow subscripted variable names (in parens). Allows scanning to skip parenthese |
| FLGINP | 1 | Determines if the input is INPUT or DATA (0 -> Input) |
| TRCFLG | 1 | Debug tracing in progress (0 -> off or 1 -> on) |
Most flags use a single byte to make a distinction between zero and non-zero value. In researching this code, I wonder why they weren't combined in to a single byte and bitmasked as needed, but I can only assume it was for performance reasons.
*Not used in this published build of Altair BASIC and so the space is actually 0
**OPRTYP and DORES point to the same byte
| Buffer | Size (bytes) | Purpose |
|---|---|---|
| BUF | 72 | The input buffer. All user input lives here before processing |
| LPTPOS | 1* | Position of printer head |
| TEMPST | 15 | Temp storage for multiple descriptors |
| DSCTMP | 3 | Single string descriptors are built in this temporary space |
| TEMP | 2 | Space used to store various things like NEWSTT addresses, LET values, etc |
| TEMP2 | 2 | Temporary space used by formula evaluator. |
| TEMP3 | 2 | Temp space used in garbage collection, formula eval, FOUT, etc. |
| DATLIN | 2 | The data line number, used for error output |
| CURLIN | 2 | The current line number |
| OLDLIN | 2 | The previous line number |
Buffers are where the 'work' goes during processing. I placed the TEMP* symbols here because, unlike flags, their values are arbitrary and they are used to hold immediate values rather than branching system behavior. Several two-byte symbols hold line numbers, thus capping program lengths at 65536 lines - well within 1975 memory limitations.
It's worth noting that both BUF and TEMPST live below DSCTMP, with the latter being directly below it. This relative position determines whether or not strings are copied to string space or not. Strings provided in some constants and user program code are above DSCTMP and do not need to be moved. This quirk is due to the instruction sequence found during string processing. More on that later.
*Not used/allocated by default
Floating Point Accumulator (FAC)The final 53 bytes of .BSS defines the floating point accumulator. The 8080 has no specialized registers to handle floating point representation, thus the obvious solution is to use static memory locations to simulate the behavior of a register by build the instructions to operate on these locations. For now, I'll focus on the storage structure and save algorithms and operations for the execution discussion. The minimum you need to know is that floating-point numbers are stored as a pair comprised of mantissa value and an exponent value which includes a sign bit.
The FAC is defined in the uninitialized variable space (impure space, as the code comments call it). The allocated space depends on the version of BASIC, which this build will focus on the 12K extended. For 12k, we essentially have two floating point storage units (FAC and ARG), used to hold and operate on two values. We also have an output buffer. Here is the space allocated and the symbols pointing to the space:

Executing Altair BASIC
We will walk through system intialization up until the moment the user is presented with 'OK' and may begin creating programs. Then we will walk through the 'main' loop of the interpter and dive in to important code branches like formula evaluation, string operations, and garbage collection.
But first, to grasp what needs to happen during initialization, we should thorougly understand what the run-time memory space should look like and how it will operate. Knowing where we are (after loading) and where we need to be will allow us to connect the dots along the path.
Run-time memory layout
After the BASIC interpreter loads, all memory above the code locations should be available for program use - this includes recovering the space used by the initialization code after it one and only pass. This is why it's located at the end of the source code so that it can easily coalesce with the unused high region. Recall that the load time memory layout for the 12K BASIC has init code at offset 14627 octal. From this location to the end of memory (octal 30000) is the space to stake out using the pointers noted from our BSS discussion.
Intuitively, the high memory area must include all of the 'user produced' content within BASIC to include program code, variables, and string constants. BASIC also needs scratch space to execute the programs (a stack). All of these components must reside in high memory and be spaced appropriately to give users sufficient capability. Consider this memory layout depicting 12K BASIC in mid-operation with data structures in use:
![]() | Run time memory is divided in to 5 sections, which each section delimited by the pointers identified in our .BSS disuccsion. "String Space" is at the very top of memory and has a small allocation of 50 bytes fixed, which grows downward. This space is used for the string values read in to the input buffer. Note these are the actual string values, not the descriptors, which are stored in .BSS and point to this space.Below that is the stack, which grows arbitrarily downward by convention of the 8080 PUSH and POP instructions. From the bottom, we have the user program data, to include variables, which all grow collectively upward. Pointer math We can use simple pointer comparisons to determine the state of the program and to trigger important events or errors.
|
This model of operation has both elegance and challenge. Freeing up dead space could be as simple as just moving the pointer back up. However, reality is that sections may need to be compacted to fill in 'holes' in a segment. When the user adds a new line to the program, variable space must shift upward, requiring block transfers. These situations will be discussed later.
Knowing the desired organization during execution, we can walk backwards to the steps to perform during system initialization. In short: it's simple pointer manipulation.

Recall from the .BSS discussion that some pointers are fixed. These pointers function as the 'base' of the data segments, while the other pointers are offset from the base, denoting the endpointers of the structure (thus size). The variable pointers are interesting hybrids that function both as the base of their structure AND the endpoint of the previous. The bottom line during initialization is that the data structures are empty, thus the offset pointers should equal their assocated base pointers.
TXTTABis the base of program text and indirectly to variable space.VARTAB,ARYTAB, andSTRENDshould equalTXTTABMEMSIZis fixed to the top of memory andFRETOPis set equal to it. After initialization, input strings will be stored here and pushFRETOPdownSTKTOPanchors the stack and lives at a fixed distance belowMEMSIZ. The stack pointer[SP]will begin equal toSTKTOP
System initialization
Altair BASIC is loaded and we're ready to execute starting from the instruction at address 0. But first, lets consider specific details that guide what needs to be accomplished during initialization. The most important idea is that the Altair 8800 does not have an operating system. There is no intermediate library, API, or even syscalls that will offload the details. BASIC itself must supply the I/O routines and runtime memory management. All work must be done using CPU instructions and assembler macros. Let's translate this in to specific actions:

- Locate I/O devices - The Altair supports several forms of I/O including teletypes, printers, tape readers, and cassettes. The code to interface with these devices is included in the BASIC routines. We must assign the proper handlers based on deteced input devices. We can poll the front panel switches and make handler assignments based on what we find. Once I/O is set up, we can prompt the user for more system details.
- Set up the stack (temporary and permanent) - A stack is necessary to use the
CALLandRETinstructions provided by the 8080. The CPU implementation of SP requires the stack grow downward, suggesting that it should be located as high in memory as practical. To make use ofCALLduring initialization, we should set up a stack early and then transition to a permanent location after we learn more about the memory space. The final lines of initialization from 74920-75000 define the temp stack symbolTSTACKwith a substantial block reservation. - Build the dynamic memory space - Armed with functioning I/O and a temporary stack, we can poll the user to input information about the system. We can then implement the memory model described earlier by setting the static pointers used during execution.
- Patch up extended instructions - The user has options for which extended math functions to implement. We need to capture this configuration and patch up the references so that the function work (or fail) as intended. Note that optional functions live just before initialization so that we can remove them from execution and reclaim the space.
- Transition to BASIC execution - Lastly, we implement a permanent stack, fix up a few code insturctions that current point to initialization, and then jump to
READY, the BASIC entry point.
We pick up our code walkthrough at instruction 0.
The beginning ... jump to the end
We're already reviewed the first instruction when we looked at the first interrupt vector. But as a refresher:
13040 START: DI /* Disables external interrupts */ 13060 JMP INIT /* Jumps to system init */
We disable interrupts, preventing external devices from calling BASIC functions. We jump from the bottom of memory to near the top at the beginning of the Init. Near the end of the Init, we will patch this jump to ready/main because after initialization, init will no longer exist as a valid destination.
Setting up I/O
69860 FUNIO==<^D256*^O312>+^O40+$CODE /* Value is ^O145040 */ ... 69940 INIT: 69960 IFN REALIO,< /* If we're using a real machine (we are)*/ 69980 IN 1 /* Read and ignore garbage input */ 70000 IN ^D255 /* Read front panel switches */ 70020 ANI ^O100 /* Check for switch 14 */ 70040 JZ NOTSIO /* Skipping down if no serial port */ 70060 LXI H,FUNIO /* H = ^O145, L = ^O40 */ 70080 SHLD CNLCA2##+2 /* CNLCA2+3 = ^0145 (JNZ -> JZ) , CNLCA2+2 = ^O40 (5th bit asserted) */ 70100 MVI H,^O310 /* Loads opcode for RZ, L remains ^O40 */ 70120 SHLD CNLCA3##+2 /* RNZ -> RZ, ANI IDONE -> ANI ^O40*/ 70140 IFN LENGTH,< /* If we're running 8K+ BASIC...*/ 70160 MVI H,^O304 /* Loads opcode for CNZ */ 70180 SHLD CNLCA4##+2> /* CZ -> CNZ, ANI IDONE -> ANI ^O40*/ 70200 FUNIO==$CODE+<^D256*^O312>+2 /* Value is ^O145002 */ 70220 LXI H,FUNIO /* H = ^O145, L = 2 */ 70240 SHLD CNLCA1##+2 /* JNZ -> JZ, ANI ODONE(128) -> ANI 2 */ 70260 NOTSIO: IN ^D255 /* Read front panel switches again*/ 70280 ANI ^O40 /* Check for switch 13 */ 70300 JZ NOTPIO /* No parallel port? We're done here */ 70320 FUNIO==$CODE+<^D256*^O312>+2 /* Value is ^O145002 */ 70340 LXI H,FUNIO /* H = ^O145, L = 2 */ 70360 SHLD CNLCA2+2 /* JNZ -> JZ, ANI ODONE(128) -> ANI 2 */ 70380 MVI H,^O310 /* Loads opcode for RZ, L remains 2 */ 70400 SHLD CNLCA3+2 /* RNZ -> RZ, ANI IDONE -> ANI 2*/ 70420 IFN LENGTH,< /* If we're running 8K+ BASIC...*/ 70440 MVI H,^O304 /* Loads opcode for CNZ, L remains 2 */ 70460 SHLD CNLCA4+2> /* CZ -> CNZ, ANI 2 */ 70480 FUNIO==$CODE+<^D256*^O312>+1 /* Value is ^O145001 */ 70500 LXI H,FUNIO /* H = ^O145, L = 1 */ 70520 SHLD CNLCA1+2 /* JNZ -> JZ, ANI ODONE(128) -> ANI 1 */ 70540 NOTPIO:>
The first thing we care about in Init is getting the user I/O working because 1) We need to ask the user setup questions, 2) Programming BASIC necessarily needs I/O. What's actually happening is that we're determine what kind of I/O cards we're using and patching the previously loaded default I/O routines with small changes based on the cards, ports, and detection bits. Determining how and why these work requires very close attention to the Altair manuals regarding their cards and port selection. Another thing to note is that SHLD orders the bytes low-high, meaning the value in H goes to the memory address+1.
For the SHLD lines pointing at symbols CNLCA*, these are located in BASIC statement code and define I/O channel operations. This self-modifying code ensure that these routines match the protocol and address of the detected hardware.
Setting up the stack (and a few extras)
70560 LXI H,$CODE+^D65535 /* HL = -1 (direct mode) */ 70580 SHLD CURLIN## /* Set the current line static var to -1 */ 70600 LXI H,TSTACK /* Temp stack pointer in to HL */ 70620 SPHL /* Actual stack pointer set to temp stack address */ 70640 SHLD STKTOP## /* Set static var tracking the first stack entry location */ 70660 IFN CONTRW,< /* If CTRL-O (output suppression) is supported */ 70680 XRA A /* Clear Accumulator (XOR self) */ 70700 STA CNTWFL##> /* Output suppression flag is 0 - force output */ 70720 CALL CRDO /* Output a carriage return */ 70740 IFN STRING,< /* If strings are enabled */ 70760 LXI H,TEMPST## /* Get the first temp string descriptor location */ 70780 SHLD TEMPPT##> /* Set the next string descriptor pointer to this first location */
Setting up a few of the internal tracking variables and setting up the stack. Note that TSTACK maps to the very end of loaded code space, which is not the end of actual physical memory. We need to use some token stack space for CALLs during init. We'll revisit setting up the full stack after we learn about the total system memory.
Discovering memory space
This section has a significant amount of work, so I will break up the sections with header comments
/*** Get memory info from the user ***/
70800 IFN REALIO,<
70820 LXI H,MEMORY /* Point HL to string 'MEMORY SIZE' */
70840 CALL STROUT /* Print the string */
70860 CALL QINLIN /* Print '? ' and wait for user input */
70880 CHRGET /* Get user input (This is an RST handler) */
70900 CPI "A" /* Check input type (See above discussion) */
70920 JZ INITAT /* Output authors and restart if not number???
* These last two lines seem like a test or bug
* Published versions didn't have this behavior */
70940 ORA A /* Set flags based on A (zero or not) */
70960 JNZ USEDE9 /* User said something, skip below next routine */
/*** User didn't enter memory value - discover space automatically ***/
70980 LXI H,LASTWR /* Get last program code line in to HL */
71000 LOOPMM: INX H /* Increment HL to the next memory location */
71020 MVI A,311 /* Store ^O311 in to A */
71040 MOV M,A /* Store ^O311 from in in to memory at HL */
71060 CMP M /* Check that memory for value in A (Zero = match) */
71080 JNZ USEDEF /* Doesn't match -- beyond usable memory, jump down */
71100 DCR A /* Matched, reduce A to ^O310 */
71120 MOV M,A /* Store new value in same location */
71140 CMP M /* Check match again */
71160 JZ LOOPMM /* Matched, Mem ok - Jump up and try next location */
71180 JMP USEDEF /* No match - jump down to finalize memory size */
/*** Process user-provided memory space ***/
71200 USEDE9: LXI H,BUF /* Point buffer location in to HL */
71220 CALL LINGET /* Put the memory value in to DE */
71240 ORA A /* Check for a zero (terminated line) */
71260 JNZ SNERR /* No zero-term? Syntax error */
71280 XCHG /* High memory addr moves from DE to HL */
71300 DCX H /* Move back to last usable byte */
/*** Wrap up memory search ***/
71320 USEDEF: DCX H> /* Walk memory back to last usable byte */
71340 IFE REALIO,< /* If we're emulating...
71360 LXI H,$CODE+16190> * Pick a specific memory size */
71380 PUSH H /* Save top memory location on stack for later */
The flow here is to ask the user directly for memory amount and save it in HL. If the user doesn't enter, BASIC will calculate the amount by reading and writing to increasingly higher memory addresses until it fails to read. If the user provided the memory size, we should do a sanity check. In both cases, we walk back at least a byte and anchor that point by pushing it on the stack. Later this will be moved to the MEMSIZ static pointer.
Note that the line numbers on 70900 and 70920 don't seem to make sense. They would seem to force an init loop with valid input. My assumption is that there were added in for testing and the listing was printed this way. These instruction do not appear in known disassemblies I've reseached.
Discovering terminal width
/*** Get the terminal width from user ***/ 71400 TTYW: LXI H,TTYWID /* Point HL to the string "TERMINAL WIDTH" */ 71420 CALL STDOUT /* Print the string */ 71440 CALL QINLIN /* Print '?' and wait for user input */ 71460 CHRGET /* Get user input */ 71480 ORA A /* Verify state of input (zero or not) */ 71500 JZ DFLENT /* No input, assume defaults from Common File */ 71520 LXI H,BUF /* Entry occurred, point to buffer */ 71540 CALL LINGET /* Get the line value in to DE */ 71560 MOV A,D /* Move high order byte to A */ 71580 ORA A /* Check if it's non-zero */ 71600 JNZ TTYW /* Non-zero? Waaay too big, redo this */ 71620 MOV A,E /* Get the lower order byte */ 71640 CPI 16 /* Compare it to the value 16 */ 71660 JC TTYW /* Less than 16? Too small. redo this */ 71680 STA LINPT1## /* Self-modifies "LINLEN" argument in file F3, line 2601 */ 71700 IFN LENGTH,< /* If we're 8K BASIC or higher... */ 71720 STA LINPT2## /* Self-modifies "LINLEN" argument in file F3, line 3091 */ 71740 IFN STRING,< /* If strings are enabled... */ 71760 STA LINPT3##> /* Self-modifies "LINLEN" argument in file F3, line 3825 */ 71780 MORCPS: SUI 14 /* Subtract 14 from A... */ 71800 JNC MORCPS /* ...Repeatedly until below zero. Output range: (-1 to -14)*/ 71820 ADI 28 /* Add 28 to A. Output range: (14 to 27)*/ 71940 CMA /* Complement A. Output range: (-15 to -28) */ 71960 INR A /* Add 1, leading to negated A. (-14 to -27) */ 71980 ADD E /* Add back original - Result is 2nd order lower multiple of 14 */ 71900 STA LINPT4## /* Self-modifies "NUTPOS" argument in file F3, line 3914 */
This is fairly straightforward - Ask user for terminal width size and propogate it. Default to 72 if nothing is provided. If something is provided, the LINLENs need to be patched up in the associated statements that use console output, such as PRINT.
Fixing up static constants (.BSS)
71920 DFLENT: 71940 IFN STRING,< /* If strings are enabled... */ 71960 LXI D,$CODE+^D65536-^D50+1 /* Loads -49 in to DE */ 71980 POP H /* Grabs the previously saved top of memory */ 72000 SHLD MEMSIZ## /* Make it official by storing in MEMSIZ */ 72020 SHLD FRETOP## /* Initialize FRETOP here also (empty string space)*/ 72040 DAD D /* Subtract 49 from top for the end of string space */ 72060 JNC OMERR /* If this is negative -- error out for lack of memory */ 72080 DCX H /* Reduce 1 to 50 bytes from the top - now at future stack top */ 72100 PUSH H> /* Save the future stack top location to the temp stack */
If strings are enabled, we have to set up the string space - this means setting up the relevant pointers: MEMSIZ and FRETOP. While you might think that we'd need MEMSIZ even without string functions, it turns out that this symbol is only used here during init and within the string functions.
Set up extended functions
Note that I'm cutting a few of the 4K-only lines here because the same task happens in the 8K+
/*** Ask user for desired functions ***/ 72800 IFN EXTFNC,< 72820 ASKAGN: LXI H,FNS /* Points to WANT SIN-COS-TAN-ATN string */ 72840 CALL STROUT /* Prints the string */ 72860 CALL QINLIN /* Awaits user input */ 72880 CHRGET /* Gets user input */ 72900 CPI "Y" /* Compares input to 'Y' */ 72920 LXI D,INITSA /* Store address to Init (as end of code ref) */ 72940 JZ HAVFNS /* User wants functions, skip down */ 72960 CPI "A" /* User didn't say 'Y', check 'A' (delete ATN only)*/ 72980 JZ OKCHAR /* Jump down if we're only deleting ATN */ 73000 CPI "N" /* Not Y or A, so check N */ 73020 JNZ ASKAGN /* No answer recognized - ask again */ 73040 OKCHAR: LXI H,ILLFUN /* Point to 'Illegal Function Call' string */ 73060 LXI D,ATN /* Point to the ATN function */ 73080 SHLD ATNFIX## /* Load 'Illegal Function Call' to ATN */ 73100 CPI "A" /* Check again if 'A' brought us here */ 73120 JZ HAVFNS /* If so, we're done, otherwise continue */ 73140 SHLD COSFIX## /* Load 'Illegal Function Call' to COS */ 73160 SHLD TANFIX## /* Load 'Illegal Function Call' to TAN */ 73180 SHLD SINFIX## /* Load 'Illegal Function Call' to SIN */ 73200 LXI D,COS> /* Store address of COS (for end of code ref) */
This is a simple segment of user Q&A and patching of discarded function pointers. Two subtle lines to note from this code block are 72920, 73060, 73200. The first stores the address of INITSA in DE. The second overwrites DE with the address of ATN if we decide to remove that function. And the final overwrites DE with the address of COS if we remove all math functions.
DE denotes the address after the last usable code. This will be used to start the dynamic memory area beginning with a zero delimiter followed by user program code in TXTTAB
Set up dynamic memory area
73220 HAVFNS:
73240 XCHG /* Pull DE in to HL (the end of code / bottom of memory) */
73260 MVI M,0 /* Zero-term this to mark the end of BASIC */
73280 INX H /* Increment HL to next byte */
73300 SHLD TXTTAB## /* Store this location as the base of TXTTAB */
73320 XTHL /* HL <-> SP. Get real stack top pushed earlier on line 72100*/
* This also puts TXTTAB on temp stack
73340 LXI D,TSTACK /* Get top of temp stack (should be end of INIT) */
73360 COMPAR /* Verify that these values make sense */
* Temp stack should be below the real stack */
73380 JC OMERR /* If somehow HL (real stack) is below DE (temp stack), error out */
73400 POP D /* No error - bring TXTTAB in to DE */
73420 SPHL /* HL ->; SP. Make it the official stack now */
73440 SHLD STKTOP /* Store stack top in static memory (.BSS) */
73460 XCHG /* HL <-> DE. TXTTAB in HL and STKTOP in DE */
73480 CALL REASON /* Verify good spacing between TXTTAB and STKTOP */
73500 MOV A,E /* Get the bottom 8 bits of STKTOP in accumulator */
73520 SUB L /* Subtract away the bottom 8 bits of TXTTAB */
73540 MOV L,A /* This leaves the amount of lower 8 bits of empty space in L */
73560 MOV A,D /* Repeat for upper 8 bits */
73580 SBB H /* High-order subtraction */
73600 MOV H,A /* HL now contains the free/unused space between stack and heap */
73620 LXI B,$CODE+65520 /* Get -16 in to BC. (space for stack to grow) */
74640 DAD B /* Subtract the 16 from the space in HL. */
74660 CALL CRDO /* Print a line feed */
73680 CALL LINPRT /* Print the numberic amount of free bytes computed in HL */
73700 LXI H,WORDS /* Point at string 'BYTES FREE' */
73720 CALL STROUT /* Print the string */
73740 LXI H,STROUT /* Get memory location STROUT in to HL */
73760 SHLD REPINI##+1 /* Store this in routine in READY (replacing INIT) */
73780 CALL SCRTCH /* Effectively 'NEW', setting up TXTTAB, VARTAB, ARYTAB */
The last code block set up the top of dynamic memory and this block sets up the bottom, beginning with the null-term and the base of TXTTAB where user code will live. We then set up the permanent stack and verify sufficient spacing between it and the user code area.
Wrapping up initialization
73800 IFN LPTSW,< /* If there is a printer output */
73820 MVI A,4 /* Prep accumulator with number 4 */
73840 OUT 2 /* Clear the buffer - send 4 to printer */
73860 XRA A /* Clear A */
73880 STA PRTFLG## /* Reset printer flag */
73900 STA LPTPOS##> /* Reset printer line position */
73920 IFE CONSSW,<LXI H,READY> /* No console? Load READY routine in HL */
73940 IFN CONSSW,< /* If console (nope ... dead code) */
73960 LXI H,CONSDO##> /* Load CONSDO in to HL */
73980 SHLD $CODE+2 /* Set the 2nd byte of memory to HL value.
* This overwrites instruction 0's JUMP from INIT to READY */
74000 PCHL /* Set program counter to READY. Time to run BASIC! */
The main loop
After initialization, Altair BASIC spends most of its time repeating the same 100 instructions in a tight loop. Most of the 'work' is boilerplate reading, verifing, and moving user input. It branches off for specific input handlers, updating memory regions, and executing direct requests. As suggested in the high-level discussion earlier, there are paths through MAIN which depends if the user is making Direct or Indirect statments. Here is our map:

We'll begin by walking through the Indirect loop, bypassing the crunch routine (for now), and circling back to MAIN. Since no program execution occurs during indirect input, we do not have to redo the READY routine.
/*** READY is a brief preporatory routine for MAIN ***/ 28700 READY: IFN LPTSW,< /* If there is a printer...*/ 28720 CALL FINLPT> /* Print any remaining input */ 28740 IFN CONTRW,< /* If output is suppressed... */ 28760 XRA A /* Get a zero */ 28780 STA CNTWFL> /* Force output by clearing suppress flag */ 28800 LXI H,$CODE+65535 /* Get a negative -1 */ 28820 SHLD CURLIN /* Make -1 the current line number, forcing direct-mode */ 28840 LXI H,REDDY /* Get routine to print "OK" */ 28860 REPINI: CALL INIT /* Print 'OK' ... This INIT was rewritten as... */ 28880 /* CALL STROUT at the end of INIT (Line 73760 above) */
In READY, we clear all input, set up for direct mode, and print 'OK' for the user. The inital startup enters through here as well as any long deviations from the main loop to ensure that we're set up for continued operation. Small branches that return to MAIN will enter below.
/*** The intro to big MAIN - reading user input ***/
28920 MAIN: CALL INLIN /* Get input line. MAIN loops within INLIN until input */
28940 CHRGET /* Input happened - get the first character */
28960 INR A /* Increment then decrement A as a ... */
28980 DCR A /* clever hack to check for Zero w/o changing Carry */
29000 JZ MAIN /* Zero? Line was blank, go back for more input */
29020 PUSH PSW /* Not zero - save the flags, including preserved carry */
29040 CALL LINGET /* Get the line number */
29060 PUSH D /* Put it on the stack above the flags */
29080 CALL CRUNCH /* Reduce the line to its basic components */
/* DETAILS LATER ON CRUNCH ROUTINE */
/* CRUNCH returns A=0, BC = character count */
29100 MOV B,A /* Make B=0, ensuring BC is a positive count */
29120 POP D /* Get the line number back (smashed by CRUNCH) */
29140 POP PSW /* Get flags/carry back (smashed by CRUNCH) */
29160 JNC GONE /* No carry = direct statement - jump to direct handler */
/* Carry was preserved from CHRGET on line 28940 */
We've retrieved some input and we've decided if this is a direct statement to be executed immediately (jump to GONE), or an indirect statement and should add it to the user program (fall through below). The CRUNCH call in this block hasn't been discussed, but its purpose is to reduce the line to single-byte tokens that can offset in to the statement dispatch table.
/*** Indirect statement handling - Find and delete old line ***/
29180 PUSH D /* Indirect, save line number on the stack */
29200 PUSH B /* Also save the character count used in NODEL routine */
29220 CHRGET /* Remember the state/type of the line */
29240 PUSH PSW /* Save its flags (type) on the stack */
29260 CALL FNDLIN /* Check user program for existing line and get a pointer */
29280 PUSH B /* Push the pointer to the line on the stack */
29300 IFE LENGTH-2,< /* In 4K basic... */
29320 CC DEL> /* Just delete the line without checks */
29340 IFN LENGTH-2,< /* In 8K+ BASIC... */
29360 JNC NODEL /* No carry - no match. Nothing to delete, done with block */
29380 XCHG /* Time to delete. Move line pointer from HL to DE */
/* DE points to the NEXT line (The one AFTER the search) */
/* Must loop from found line to end of TXTTAB to compact it */
29420 LHLD VARTAB /* Get Variable Table in HL. TXTTAB ends at VARTAB */
29440 MLOOP: LDAX D /* Load NEXT line byte in to A */
29460 STAX B /* Stores the NEXT line byte in THIS line */
/* We must now patch every line afterward */
29480 INX B /* Move to next destination byte in TXTTAB */
29500 INX D /* Move to the next source byte in TXTTAB */
29520 COMPAR /* Check if we're still in within TXTTAB */
29540 JNC MLOOP /* No Carry means we are still in TXTTAB (more work) */
29560 MOV H,B /* Get the high byte of the end of TXTTAB in H */
29580 MOV L,C /* Get the low byte of the end of TXTTAB in L */
29600 INX H /* Increment HL to find the next VARTAB */
29620 SHLD VARTAB> /* Store the new, lower positioned VARTAB */
We know the input is an indirect statement and so the TXTTAB (User program table) must be manipulated. This launches in to low-level mechanical operations that require us to follow closely with the table organization. Visualizing the TXTTAB manipulation is easier if you understand how a single line is composed. Consider this:

Using the line number, we call FNDLIN to get pointers to both the target line and the next line. If there is no target, we can skip to the next block. Otherwise, we have to delete that line by rolling all the subsequent lines up one byte at a time until we reach the end of the TXTTAB (which is VARTAB). The tight loop labelled MLOOP does the work of moving bytes, moving pointers, and checking bounds. We end by setting the new start of VARTAB. Note that actual VARTAB data is untouched and there will be some dead space at the beginning, which is fixed up in the next block.
/*** Indirect statement handling - Patching up VARTAB and inserting new line ***/ 29640 NODEL: POP D /* Retrieve the line insert position pushed on line 29280 */ 29660 POP PSW /* Retrieve line type pushed on line 29240 */ 29700 JZ FINI /* No work to do if this line is empty */ 29720 LHLD VARTAB /* Get the end of TXTTAB in HL */ 29740 XTHL /* Swap HL (VARTAB) with top of stack. */ 29760 /* HL now contains input length pushed back on line 29200 */ 29780 POP B /* Get VARTAB pointer just swapped to stack */ 29800 DAD B /* Add line length to VARTAB, accomodating new line */ 29820 PUSH H /* This will be the new VARTAB, but save on stack first */ 29840 CALL BLTU /* Block shift all data between insert point and new VARTAB */ 29860 POP H /* Regain new VARTAB pointer smashed by BLTU */ 29880 SHLD VARTAB /* Make this the new VARTAB */ 29900 XCHG /* DE is now new VARTAB, HL is line insert point */ 29920 MOV M,H /* Store this pointer in its own memory location */ 29940 INX H /* This avoids zero pointers halting CHEAD fixup */ 30000 INX H /* 2nd increment now points us to line number storage */ 30020 POP D /* Get the line number from stack (pushed at 29180) */ 30040 MOV M,E /* Save low order byte of line number in pointed at HL */ 30060 INX H /* Increment to high order byte */ 30080 MOV M,D /* Save high order byte of line number in TXTTAB */ 30100 INX H /* Done with pointers and line numbers. Increment to line data */ 30140 LXI D,BUF /* Point to the base of the line buffer in .BSS */ 30160 MLOOPR: LDAX D /* Load a byte from the line buffer in to A */ 30200 MOV M,A /* Store A in to TXTTAB location pointed by HL */ 30220 INX H /* Point to next character destination */ 30240 INX D /* Point to next buffer source position */ 30260 ORA A /* Test for zero in A (implies we're done) */ 30280 JNZ MLOOPR /* Repeat the last 5 lines until finished */
Previously, we either deleted a line and rolled up the table, or we found no line to delete. Now we're ready to insert the new line. The way to do that is to use the insert point discovered earlier and block copy (BLTU call) all bytes afterward even further forward by the size of the new line. We then fill in the empty space with the line components. The MLOOPR loop copies the arbitrarily sized line content bytes, which includes the final NUL on the line.
The big problem we're left with is broken pointers. After shifting many lines around at the byte level, we've never actually updated the pointers to ensure that they are correct.
/*** Fixing up pointers to all recently moved TXTTAB lines ***/ 30300 FINI: CALL RUNC /* Clears up dynamic memory and stack */ 30340 INX H /* HL now points to TXTTAB base (RUNC side-effect) */ 30480 CHEAD: MOV D,H /* Setting DE to HL (high byte) */ 30500 MOV E,L /* Setting DE to HL (low byte) */ 30520 MOV A,M /* Get line pointer for text (high byte) */ 30540 INX H /* Point to low byte */ 30560 ORA M /* Verify if bytes hold any non-zero value */ 30580 JZ MAIN /* If zero, we are completely done - back to MAIN */ 30600 INX H /* Point to first byte of line num */ 30620 INX H /* Point to second byte of line num */ 30640 INX H /* Point to first byte of line data */ 30660 XRA A /* Clear accumulator */ 30680 CZLOOP: CMP M /* Check data byte for NUL (end of line) */ 30700 INX H /* Move to next byte - If was NUL, now points to next line ptr */ 30720 JNZ CZLOOP /* If not zero, repeat until the end */ 30740 XCHG /* Swap HL/DE. DE is now start of next line, HL is last line ptr */ 30760 MOV M,E /* Set first byte of TXTTAB pointer to next entry base */ 30780 INX H /* Increment to second pointer byte */ 30800 MOV M,D /* Set the 2nd pointer byte */ 30820 XCHG /* HL picks back up at next line. */ 30840 JMP CHEAD /* Repeat until double-zero triggers jump to main (line 30580) */
Keep an eye on the HL and DE pointers, which alternate positions with the XCHG command as they walk through the lines anchoring the pointer and referencing the next pointer. CZLOOP walks through the line content to end pointing at the next line to backfill the previous line's pointer. CHEAD walks through every line and takes us back to MAIN.
You may note that dynamic memory is reset each and every time a new line is added. This is a quick way to ensure valid pointers and that HL is pointed to the base of user code. There is no user programming running during the input phase, so nothing is lost.
Crunch time
I handwaved line 29080 above (CALL CRUNCH), but this is an critical routine that saves space by doing an in-place transformation of user input in the buffer. Not only does it reduce keywords to a single byte, but that byte also serves as the offset in to the statement dispatch table for quick execution. So again, our two goals are 1) Compression and 2) Fast dispatch.
Goal 1 (compression) is accomplished by encoding all keywords by their Q value we determined earlier. 7th bit asserted (Decimal 128+ and Octal 200+).
Goal 2 (dispatch) is done by ordering the dispatch table the same as the encoded value. That we can calculate an offset and jump using this single byte.
The CRUNCH routine solves this problem in about 100 lines - the same length as MAIN. Naturally this makes for some dense code that requires us to have a tight grasp on the inputs, outputs, and goals. Let me review them now:
Input: The input buffer BUF contains a line as typed by the user. HL points to the first character after the line number.
Output: The input buffer contains user input with keywords replaced by single bytes. Line number is removed. Accumulator is 0, C holds the length of the input (not linked to B).
So we're looking at an input-parsing exercise in assembly. These can be tedious to read, especially going in blind. To ease the burden, let's talk about what must happen in order to get from input to output:
- Read in characters one at a time and decide what to do? Skip it? Change it? Pass it through?
- Keywords must be changed (crunched)
- We must count each character that is output
- Be aware of delimiters like colon, question mark, quotes, etc
- User input strings wrapped in quotes should be passed through unchanged
DATAstatements should not be altered, but they may look like any data typeREMstatements can be ignored- Crunched keywords match to the reserved word table
- We must track the matching offset into the reserved keyword table
- Lines must be properly terminated
Given the above requirements, we can speculate on the minimum set of tools we'll need to do this work:
- A source pointer to read the buffer - this will be HL register pair
- A destination pointer to mark the next output location - the DE register pair
- A flag to suspend crunching based on detected delimiters - this is the
DORESflag discussed in .BSS - A counter to capture the offset as we try to match reserved word - This is done by register B
Let's dive in to the crunch code:
33700 CRUNCH: IFN STRING,< /* If strings are enabled */ 33720 XRA A /* Clear accumulator */ 33740 STA DORES> /* Flag 0 - enable crunching */ 33760 MVI C,5 /* Set character count to 5 - minimum size of TXTTAB line */ 33780 LXI D,BUF /* DE output beings at buffer base */
This short setup routine lays the ground word discussed before. We clear the accumulator and set the crunch flag to proceed with the work. We begin the line length counter at 5 since that is the minimum size of a TXTTAB line descriptor (2 byte pointer, 2 byte line number, and 1 byte line length). We end by pointing DE at the base of the buffer where we will write the crunched line
33800 KLOOP: MOV A,M /* HL is further along in the buffer, getting next character in A */ 33820 CPI " " /* Check for a space */ 33840 JZ STUFFH /* Push space to output, increment I/O ptrs and count */ 33860 MOV B,A /* Not space - move characer to B */ 33900 CPI 34 /* Check if character is a quote */ 33920 JZ STRNG /* If quote, jump to user string handler */ 33940 ORA A /* Check for NUL (end of line) */ 33960 JZ CRDONE /* End? We're done crunching */ 33980 IFN STRING,< /* If strings are enabled... */ 34000 LDA DORES /* Gets the crunching status flag (for DATA statements) */ 34020 ORA A /* Sets flags to check value */ 34040 MOV B,A /* Init B to zero if we're about to crunch */ 34060 MOV A,M /* Retrieve buffer character again */ 34080 JNZ STUFFH> /* No crunch? Just pass character to output */ 34120 IFN LENGTH,< /* If 8k+ BASIC... */ 34140 CPI "?" /* Check character for question mark */ 34160 MVI A,PRINTK /* Replace character with print token */ 34180 JZ STUFFH /* Send print token to output */ 34200 MOV A,M /* Grab the character a third time */ 34220 CPI "0" /* Check if control characters */ 34240 JC MUSTCR /* This is a crunchable reserved word */ 34260 CPI 60 /* Check for Colon (above numerics) */ 34280 JC STUFFH /* Must be a number, just pass it through */
This is the top-level loop for the crunch routine. We enter at the top and we will leave with 3 possible outcomes: 1) We're completely finished crunching (CRDONE), 2) We pass the byte to the output without crunching (STUFFH), or 3) we will definitely crunch this byte (MUSTCR). You may notice a 4th exit called STRNG, but this leads to the STUFFH byte passthrough by skipping repeated type checks during the quoted string.
We make decisions based on the value of the incoming character ordered by the ASCII table, the logic looks like this.
- Space? Pass it through unchanged
- Is there a quote? Jump to special string handler that passes everything through until matching quote
- Is there a
NUL? We're completely done the line and crunching - If crunching is disabled (
DATAstatments), then pass it through - If it's a question mark, this is an alias for
PRINT, substitute a print token and pass it through - If the character is less than numerics (control characters). We must crunch.
- If the character is less than a '<' it is a numberic or colon. Pass it through
- Anything else must be a letter, which will fall through below to crunching
34300 MUSTCR: > /* This is the crunch! */ 34320 PUSH D /* Save pointer to output start */ 34340 IFE STRING,< /* If strings are disabled... */ 34460 MVI B,0> /* Clear B */ 34480 LXI D,RESLST-1 /* Point DE to reserved word list-1 */ 34400 PUSH H /* Save current input character pointer */ 34420 XWD ^O1000,^O76 /* Skip next instruction on first entry */ 34440 NXTRES: CHRGET /* Get new character (entry here from 34620)*/ 34460 INX D /* Walk to next resereved word */ 34480 RESER: LDAX D /* Get the reserved word in to A */ 34500 ANI 127 /* Mask away all but the lower 7 bytes (no sign) */ 34520 JZ TABEND /* At the end of the list? Done - jump out */ 34540 CMP M /* Compare current charcater with */ 34560 JNZ NTHIS /* Not a match for this reserved word */ 34580 LDAX D /* Match so re-get the reserved word byte */ 34600 ORA A /* Set the flags for value */ 34620 JP NXTRES /* Positive - no match. Neg/Sign - match */
First, we need space to store another pointer, so we push the DE to the stack since we won't be outputting anything directly in this routine. We then load the pointer to the reserved keyword list to DE. We then save the character input pointer to the stack which could be smashed.
Skipping some preable instructions, we load the first reserved word and mask away the 8th bit which we know to be 1 for all reserved words. Zero? Then the table is finished, otherwise we compare each letter. If there is no match, we jump back up and try the next reserved word until we find a match and fall down to the next block.
34640 FOUND: POP PSW /* Trash input pointer saved on stack */ 34660 MOV A,B /* B holds reserved word counter, move to A */ 34680 ORI 128 /* And set the 8th/sign bit to mark reserved word */ 34700 XWD ^O1000,^O362 /* Skip next two instructions, previous ORI guarantees no jump */ 34720 TABEND: POP H /* Recover the input character pointer */ 34740 MOV A,M /* Get the input character again */ 34760 POP D /* Recover the output pointer */ 34780 IFE LENGTH-2,< /* If we're in 12k+, we need to handle ELSE tokens */ 34800 XCHG /* HL is now the output location */ 34820 CPI ELSETK /* Check if this is an ELSE token */ 34860 MVI M,":" /* Terminate the ELSE with a colon */ 34880 CZ INXHRT## /* Advance pointer so : isn't smashed */ 34920 XCHG> /* HL is back to input and DE is output */
At the top, we've found the reserved word, with it's offset stored in B. The offset IS the crunched value after we assert the 8th bit to mark it as a reserved word. If we entered this code block at TABEND, then we did not find the word and so we need to get the pointers back. In either case we fall down in to the the STUFFH pass-through to write either the newly crunched byte or the original byte.
An edge case at the bottom of this routine appends a colon to an ELSE for the 12k+ BASIC.
34940 STUFFH: INX H /* Just passing the byte - increment to next byte */ 34960 STAX D /* Store byte held in A to the destination */ 34980 INX D /* Increment to next destination */ 35000 INR C /* Add 1 to the line length */ 35020 IFN STRING,< /* If strings are disabled... */ 35040 SUI ":" /* Check for colon by subtraction */ 35060 JZ COLIS /* Match - allow crunching again */ 35080 CPI DATATK-":" /* Check for data token (modified by prev SUI) */ 35100 JNZ NODATT /* No data token, skip to REM check */ 35120 COLIS: STA DORES /* Enable crunching */ 35140 NODATT: SUI REMTK-":"> /* Check (and store) for REM token */ 35160 IFE STRING,<SUI REMRK> /* If strings are enabled, see if REM */ 35180 JNZ KLOOP /* No rem? back to the very top - next char */ 35200 MOV B,A /* Get more of the REM statement */
This block starts with the accumulator prepped with the desired output (either a crunched value or a pass-through value from the top-level loop). Feed A to the output, incrementing the line length, source and destination pointers accordingly.
We also check for the edge cases of DATA and REM tokens before jumping all the way back to the top to proceed with more crunching
35220 STR1: MOV A,M /* Get the input character again */ 35240 ORA A /* Is it a NUL? */ 35260 JZ CRDONE /* Then we're done, otherwise */ 35280 CMP B /* Check for matching quote */ 35300 JZ STUFFH /* Store it */ 35320 STRNG: INX H /* Increment to next string byte */ 35340 STAX D /* Store it in output immediately */ 35360 INR C /* Increment the output count */ 35380 INX D /* Increment the desintation pointer */ 35400 JMP STR1 /* Go to the next character */
We usually enter this block from the STRNG label except for the REM token edge case. So let's start in the middle which is where we pick up having already seen a quote in the input buffer. We essentially skip to the next character, store it, and repeat until we encounter the end of string NUL or another quote (already saved in B). This block and the next two end with an unconditional exits, making these isolated routines.
35440 NTHIS: POP H /* Not a reserved word match, get input pointer back */ 35460 PUSH H /* Save it again */ 35480 INR B /* Increment counter to next reserved word */ 35500 XCHG /* Reserved word list in HL, Input pointer in DE */ 35520 NTHIS1: ORA M /* Compare reserved word to A */ 35540 INX H /* Move to next byte */ 35560 JP NTHIS1 /* Repeat until match */ 35580 XCHG /* Match - Return reserved word list to DE. Input to HL */ 35620 JMP RESER /* Finished scan, move to next reserved word */
This is the reserved word iteration routine that scans until we find a match and jump back to the top.
35640 CRDONE: LXI H,BUFMIN /* Crunching complete, move HL pointer back to start of line */ 35660 STAX D /* NUL terminate the line once (end-of-line) */ 35680 INX D /* Move to next byte */ 35700 STAX D /* NUL terminate the line twice (zero link value) */ 35720 INX D /* Move to next byte */ 35740 STAX D /* NUL terminate a third time for end of program */ 35760 RET /* Crunch complete! Return to MAIN */
Crunching is complete so null terminate the line as needed. RET back to the main loop to line 29080 after CALL CRUNCH
Formula Evaluation
Formula evaluation is the heart of most (all?) code generation programs. Altair BASIC gives us a glimpse of what evaluation looked like in the mid-1970s. Perhaps surprisingly, the modern convention of recursive calls to eval() until reaching a base case remains firmly established even 50 years ago.
Unfortunately, this is such a behemoth of a routine that I won't go through it line by line. The routine is over 1000 lines long not including the significant reliance on the math package. Instead, I will break it down to the subroutine level with a few key line operations thrown where they add the most value. I encourage the very curious to dig through the entire formula eval routine found in file F3 between listing lines 4470 and 5495. On to the discussion.
The symbol FRMEVL is called within many statement handling routines during direct command interpretation. Here are the ten specific cases when Altair BASIC evaluates formulas:
- Integer coersion with the
INTstatement - Within the
LETstatement for evaluating the value of the variable - Twice in an
IFstatement for both the predicate and the consequent - When
PRINTis called for the subsquent expression - Type checking in
FORstatements - Within
EVALas part of a recursive evaluation of a complex expression - Evaluating the arguments of a user-defined function
- Iterating over strings in the 12k+ BASIC version
- Writing to disk with
DSKO$ - Writing to memory with
POKE
When FRMEVL is called from any of the above paths, these conditions follow:
State at call: HL points to the first character of the formula to evaluate.
State at final return: HL points at the formula terminator. The Floating Point Accumulator (FAC) holds the result.
Helpers: The operator table (OPTAB) provides precedence values and dispatch addresses. Intermediate results for evaluation reside on the stack. Temporary variables in .BSS hold transient pointers within the expression (TEMP2, TEMP3).
FRMEVL evaluates complex expressions through one or more calls to EVAL for each lexeme. EVAL determines the type, dispatches to specific handlers, and stores work in temporary variables, the stack, or and the FAC. EVAL may invoke FRMEVL recursively if it encounters a potential complex expression within parenthesis. Consider this:

FRMEVL begins by saving the current code pointer in HL then priming the stack with a precedence of 0. Then it verifies sufficient stack space exists for execution. The call to EVAL, determines the type of the next operation, stores the precedence on the stack along with potential values in the FAC. Using the prior precedence with the newly discovered precedence, FRMEVL determines whether to execute the last operation, or store it again along with the current temporary results, repeating the process again. Eventually we will run out of tokens to arrive at a final answer or we'll run out of stack space.
To get a better picture of the previous two sentences, let's take a look at the labels and routines built within FRMEVL. From a flow control perspective, each of the jump labels will form the start of a basic block, however only a small subset of the labels can be considered full routines in a logical sense. The high-degree of optimization and code reuse without explicit function calls results in multiple entry points and multiple exit points across many major routine in FRMEVL. Here is with the full list with all labels and bolding to set apart the major routines (RETALOP, APPLOP, and EVAL):
| Routine | Entry From | Entry Method | Purpose |
|---|---|---|---|
| FRMCHK | FRMEVL, FOR | Fall-through (FT), Call | FRMEVL less backup of the input location pointer |
| LPOPER | FRMCHK, DOPREC, DOMIN, NOTER | FT, Jump, Call | Save precedence and verify stack |
| RETAOP | LPOPR, DOPREC, DODSP | FT, RET (trampoline), PCHL | Entry to FRMEVL without the setup preamble. Directly handle the next expression |
| TSTOP | RETAOP, CAT | FT, RET | Routine to test precedence of the next operation |
| NOTSTV | TSTOP | FT | Get next character and prep relation bits |
| LOPREL | NOTSTV | FT | Loops search all relational operators to match and set mask bits |
| ENDREL | LOPREL | Jump | Tail of FRMEVL to determine what to do with the current and last operators discovered |
| DOPREC | ENDREL, FINREL | FT, Jump | Stores precedence and loops back to LPOPER |
| NUMREL | FINREL | Jump | Pushes the FAC on to the stack |
| VPUSHD | NUMREL | FT | Moves the contents of E and A to BC and pushes to the stack |
| FINTMP | VPUSHD, ANDORD, FINREL | FT, Jump | Pushes next return address and restarts FRMEVL at LPOPER |
| EXPSTK | DOPREC | Jump | Converts FAC value to single precision in prep for exponentiation |
| ANDORD | DOPREC | Jump | Converts FAC value to integer in prep for AND and OR operations |
| FINREL | ENDREL | Jump | Puts special relational operators on stack for return and execution |
| APPLOP | ENDREL | RET trampoline | Applies operations previously pushed on to the stack |
| STKDBL | APPLOP | Jump | Converts FAC to double precision to hold stack operands |
| SNGDBL | STKDBL, FACDBL | FT, Jump | Moves 4 bytes from stack to FAC |
| SETDBL | SNGDBL, FACDBL | FT, Jump | Preps dispatch for double conversion |
| DODSP | SETDBL, SNGDO | FT, Jump | Final prep for double dispatch and actual PCHL jump |
| FACDBL | APPLOP | Jump | Preps the FAC convertion to double from int or single |
| STKSNG | APPLOP | Jump | Preps the FAC convertion to single precision |
| SNGDO | STKSNG, FACSNG | FT, Jump | Sets up the dispatch to single conversion |
| FACSNG | APPLOP | Jump | Preps the FAC convertion to single precision when stack is an integer |
| EVAL | LPOPER, CAT | Call | Determines the operation type and populates FAC |
| PARCHK | OKNORM, | Call, FT | Checks for parenthesis for complex expressions |
| DOMIN | EVAL | FT, Jump | Store registers and negate value in the FAC |
| LABBCK | DOMIN, NOTFRF | FT, RET | Restore text pointer and return to main FRMEVL routine |
| ISVAR | EVAL | Jump | Handles variables with the FAC |
| ISFUN | EVAL | Jump | Handles functions to be evaluated |
| OKNORM | ISFUN | Jump | Parses single argument functions and RETs to them for execution |
| NOTFRF | OKNORM | FT, Jump | Skips FAC conversion process |
| FINGO | NOTFRF | FT, Jump | Dispatches to function routine |
| GETYPE | PRINT, FINREL, others | Call | Gets the type and sets condition codes |
| OR | N/A | N/A | Logical op for 8K only, possibly dead code replaced by DANDOR |
| AND | N/A | N/A | Logical op for 8K only, possibly dead code replaced by DANDOR |
| ANDCON | N/A | N/A | Logical op for 8K only, possibly dead code replaced by DANDOR |
| ORFIN | DANDOR, ANDCON (dead?) | Jump | . |
| DOREL | PTDORL, FINREL | RET | Sets up a relational operation to be performed |
| STRCMP | FINTMP, RETAOP | RET | Compares two strings |
| CSLOOP | STRCMP | FT, Jump | Inner loop for comparing strings |
| DOCMP | RETAOP | RET | Comparison between two values and resulting bit flags |
| NOTER | EVAL | Jump | Perform NOT operation |
| DANDOR | RETAOP | RET | Performs AND or OR depending on precedent |
| ORFIN | DANDOR | Jump | Finalizes registers and PCHLs back to last location |
That is the long list of all routines that make up the Formula Evaluation in order of definition - excluding calls to the math package. The overall feature I'll point out is that the top-level functions are general and broad dealing mostly with determining where to flow next. The routines at the bottom are specific worker functions that do the final processing to produce the output or desired side-effects.
Let's take a line-by-line look at EVAL, which focuses specifically on typing the current input and dispatching to the low-level worker routine.
06820 EVAL:
06840 IFN LENGTH-2,< /* If 4 or 8k... */
06860 IFN STRING,< /* ..and strings active...(so 8k only!) */
06880 XRA A /* Clear A */
06900 STA VALTYP>> /* Get the type in A */
06820 CHRGET /* Read the next character to evaluate */
06840 JC FIN /* Number? Handle floating input */
06960 CALL ISLET /* Is this is a letter? */
06980 JNC ISVAR /* It IS a letter - assume variable */
07000 CPI PLUSTK /* Not letter, check for plus sign */
07020 JZ EVAL /* Plus sign, skip by reentering EVAL */
07040 CPI "." /* Check for decimal point */
07080 JZ FIN /* Decimal point? Handle floating input */
07100 CPI MINUTK /* Minus sign? */
07120 JZ DOMIN /* Negate number value */
07140 IFN STRING,< /* If strings are active */
07160 CPI 34 /* Check for quote (string delimiter) */
07180 JZ STRLTI> /* It is a quote - handle a string */
07240 IFN LENGTH,< /* If we're in 8k+... */
07260 CPI NOTTK /* Check for a NOT token */
07280 JZ NOTER> /* Handle the NOT */
07300 IFN FUNCTS,< /* If functions are active... */
07320 CPI FNTK /* Check for user function */
07340 JZ FNDOER> /* Execute user function */
07360 SUI ONEFUN /* Compare with function name list */
07380 JNC ISFUN /* Great - must be a function */
07460 PARCHK: SYNCHK "(" /* Check for open paren */
07500 CALL FRMEVL /* Assume contents are complex function - recurse! */
07520 SYNCHK ")" /* Check for close paren */
07540 RET /* Done! */
Within EVAL, we're at the lowest tier that reads a token and determines the specific operator applications. The job is to determine which operation to apply. The ruleset is:
- Unquoted letter? Must be a variable - get the pointer and value
- Number or decimal point? Call floating point input routine
- Unary operation (NOT and negate)? Push precedence on the stack and re-enter Formula eval
- Quote mark? Handle as string literal
- Function token? Handle as a function
- Skip certain characters like '+'
- Parenthesis? Recurse in to
FRMEVL
FAC operations
We've discussed the layout of the Floating Point Accumulator during loading and described it as a set of static variables in a specific arrangement. We will now review the routines that orchestrate operations among these memory locations. We'll start with the calling conventions established for FAC operations:
- The result of FAC operations will be in the FAC
- In single-argument operations, the argument is in the FAC
- In two-argument operations:
- The first argument is in B,C,D,E
- The second argument is in the FAC
- Many operations have two entry points ("S" and "T") each using alternative convensions:
- "S" entries begin with the first argument referenced by HL and the
MOVRMoperation brings it in to the registers. - "T" entries begin with the first argument on the stack beginning with low order bits and with the sign pushed last.
- The stack is set up with two
PUSHMoperations. ThePOPRgets the arguments from the stack in to the registers. - All entries to "T" should be with a jump to avoid call pushing the return address push over the stack argument.
For quick reference, I'll re-post the FAC structure here:

Now we'll look at the list of FAC operations before walking through one of them. I encourage walkthrough through all of these functions line by line using the symbols indentified above as a reference. Note that many routines have multiple entry points with differing state assumptions.
| Routine Entry(s) | Purpose |
|---|---|
| . | . |
| FADDH, FADDS, FADDT, FADD | Floating point addition |
| FSUBS, FSUBT, FSUB | Floating point subtraction |
| NORMAL | Normalization by left shifting until MSB is 1 |
| ZERO | Zeros the FAC |
| ROUND | Rounds values in C,D,E,B and transfers to FAC |
| FMULTS, FMULTI, FMULT | Floating point multiplication |
| FDIVT, FDIV | Floating point division |
| FLOAT, FLOATR | Coerce value to float |
| ABS | Absolute value |
| NEG, VNEG | Negate value in FAC |
| SGN | Basic SGN function - Get the sign of a value |
| PUSHF | Push FAC to stack |
| MOVFM | Move value pointed at by HL in to FAC |
| MOVFR | Move register set B,C,D,E to FAC |
| MOVRF | Move FAC to B,C,D,E |
| MOVRM | Move (B,C,D,E) to HL (not really a FAC OP) |
| MOVMF | Move FAC to HL |
| VPUSHF | From FAC to Stack |
| VMOVFA, VMOVFM | From HL to FAC |
| VMOVAF, VMOVMF | From FAC to HL |
| FRCINT | Convert FAC to Integer |
| CONIS | Convert Single to Integer |
| FRCSNG | Convert FAC to Single |
| CONSD | Convert Double to Single |
| FRCDBL | Convert FAC to Double |
| CONDS | Convert Single to Double |
| QINT | (Quick) Greatest Integer Function |
| INTFNC | Greatest Integer Function |
| DINT | Double greatest integer function |
| DSUB | Double subtraction |
| DADD | Double addition |
| DNORML | Double normalize FAC |
| DNEGR | Douable negate FAC |
| DMULT | Double multiuplication |
| FIN | Floating point input to FAC |
| FOUT | Floating point FAC to output |
Pay close attention to the naming convention and the associated interface with the FAC. For instance:
- MOVxy routines imply data movement to x from y where (M = memory, F = FAC, R = Registers).
- V* routines consider the value type as determined by VALTYP
- D* routines are all double precision operations
- FRC* and CON* convert value types
Note that I did not include complex math functions like EXP, POLYX, SQR in this list because they focus on executing well-known math operations rather than interfacing with the FAC.
Now we'll take a look at two FAC-related routines. The first demonstrates simple movement and how many of these functions simply call each other for space efficiency:
/* MOVFM gets data in to the FAC from a memory location (HL) 17580 MOVFM: CALL MOVRM /* Move number from memory to B,C,D,E */ 17700 MOVFR: XCHG /* Swap DE to HL */ 17720 SHLD FACLO /* Push HL to the low order FAC bytes */ 17740 MOV H,B /* Move the highest byte in to H */ 17760 MOV L,C /* Move the 2nd highest byte in to L */ 17780 SHLD FAC-1 /* Put the high bytes where they beyond */ 18000 XCHG /* Restore original DE */ 18020 RET /* Done! */
Seemingly simple, but the trick with this MOVFM routine is that we're not going directly from memory to the FAC, but copying it to registers first using an existing routine - saving memory by reusing a code path. MOVFM piggybacks on MOVFR by falling through by using a quick call to MOVRM. Fall-through takes care of the rest.
Now for a core FAC operation - reading in floating numbers with FIN. Although this code page is over 400 lines, much of it is duplicated for both the 12K and the smaller versions. I will cut out the 'dead' code for anythingb but the 12k version. Recall from earlier that the F3 file was built for 12K+, but the F4 file where FIN lives targets 8k.
- Setup: HL points to first input character and already in A
- Result: The number is in the FAC
- C tracks if we've read in the decimal (0 = yes, 377o = no)
- B counts digits after decimal
- E is the exponent
- All registers smashed

Consider this diagram while walking through FIN. Comments are my own however this section of source is very well commented already and my changes are minor.
/* Check the leading sign of the value - entry from VAL may include the sign */ 39040 FIN: 39060 IFN STRING,< 39100 CPI "-" /* Check for leading minus */ 39120 PUSH PSW /* Save the FLAGs */ 39140 JZ FIN1 /* A minus - skip to next code block */ 39160 CPI "+" /* Check for leading plus */ 39180 JZ FIN1> /* A plus - skip to next code block */ 39200 DCX H /* Not a leading sign, move back for re-read */
Simple code block - store info on stack. It there was sign info skip the line that walks back the text pointer (need to reread for upcoming processing)
/* Register set up routine */ 39220 FIN1: 40700 IFE LENGTH-2,< /* 12K+ routine */ 40720 XCHG /* Moving the text pointer to DE */ 40740 LXI B,377+$CODE /* Setting up decimal point flag to 377o (not seen yet) */ 40780 MOV H,C /* Clear H */ 40800 MOV L,C /* Clear L */ 40820 CALL CONISS /* Clear FAC and set expected value to integer */ 40840 XCHG /* Get the text pointer back */
Setting up the base register values for upcoming usage
/* Checking the TYPE of the next character */ 40880 FINC: CHRGET /* Get next character */ 40900 JC FINDIG /* Carry implies digit - jump to digit handler */ 40920 CPI "." /* Check for decimal point */ 40940 JZ FINDP /* Match - go to decial point handler */ 40960 CPI "E" /* Check for letter E (single precision marker) */ 40980 JZ FINEX /* Jump to single precision handler */ 40000 CPI "D" /* Check for letter D (double precision handler) */ 41020 JNZ FINE /* NOT D? That's everything, we're done. Jump to cleanup */ 41040 ORA A /* Must be double - set FLAGS (zero flag) */
Type checking routine with the caveat that doubles will fall down to single precision handler with zero flag reset
/* Handle single or double precision numbers */ 41060 FINEX: CALL FINFRC /* Conversion routine. Zero flag set = single */ 41080 PUSH H /* Save current input position */ 41100 LXI H,FINEC /* Load address of exponent character fetch in to HL */ 41120 XTHL /* HL <-> stack - Now reading current input. Next routine on top of stack */ /* Check exponent sign and skip if necessary */ 41160 CHRGET /* Load next character in to A */ 41180 DCR D /* Set exponent sign to negative */ 41200 CPI MINUTK /* Check for negative exponent token in A */ 41220 RZ /* Negative exponent confirmed - RET to next character fetch */ 41240 CPI "-" /* Check for negative char in A (from VAL entry)*/ 41260 RZ /* Negative exponent confirmed - RET to next character fetch */ 41280 INR D /* Still here? Must be positive, put exponent sign back to zero */ 41300 CPI PLUSTK /* Leading plus token? */ 41320 RZ /* RET to next character */ 41340 CPI "+" /* Leading plus from VAL entry? */ 41360 RZ /* Ditto RET */ 41380 DCX H /* No sign? Probably digit - move input pointer back */ 41420 POP PSW /* Pop and discard FINEC RET trampoline - not needed now */
These routines convert the FAC to the necessary and test the next character for a sign.
/* Fetching the next character of exponent */ 41460 FINEC: CHRGET /* Getting the next characters in to A */ 41480 JC FINEDG /* Carry = Digit. Put digit in exponent */ 41500 INR D /* Not digit - put sign to positive or zero (if neg) */ 41520 JNZ FINE /* Positive - we're done */ 41540 XRA A /* Must be negative so... */ 41560 SUB E /* Subtract exponent value */ 41580 MOV E,A /* Resave it to E as negative */
Retrieves the next character of the exponent and adjusts the target register as necessary
/* Done with exponent search - clean up and continue */ 41620 FINE: LDA VALTYP /* Get type in A */ 41640 CPI 2 /* Compare with integer (type 2) */ 41660 JNZ FINEF /* Not integer? Jump to finish float */ /* Negative integer finisher */ 41700 POP PSW /* Get the sign value */ 41720 XCHG /* Store input pointer. HL -> DE */ 41740 CZ INEG /* Negative? Negate value */ 41760 XCHG /* Get input pointer back */ 41780 RET /* Finished */ /* Floating point finisher - multiply for full number*/ 41820 FINEF: PUSH H /* Store input pointer on stack */ 41840 MOV A,E /* Get final exponent in to A... */ 41860 SUB B /* Less the digits beyond decimal point */ /* A now holds the number of times to multiple or divide by 10 */ 41900 FINEF2: CP FINMUL /* A still positive? Time to multiply */ 41920 CM FINDIV /* A negative? Time to divide */ 41940 JNZ FINEF2 /* Still here? Try this all again */ /* All shifting done - lets verify signs */ 41980 POP D /* Restore input point to DE (stored from line 41820 above) */ 42000 POP PSW /* Get the sign (stored from line 39120) */ 42020 CZ NEG /* Zero? Must be negative. Negate FAC */ 42040 XCHG /* Get input position from DE to HL */ /* Catch edge case of negating -32768 (which has no positive complement in S16) */ 42060 LDA VALTYP /* Get the current value type in the FAC */ 42080 CPI 4 /* Check for single precision (-32768 INT looks like SNG) */ 42100 RNZ /* Not SNG, so not our edge case - done! */ 42120 PUSH H /* It is single - save the input pointer we're about to smash */ 42140 LXI H,POPHRT /* Find POPHRT routine for CONIS routine prep */ 42160 PUSH H /* Store it on the stack (CONIS2 could RET to it) */ 42180 CALL CONIS2 /* Routine to check for -32768 */ 42200 RET /* Still here? No match - so just RET to POPHRT */
This series of routines wrap up input by verifying types, matching signs, and exeuting multiplication or division necessary to normalize the value in the FAC.
This ends the top-level FIN process. However, I will continue with a few of the standalone routines referenced above.
/* Find the decimal point and set flags */ 42300 FINDP: INR C /* Set flag for decimal point seen (377o + 1 = 0 [flag seen]) */ 42320 JNZ FINEF /* Not zero? We're here twice - end it */ 42340 CALL FINFRC /* This is the first decimal point - convert FAC to single/double */ 42360 JMP FINC /* Done - on to the next digit */ 42380 /* Converts FAC to single or double. ZF=1 ? single : double */ 42460 FINFRC: PUSH H /* Store input pointer */ 42480 PUSH D /* Store exponent tracker (DE) */ 42500 PUSH B /* Store decimal point info (BC) */ 42520 PUSH PSW /* Save FAC status info */ 42540 CZ FRCSNG /* Zero? (ZF=1) - convert to single */ 42560 POP PSW /* Restore FAC status info */ 42580 CNZ FRCDBL /* Not Zero? (ZF=0) - convert to double */ 42600 POP B /* Restore decimal point info (BC) */ 42620 POP D /* Restore exponent tracker (DE) */ 42640 POP H /* Restore input pointer */ 42660 RET> /* Finished */ /* Multiply FAC by ten one time */ 42740 FINMUL: RZ /* No exponent? No work to do */ 42760 FINMLT: PUSH PSW /* Save exponent (FOUT enters here) */ 42820 IFE LENGTH-2,< /* 12k version only */ 42840 LDA VALTYP /* Get the type */ 42960 CPI 4 /* Check if we have a single */ 42980 PUSH PSW /* Save type info from smash */ 42900 CZ MUL10 /* Single - multiply by 10 */ 42920 POP PSW /* Get type back */ 42940 CNZ DMUL10> /* Double - multiple by 1000 */ 43960 POP PSW /* Get exponent info */ 43980 DCRART: DCR A /* Reduce by 1 */ 43000 RET /* Done with single x10 operation */
This completes roughly half of the FAC code and where I'll stop to save space. The chosen code blocks should provide good background for those wishing to dig further.
String operations
I briefly discussed "string space" as the highest region of memory that grows downward to a default of 50 bytes. The MEMSIZ pointer is the highest valid memory address, which is also the base of string space. The FRETOP pointer starts at MEMSIZ and moves downward to point to the next free byte in string space. Keep this idea in mind when we discuss garbage collection in the next section - for now, we'll think about how strings get in to string space.
Strings can enter the system through three pathways:
- Input as a direct statement - String briefly lives in the buffer. A descriptor is created and the string is copied to string space
- Included within a program - String descriptor references program directly and no string space is used
- String generated at runtime - String result will be created in string space
With these interactions in mind, let's break the routines down in to two flavors: "Command" routines that directly implement BASIC commands (STR$, LEN, etc), and "Support" routines that interface between string space, descriptors, and other runtime components. We can think of the "Command" routines as standalone operations. Let's take a closer look at the "Support" routine interactions:

This diagram shows all the "support" functions except STROUT, which doesn't interact with other memory components. All others parse and move data among memory and registers. The top-level string operations include:
| Symbol | Type | Purpose |
|---|---|---|
| STR$ | Command | BASIC command to convert input numeric to a string type |
| STRCPY | Support | Copies a string in to string space and create a descriptor |
| STRLIT | Support | Creates a string descriptor in DSCTMP |
| PUTNEW | Support | Moves a new descriptor to one of the temporary positions |
| STROUT | Support | Print the string pointed to in HL |
| GETSPA | Support | Allocates string space for a new string |
| CAT | Support | Concatenates two strings with the + operator |
| FRETMP | Support | Frees a temporary string descriptor |
| LEN | Command | BASIC command to return the character length of an input string |
| ASC | Command | BASIC command to return the ASCII value of the input character |
| CHR$ | Command | BASIC command to return the character representing the input value |
| LEFT$ | Command | BASIC command to return the left-most characters specified |
| RIGHT$ | Command | BASIC command to return the right-most characters specified |
| MID$ | Command | BASIC command to return characters within a string |
| VAL | Command | BASIC command to convert input string to a number, if possible |
Let's dig in to two of the support functions and two of the command functions.
The STRCPY function begins with HL pointing to a string descriptor create from the input buffer. The result is a pointer to DSCTMP containing the new decsriptor. The only time STRCPY is called is within STRLIT. Note that the first byte of a the 3-byte descriptor is the length of the string.
29200 STRCPY: MOV A,M /* Move first byte of descriptor (length) in to A */ 29220 INX H /* Move pointer up (to low byte of string address */ 29240 IFN LENGTH-2,< /* But in 8K... */ 29260 INX H> /* Skip one more byte since 8K descriptors are 4-bytes */ 29280 PUSH H /* Push address on to stack */ 29300 CALL GETSPA /* Allocate string space */ 29320 POP H /* Get the address to the new space to HL*/ 29340 PUSHM /* Save address on stack and advance HL pointer */ 29360 POP B /* Recover address in to BC (smashing useless value) */ 29380 CALL STRAD2 /* Prepare DSCTMP */ 29400 PUSH H /* Save pointer to DSCTMP */ 29420 MOV L,A /* Get length in to low address (offset from H) */ 29440 CALL MOVSTR /* Move the string */ 29460 POP D /* Restore DSCTMP pointer */ 29480 RET /* Done! */
The quirk here is that descriptors may be 4 bytes if we not in the 8K version, so we must skip an extra byte after the length.
You'll notice that copying strings relies on the GETSPA routine to allocate string space. We'll dig in to that function now, which will tie in to our next section about garbage collection. Some things to know about the :
- Register A holds the number of bytes requested
- Register pair DE holds the pointer to the free string space
31440 GETSPA: ORA A /* Set flags for provided value */ 31460 XWD ^O1000,^O016 /* Skips the next POP instruction */ 31480 TRYGI2: POP PSW /* Alternate entry where A is on the stack */ 31500 PUSH PSW /* Save space requested on the stack */ 31520 LHLD STKTOP /* Find the absolute end of string space */ 31540 XCHG /* Store it in D/E for now */ 31560 LHLD FRETOP /* Get the current actual end of free space */ 31580 CMA /* Negate A (the number of bytes required) */ 31600 MOV C,A /* Move the negative bytes required to C */ 31620 MVI B,255 /* Extend complement to B (BC = negative size)*/ 31640 DAD B /* Add (the negative) size to FRETOP in HL */ 31660 INX H /* Add back one byte */ 31680 COMPAR /* Compare proposed string end with top of stack */ 31700 JC GARBAG /* Carry? Oops, no space. Attempt garbage collection */ 31720 SHLD FRETOP /* Space is available - Bump FRETOP down to new end */ 31740 INX H /* Move up one more byte above FRETOP */ 31760 XCHG /* FRETOP + 1 -> DE, STKTOP -> HL */ 31780 PPSWRT: POP PSW /* Restore the space request (char count) */ 31800 RET /* Done! */
This is a straight-forward rourtine that checks the space request, compares the potential new string with the stack top to detect collision, then garbage collects if necessary and returns the new pointer. Note the mechanical method of creating the negative memory offset with the complement (CMA) and sign extending the larger register of the pair with a 255.
Garbage collection
If you're coming from a modern language with garbage collection, the Altair BASIC implementation will seem simple and narrow. There are no nascent forms of mechanics like Reference Counting or Mark-and-Sweep. Instead, Altair BASIC garbage collection focuses solely on the 50 bytes of string space and attempts to find space when the stack may be smashed. Let's revisit our memory model alongside the GETSPA routine just discussed above:

As shown in the previous section, GETSPA may call garbage collection if the string space size requested would cause FRETOP to collide with STKTOP. We could simply throw an out of memory error and disappoint the user; however, Altair BASIC first attempts to recover string space by analyzing existing string descriptors to find and remove unnecessary allocations. Only if space recovery fails, should we throw an error.
Now for a closer look at the longest path through this process:

We begin in GARBAGby verifying that we haven't already run garbage collection, if so then we've failed to collect and error out immediately since the result is idempotent. If we go ahead with collection, then we first completely reset string space in preparation for building it back from scratch. We're not scanning the space for unnecessary strings in, but rather we're using existing descriptors to rebuild from scratch.
We then loop through all of the variable locations and push string candidates on to the stack. Ultimately, we unwind the stack by putting the living strings back in to string space.
Let's walk through the garbage collection code:
/* Initiate garbage collection if this is the first attempt */ 31840 GARBAG: POP PSW /* Get flags to test if this is a re-entry */ 32860 MVI E,ERRSO /* Prepare error handler in E */ 32880 JZ ERROR /* Zero means we've already failed to collect - error */ 32900 CMP A /* Not zero? Make it zero now */ 32920 PUSH PSW /* Put updated flags back on the stack */ 32940 LXI B,TRYGI2 /* Get the return point after garbage collection */ 32960 PUSH B /* ...and put it on the stack */ 32980 GARBA2: LHLD MEMSIZ /* Get the top of memory in to HL */ 32000 IFE REALIO,< /* If this is a real terminal */ 32020 MVI A,7 /* Get the BEL character */ 32040 OUTCHR> /* Ring it to announce garbage collection */
We only want to enter garbage collection once, so we first check if this is a reentry. If not, then we flag it as entered immediately. We find the top of memory and signal the user about garbage collection with the terminal bell
/* Get the next temp and jump to stack push if valid */ 32060 FNDVAR: SHLD FRETOP /* Nuke string space - MEMSIZ -> FRETOP */ 32080 LXI H,$CODE /* Get a zero */ 32100 PUSH H /* Push on stack to ensure garbage entry (ZF = 0) */ 32120 LHLD STKTOP /* Get start of stack to mark off non-string space */ 32160 PUSH H /* Put on stack (now 0 - STKTOP) */ 32180 LXI H,TEMPST /* Get first temp string */ 32200 TVAR: XCHG /* Move it to DE */ 32220 LHLD TEMPPT /* Get next temp string */ 32240 XCHG /* Swap with DE (now back in HL) */ 32260 COMPAR /* Compare TEMPPT and TEMPST */ 32280 LXI B,TVAR /* Get temp var garbage collector */ 32300 JNZ DVAR2 /* Pointers different? We have temp vars to collect */
On first pass, we reset string space (or shift if entering from elsewhere). From here we prepare to read through string temporaries first. Note that this basic block will jump down to the stack storing section, skipping the next two sections.
/* Scan simple variable space (VARTAB) */ 32340 SVARS: LHLD VARTAB /* Get the base of simple variable table */ 32360 SVAR: XCHG /* Put it in DE */ 32380 LHLD ARYTAB /* Get the end of the simple variable table */ 32400 XCHG /* Swap with base of table */ 32420 COMPAR /* Check if we're at the end */ 32440 JZ ARYVAR /* Yes? Move on to Array variables */ 32460 MOV A,M /* First memory location holds the 2nd char of name */ 32480 INX H /* Move up to the first char of the variable name */ 32500 INX H /* Move up to the byte length */ 32520 IFE LENGTH-2,< /* In the 12k+ version... */ 32540 INX H /* Move to the variable type (3 = string) */ 32560 CPI 3 /* Test it for a string */ 32580 JNZ SKPVAR /* Not a string? skip it */ 32600 CALL DVARS /* It's a string - collect it */ 32620 XRA A /* Sets Zero Flag to 1 to stop skips */ 32640 SKPVAR: MOV E,A /* Move A to E - Will be zero if fall-in */ 32660 MVI D,0 /* Clears D ensuring that DE is the skip amount */ 32680 DAD D> /* Move HL forward by DE bytes (the skip!) */ 32760 JMP SVAR /* Jump back up to skip more if necessary */
This section checks if simple variables are 'alive'. If so, skip the next basic block to the stack storing routine.
/* Scan array variable space (ARYVAR) */ 32800 ARYVA2: POP B /* Remove unneeded pointer for skipped var */ 32820 ARYVAR: XCHG /* Put ARYVAR base from HL in to DE */ 32840 LHLD STREND /* Get end of ARYVAR */ 32860 XCHG /* Swap base (to HL) with end (to DE) */ 32880 COMPAR /* See if they are the same (means no work to do) */ 32900 JZ GRBPAS /* Collecting done - jump to unwind phase */ 32920 IFE LENGTH-2,< /* If this is the 12K+ Altair BASIC */ 32940 MOV A,M /* Get the variable type in to A */ 32960 INX H> /* Move to the next byte */ 32980 CALL MOVRM /* Get the length in to BC */ 33040 PUSH H /* Put pointer to values on the stack */ 33060 DAD B /* Move forward to the next variable */ 33080 IFE LENGTH-2,< /* If this is the 12K+ Altair BASIC */ 33100 CPI 3 /* Check type for string */ 33120 JNZ ARYVA2> /* Not string? Nothing to clean. Next! */ 33200 SHLD TEMP3 /* Store the end of the array because we have to ... */ 33220 POP H /* Jump back to the beginning of this one to collect */ 33240 MOV C,M /* Get the number of dimensions (length) */ 33260 MVI B,0 /* Won't need high byte */ 33280 DAD B /* Move the length beyond the base */ 33300 DAD B /* Again - for the second byte */ 33320 INX H /* One beyond the end */ 33340 ARYSTR: XCHG /* Current HL position to DE */ 33360 LHLD TEMP3 /* Bring the end of the array back to HL */ 33380 XCHG /* Swap DE and HL */ 33400 COMPAR /* Compare both of these values */ 33420 JZ ARYVAR /* Same? Move to next array */ 33440 LXI B,ARYSTR /* Not same - get routine jump point and fall down to push*/
This section checks the array variables. All strings are pushed to the stack and afterward, we jump to the final unwinding routine.
/* Push the current variable on to the stack for later */ 33460 DVAR2: PUSH B /* Save return point on to the stack */ 33480 IFE LENGTH-2,< /* If this is the 12K+ Altair BASIC */ 33500 DVAR: /* Alternate label */ 33520 DVARS: XRA A /* Clear accumuatlor, ZF = 1 */ 33540 ORA H /* Check if we're pointed at a null */ 33560 INX H /* Move to next byte (low address) */ 33580 MOV E,M /* Move to low address to E */ 33600 INX H /* Bump to high address */ 33620 MOV D,M /* Move to D, DE now points to string value*/ 33640 INX H> /* Move to next descriptor */ 33840 RZ /* Was this null? Return */ 33860 MOV B,H /* Not null? Move to BC (high byte) */ 33880 MOV C,L /* Not null? Move to BC (low byte) */ 33900 LHLD FRETOP /* Get the pointer to the end of string space */ 33920 COMPAR /* Check string location relative to end of string space */ 33940 MOV H,B /* Recover HL (high byte) */ 33960 MOV L,C /* Recover HL (low byte) */ 33980 RC /* Carry? String isn't in string space, RET (ignore) */ 34000 POP H /* Get return address from push on 33460 */ 34020 XTHL /* Return address to stack and highest address to HL */ 34040 COMPAR /* Compare both (and note carry flag for later) */ 34060 XTHL /* Swap address back again (Ret address in HL) */ 34080 PUSH H /* And back on the stack */ 34100 MOV H,B /* Get back our present scanning location */ 34120 MOV L,C /* from lines 33860-33880 */ 34140 RNC /* No carry? Found a candidate to move */ 34160 POP B /* Get return address from 3 lines back */ 34180 POP PSW /* Pop off the highest variable seen */ 34200 POP PSW /* Pop off the pointer */ 34220 PUSH H /* Save the base of the string on the stack */ 34240 PUSH D /* Save the high end of the string on the stack */ 34260 PUSH B /* Save the return address on to the stack */ 34280 RET /* Return! */
This routine stores living string pointers on the stack, which will unwind in to string space in the next section
/* Unwind our variables on the stack in to string space */ 34360 GRBPAS: POP D /* Get the end of the string */ 34380 POP H /* Get the beginning of the string */ 34400 MOV A,L /* Get the low byte in to A */ 34420 ORA H /* Check for zero (null) pointer */ 34440 RZ /* Zero means end of work. Return */ 34480 DCX H /* Move back from the string to the descriptor */ 34500 MOV B,M /* Move high byte of descriptor in to B */ 34520 DCX H /* Move back to the first byte of the descriptor */ 34540 MOV C,M /* Low byte to C - BC = string pointer */ 34550 PUSH H /* Save pointer to update later */ 34560 DCX H /* Move back further to string length */ 34620 MOV L,M /* Get length in to H */ 34640 MVI H,0 /* Length has no high byte. HL = Length */ 34660 DAD B /* Add length to address - HL = beyond end of string */ 34680 MOV D,B /* High string addess -> D */ 34700 MOV E,C /* Low -> E. DE = Original Pointer */ 34720 DCX H /* Base + Length - 1. Last byte of string */ 34740 MOV B,H /* End of string in to BC */ 34760 MOV C,L /* DE and BC are the ends of the string */ 34780 LHLD FRETOP /* Get the current end of string space in HL */ 34800 CALL BLTUC /* Block transfer string to new location */ 34820 POP H /* Get point location saved on 34550 */ 34840 MOV M,C /* Store new low byte address result from block transfer */ 34860 INX H /* Move to high address */ 34880 MOV M,B /* Save new high address */ 34900 MOV L,C /* Get low byte in to HL */ 35920 MOV H,B /* Get high byte in HL. HL = new address */ 35940 DCX H /* Address - 1 to save to FRETOP at below jump */ 35960 JMP FNDVAR /* Move to next variable */
This section unwinds all our work staging string boundaries on the stack and block transfers the data to string space. Pointers are fixed up as after move.
Other sections worth further study
I reviewed every line of code during this write up and there were several sections I wanted to explore further but I decided to put off for another time (maybe). Here is a list of things worth further study:
- The individual BASIC command implementations
- The math package
- Error Handling
- Technical details of I/O between BASIC and the Altair hardware
Where's the brilliance?
Books have been written about the business case of being first on the block to bring programming in to the home. However, I will keep to the technical aspects of the Altair BASIC source code.
Allen, Davidoff, and Gates made a big program fit in a small space through a combination of these strategies:
- Routine re-use with multiple entry-points - Core routines are specified once but employ multiple header labels and preambles depending on origin path. This resembles tail merging and jump threadng used in later optimizing compilers
- Optimized fall-through - Avoid branching checks by ensure that the most commonly run code paths are adjacent.
- Functional identities - Build one 'base' routine and reuse it for similar cases. This is used extensively in the math package, such as programming only SINE then specifying COS, TAN in terms of SIN.
- Platform-specific manipulation - Taking advantage of the Altair and 8080 architecture to save space, such as using interrupt handlers for common functions
Routine re-use with multiple entry-points
This is essentially function-reuse in assembly with the complication of having multiple label names as entry points depending on the state of the registers at jump/call time. Simple compilers would just emit a separate routine for each case. The handcrafted version here essentially "factors out" the common tail reused by attaching an all-purpose preamble that 'catches' different states which fall through to the actual work. Consider the first few lines of STRLIT, which turns a string in to a literal:
/* STRLIT builds a descriptor for a string pointer to by HL+1 */ 29720 STRLIT: DCX H /* Decrement HL so that it string is HL+1 */ 29740 STRLTI: MVI B,34 /* Move quote charater in to B */ 29760 STRLT3: MOV D,B /* Before moving it to D */ 29780 STRLT2: PUSH H /* Save character start pointer on stack */ 29800 MVI C,255 /* Start character counter at max length */ 29820 STRGET: INX H /* Move to the next character (first if from above) */ 29840 MOV A,M /* Character to A */ 29860 INR C /* Count the character */ 29880 ORA A /* Check if null (end of line) */ 29900 JZ STRFIN /* Null? We're done */ 29920 CMP D /* Check for quote */ 29940 JZ STRFIN /* Quote? We're also done */ 29960 CMP 0 /* Check for other quote */ 29980 JNZ STRGET /* No? Still in string, go up and get more */ ... 9 more lines in this routine ...
The first four lines of the routine have a separate label name to capture and adjust the register states prior to doing the real work. Those 4 labels are referenced 7 times throughout the program saving 19 lines of code per entry over a simple compiled program. This pattern is repeated many times throughout the code in the following routines: DVAR, FRETMP, GARBAG, INLIN, INITDX, LEFT$:, LINPRT, READY, and many others
Optimized Fall-through
Unoptimized compiled assembly often builds self-contained routine blocks that call and return from each other. However, if we look closely at routines and discover that one function almost always follows another, we can skip branching instructions and simply fall into the subsequent routine, saving a branching instruction.
Often this technique is paired with one or more of the other techniques so we'll move on to a specific example in the next section that demonstrates both
Here are cases within the code where work is performed by other functions:
- ABS of the FAC falls in to NEG for negative cases
- GOSUB falls in to GOTO
- Floating addition falls in to normalize for negative cases
- Floating normalize falls in to zeroing FAC if the answer is zero
- Rounding falls in to overflow if it overflows
- Floating multiplication underflows fall in to FAC zero
Functional Identities
In some cases an existing function can be used as the workhorse for other functions with some slight modifications. Consider how Altair BASIC implements Cosine:
65900 SUBTTL SINE, COSINE AND TANGENT FUNCTIONS 65980 COS: LXI H,PI2 /* Get PI/2 constant */ 66000 CALL FADDS /* Add it to the input value */ 66020 /* Falls directly into sine */ 66220 SIN: CALL PUSHF /* Sine entry point ... Over 60 lines follow in the SIN routine ...
The cosine function offsets the input value to make it equivalent to the sine function. Afterward, we must call sine, therefore the cosine function always falls directly in to sine.
Another easy example is that subtraction is simply negative addition. So if we negate the second oprand, then we only need to add. Consider the Double Precision subtraction routine:
31840 DSUB: CALL NEG /* Negate second argument */ 31860 /* Fall into DADD */ 31960 DADD: LXI H,ARG /* Entry point for DADD */ ... over 50 lines of DADD follow which falls in to DNORM ...
Other operations where the bulk of the work is performed by a base operation:
- TAN is implemented by both COS and SIN
- Square roots immediately fall in to exponentiation
Platform-specific Manipulation
We discussed earlier that Altair BASIC disables interrupts and repurposes them for common operations. This allows invocation with a CPU instruction (RST) macro'd to a friendly name. It follows that interrupt handlers can use optimized fall-through, for instance:
Interrupt 1 (SYNCHK) automatically falls through to the next handler (CHRGET) since these two operations are inherently related, allowing us to avoid manual invoke of the second routine. In total, this fall-through is employed 27 times in the code, savings at least that many branch instructions in memory (across all versions, 4/8/12K)
Another important factor is the code organization. By placing optional code and initialization code at the end of the program, this space can be recovered at run time and reallocated to user programs. While this is absolutely critical for Altair BASIC to run on the platform, these techniques were standard practice in the memory-cosntrained era of the 1970s and 80s.
Fun stuff
While reading through the source, a few interesting and fun things jumped out as reminders of the era that are worth a mention. In no particular order:
Fortran style comparisons
Altair BASIC comments and notes exclusively use original Fortran comparison dot notations common through the 1980s (Fortran 77). These are the period delimited operaors like .GT., .LT., .EQ., etc. A few exmaples:
11340 IF THE STRING IS NOT NULL AND ITS POINTER IS 11360 .GT.MINPTR AND .LT.FRETOP, .... 27660 COMPAR ;SEE IF THIS IS .GT. ENTERING [H,L]
Fortran 90 introduced the modern operators (<=, ==, etc), but the dot operators are still supported even today.
Easter egg comments
Programming comments are always a great place to hide stuff. Altair BASIC is no exception:
36180 INLINC: CALL INCHR ;GET A CHARACTER 36184 CPI 7 ;IS IT BOB ALBRECHT RINGING THE BELL 36186 JZ GOODCH> ;FOR SCHOOL KIDS? 36200 CPI 13 ;IS IT A CARRIAGE RETURN?
It's interested to see the Bob Albrecht reference before he split off on the Tiny BASIC project - effectively undercutting the commercial value of Altair BASIC. The first meetings of the Homebrew Computer Club happened while the first lines of Altair BASIC were being written. He was already well known in the 'hacker' community for his newsletters and as an early Altair adopter, but I'm not aware that any of the Micro Soft trio having directly dealt with him when this version was written. If you're interested in this era, I recommend researching the influence of the HCC, it's members, and the tensions between software philosophies that arose in the mid-1970s.
Here's a comment from (probably) Davidoff to Gates in the math package:
24800 DCXBRT: DCX B> ;THIS IS FOR BILL. C WILL NEVER BE ZERO 24820 ; (THE MSB WILL ALWAYS BE ONE) SO "DCX B" 24840 ; AND "DCR A" ARE FUNCTIONALLY EQUIVALENT 24860 RET ;ALL DONE
I assume Gates held a high standard for comments since Davidoff went out of his way to notate almost every line in the math package.
Incomplete? Bugs?
Altair BASIC 3.0 was a work in progress and I doubt that these printouts were release ready. Since files F3 and F4 were printed weeks apart with different configs (12K vs 8K), there are inconsistencies with symbols and code paths. Some obvious syntax errors exist and would prevent a build. There are also a few semantic errors that would lead to unintended side affects. Consider these:
53240 FOUTDN: MVI M,0 ;PUT A ZERO AT THE END OF THE NUMBER 53260 XCHG ;SAVE THE POINTER TO THE END OF THE NUMBER 53280 ; IN (DE) FOR FFXFLV 53300 LXI H,FBUFFR+1 ;GET A POINTER TO THE BEGINNING 53320 RET ;ALL DONE ... ~300 lines later ... 59820 FOUTNV: PUSH D ;SAVE (DE) 59840 LDA VALTYP ;GET WHAT KIND OF VALUE WE HAVE 59860 CPI 4 59880 JNZ FOUTND ;WE HAVE A DBL
The problem above is that neither the label FOUTDN in the top block nor FOUTND in the bottom block appear anywhere else in the code. This wasn't found during load time because the code path is dead. Recall that the math package in the code listing is built at 8K, but these extended routines are 12K+ only. This was likely just a typo and FOUTDN (Floating output DONE) is the intended label so like 59980 should change from FOUTND to FOUTDN.
Multiple lines of the code listing have corrections penciled in showing that this is was probably a working draft for off-prem code review. You can't take that PDP home with you!
50060 RZ 50080 CMP C /* Changed from CMP B */ 50100 RZ
This block is from the 4K version of DATA and REM commands. This comparison was changed from B to C with pencil but header notes indicate that it was originally B and they decided to use register C at a late time.
01100 FBUFFR: BLOCK ^D13 ;BUFFER FOR FOUT 01120 IFE LENGTH-2,<BLOCK ^D<35-13>> /* Originally ^D<31-13> */
From the math package, they penciled in this extended buffer resize. Likely this was a design change */
45980 PUSH B ;PUT DUMMY FIELD LENGTHS ON STACK 46000 JMP STROUI## /* Originally JMP FOUT? */
Also from the math package. A pencil change from one routine to another.
PDP-10 code
We know that development of Altair BASIC began on a PDP-10 so it should be no surprise that this early version would include a trace amount of platform code:
44360 IFE REALIO,< /* This code is only live if REALIO is off - not a real Altair */ 44380 DDT: POP B /* Clean the stack */ 44400 HRRZ 14,JOBDDT## /* Load the debugger address to register 14 (stack pointer by convention)*/ 44420 JRST 0(14)> /* Jump to it */
This code block is not applicable to interpreter operation
FAQ
Will this code build and execute as-is?
Short answer: No.
The printouts provided by Gates for the 50th anniversary release show two source files (F3 and F4), from two different configurations, 12K and 8K respectively. If you attempt to run these together, both will operate with wrong assumptions about memory layout, specifically the size of the Floating Point Accumulator. The extended portions of the math package also won't define all of the math symbols required by the interpreter section.
Then there's the problem of missing dependencies, notably the "MCS808" header.
What would it take to build this code?
At the very least, you'll need to:
- Get Macro Assembler (or build your own) to handle the Macro-specific directives
- Locate the missing universal file "MCS808" referenced in the source (or make your own)
- You just need to resolve the missing symbols. Stub them or recreate them. Check out known dissasemblies or back the missing data out of an Altair BASIC tape
- Change the F4 common file to 12K version. (Line 00400 Length from 1 to 2) to ensure the math package matches the interpreter.
- Easier said than done, since the 12K floating point output routine seems incomplete
- Alternatively, you could downgrade F3 or both to a lower version
- Double check errors in my source transcription with the printout. While I believe the transcription is over 99% correct -- that's also known as "incomplete"
- Find an 8080 emulator to test the build - or a real Altair if you want the full experience!
Some dead code sections, especially in the math package, have unresolved symbols. It's not clear if these sections were permanently cut out or the devs were in the middle of significant changes. The Floating Point Output routines for 12K are especially problematic - but again, this printout shows only the 8K as 'live' so no errors are thrown.
There's more quirks that you'll need to battle through, but those steps address the high-level approach
How did you extract the code from the printouts?
75% script automation and 25% "the hard way". No generative AI or OCR.
Script automation generated the line numbers, headers, footers, and basically any element that was predictably repeatable.
I transcribed most of the remaining by hand over roughly 40 hours. This was combined with my code reading and some quick analysis with notes which provided the outline for this discussion.
Is there any way I can contribute?
I didn't really plan for this to be a 'living' project, but there is certainly room to clean up my transcription in the GitHub repo. The intent is a perfect line-by-line copy of the code listing printouts Bill Gates provided.
As I've said already, I'm confident the source transcript is 99% correct, but I wouldn't turn down a pull request with more corrections to the main listing (ALTAIRBASIC30.LST). I prioritize the corrections like this:
- Build-blocking corrections - assembly instructions and operands
- BASIC line numbers, because order matters
- Comment sections, because this is really about understanding
- Symbol table corrections
- Machine code corrections
- Page headers or anything else
Finally, if anyone has a better perspective about the discussion above, I'll be glad to update and add credits.
More questions added as they roll in

