B Programming Language: History, Features, and Modern Reconstruction

 9 min video

 2 min read

YouTube video ID: cYS57xJuRP8

Source: YouTube video by ComputerphileWatch original video

PDF

B emerged as a simplified version of BCPL, often described as “BCPL filtered through Ken Thompson’s brain.” The language was created to run on the PDP‑7, a small machine with severe memory constraints. There was never an “A” language; the name B derives directly from its predecessor, BCPL.

Technical Characteristics

B operates with a single data type: the machine word. It lacks explicit types such as int or char and does not provide pointers. Because there is no native byte type, extracting a character from a string requires a dedicated char function.

The language uses threaded code, a technique similar to Forth, to keep the compiler footprint tiny. The auto keyword, now familiar from C++, originated in B to declare variables on the stack. Early versions of B omitted a for loop, so programmers relied on while loops for iteration.

Reconstruction Process

The original B compiler disappeared, prompting a reverse‑engineering effort to make B code runnable again. By analyzing surviving object code, standard library functions such as printf, and existing documentation, a new compiler was built from scratch. The reconstructed compiler consists of roughly 1,000 lines of B code and is capable of compiling itself.

Because the original intermediate code format was forgotten, an invented intermediate format bridges the source and the B assembler. The assembler then produces output compatible with the Unix assembler.

Mechanisms & Explanations

Compilation Process – B source is fed to the B compiler, which emits intermediate code. This code passes to a B assembler, generating object files that the Unix assembler can link.

Threaded Code – The compiler emits threaded code for a stack machine. This design choice stems from the original compiler’s need to fit into a very limited memory space, precluding complex optimizations.

Portability – Relying solely on machine words and a threaded code model makes B highly portable. Modern implementations have run on PDP‑11, PDP‑10, RISC‑V, MIPS, and x86 architectures.

Portability in Practice

The language’s minimalism—single data type, lack of pointers, and threaded code—allows the same B source to be compiled across diverse hardware. This portability aligns with the original design goal of running on constrained machines while still being adaptable to contemporary platforms.

Notable Quotations

  • “B was actually uh probably called B because it was a simplified version of BCPL.”
  • “B was like BCPL filtered through Ken Thompson’s brain.”
  • “One interesting feature of B is that it only has a single data type just the machine word.”
  • “The compiler is very short uh because it had to fit into a small memory uh amount of memory.”
  • “You can write a compiler in like about thousand lines of B code that compiles itself.”

  Takeaways

  • B originated as a simplified version of BCPL designed to run on the memory‑constrained PDP‑7, and its name reflects the direct lineage from BCPL.
  • The language uses only the machine word as its data type, lacks pointers, and employs threaded code to keep the compiler footprint minimal.
  • The original B compiler was lost, so a reverse‑engineered version of about 1,000 lines of B code was created by analyzing surviving binaries and library functions.
  • The reconstructed compiler generates an intermediate code format that the B assembler translates into Unix‑compatible object files.
  • Because B relies solely on machine words and threaded code, it runs on a wide range of architectures, from PDP‑11 to modern RISC‑V, MIPS, and x86 systems.

Frequently Asked Questions

What single data type does B use for all values?

B uses the machine word as its only data type, meaning every variable occupies a full word of the target architecture. This design eliminates explicit types like int or char and removes the need for pointers.

How was the lost B compiler rebuilt?

The compiler was rebuilt by reverse‑engineering surviving object code, standard library routines such as printf, and existing documentation. The result is a self‑hosting compiler of roughly 1,000 B lines that produces an intermediate code format for a B assembler.

Who is Computerphile on YouTube?

Computerphile is a YouTube channel that publishes videos on a range of topics. Browse more summaries from this channel below.

Does this page include the full transcript of the video?

Yes, the full transcript for this video is available on this page. Click 'Show transcript' in the sidebar to read it.

Helpful resources related to this video

If you want to practice or explore the concepts discussed in the video, these commonly used tools may help.

Links may be affiliate links. We only include resources that are genuinely relevant to the topic.

PDF