[joho] JOHO - October 18, 2008

  • From: "David Weinberger" <self@xxxxxxxxxxx>
  • To: joho@xxxxxxxxxxxxx
  • Date: Sun, 19 Oct 2008 15:21:26 -0400

[image: joho logo]

October 18, 2008

This issue is archived
If this message has formatting problems, please try the archive link above.


we exit the Information Age, we can begin to see how our idea of
information has shaped our view of who we are.

*The future from
a 1978 anthology predicts about the future of the computer tells us a
lot about the remarkable turn matters have taken.

*A software idea: Text from
care to write software that would make it much easier to edit spoken

Name that software!

Half a Loaf of Half Baked Thought!

The lead article this time is particularly half-baked. But at least
half-baked stuff is chewy. Unfortunately, it's also sort of wet. And it can
expand in your digestive tract, causing cramps and, if untreated, internal

You're welcome!

[image: dividing line]
Exiting information


I'm giving a talk at the Ars Electronica conference in Austria at the
beginning of September, which is forcing me to try to find a path through my
past year's reading about the history and nature of the information. I've
facing two pretty big challenges: First, I don't know what I think. Second,
the Ars Electronica crowd knows this stuff better than I do. Other than
that, I'm in excellent shape :(

I've started outlining my talk, but it refuses to take shape. This is making
me nervous. So, here's some of what I have in my (s)crappy little outline.
It's really just notes toward an outline toward a talk toward a coherent

My amateurism in info theory shows in this piece. Please set me straight.

NOTE NOTE: I just heard from the conference organizers that the day of my
talk has a different theme. So, the following will have little or no
relation to what I actually say. Oh well. I still want to figure out what I
think about all this.

NOTE NOTE NOTE: As I postpone publishing this issue of JOHO longer and
longer, the conference approacheth and the parameters change. I'm going to
talk about the nature of truth in the broadcast and Internet ages. I think.
Thus, the following is now officially severed from my the Ars Electronica

NOTE NOTE NOTE NOTE: I've dithered so long on this essay that Ars
Electronica has come and gone. (You can see what I actually said at Ars
Electronica here <http://blog.whoiswho.de/stories/30719/>. It bears almost
no resemblance to this article.) Entire election campaigns have crested and
fallen. And still I can't get it right.


When computers were first introduced, we resented them as soulless machines
that enforced efficiency over humanity. Yet, now discipline after discipline
has reconceived itself as being fundamentally about information. DNA, we
think, is information, in the form of a code. Businesses measure success by
the informational models they build. Economies run on models, until the
bottom of the cliff smashes some sense into them.  The brain processes
information. We contrast atoms and bits, as if bits were as fundamental and
ubiquitous as 
quanta, the stuff of physics, are  understood by some — see Charles Seife's
excellent "Decoding the Universe <http://www.decodingtheuniverse.com/>" —
as nothing but information processed by the computer formerly known as the

From cradle to grave, from quirk to quark, we have thoroughly
informationalized ourselves. Now, as we are exiting the Age of Information —
oh yes we are — is a good time to ask what information had done to our world
and our way of thinking about it.
Information's big bang

Information begins in 1948 by being defined in opposition to noise. AT&T
wanted to know how many conversations it could send simultaneously through a
wire, which for a telephone company is as crucial as a supermarket knowing
how many customers a checkout clerk and bagger can process in an hour. For a
supermarket, the enemy is human reaction time. For telephone conversations,
it's noise: the cumulative crackle of the lines and the way signals can
interfere with one another. In 1948, Claude Elwood Shannon published a
for the first time enabled us to quantify the amount of information
required to encode a message in a transmission and thus quantify the damage
noise does.

Information for Shannon is a measure of the probability of receiving a
particular message, where the message is  considered as one out of the set
of possible messages. When there are few possible messages, the actual
message conveys less information than when there are lots of ways it might
go. Receiving information decreases uncertainty (having guessed correctly a
letter in a game of hangman, you are less uncertain about the solution), and
its quantity is, in this sense, a measure of surprise. When there is no
uncertainty, there is no information to be gained, and informational entropy
is at its lowest.

Shannon's technical use of the term "information" was new. For example, in
1939, Shannon wrote to Vannevar Bush: "I have been working on an analysis of
some of the fundamental properties of general systems for the transmission
of intelligence..." "Intelligence" became "information." (See Erico Marui
Guizzo's unpublished dissertation, "The Essential
for a lucid, understandable account of Shannon's context and contribution.)

Shannon gave us a diagram that we today take for granted:

message > encode > channel > decode > message

 The top line shows the transmission of information. On the bottom, noise
enters the channel and interferes with the signal.  Shannon worked out the
mathematics of the interplay of these two lines, putting the movement of
bits — information — on a sound mathematical, probabilistic footing.

The term was immediately taken up by computer science and other disciplines
(notably at the remarkable Macy
in ways inconsistent with Shannon and with one another. Once the Information
Age takes off, not only don't we generally mean what Shannon meant by
"information," we often don't mean anything very precise.
History of info

The canonical history of information runs roughly like this: In 1801,
Jacquard <http://en.wikipedia.org/wiki/Jacquard_loom> invented a loom that
was "programmed" by punch cards. Twenty years later, Charles
Babbage<http://en.wikipedia.org/wiki/Charles_Babbagev>invented analog
computers (the Difference Engine and especially the Analytic
Engine) that remarkably mirrored the architecture of modern computers. But,
because of the limitations of manufacturing at the time, he wasn't able to
actually build one. (Stir in Ada
daughter of Lord Byron, who envisioned the power of Babbage's computer more
clearly than Babbage did.) Then, Herman
Hollerith<http://en.wikipedia.org/wiki/Herman_Hollerith>took up the
loom card idea and applied it to tabulating machines, which
proved themselves decisively in the 1890 U.S. Census. Hollerith's machines
eventually became IBM's. In the 1930s, Alan
Turing<http://en.wikipedia.org/wiki/Alan_Turing>invented the Turing
Machine, a theoretical construct that provided the basis
for modern computers as universal machines. In 1948, Claude Shannon wrote
the paper defining information theory.

That's the canonical history, but it's wrong. The true history of
information is discontinuous. Shannon's definition marks a break. Perhaps in
my talk I will take punch cards as my illustrative object, arguing that loom
cards and Babbage's cards were essentially not information, that Hollerith
gets a little closer to the modern definition, but only with Turing and then
Shannon does the modern meaning emerge.

So, what was the pre-Info Age meaning of the term "information"? You can see
two senses in Charles Babbage's 1864
where he uses the term 28 times, never in Shannon's sense.  First,
"information" means simply something that you're about to learn, either
because you didn't know it or because there was a change in the world (=
news). E.g., "I just got some information." Second, Babbage's memoir also
talks about information as the sort of thing that's in a table. To this day,
those are roughly what we mean when we use the term in non-technical,
ordinary talk.

The history of tables is pretty durn fascinating and I'd love to find a
reason to talk about it. Before modern computers, the application of
mathematics required tables of pre-computed numbers. But, tables were
necessarily error prone; a book of tables could have thousands of mistakes,
and the errata had errata. Jonathan Swift denied that applied math would
ever work because of this problem. And that's the very problem Babbage set
out to solve with his  "computers," which were really nothing but
steam-driven (well, hand-cranked) table generators. Babbage was inspired by
Adam Smith's ideas about the division of labor, breaking the manufacturing
of, say, pins into separate steps. Babbage did the same for the
manufacturing of mathematical tables. He was more concerned with friction
and gear slippage than with information in the modern sense. (And, by the
way, the low-paid people, frequently women, who filled in each and every box
in a table were called "computers.")

Both of these prior meanings of "information" are fundamentally different
from the modern sense if only because they refer to special cases: something
you're about to learn or the content of tables. In the Information Age,
information is the most unspecial stuff around. Everything is information or
can be information. Information becomes co-extensive with the stuff of the
universe itself...quite literally for those who think the universe
ultimately is a computer.
Origins of modern info in the crucible of WW II

How did we get from Babbage's to Shannon's view of information? Again, tons
of fantastic material has been written on this. But for now I want to pin it
on World War II. In particular, four war-based circumstances gave rise to
the modern idea.  (A great book about how the military influenced the
development of information science: The Closed
Paul Edwards.)

First, World War II was even more far flung and complex than railroads and
other corporate enterprises. To command and control our World Wide War, we
needed to extend the office tools — typewriters, forms, calculators, filing
systems, etc. — we'd developed over the past century. The new tools and
procedures created yet more simple, predictable,
bureaucratically-processable information.

Second, some of the creators of info theory during the War created
computers so artillery could shoot at where airplanes will be, not at where
they are. Yes, it's tables again. (This goes way back. Galileo created and
sold artillery tables.)

Third, many of these people worked on cryptography, i.e., the encoding and
decoding of messages via a predictable, controlled system. Cryptography got
taken not as an exceptional case but as the ordinary case of communication:
You translate meanings into arbitrary symbols that the other person
decodes. Information as Shannon sketched it mirrors the behavior of a spy
behind enemy lines. Once again in the history of ideas, we've taken an
exceptional case (spies trying not to be too easily understood) as our model
for the ordinary cases.

Fourth, troops on the battlefield couldn't hear commands because
the machinery of war was so loud. So, a major project was undertaken, with a
center set up at Harvard, to enable battlefield communications to survive
the noise. One result: A thousand-word controlled vocabulary designed to be
distinguishable over booming guns and grinding gears. (The Able, Baker,
Charlie alphabet was born in this project. Was the idea of issuing
"commands" to a computer also born there?) That's a major reason that
Shannon's info theory talked about signal vs. noise.

After the war, the military continued to be the major funder of computer
research. It's no accident that we ended up with nuclear strategy based on
war simulations, a Cold War based on controlling what information the other
side had about your capabilities, and McNamara's Vietnam war in
which progress was measured by a spreadsheet of deaths (= body counts).
Businesses adopted the control-through-models approach, greatly facilitated
by the rise of the personal PC and the invention of the spreadsheet (thank
you Dan Bricklin <http://www.bricklin.com/log/> and Bob
two of the least militaristic people around), which let organizations
simulate business choices the way the Rand Corporation simulated limited
nuclear war.

Ah, abstracting war! The reasons to do so are so compelling: Formalizing
this most embodied of social actions let us get a sense of mastery over it,
and formalizing it let us persevere with body counts that poetry would have
made impossible.

But not only war. Shannon's theory defines information at its most abstract:
not the dots and dashes of the telegraph or the modulations of radio waves
or the pixels on your LCD, but information abstracted from any instance of
its communication. Because of this utter abstractedness, information theory
can be applied to everything, from traffic to clouds. But, as with any
paradigm (and, dammit, I am using the word in its correct,
it hides as well as reveals.
The final abstraction

It all goes back to bits.

A bit is the fundamental unit of information because a bit is simply a
measurable difference. Is this a fruit or a berry? That's a bit. Is it
edible or not? That's a bit. Is it smooth or lumpy? That's a bit. Is it
genetically engineered or not? That's a bit. Does it rhyme with "jaw merry"
or not? That's a bit. Did Shakespeare ever eat one of them or not? That's a
bit. A bit is a difference worked out to its most general, simple, and
abstracted nature: Yes or no, on or off, one or zero, black or white, is or
isn't. In fact, a bit doesn't even tell you which polarity is under
consideration. A bit is something that has been brought down to a state of
two possible alternatives, regardless of what the something is and
regardless of what the two alternatives

Everything can become a bit because everything can be seen as being in
either one of two possible states. Of course, you have to stipulate the
level of precision you're talking about: "The earth is approximately a
sphere" is a bit even though we live on a lumpy, misshapen planet, because
the "approximately" specifies a degree of precision. This is not an
objection to bits, for every measurement comes with a degree of precision,
whether enunciated or not. Even the encoding of bits in physical media comes
with its own scale of precision: A computer transistor is counted as a "one"
if its voltage is high enough. And let's not even get started on hanging

But, bits qua bits are a special unit of measurement, for they don't measure
any particular quality. "This is green or not" is a bit just as much
as "George W. Bush was our greatest president or not." Inches measure
length, and lumens measure brightness, but bits measure any difference. Bits
thus are the ultimate reductions, dropping every particular quality, and
instead measuring pure is-or-isn't-ness.

Why measure difference? Initially, this view enabled Shannon to derive a
general theory for measuring the capacity of communication channels, leading
to better management of the transmission of bits. And, of course, the
awesome power of digital computers to process and transform strings of bits
gave us a very practical reason for translating as much of our world into
bits as possible. The emptiness of bits — their abstractedness — enables us
to model everything, and their simplicity enables us to build machines that
manipulate those models. Bits are the ultimate enablers.

Because bits measure any type of difference, there is no place they can't
apply. They cover as much territory as atoms. In fact, more than atoms. So,
we can imagine a bit-based brain that models the atom-based brain, and we
can even imagine that the bit-based brain might be conscious.

But there is an important difference between atoms and bits. Atoms exist
independent of us. With bits, it's not so clear. For one thing, to get a
bit, you have to decide on the level of precision. [Here I nod to Bruno
Latour, who would dispute my implied metaphysics.] And you have to be
willing to do something even more unnatural. Because bits measure no
property, or, more exactly, can measure any property, they stand in symbolic
relationship to that which they measure. We can process bits without knowing
what they stand for, but we only gather bits — bits only emerge — because
they do stand for something we care about. As Gregory Bateston famously
said, a bit is a "difference that makes a difference." This bit stands for
the hue of a nanosquare of Natalie Portman's face and that bit stands for
part of the precise location of my avatar in World of Warcraft. We care
about bits  because some are telephone calls, some are weather data, and
some are the windmill in an online game of miniature golf. That's why we
turned that segment of our world into bits in the first place. Bits only
emerge because they represent particular attributes that we care about.

Bits have this symbolic quality because, while the universe is made of
differences, those differences are not abstract. They are differences in a
taste, or a smell, or an extent, or a color, or some property that only
registers on a billion dollar piece of equipment. The world's differences
are exactly not abstract: Green, not red. Five kilograms, not ten. There are
no differences that are only differences. There are only differences of
something in some way or another. There are no bits, except as abstractions.

Bits never fully escape our involvement in them, and they never become fully
abstract. Take a digital camera.  We turn a visual scene into bits in our
camera because we care about the visual differences at that moment, for some
human motive. We bit-ify the scene by attending to one set of differences —
visible differences — because of some personal motivation. The bits that we
capture depend entirely on what level of precision we care about, which we
can adjust on a camera by setting the resolution. To do the bit-ifying
abstraction, we need analog equipment that stores the bits in a particular
and very real medium. Bits are a construction, an abstraction, a tool, in a
way that, say, atoms are not. They exist because they stand for something
that is not made of bits.

This is by no means a criticism of Shannon's theory. That theory has
such enormous power because it abstracts information in a way that the world
itself never allows. Shannon's diagram depicts information in a Platonic,
mathematical purity: message > encode > channel > decode > message. Its
abstractness is its strength. Yet, the messy, insistent world remains
present in Shannon's diagram. It's there as noise, intruding, interrupting,
distracting, contradicting itself, creaking with history and the
inevitability of failure.

Noise is the sound of the world refusing abstraction, insisting on
differences that are never the same as every other difference.

If we are indeed exiting the age of information, perhaps we are entering —
have entered — the age of noise.

# # #

far as I know, Nicholas Negroponte started the atoms vs. bits meme, in Being
Digital in 1995. But he didn't mean it then the way I'm suggesting we
currently sometimes take it. He was describing two economies, not making a
metaphysical claim.

a bit can take multiple values, despite the fact that the name comes from "*
binary* digit." See J.E. Robertson's "The Correspondence between Methods of
Digital Division and Multiplier Recoding," in IEEE Transactions on Computers,
Vol. C-19, No. 8, August 1970, which I have not.

My current epigram: The opposite of noise is attention.
[image: dividing line]
The view from 1978

Almost thirty years ago, some professors at MIT published a book of essays
looking back at twenty years of computing history, and looking forward
twenty. Called The Computer Age: A Twenty-Year
edited by Michael Dertouzos and Joel Moses, its essays were written in the
late 1970s, back when if you knew how to use a computer you could probably
also name every "away team" in the first season of Star Trek.

Joel Moses, in the first essay, got a surprising amount right. We would be
paying our bills electronically, we would subscribe to mailing lists, we
might even have what he called the "surprising concept of the office in the
home." Others under-estimated. Victor Vyssotsky thought that "Electronic
correspondence will increase...slowed somewhat by the fondness for engraved
letterheads and hand-inscribed signatures..." It turns out that except for
wedding invitations and thank yous from monarchs, we actually weren't as
attached to engraved paper as he thought. Robert Noyce of Intel perhaps was
a tad self-interested when he wrote "The potential for developing
inexpensive processing power is truly awesome," but he was also right on the

Networking shows up, but the contributors assumed it was going to be
something quite different. J.C. Licklider foresaw what he calls the
Multinet, which is a network of networks, just as the Internet is. Marvin
Denicoff was excited about the ARPA network becoming more widely available,
which is what did indeed become the Internet. But none of them foresaw  just
how simple the Internet was going to be. Licklider's Multinet was going to
knit together various specialized networks, some that would handle speech
and some that would handle video. B.O. Evans thought a national-scale
network would be complex to build because it would have to provide all sorts
of services, including "systems recovery, network management, traffic
routing, multiple-data-base access, data-base sharing, and security and
privacy." All these prognosticators assumed, to one degree or another, that
the network of networks would be carefully planned and controlled, with the
right services built into it, just as if you were going to build a telephone
network, you'd make sure it had a way to meter people's usage and have a 411
number for directory assistance.

But the Internet took a different turn. In 1983, three engineers not
included in this anthology wrote a technical
paper<http://www.reed.com/Papers/EndtoEnd.html>on what they called the
"end-to-end" principle. It said that the best way to
build this new Internet thing was to keep it as simple as possible. Don't
build in any services beyond what you need to get data from point A to B.
Let the innovation and services — from security to searching to programs to
name your dog — let them all be created by the users of the Net.

And because the prognosticators didn't foresee that the Net would be so
simple and stupid, they couldn't foresee how everyone with a keyboard would
pitch in. Sidney Fernback assumed a government would have to build a
National Data Library. Daniel Bell assumed that because information is a
collective good, no individual would have an incentive to build it out. They
couldn't foresee the power of the market to build a Google, and they would
have been flabbergasted by the way we all pitched in to build Wikipedia, or
LibraryThing <http://www.librarything.com/>, or even to find the continuity
errors <http://www.imdb.com/title/tt0034583/goofs> in beloved films. The
amount we've built together for free surpasses not only assumptions about
technological predictions but many assumptions about human motivation and
human nature.

The fact that this book by remarkably insightful men — yes, all men — failed
to predict the most important change computers would bring is a sign not of
their failure but of the unpredictable transformative human power the
Internet has unleashed.
 [image: dividing line]
A software idea: Text editor for audio

Here's an idea for some software I'd like built. If you'd like to build it,
or if you know of a way to get this done — as open source software, natch —
could you please send me an email to let me know? As always, I'm at

I occasionally do podcast interviews, but I do them rarely because, even
with today's modern audio editing software (I use
it's still a pain in the tuchus. Let's say you want to remove an "um" or cut
out a question and response. You have to open the file in your audio editing
software, find the beginning of the text you want to cut (which requires
listening with the reflexes of a cat to set the marker), find the end of
where you want to cut (again with the cat-like reflexes), and do the cut. If
you want to move the question and response to a different spot, you have to
listen to it again to find the spot you want and then do the insert. This is
all quite counter-intuitive and clumsy, although audio folk think it's easy
and natural. They're wrong.

So, how about this instead:

Run the audio through this new piece of software. It uses some existing
speech-to-text processor to create a reasonably good transcript of the
words. Then it lets you use its primitive word processing functionality to
cut words, move them, or copy and paste them. You can't select anything
smaller than a word, but you can select as many of them as you want. Whole
paragraphs or more! When you're done arranging the words, you press the
"Done" button and it assembles a copy of the original audio file with the
cuts, moves and copies done for your automatically. It can do this because
when it created the transcript, it kept track of the time-codes. When you're
done editing, it uses the time-codes as a guide for automatically assembling
the finished audio.

So, to edit an audio file of the spoken word, you could simply edit a text
transcript of it.

Wanna help?
[image: dividing line]
Bogus Contest: Name that software!

What would be a good name for the software described in the previous

To get you started, here are some that are not good names for the software:

A2T2A (Audio to Text to Audio)




Speech Processor


You certainly can do better than this...

 Editorial Lint

JOHO is a free, independent newsletter written and produced by David
Weinberger. If you write him with corrections or criticisms, it will
probably turn out to have been your fault.

To unsubscribe, send an email to
"unsubscribe" in the subject line. If you have more than one email
address, you must send the unsubscribe request from the email address you
want unsubscribed. In case of difficulty, let me know: self@xxxxxxxxxxx

There's more information about subscribing, changing your address, etc., at
www.hyperorg.com/forms/adminhome.html. In case of confusion, you can always
send mail to him at self@xxxxxxxxxxxx There is no need for harshness or
recriminations. Sometimes things just don't work out between people. .

Dr. Weinberger is represented by a fiercely aggressive legal team who
responds to any provocation with massive litigatory procedures. This notice
constitutes fair warning.

The Journal of the Hyperlinked Organization is a publication of Evident
Marketing, Inc. <http://www.hyperorg.com/evident/evihome.html> "The
Hyperlinked Organization" is trademarked by Open Text
For information about trademarks owned by Evident Marketing, Inc., please
see our Preemptive Trademarks™™ page at

[image: Creative Commons
This work is licensed under a Creative Commons
Attribution-Noncommercial-Share Alike 3.0 United States

Other related posts:

  • » [joho] JOHO - October 18, 2008