[Ilugc] Opening Google..

From: linux@xxxxxxxxxxxxxxxx (Suresh Ramasubramanian)
Date: Sun, 25 Apr 2004 13:06:54 +0530

Suraj wrote:

Pray how you figured google is NON-EVIL? (and why CAPS ON?)

Try this article by Brad Templeton -
http://www.templetons.com/brad/gmail.html

srs

--
linux@xxxxxxxxxxxxxxxx (Suresh Ramasubramanian)
jaharkes@ravel:/usr/src$ mv linux Gnu/Linux
mv: cannot move `linux' to `Gnu/Linux': No such file or directory
jaharkes @ cs.cmu.edu in reply to RMS on linux.kernel

The GMail Saga

Much has been written about the new Google GMail trial, which is an e-mail
service that offers a gigabyte of archiving, Google search of your mail
archives and a nice interface. It's free, because while you read your mail,
Google ads will be displayed based on keywords found in the mail you are
reading (just as the Google Adsense program shows ads like the ones you see
on the right based on words found in this document.)

GMail created a surprising storm for a product that hasn't yet been released.
A coalition of privacy groups asked Google to hold back on releasing it. A
California state senator proposed a law to ban the advertising function.
Editorials and blog entries left and right have condemned it and praised it.

I come to this problem from two sides. One, I'm a fan of Google, and have
been friends with Google's management since they started the company. I've
also consulted for Google on other matters and make surprising revenue from
their Adsense program on my web site.

I'm also a privacy advocate and Chairman of the Electronic Frontier
Foundation, well regarded as one of the top civil rights advocates in
cyberspace. The EFF has issued some statements of privacy concern over GMail,
though we declined joining the coalition against it. (I'm writing this as my
own essay, though with some advice from the EFF team.) I've also had a chance
to talk at length with Google President Larry Page about some of the issues.

Here's a summary of some of my conclusions

   1. While there has been over-reaction to GMail, there are some real issues
here to be worried about.
   2. Other webmail providers are doing, or will be doing the same things,
meaning these issues apply to all of them, including MSN, Yahoo and others.
   3. One key risk is that because GMail gets your consent to be more than an
e-mail delivery service -- offering searching, storage and shopping -- your
mail there may not get the legal protection the ECPA gives you on E-mail.
   4. The storage of e-mail on 3rd party servers for more than 180 days
almost certainly causes the loss of those privileges.
   5. This in turn creates a danger that we may redefine whether e-mail has
the "reasonable expectation of privacy" needed for 4th amendment protection.
   6. Correlation of search and mail has real risks.
   7. Google and others should architect to encrypt your mail.
   8. Even the irrational fears over the spooky aspect of advertising being
associated with e-mail creates problems that must be addressed.

Page and the others in Google were taken aback by the negative reaction to
their pre-launch. GMail is a very nice product, with great promise. They've
done a lot of work on building a superior web e-mail interface. Their
surprise at the tempest is not inappropriate. Many of the issues raised here
are subtle and known only to full-time privacy lawyers. In addition, many of
the debated features in GMail are already present to some degree in competing
products like MSN Hotmail, Yahoo Mail and AOL. And indeed, some of the
reaction has been silly and a bit paranoid, and some of that has gotten a lot
of the attention.

But there are also some deep issues here, worth discussing with not just
Google but all the other webmail providers.

The thing many people reacted to the most was the idea of displaying ads
keyed to words in your mail. We think of our mail as fairly private, and we
do a lot of private stuff in e-mail. Google's ad-linking is all done by
computer, they promise not to have human beings look at the mail (almost all
the time). If only a computer "knows" your deepest secrets, is there a
concern? I certainly let my own personal computer contain all sorts of
private information, ranging from my finances to private e-mails with my
family.

Even so, people have a reaction to a 3rd party computer doing scans like
this. If you were offered a service that saved you money by having your paper
mail opened by robots for scanning, which then inserted new junk mail in your
box based on what it found, you might get a bit creeped out. Go further and
consider a service that gave you free phone calls if it could have
speech-recognizing computers listen in and barge in with product offers
related to your conversation? It's easy to imagine an unpleasant situation
where you get invited to a gay wedding in Vancouver, and find with it in your
mailbox brochures for gifts, Vancouver hotels and a free copy of Out
magazine. People have extended that fear into the e-mail realm.

Of course, webmail is an optional service. Those afraid of these sceneria can
simply not use it, and that's an entirely reasonable answer. But it ignores
an aspect of the privacy fears we have. In the modern era where computers
threaten privacy, we are as afraid of outside computers knowing all about our
lives as much as we are outside people. Even though we might trust the people
running the computers today, or the human health insurance clerk who learns
we have cancer, we are uncomfortable with both of those things. We are not
quite as uncomfortable about the computer knowing, but it's hard to ignore
the fear that, in spite of the best of intentions, information on external
computers (and even our own computers) sometimes makes it out into the world.

People's fears here are sometimes irrational, but still real. And irrational
fears often affect privacy and freedom, even when they are irrational.
Consider the amusing Wall Street Journal story some time ago headlined,
"Help, my Tivo thinks I'm gay!" (The Tivo suggests shows to you based on what
you have watched before. Tivo says that this does not leave the box
associated with your identity, though they do collect data on all Tivo use,
stripping off the identity along the way.) People became annoyed at or scared
by their video recorders.

Google is not ignorant of this issue, of course. They plan to work hard to
not do something so bold as sexual ads on your salacious e-mails with your
husband. They know that even irrational freaking-out generates a real
consumer issue.

And it's not to be ignored that well-targetted advertising is itself a useful
thing. If you take as a given that you're going to accept ads to subsidize an
activity, few wish to have their time wasted by ads that are irrelevant. In
addition, well-aimed advertising costs advertisers more because it is more
effective -- which means you can get the ad subsidy you are seeking with
fewer ads. Clearly many users find themselves buying products from these ads,
products they might not have known about without them, so good ads can be a
service unto themselves.

I'll talk about the result of some of the irrational fears in a moment, but
first let's look at some of the real fears.
Our lives move online and outside

GMail is a big step in a concerning direction. If it's a success, millions of
people will move a great deal of the record of their lives not just online,
but online and stored with a 3rd party. This isn't the first time this will
happen, but for millions it will be the biggest such jump. Particularly if
one's searches and social network can be correlated with this information. My
e-mail contains the story of my life, and what's not in there is often
recorded in my searches.

As we move these things online and outside, we build some of the apparatus
for a surveillance society. As usual, we don't plan to do so, and the people
building it would oppose it being used that way. But they build it
nonetheless. We make it so that having that surveillance becomes a "logical"
step -- changing a law or a policy, or in some cases just pushing a button,
rather than a physical step -- going into somebody's house to grab their
papers.

When our papers are at home, mass surveillance of them simply doesn't scale.
It's too expensive. Online, it scales well. Our networked computers at home
are not very secure, of course -- probably less secure than Google's against
unauthorized intruders. But they must be broken into a million times --
though possibly by one giant virus -- and those outside machines must be
compromised (by the law or by system crackers) only once.

Webmail is, as noted, only a part of this trend. No surprise, since there are
many technical advantages to centralization, especially when it comes to
software maintenance. Installing, running and upgrading software on your home
computer is hard. A remote computer is professional maintained, and almost
surely a lot faster too. It can be reached from anywhere. In many cases,
where you want to roam all around and access your data, or have servers
receive data for you while your home computer is off, it's the only way to do
things well. I won't pretend that such centralization is not quite useful, or
imagine it is something we can stop from happening.

Many have written, correctly, that we have already seen this trend in
Hotmail, AOL, Yahoo, Turbotax Online, Instant Messengers, Salesforce.com and
all other "Application Service Providers" (ASPs.) That the trend has already
started doesn't make it less disturbing.

GMail however, with the offer of a free gigabyte, makes a big jump in this
trend. Free Hotmail users tended to keep only a bit of their mail online,
deleting the rest. Hotmail forced them to. GMail (and soon other webmail
offerings) will openly encourage the opposite.

Google is a good company with honest people, headquartered in a fairly free
country with protection of rights that is among the best in the world. But
that's not always so. I wonder, for example, about what it means to the
people of China that we have built our instant messaging infrastructure and
some of our E-mail infrastructure entirely with centralized servers and no
encryption? Their government can and does routinely listen to internet
traffic. When we build our systems, how often do we think of what it will
mean when they become popular in China, or Saudi Arabia? Or what it means
when they are sold to other companies whose policies are not so benevolent.

There are ways to buck this trend. Below, I'll be recommending that Google
and its competitors work to encrypt your data when they store it for you.
ECPA

The Electronic Communications Privacy Act started out in the 80s as a
blessing. It declared that e-mail was a private means of communication, and
that we might hope for the same level of privacy in it as we have in phone
calls and letters. Among other things, it means that police need a wiretap
warrant to read your e-mails, and that your e-mail company's employees can't
disclose your e-mails to others.

But the world has changed and the ECPA has not changed with it. E-mail in
transit is protected, but those in law enforcement advocate that once mail is
processed and stored, it is no longer the same private letter, but simply a
database service.

GMail's big selling point is that they don't simply deliver your mail. They
store it for you, and they index it so you can search it. (Of particular
importance, you authorize them to scan your mail for these purposes, and that
authorization is the act that risks stripping you of your rights.) All the
other webmail companies who don't already do this will, I'm sure, quickly
have offerings to compete with GMail. As noted, not only do they search it,
but they scan on viewing to provide ads that match the content, making it not
just a database but a shopping service. Should you click on those ads, the
merchants will see your IP address, and know somebody from that address (or
with their cookie) was reading a page, search result or e-mail related to
their ads.

Unfortunately, a database and shopping service doesn't look as much like an
e-mail delivery service as it should according to the legal definitions in
the ECPA. Thus, while Google promises not to peek into your e-mails or hand
them to others, the danger is that this is now solely their contractual
promise. With an e-mail delivery service covered by the ECPA, the law, not
just Google's terms of service governs when your mail can be handed over.

Google also has some work to do on their privacy policy, because it does
allow them to look at or release your e-mail in circumstances where they
would be forbidden to do so if it were ECPA protected, such as a law
enforcement "request." But even if it bound them as tightly as the law does,
it still contains the ubiquitous clause letting them change the policy at
will, without your consent -- so it's really not a strong restriction at all.
The policy should state at a minimum that changes to it which would reduce
the privacy of your mail can only take place with your explicit and informed
consent at the time of the change.

A lot of privacy policies should say that, of course. They don't, because
doing this in general is a logistic nightmare. However, with work, one can
fine-tune such policies to dictate what can be changed and what can't, to do
at least as well as what the law requires of e-mail delivery companies, if
not better. This is not easy, it's hard. But it's worth doing.

When you use network services, you need a contract, and that contract will be
written primarily to protect the interests of the service provider. When you
run software on your PC, you may agree to terms when you install software,
but as far as your own data is concerned there is usually no question of
contract at all. The data is on your PC, and there don't have to be rules
governing what others can do with it.

Without the ECPA protection, your e-mail (now just a database) can be seized
against Google's will with an ordinary subpoena (vastly less involved than a
wiretap warrant) or in the discovery phase of a lawsuit. With warrants, and
in some rare cases even without them, your mail can be grabbed without you
being informed that this has been done. Worse, Google has retained the right
to hand it over in the case of a "request" from law enforcement, rather than
a court order.

Of course, these legal techniques can be applied to the data you store
yourself. However, short of a secret warrant to break into your house and
seize your computer, it can't be done without your knowledge and involvement.
You have the chance to fight any attempt to grab your mail. You have the
power to get a lawyer and appeal any order in front of a judge. You give up
some of that power when you put your e-mail database in somebody else's
hands. Google is a good company that wants to please its users, they don't
want to be subject to all these subpoenae. But they won't fight as hard as
you would.

It's also important to note that most of these webmail providers are global
companies, with servers around the world. The ECPA is a U.S. law, only
protecting mail in the USA. Some other nations have protections but they
certainly all don't. Unless care is taken, your mail could end up stored in
another country without legal protections you were hoping for.

Now, after scaring you like this, let me add that this is an entirely new
situation, and the courts have not yet ruled whether the ECPA covers your
mail in a searching/advertising database service. It's possible they might,
but from the text of the law, quite possible they would not. Past history
suggests it is highly likely the Department of Justice will take the position
that the protection is lost.

The DoJ in fact believes that the moment after you open your mail is is just
an archive and loses some protections, though at least the 9th circuit court
has disagreed with that -- up to the 180 day rule at least.

To learn more about the ECPA and e-mail you will find a good analysis in this
paper. It's way more complex than I describe here.
The 180 day rule

In the hoped-for event that your webmail archives are protected by the ECPA
as what it calls an ECS, they lose some of that protection after 180 days.
This is not news, but a product like GMail, which encourages long-term
archving of e-mail with the web mail provider brings the question to the
forefront. After 180 days your e-mail archives can now be fetched without a
warrant, through a special ECPA court order or a subpoena. (In most cases,
but not all, you will get notice of such seizures.)
Expectation of Privacy

In the USA, privacy from government intrusion is defended by the Fourth
amendment, which requires a warrant be issued by a judge for many kinds of
searches and seizures. Over the years, court precedents have required that
warrant when you have a "reasonable expectation of privacy" over what's being
searched. The definition of when and where you have this expectation of
privacy has been in flux. Court decisions have sometimes sustained the
expectation of privacy, but often eroded it.

The higest standard of privacy has been "the privacy of your own home." For
example, the courts have ruled you to not have as much expectation of privacy
in a car as you do in a home. Not as much on your open land as inside your
house. Not as much in an RV as in a house. Your day-guests don't have the
same privacy in your house as you do.

When technology changes privacy, there is always a battle over this
expectation. The police want to reduce it so they can do more without
warrants. The public wants to keep their privacy. Recently police made a
number of arrests over the years by using infrared scanners to look for
signals coming from inside a home. The court eventually told them no: That
even though the IR was emitted from the house for view by "anybody" with the
right equipment, what went on inside was still private. Often we lose
privacy, as a man who lived in his motorhome discovered when the police were
able to search it like a car because he was parked in a parking lot rather
than an RV park.

Others have learned that what they say in front of baby monitors is not
private, even though it is said inside their homes, because it is transmitted
"in the clear" outside your home. (Cell phone calls, also often sent that
way, have statutory protection of your privacy.)

E-mail privacy is in crisis because most e-mail is sent without encryption.
That means that almost all e-mails are like postcards. Anybody with access to
the wires (or airwaves) they travel over can read them. Because of this,
arguments are being made that you should have no more expectation of privacy
in an e-mail than you have in a postcard, or worse, a postcard you hand to a
3rd party to carry. This question will remain in the balance for some time.

GMail and its competitors may tip this balance, particularly if the e-mail
managed by the webmail companies loses its ECPA protection status, the risk
of which is described above. I send my e-mails from my own private machine,
and that part is private. But I send them anywhere, including to webmail
users where they are stored in a 3rd party database for searching. Of course,
that database has password-controlled access and I would like to hope it's
private, but there are those who will argue it's a step below the privacy
expectation one has for one's own home computer. With a home computer,
somebody has to break into your home or the computer to read the mail. On a
3rd party server, they can do that, but they might also see it due to a
change of policy.

Sadly, we can also expect arguments that e-mails read from a database over an
open wireless network might be akin to the cordless phone or baby monitor
situation.

One can hope such arguments will not win. But what is true that the more
e-mail that is sent unencrypted, the more it is stored in private but
external databases, the greater the arguments will be that you no longer
really can expect that your mail is secure and private from being seen by
parties other than yourself and the recipients. If the courts ever become
convinced this is true, e-mail could lose the protections it currently has,
designed to parallel the protections of paper mail and phone calls.

My hope would be that Google's design should not reduce that expectation of
privacy. But all e-mail vendors should work hard to ensure they don't even
raise the risk, let alone pitch us over the edge. What a terrible thing if we
were to lose the cherished expecatation of privacy we want in our letters in
the name of convenience. The ECPA is just an ordinary law protecting e-mail
privacy, not as strong as a constitutional protection.

The expectation of privacy issue is important because it has a bearing not
just on users of GMail but on those who send mail to a GMail user. Since many
people will alias other addresses into GMail, you may not be aware you are
mailing GMail.
Search correlation

You've probably noticed that a lot of your web searches contain private
information. I often type into Google things like the names of prescription
drugs I have been given, to find out more about them. I use search engines to
research the stocks I buy and look up my friends, or see what people are
saying about my family. Very private stuff.

Largely we treat our searching as anonymous, though in fact it's not nearly
so anonymous as we might like. All the major search engines use a cookie that
allows the to correlate together all the searches we do. If like me, you have
broadband, your computer's numeric internet address is also fixed -- either
permanently like mine, or effectively permanently as it is for all those who
don't turn their computer off or who have a network gateway box.

All of the companies providing e-mail and search together create a troubling
risk that the private matters in our e-mail can be combined with the things
we search for. It's no surprise that this potential is there. Search
companies are all eager to find ways to improve the relevancy of their search
results in order to please their users. It's what we want them to do.
Learning things about you is one obvious technique to do that.

Again this is part of a trend toward creating a giant dossier of all your
private information in a central place. Because users will demand more
accurate search, this trend won't be stopped. However, companies can be
encouraged to anonymize data they collect, making it hard or impossible to
link back to the real person

Google has said they are not correlating search and e-mail (or their social
network prototype called Orkut.) That's good, but for business and user
interface reasons they are not likely to say this will always be true. The
browser cookie system allows the e-mail and search systems to share
information about your identity unless you go to extreme lengths to delete
your cookies or use cookie-washing software. Even then, unless you are a
dial-up user, your IP address is almost as good as the cookie, perhaps
better. Of course, once you have a common login for e-mail and enhanced
search services your activities will be fully linked, as they can be on Yahoo
today. You can only wash this through the extreme step of using a web
anonymizer which bounces your requests among cooperating servers that hide
your address.
Global Search

The war on terrorism has rewritten the rules on what's possible in civil
rights. Consider the idea that the government might come with a warrant or
new law to an e-mail provider and say, "Search all your customer's e-mail to
see if anybody was talking about planes and the World Trade Center before 8am
on Sept 11."

Under traditional 4th amendment rules, such a broad fishing expedition would
be unthinkable. But some people are finding it more thinkable. Some might
even think it's a good idea, and proclaim they don't mind if this were done
to their own mail.

Without large webmail archives, the idea wasn't even possible. Now it could
be.
What can Google Do

Google is proud of its reputation as a good corporate citizen that tries to
keep its users interests at heart. It has done this even against the wishes
of its customers (the advertisers) by putting restrictions on ads that other
sites wouldn't place. This has actually been a good business decision, but
Google espouses a philosophy of not doing any "evil" while building their
business. It's a good philosophy.
Encryption

The most obvious step Google could take would be to encrypt a user's e-mail,
searching index and other associated data, so it can only be accessed using
the user's password, and of course that password should not be stored when an
e-mail session is over.

If need be, the mail can be held temporarily unencrypted before delivery to
the user (because then it has ECPA protection) and thus indexed and tied to
ads. Then it can be encrypted. Both the index and the mail contents must be
encrypted so that they can't be read without your password. Police could get
a wiretap warrant to watch you type in your password, so this is not as
private as doing fully encrypted mail to your home PC, but as noted the
warrant process defends your rights much more than the subpoena or
contractual system does.

An overseas escrow system would allow recovery of your password if you lose
it, and in the inevitable event of your death.

GMail could also encourage the use of encryption in sending mail, both by
doing SMTP-over-TLS (a standard technique for encrypting mail as it moves
from server to server) wherever possible for your incoming and outgoing mail,
and also supporting standard e-mail encryption formats like S/MIME and PGP,
as well as new opportunistic encryption systems. This would be a giant step
forward.
Resist correlation with search

This should be resisted as long as possible. If business needs -- which means
pleasing the customer -- demand it, ways should be developed so that it's
hard to tie the correlated data back to a particular user. If need be, it
might even be better to have an explicit login (with password) so that the
correlation data is also encrypted and only available when the user is logged
on. Though frankly, I use Google search hundreds of times a day -- I'm always
logged on.

Think about whether you can make such correlation opt-in, requiring the
user's agreement. You might be able to create more accurate search doing
this, but if customers want that, is it too hard to add the step of getting
them to knowingly agree to it?
Lobby to strengthen the ECPA

Google, Microsoft, AOL, Yahoo and others should join with us in pointing out
that the ECPA is now outdated, and not providing enough legal protection for
all the new technological innovations in e-mail delivery. Fight in court
cases where the DoJ and others try to expand their powers to grab user e-mail.

All these companies could also push for similar and stronger protections in
the other nations where they do business.
Continue to refine policies and educate the public

Much of what Google is doing with GMail is innovative and worthwhile. It
would be ridiculous to see it banned, as Senator Figueroa would suggest. It's
not a lot different in kind from all the other well established webmail
services -- but this doesn't mean those services didn't also have issues.
There is not at this time, for example, a reason to be afraid of GMail and
not afraid of premium Yahoo mail. Within months, the competitors will also
release products with large archives and other features to keep up with
Google.

So Google needs to refine their policies and contracts. Any policy that can
be changed to protect user privacy that does no business harm to the
company's plans should be changed. Even ones which do impede possible
business plans should be evaluated and some of them done.
What users can do

The hard truth is if you are concerned about your privacy, 3rd-party hosted
web based e-mail may not be right for you. There are tools that will let you
search the e-mail on your PC.

You can also be more careful about how cookies are used to track you on the
web. Most browsers can help you control this. I recommend the absolutely free
Mozilla browser which lets you easily turn on and off what cookies go to what
sites. Google is a good site that lets you use it without cookies; not all
sites are so kind.

The problem with privacy is that nobody cares about their privacy until after
it's been violated. Only when Bill Gates discovered that his e-mail could be
searched for a message about cutting off "Netscape's Air Supply" did he
realize the danger in logging it. Many people only realize the danger in
their records when they enter a lawsuit, or a custody battle. They all
thought it would never happen to them.

If you are concerned there are a variety of resources, including tools for
anonymous web surfing, fancy cookie-management and more.

Most importantly, tell companies that you care about your privacy. Because
others, not yet finding it violated, are not giving this message even though
they do care if they think about it.
On the appearance of surveillance

To close, let me add one other rule about privacy. It is not only important
to have your privacy. It is important that you believe you have your privacy.
If you even suspect that you are being watched, it changes your behaviour and
you become less free as an individual.

Are you the same person at your mother's house for holiday dinner as you were
your first year on your own at university? How much you think you are being
watched affects your freedom. How much a society thinks it is being watched
affects the freedom of the society.

The fear that computerized scanning of our e-mails (to display ads or filter
out spam) will result in actual harm is largely baseless. But even irrational
fears affect our freedom, and this should be considered in software design.

References:
- [Ilugc] Opening Google..
  - From: Arvind Narayanan
- [Ilugc] Opening Google..
  - From: rajendran
- [Ilugc] Opening Google..
  - From: Arvind Narayanan
- [Ilugc] Opening Google..
  - From: Suraj

[Ilugc] Opening Google..

Other related posts: