Read Data and Goliath Online

Authors: Bruce Schneier

Data and Goliath (30 page)

Other applications prefer having everyone’s data simply because it makes them more
effective. Sure, Google could do well if it only had data on half of its users, or
saved only half of the search queries on all of its users, but it would be a less
profitable business. Still other applications actually need all of the data. If you’re
a cell phone company trying to deliver mobile phone calls, you need to know where
each user is located—otherwise the system won’t work.

There are also differences in how long a company needs to store data. Waze and your
cell phone company only need location data in real time. Advertisers need some historical
data, but newer data is more valuable. On the other hand, some data is invaluable
for research. Twitter, for example, is giving its data to the Library of Congress.

We need laws that force companies to collect the minimum data they need and keep it
for the minimum time they need it, and to store it more securely than they currently
do. As one might expect, the German language has a single word for this kind of practice:
Datensparsamkeit
.

GIVE PEOPLE RIGHTS TO THEIR DATA

The US is the only Western country without basic data protection laws. We do have
protections for certain types of information, but those are isolated areas. In general,
our rights to our data are spotty. Google “remembers” things about me that I have
long forgotten. That’s because Google has my lifelong search history, but I don’t
have access to it to refresh my memory. Medtronic maintains that data from its cardiac
defibrillators is proprietary to the company, and won’t let patients in whom they’re
implanted have access to it. In the EU, people have a right to know what data companies
have about them. This was why the Austrian Max Schrems was able to force Facebook
to give him all his personal information the company had. Those of us in the US don’t
enjoy that right.

Figuring out how these rights
should
work is not easy. For example, here is a list of different types of data you produce
on a social networking site.

•  Service data: the data you give to a social networking site in order to use it.
Depending on the site, such data might include your legal name, your age, and your
credit card number.

•  Disclosed data: what you post on your own pages, such as blog entries, photographs,
messages, and comments.

•  Entrusted data: what you post on other people’s pages. It’s basically the same
stuff as disclosed data, but the difference is that you don’t have control over the
data once you post it—another user does.

•  Incidental data: what other people post about you. Maybe it’s a paragraph about
you that someone else writes, or a picture of you that someone else takes and posts.
Not only do you not have any control over it, you didn’t even create it.

•  Behavioral data: data the site collects about your habits by monitoring what you
do and whom you do it with.

•  Derived data: data about you that is inferred from all the other data.
For example, if 80% of your friends self-identify as gay, you’re probably gay, too.

What rights should you have regarding each of those types of data? Today, it’s all
over the map. Some types are always private, some can be made private, and some are
always public. Some can be edited or deleted—I know one site that allows entrusted
data to be edited or deleted within a 24-hour period—and some cannot. Some can be
viewed and some cannot. In the US there are no rules; those that hold the data get
to decide—and of course they have complete access.

Different platforms give you different abilities to restrict who may see your communications.
Until 2011, you could either make your Facebook posts readable by your friends only
or by everyone; at that point, Facebook allowed you to have custom friends groups,
and you could make posts readable by some of your friends but by not all of them.
Tweets are either direct messages or public to the world. Instagram posts can be either
public, restricted to specific followers, or secret. Pinterest pages have public or
secret options.

Standardizing this is important. In 2012, the White House released a “Consumer Privacy
Bill of Rights.” In 2014, a presidential review group on big data and privacy recommended
that this bill of rights be the basis for legislation. I agree.

It’s easy to go too far with this concept. Computer scientist and technology critic
Jaron Lanier proposes a scheme by which anyone who uses our data, whether it be a
search engine using it to serve us ads or a mapping application using it to determine
real-time road congestion, automatically pays us a royalty. Of course, it would be
a micropayment, probably even a nanopayment, but over time it might add up to a few
dollars. Making this work would be extraordinarily complex, and in the end would require
constant surveillance even as it tried to turn that surveillance into a revenue stream
for everyone. The more fundamental problem is the conception of privacy as something
that should be subjected to commerce in this way. Privacy needs to be a fundamental
right, not a property right.

We should have a right to delete. We should be able to tell any company we’re entrusting
our data to, “I’m leaving. Delete all the data you have on me.” We should be able
to go to any data broker and say, “I’m not your product. I never gave you permission
to gather information about me and sell it to others. I want my data out of your database.”
This is what the EU is currently grappling with: the right to be forgotten. In 2014,
the European Court of Justice ruled that in some cases search engines need to remove
information about individuals from their results. This caused a torrent of people
demanding that Google remove search results that reflected poorly on them: politicians,
doctors, pedophiles. We can argue about the particulars of the case, and whether the
court got the balance right,
but this is an important right for citizens to have with respect to their data that
corporations are profiting from.

US Consumer Privacy Bill of Rights (2012)

INDIVIDUAL CONTROL:
Consumers have a right to exercise control over what personal data companies collect
from them and how they use it.

TRANSPARENCY:
Consumers have a right to easily understandable and accessible information about
privacy and security practices.

RESPECT FOR CONTEXT:
Consumers have a right to expect that companies will collect, use, and disclose personal
data in ways that are consistent with the context in which consumers provide the data.

SECURITY:
Consumers have a right to secure and responsible handling of personal data.

ACCESS AND ACCURACY:
Consumers have a right to access and correct personal data in usable formats, in
a manner that is appropriate to the sensitivity of the data and the risk of adverse
consequences to consumers if the data is inaccurate.

FOCUSED COLLECTION:
Consumers have a right to reasonable limits on the personal data that companies collect
and retain.

ACCOUNTABILITY:
Consumers have a right to have personal data handled by companies with appropriate
measures in place to assure they adhere to the Consumer Privacy Bill of Rights.

MAKE DATA COLLECTION AND PRIVACY SALIENT

We reveal data about ourselves all the time, to family, friends, acquaintances, lovers,
even strangers. We share with our doctors, our investment counselors, our psychologists.
We share a lot of data. But we think of that sharing transactionally: I’m sharing
data with
you
, because I need you to know things/trust you with my secrets/am reciprocating because
you’ve just told me something personal.

As a species, we have evolved all sorts of psychological systems to navigate these
complex privacy decisions. And these systems are extraordinarily complex, highly attuned,
and delicately social. You can walk into a party and immediately know how to behave.
Whom you talk to, what you tell to whom, who’s around you, who’s listening: most of
us can navigate that beautifully. The problem is that technology inhibits that social
ability. Move that same party onto Facebook, and suddenly our intuition starts failing.
We forget who’s reading our posts. We accidentally send something private to a public
forum. We don’t understand how our data is monitored in the background. We don’t realize
what the technologies we’re using can and cannot do.

In large part that’s because the degree of privacy in online environments isn’t salient.
Intuition fails when thoughts of privacy fade into the background. Once we can’t directly
perceive people, we don’t do so well. We don’t think, “There’s a for-profit corporation
recording everything I say and trying to turn that into advertising.” We don’t think,
“The US and maybe other governments are recording everything I say and trying to find
terrorists, or criminals, or drug dealers, or whoever is the bad guy this month.”
That’s not what’s obvious. What’s obvious is, “I’m at this virtual party, with my
friends and acquaintances, and we’re talking about personal stuff.”

And so we can’t use people’s continual exposure of their private data on these sites
as evidence of their consent to be monitored. What they’re consenting to is the real-world
analogue they have in their heads, and they don’t fully understand the ramifications
of moving that system into cyberspace.

Companies like Facebook prefer it this way. They go out of their way to make sure
you’re not thinking about privacy when you’re on their site, and they use cognitive
tricks like showing you pictures of your friends to increase your trust. Governments
go even further, making much of their surveillance secret so people don’t even know
it’s happening. This explains the disconnect between people’s claims that privacy
is important and their actions demonstrating that it isn’t: the systems we use are
deliberately designed so that privacy issues don’t arise.

We need to give people the option of true online privacy, and the ability to understand
and choose that option. Companies will be less inclined to do creepy things with our
data if they have to justify themselves to their customers and users. And users will
be less likely to be seduced by “free” if they know the true costs. This is going
to require “truth in product” laws that will regulate corporations, and similar laws
to regulate government.

For starters, websites should be required to disclose what third parties are tracking
their visitors, and smartphone apps should disclose what information they are recording
about their users. There are too many places where surveillance is hidden; we need
to make it salient as well.

Again, this is hard. Notice, choice, and consent is the proper way to manage this,
but we know that lengthy privacy policies written in legalese—those notice-and-consent
user agreements you click “I agree” to without ever reading—don’t work. They’re deliberately
long and detailed, and therefore boring and confusing; and they don’t result in any
meaningful consent on the part of the user. We can be pretty sure that a pop-up window
every time you post something to Facebook saying, “What you’ve written will be saved
by Facebook and used for marketing, and will be given to the government on demand,”
won’t work, either. We need some middle way. My guess is that it will involve standardized
policies and some sort of third-party certification.

ESTABLISH INFORMATION FIDUCIARIES

In several areas of our lives we routinely give professionals access to very personal
information about ourselves. To ensure that they only use
that information in our interests, we have established the notion of fiduciary responsibility.
Doctors, lawyers, and accountants are all bound by rules that require them to put
the interests of their clients above their own. These rules govern when and how they
can use the information and power we give them, and they are generally not allowed
to use it for unrelated purposes. The police have rules about when they can demand
personal information from fiduciaries. The fiduciary relationship creates a duty of
care that trumps other obligations.

We need information fiduciaries. The idea is that they would become a class of organization
that holds personal data, subject to special legal restrictions and protections. Companies
could decide whether or not to become part of this class or not. That is comparable
to investment advisors, who have fiduciary duties, and brokers, who do not. In order
to motivate companies to become fiduciaries, governments could offer certain tax breaks
or legal immunities for those willing to accept the added responsibility. Perhaps
some types of business would be automatically classified as fiduciaries simply because
of the large amount of personal information they naturally collect: ISPs, cell phone
companies, e-mail providers, search engines, social networking platforms.

Fiduciary regulation would give people confidence that their information wasn’t being
handed to the government, sold to third parties, or otherwise used against them. It
would provide special protections for information entrusted to fiduciaries. And it
would require certain duties of care on the part of providers: a particular level
of security, regular audits, and so on. It would enable trust.

Along similar lines, Internet security expert Dan Geer proposed that Internet service
providers choose whether they were content companies or communications companies.
As content companies, they could use and profit from the data but would also be liable
for the data. As communications companies, they would not be liable for the data but
could not look at it.

In the Middle Ages, the Catholic Church imposed a strict obligation of confidentiality
regarding all sins disclosed in confession, recognizing that no one would partake
of the sacrament if they feared that their trust might be betrayed by the priest.
Today we need a similar confidence online.

Other books

Havana Nights by Jessica Brooks
Controversy by Adrianne Byrd
SEAL Endeavor by Sharon Hamilton
Old Jews Telling Jokes by Sam Hoffman
Dracul's Revenge 01: Dracul's Blood by Carol Lynne, T. A. Chase
The Three Sirens by Irving Wallace
Wish You Well by David Baldacci