Anti-terrorism algorithms - |
Does the stuff you post on the internet make you look like a terrorist?
Is the rhythm of your typing sending the wrong signals? The government
wants sites such as Google and Facebook to scan their users more
closely. But if everything we do online is monitored by machines, how
well does the system work?
Should our future robot overlords decide to write a history of how
they overcame their human masters, late 2014 will be a key date in the
timeline. Last week, an official report from the parliamentary
intelligence and security committee handed over responsibility
for the UK’s fight against terrorism, or at least part of it, to
Facebook’s algorithms – the automated scripts that (among other things)
look at your posts and your networks to suggest content you will like,
people you might know and things you might buy.
Assessing the intelligence failures that led to the murder of
Fusilier Lee Rigby at the hands of two fanatics, the committee absolved
MI5 of responsibility, in part because the agency was tracking more than
2,000 possible terrorists at the time – far more than mere humans could
be expected to follow. Instead, they placed a share of the blame on
Facebook – which busily tracks its one billion users on a regular basis –
for not passing on warnings picked up by algorithms the company uses to
remove obscene and extreme content from its site. David Cameron agreed,
and promised new laws, so it’s possible that soon Google, Facebook and
co won’t just be scanning your messages to sell you stuff – they will be
checking you are not plotting the downfall of western civilisation too.
Between the NSA’s automatic systems, social media tracking and more,
everything you do is being overseen by the machines – but what might
make you look suspect? Here are just a few examples.
Say the wrong thing
We already know that saying something stupid on social media can bring unwanted attention from the law. In 2010, a trainee accountant called Paul Chambers
tweeted: “Crap! Robin Hood airport is closed. You’ve got a week and a
bit to get your shit together otherwise I’m blowing the airport sky
high!!” Those 134 characters, seen by an airport worker, led to arrest
by anti-terror police, a conviction and three appeals, and cost Chambers
two jobs before a crowdfunded legal campaign got the conviction
quashed.
With the capability – and maybe soon the legal requirement – for
algorithms to scan every social media post for problematic phrases, the
potential for trouble increases exponentially. One way a machine might
assess your content is through lists of keywords: a message containing
one or two of these might not trigger an alert, but too many, too close
together, and you are in trouble. Take a message such as: “Hey man,
sorry to be a martyr, but can you get round to shipping me that
fertiliser? I really do need it urgently. Thanks, you’re the bomb! See
you Friday, Insha’Allah.”
An algorithm designed to flag content that might be inappropriate –
triggering perhaps automated deletion, or account suspension – would
have a much lower threshold than one sending a report to an intelligence
officer suggesting she spend the rest of her day (or week) tracking an
individual. How should the tool be tuned? Too tight and it will miss all
but the most obvious suspicious messages. Too lax and the human
operators will be drowning in cases.
In practice, algorithms designed to police content are set far more
loosely than those to catch terrorists: keywords for intelligence
agencies are more likely to be focused: names of particular individuals,
or phrases picked up from other suspects.
Algorithms can get far cleverer than simply using keywords. One way
is to pick up subtle ways in which messages from known terror suspects
vary from the main population, and scan for those – or even to try to
identify people by the rhythm of their typing. Both are used to a degree
now, but will spread as they become better understood.
However sophisticated these systems are, they always produce false
positives, so if you are unlucky enough to type oddly, or to say the
wrong thing, you might end up in a dragnet.
Data strategist Duncan Ross set out what would happen if someone
could create an algorithm that correctly identified a terrorist from
their communications 99.9% of the time – far, far more accurate than any
real algorithm – with the assumption that there were 100 terrorists in
the UK.
The algorithm would correctly identify the 100 terrorists. But it
would also misidentify 0.01% of the UK’s non-terrorists as terrorists:
that’s a further 60,000 people, leaving the authorities with a
still-huge problem on their hands. Given that Facebook is not merely
dealing with the UK’s 60 million population, but rather a billion users
sending 1.4bn messages, that’s an Everest-sized haystack for security
services to trawl.
Share the wrong link
It’s pretty hard for machines right now to know exactly what we mean
when we talk, so it is much easier for them to look for some kind of
absolutely reliable flag that content is suspect. One easy solution is
to use databases of websites known to be connected to extremists, or
child abuse imagery, or similar. If you share such a link, then it is a
pretty reliable sign that something is awry. If you do it more than
once, even more likely that you are a terrorist. Or a sympathiser. Or a
researcher. Or a journalist. Or an employee of a security agency …
If the database is accurate, this system works (sort of). The
problems come if they are crowdsourced. Many major sites, such as
YouTube, work in part through user-led abuse systems: if a user flags
content as inappropriate, they are asked why. If enough people (or a few
super-users) flag content for the same reasons, it triggers either
suspension of the content (or user), or review by a human moderator.
What happens when the pranksters of 4chan
decide, en masse, to flag your favourite parenting website? Other
systems rely on databases supplied by NGOs or private companies, which
are generally good, but far from infallible.
Anyone who has got an “adult content warning” browsing the internet
on their mobile – where first world war memorials, drug advice sites,
and even Ada Lovelace Day have fallen foul of O2 filters, for example – might be a little alarmed.
Know the wrong people
Everyone knows that hanging out with the wrong crowd can get you in
trouble. Online, the crowd you hang out with can get pretty big – and
the intelligence agencies are willing to trawl quite a long way through
it.
We know, post-Snowden, that the NSA will check up to “three hops”
from a target of interest: one hop’s your friends, two hops is friends
of friends, and three hops drags in their friends too. Given that, thanks to Kevin Bacon,
we know six hops is enough to get to pretty much anyone on the planet,
three hops is quite a lot of people. If the NSA decided I was a target
of interest, for example, that could drag in 410 Facebook friends,
66,994 friends of friends, and 10.9 million of their pals. Sorry, guys.
Obviously no agency on the planet would manually review 66,994 of
anyone’s contacts (let alone nearly 11 million), but if a few of those
second- or third-degree contacts happened to also be in the networks of
other people of interest to the NSA, then their odds of being
scrutinised rockets.
The potential of these huge, spiderlike networks-of-networks is an
exciting one for the agencies. They don’t always live up to the hype,
though. According to Foreign Policy magazine,
General Keith Alexander, the former head of the NSA, was an
enthusiastic advocate for bulk surveillance programmes. In his bid to
convince colleagues of their worth, he could be seen giving briefings in
the Information Dominance Center,
pointing to complex diagrams showing who knew who – including some
places being called by dozens of people in the network. Maybe the data
had found the kingpin?
“Some of my colleagues and I were sceptical,” a former analyst told
the magazine. “Later, we had a chance to review the information. It
turns out that all [that] those guys were connected to were pizza
shops.”
Have the wrong name
With all the talk of “smart analytics” and “big data”, it is easy to
forget that a lot of automatic systems will unthinkingly dive on
anything that looks like a target. If you are unlucky enough to have the
same name as a major terror suspect, your emails, messages and more
will almost certainly have ended up in at least one intelligence agency
database.
Things get even worse with no-fly lists: because of clerical errors,
false flags on names or similar, for the first few years after 9/11,
some unfortunates were detained on dozens of occasions flying around the
US, and even imprisoned. These included Stanford academic (and US citizen) Rahinah Ibrahim,
who uses a wheelchair. She had been flagged when someone hit the wrong
checkbox on an online form, as she learned only years later through a
court challenge. Only after several court battles was the system tidied
up, and some people still need to fly with letters – to show to humans –
stating that they are absolutely, definitely, not a terrorist, no
matter what the computer says.
Act the wrong way
It is possible that, mindful of companies tracking you for ads,
governments tracking you to keep you safe, and schoolfriends tracking
you down to show baby pictures, you have decided to try to use the
internet a bit more privately.
One way might have been to install software such as Tor,
which, when used properly, anonymises your internet browsing. The US
navy helped develop the software, which receives public money to this
day for its role in protecting activists in dictatorships around the
world. At the same time, though, British and US spies decry the hiding
place it offers to terrorists, serious criminals and others. According
to the Snowden files, GCHQ and the NSA constantly attempted to break and
track the network, created special measures to save traffic of Tor
users, and even constructed some malware tools that would target any Tor
users who happened upon a site hosting the virus. The sophisticated
attack used problems in browser software to allow almost total access to
any compromised computer.
Do nothing at all
In the online era, there is every possibility that you could fall
into surveillance without ever posting, acting or associating
suspiciously. With so much traffic flowing across the internet, it is
sometimes easier for intelligence agencies to collect everything they
see rather than targeting particular people – so sometimes even merely
using the most innocuous or esoteric web services can get your pictures
into agency databases. It is unlikely to lead to your impending arrest,
and could well never be read by an actual human – but it would be there
all the same.
One example is a GCHQ system codenamed OPTIC NERVE
that was designed to capture images from every Yahoo webcam chat picked
up by GCHQ’s bulk-intercept system. The capability was created, Snowden
documents suggested, because some GCHQ targets used the webcam software
– and so the agency picked up everything it could. Our poor spies
quickly discovered that lots of people – up to 11% of users – rely on
such webcam services to exchange “adult” moments, and staff had to be
issued with advice on how to avoid seeing such smut. Such are the
hazards of snooping: you set out to find terrorists, and end up building
(probably) the world’s largest porn collection.
Another place the agencies saw some of their targets was in the world
of online gaming. Noticing suspects playing online role-playing games,
or messing with Angry Birds, the agencies responded to cover those areas
of the internet too. GCHQ documents show the agency analysed how to
read and collect information sent back and forth from that and other
online games, including how to extract and store text in bulk from some
game chatrooms. Other GCHQ analysts managed to wangle the geek’s dream
assignment of becoming human agents in online games, including Second Life and World of Warcraft.
One way to avoid such unwanted attention might be to stick with
console shoot-’em-ups: play this sort of game on Xbox Online, and you
are more likely to see a GCHQ hiring advert than fall foul of
surveillance. If you can’t beat ’em, why not join ’em?
Source:
The Guardian, UK