|  | 
| Anti-terrorism algorithms - Photograph: David Gunn/Getty Images | 
Does the stuff you post on the internet make you look like a terrorist? 
Is the rhythm of your typing sending the wrong signals? The government 
wants sites such as Google and Facebook to scan their users more 
closely. But if everything we do online is monitored by machines, how 
well does the system work? 
Should our future robot overlords decide to write a history of how 
they overcame their human masters, late 2014 will be a key date in the 
timeline. Last week, an official report from the parliamentary 
intelligence and security committee handed over responsibility
 for the UK’s fight against terrorism, or at least part of it, to 
Facebook’s algorithms – the automated scripts that (among other things) 
look at your posts and your networks to suggest content you will like, 
people you might know and things you might buy.
Assessing the intelligence failures that led to the murder of 
Fusilier Lee Rigby at the hands of two fanatics, the committee absolved 
MI5 of responsibility, in part because the agency was tracking more than
 2,000 possible terrorists at the time – far more than mere humans could
 be expected to follow. Instead, they placed a share of the blame on 
Facebook – which busily tracks its one billion users on a regular basis –
 for not passing on warnings picked up by algorithms the company uses to
 remove obscene and extreme content from its site. David Cameron agreed,
 and promised new laws, so it’s possible that soon Google, Facebook and 
co won’t just be scanning your messages to sell you stuff – they will be
 checking you are not plotting the downfall of western civilisation too.
Between the NSA’s automatic systems, social media tracking and more, 
everything you do is being overseen by the machines – but what might 
make you look suspect? Here are just a few examples.
Say the wrong thing
We already know that saying something stupid on social media can bring unwanted attention from the law. In 2010, a trainee accountant called Paul Chambers
 tweeted: “Crap! Robin Hood airport is closed. You’ve got a week and a 
bit to get your shit together otherwise I’m blowing the airport sky 
high!!” Those 134 characters, seen by an airport worker, led to arrest 
by anti-terror police, a conviction and three appeals, and cost Chambers
 two jobs before a crowdfunded legal campaign got the conviction 
quashed.
With the capability – and maybe soon the legal requirement – for 
algorithms to scan every social media post for problematic phrases, the 
potential for trouble increases exponentially. One way a machine might 
assess your content is through lists of keywords: a message containing 
one or two of these might not trigger an alert, but too many, too close 
together, and you are in trouble. Take a message such as: “Hey man, 
sorry to be a martyr, but can you get round to shipping me that 
fertiliser? I really do need it urgently. Thanks, you’re the bomb! See 
you Friday, Insha’Allah.”
An algorithm designed to flag content that might be inappropriate – 
triggering perhaps automated deletion, or account suspension – would 
have a much lower threshold than one sending a report to an intelligence
 officer suggesting she spend the rest of her day (or week) tracking an 
individual. How should the tool be tuned? Too tight and it will miss all
 but the most obvious suspicious messages. Too lax and the human 
operators will be drowning in cases.
In practice, algorithms designed to police content are set far more 
loosely than those to catch terrorists: keywords for intelligence 
agencies are more likely to be focused: names of particular individuals,
 or phrases picked up from other suspects.
Algorithms can get far cleverer than simply using keywords. One way 
is to pick up subtle ways in which messages from known terror suspects 
vary from the main population, and scan for those – or even to try to 
identify people by the rhythm of their typing. Both are used to a degree
 now, but will spread as they become better understood.
However sophisticated these systems are, they always produce false 
positives, so if you are unlucky enough to type oddly, or to say the 
wrong thing, you might end up in a dragnet.
Data strategist Duncan Ross set out what would happen if someone 
could create an algorithm that correctly identified a terrorist from 
their communications 99.9% of the time – far, far more accurate than any
 real algorithm – with the assumption that there were 100 terrorists in 
the UK.
The algorithm would correctly identify the 100 terrorists. But it 
would also misidentify 0.01% of the UK’s non-terrorists as terrorists: 
that’s a further 60,000 people, leaving the authorities with a 
still-huge problem on their hands. Given that Facebook is not merely 
dealing with the UK’s 60 million population, but rather a billion users 
sending 1.4bn messages, that’s an Everest-sized haystack for security 
services to trawl.
 
                      Share the wrong link
It’s pretty hard for machines right now to know exactly what we mean 
when we talk, so it is much easier for them to look for some kind of 
absolutely reliable flag that content is suspect. One easy solution is 
to use databases of websites known to be connected to extremists, or 
child abuse imagery, or similar. If you share such a link, then it is a 
pretty reliable sign that something is awry. If you do it more than 
once, even more likely that you are a terrorist. Or a sympathiser. Or a 
researcher. Or a journalist. Or an employee of a security agency …
If the database is accurate, this system works (sort of). The 
problems come if they are crowdsourced. Many major sites, such as 
YouTube, work in part through user-led abuse systems: if a user flags 
content as inappropriate, they are asked why. If enough people (or a few
 super-users) flag content for the same reasons, it triggers either 
suspension of the content (or user), or review by a human moderator. 
What happens when the pranksters of 4chan
 decide, en masse, to flag your favourite parenting website? Other 
systems rely on databases supplied by NGOs or private companies, which 
are generally good, but far from infallible.
Anyone who has got an “adult content warning” browsing the internet 
on their mobile – where first world war memorials, drug advice sites, 
and even Ada Lovelace Day have fallen foul of O2 filters, for example – might be a little alarmed.
Know the wrong people
Everyone knows that hanging out with the wrong crowd can get you in 
trouble. Online, the crowd you hang out with can get pretty big – and 
the intelligence agencies are willing to trawl quite a long way through 
it.
We know, post-Snowden, that the NSA will check up to “three hops” 
from a target of interest: one hop’s your friends, two hops is friends 
of friends, and three hops drags in their friends too. Given that, thanks to Kevin Bacon,
 we know six hops is enough to get to pretty much anyone on the planet, 
three hops is quite a lot of people. If the NSA decided I was a target 
of interest, for example, that could drag in 410 Facebook friends, 
66,994 friends of friends, and 10.9 million of their pals. Sorry, guys.
Obviously no agency on the planet would manually review 66,994 of 
anyone’s contacts (let alone nearly 11 million), but if a few of those 
second- or third-degree contacts happened to also be in the networks of 
other people of interest to the NSA, then their odds of being 
scrutinised rockets.
The potential of these huge, spiderlike networks-of-networks is an 
exciting one for the agencies. They don’t always live up to the hype, 
though. According to Foreign Policy magazine,
 General Keith Alexander, the former head of the NSA, was an 
enthusiastic advocate for bulk surveillance programmes. In his bid to 
convince colleagues of their worth, he could be seen giving briefings in
 the Information Dominance Center,
 pointing to complex diagrams showing who knew who – including some 
places being called by dozens of people in the network. Maybe the data 
had found the kingpin?
“Some of my colleagues and I were sceptical,” a former analyst told 
the magazine. “Later, we had a chance to review the information. It 
turns out that all [that] those guys were connected to were pizza 
shops.”
Have the wrong name
With all the talk of “smart analytics” and “big data”, it is easy to 
forget that a lot of automatic systems will unthinkingly dive on 
anything that looks like a target. If you are unlucky enough to have the
 same name as a major terror suspect, your emails, messages and more 
will almost certainly have ended up in at least one intelligence agency 
database.
Things get even worse with no-fly lists: because of clerical errors, 
false flags on names or similar, for the first few years after 9/11, 
some unfortunates were detained on dozens of occasions flying around the
 US, and even imprisoned. These included Stanford academic (and US citizen) Rahinah Ibrahim,
 who uses a wheelchair. She had been flagged when someone hit the wrong 
checkbox on an online form, as she learned only years later through a 
court challenge. Only after several court battles was the system tidied 
up, and some people still need to fly with letters – to show to humans –
 stating that they are absolutely, definitely, not a terrorist, no 
matter what the computer says.
 
                      Act the wrong way
It is possible that, mindful of companies tracking you for ads, 
governments tracking you to keep you safe, and schoolfriends tracking 
you down to show baby pictures, you have decided to try to use the 
internet a bit more privately.
One way might have been to install software such as Tor,
 which, when used properly, anonymises your internet browsing. The US 
navy helped develop the software, which receives public money to this 
day for its role in protecting activists in dictatorships around the 
world. At the same time, though, British and US spies decry the hiding 
place it offers to terrorists, serious criminals and others. According 
to the Snowden files, GCHQ and the NSA constantly attempted to break and
 track the network, created special measures to save traffic of Tor 
users, and even constructed some malware tools that would target any Tor
 users who happened upon a site hosting the virus. The sophisticated 
attack used problems in browser software to allow almost total access to
 any compromised computer.
Do nothing at all
In the online era, there is every possibility that you could fall 
into surveillance without ever posting, acting or associating 
suspiciously. With so much traffic flowing across the internet, it is 
sometimes easier for intelligence agencies to collect everything they 
see rather than targeting particular people – so sometimes even merely 
using the most innocuous or esoteric web services can get your pictures 
into agency databases. It is unlikely to lead to your impending arrest, 
and could well never be read by an actual human – but it would be there 
all the same.
One example is a GCHQ system codenamed OPTIC NERVE
 that was designed to capture images from every Yahoo webcam chat picked
 up by GCHQ’s bulk-intercept system. The capability was created, Snowden
 documents suggested, because some GCHQ targets used the webcam software
 – and so the agency picked up everything it could. Our poor spies 
quickly discovered that lots of people – up to 11% of users – rely on 
such webcam services to exchange “adult” moments, and staff had to be 
issued with advice on how to avoid seeing such smut. Such are the 
hazards of snooping: you set out to find terrorists, and end up building
 (probably) the world’s largest porn collection.
Another place the agencies saw some of their targets was in the world
 of online gaming. Noticing suspects playing online role-playing games, 
or messing with Angry Birds, the agencies responded to cover those areas
 of the internet too. GCHQ documents show the agency analysed how to 
read and collect information sent back and forth from that and other 
online games, including how to extract and store text in bulk from some 
game chatrooms. Other GCHQ analysts managed to wangle the geek’s dream 
assignment of becoming human agents in online games, including Second Life and World of Warcraft.
One way to avoid such unwanted attention might be to stick with 
console shoot-’em-ups: play this sort of game on Xbox Online, and you 
are more likely to see a GCHQ hiring advert than fall foul of 
surveillance. If you can’t beat ’em, why not join ’em?
Source:
The Guardian, UK 
 
 
No comments:
Post a Comment