![]() ![]() ![]() |
|
Uh-Oh
Posted by Stephen Green · 21 July 2004
One historic strength of the US intelligence community has been its ability to intercept and decypher enemy communications. The most well-known example is the breaking of the Japanese diplomatic and military codes before and during WWII. Our ability to "read their mail" enabled the Navy's stunning victory at Midway -- a battle which put Japan on the defensive for the rest of the war. Our new enemy is even more dispersed than Imperial Japan's vast East Asian Co-Prosperity Sphere, and thus even more reliant on secure communications. You'd think that would play into our biggest strength. To gain an appreciation of the emerging challenge, consider these facts. A single strand of fiber-optic cable exceeds the capacity of all the telecommunications satellites orbiting the globe. This year alone, e-mail volume is expected to be the equivalent of 40 copies of the fully digitized holdings of the Library of Congress. Would any readers with actual signals intelligence experience like to weigh in on this, in a strictly unclassified manner? Shoot me an email, or click on the Comments below. Comments
And would any readers like to give a Library of Congress equivalent in megabytes? I absolutely HATE it when people use analogies like that that don't mean anything to most people. I don't need to know how many 'songs' or 'encyclopedias' some new removable media can hold, I want to know how many bytes it can hold. Unless Windows starts giving file sizes in encyclopedias. Posted by: dr.dna at July 21, 2004 01:53 AMTo give some scope and scale to the challenge facing our intelligence gatherers, figure millions (soon to be billions) of terabytes per second. First, it's encoded with a network protocol, then it's encrypted, then it's in one of say 150 different languages. 530 billion instant messages of 1K bytes each will be lots of bits to inspect, and that doesn't include email or telephone conversations. B^( Posted by: geoffg at July 21, 2004 04:17 AMOK, I'll weigh in on this one. I was a 05H Electronic Warfare Signals Intelligence Morse Interceptor in the Army in the late 1980's. Signals Intelligence is a bit of a hobby also, and I am now a computer programmer/analyst. I have worked on communications programs, and am also a ham radio operator. In a general sense, there are a couple of problems to solve here, and automation can help with some of them. First, you have the related problems of targeting and collection. In other words, you need to know who to listen to (which in this case might be an e-mail address, or an AIM screen name), and to effectively collect that information. These two fields, especially the second one, are amenable to automation. I imagine (not having any access to inside information) that this is happening now, and has been happening since the public birth of the Internet. The next problem is getting useful and timely information from what you have intercepted. In these days of strong encryption being available to everyone for free, this is a serious problem. But all is not lost. You can get a lot of information by using what is known as 'traffic analysis'. Let us assume three entities, labeled A, B, and C. They are suspected of being part of a terrorist network, but we aren't sure. A talks to B quite frequently, and likewise B talks to A, but only after talking to C. C only communicates with B, but only infrequently. We then notice that C tends to communicate with B right before some kind of terrorist activity. This then gives us a 'predictive' capability. When we note that C sends a message to B, we can assume something is going to happen, and put people on alert. We can assume that A is either a support wing of the group, or a command group (we don't have enough information yet). We can assume that C is the 'action' part of this terrorist group. This starts narrowing down things. We can also note IP addresses, and possibly work from there to find the physical computers they are using, and work on getting information from there. Also, remember that people who communicate tend to get sloppy. Call it laziness, call it getting comfortable, but people make mistakes. One common mistake is enciphering or encoding a message in two different systems to go to two different receipients. That opens up a very nice crack in the cryptological armor. And it happens more than you might think. Lets say that A sends a message to B using a certain PGP key. B then sends the message to C using a different key, because A and C can't communicate for security reasons. You now have two identical messages that can be compared. I'm not really a cryptanalyst, and it would be far above my skill level (which is not much better than solving cryptoquotes in the newspaper and some simple polyalphabetic systems), but I do know this is a classic technique for leveraging plaintext from unknown ciphertext. Another problem that touches on all that stated above is how to separate the wheat from the chaff. Using the Pacific Theater as an example again, at first we really didn't care about the locations of Japan's merchant fleet. We were worried about their aircraft carriers and battleships. Protecting our numerically weaker assets, while trying to reduce the offensive capability of the Imperial Japanese Navy was the priority. However, as we shifted to trying to starve the Japanese into submission and to cut off supplies to their far-flung outposts, the locations of ships carrying food, fuel, and ammunition became important. And we used that information to guide our submarines onto the likely routes of Japans Marus. We accomplished in the Pacific, in large part thanks to signals intelligence, what Admiral Doenitz failed to do in the Atlantic. Sometimes even just knowing where an entity is can provide valuable information. This mostly applies to transmissions over the airwaves, but you must remember that includes things like satellite phones, cell phones, walkie talkies, even wireless networks. Basically, it boils down to if you are radiating a identifiable signal (even perhaps an unintentional one), you can be located very quickly using direction finding (DF). There are some limitations, and ways to obscure your location, but none are foolproof. Even spread spectrum emissions are vulnerable, whether they are direct sequence or frequency hopping. Wideband receivers plugged into computers running Fourier transform software would make quick work of this (commonly available ham radio software could almost do this now). Knowing where a target is, even if you don't know what they are saying, can be useful for targeting munitions. The real trick is acting on the information before it becomes stale (ie., the target changes location). There are challenges that no doubt need to be met, but it has always been that way. And these issues didn't just spring from the ground full grown in the last three years, so I assume that there has been considerable progress towards solving these problems. I imagine that somewhere deep in the bowels of the NSA headquarters is a sign that says "All your comms are belong to us". Posted by: Bill at July 21, 2004 07:53 AMThanks, Bill! To slightly amplify your initial point "two". As the technology for bandwidth increases so does the tech for storage. So store EVERYTHING. Every email, every phone call, every packet. At some point a clue is discovered. Say you pick up a shoe bomber on an aircraft with a cell phone in his pocket. Now you pull from storage the record headers of every call from that phone to other phones. Patterns? Attempt to listen to the recordings. Can you identify other phone numbers that might be clues? This isn't real time analysis, but it is a great, if time lagged, way to apply filters to the mass of information coming in. And if a high-level node can be identified in LAST month's mass, then you CAN start tapping the real-time messages flowing in and out of that node THIS month. For us private citizens of no evil intent, it's a bit scary to think of all our prank calls to the high school; the calls where we breathe heavily at our objects d'crush, the angry calls where we threaten to "Come over there and beat some sense into your skull"; and the mushy calls where we babble over the phone at some oblivious baby, all recorded forever. But unless I'm picked up with a shoe bomb, I'd expect the recordings would never actually be listened to. And that, perhaps, is our safeguard. Not that an official needs a search warrent to RECORD the packets. But we might want to consider laws requiring officials to get a warrent in order to RETRIEVE the recordings and listen. Time to think about it is now -- if not last decade. Posted by: pouncer at July 21, 2004 08:35 AMPouncer: By the way, it is *VERY* bad juju to systematically monitor what is definded as a 'US Person' without a warrant. There is a specific definition, which escapes my memory, but it applies to all US citizens and companies. There are exceptions for indentification purposes (you have to know what not to monitor, in order to not monitor it) and training. In fact, that was the excuse used in 'Enemy of the State' to monitor Will Smiths character. But in practice, that wouldn't wash. Training would have been done against other government entities, not against private citizens. Records other than those necessary to identify the signals must be destroyed. And it is taken very seriously. There are some serious fines and prison time associated with monitoring 'US Persons' outside of the narrow exceptions. Recording the communications for possible later retrieval is illegal. The law would have to change, and I am not sure that I want this particular law changed. It is there to protect us from government abuse. None of this applies to foreign nationals not in the US, who are by definition not 'US Persons'. There is a good FAQ on some of this here: Note especially questions 9 through 13. There are some good definitions in there that have refreshed my memory somewhat. While I do know that the NSA is up to its eyeballs in computers (and indeed was back when I was in the SIGINT business, they were Crays biggest customer), storing every communication World-wide isn't practical. I suspect that they filter at the front end, and continually tweak those filters to gather as complete a picture as they can. Another problem that I touched on in my previous post is that of translation. I surmise that they have some automated translation tools, probably much better than Babel, but that important stuff is done by a human. There are nuances that a human will notice that machines can miss. This would be analogous to the skimming of Japanese intercepts in 1942. The NSA maintains an interesting website with some historical decrypts. Well worth the look, along with reading anything written by David Kahn. Posted by: Bill at July 21, 2004 09:31 AMI can't really comment on the policy implications of this but I do know quite a lot about the cryptographic side of things (I've worked in designing cryptographic software both in academia and business). Basically the upshot is this: with the consistent application of some widely available tools, the battle to read what people are sending is lost. Bill is correct in saying that re-encryption of a message in PGP opens a chink in the armour, but it's a very tiny chink and it is exceedingly unlikely that you will ever get enough messages to mount a known-ciphertext attack on a modern cipher. Really your only chance is sloppy key generation, i.e. attack the cryptosystem, not the cipher. It's very unlikely the NSA can break AES, 3DES, RSA or El Gamal, and if you're encrypting VOIP end-to-end then traffic analysis is really all you're left with. Couple anonymous remailers with IPSec and steganographic techniques and a couple of knowledgable people can set up a comms channel that is well-nigh immune to surveillance. With routine encryption of VPNs, for example, cryptanalysts are at a huge disadvantage. We used to call this the 'all cats look grey in the dark' effect. I could create an innocuous-seeming website with images on it doctored to store a few bytes of information in a fashion that is impossible to prove, even in principle, constitutes a message (this is a so-called subliminal channel). Those few bytes could easily contain a go-code for a terrorist, for example. This 'broadcast' method is particularly effective, in fact. Let's examine the 'store everything' angles. If we wave away the legal objections, we can try to get a handle on the data storage requirements. The WP article is a bit useless in quantifying things but we can go order-of-magnitude. 180 billion minutes of phone calls. Let's stipulate this is 64 kbit/s CCITT toll-quality voice (ignore fax, etc). That's 1.8 x 10^11 x 60 x 8 kbytes. Order of magnitude, that is 100 Petabytes of data, or 100 million Gigabytes (you'll need a database on top of this). Let's say we're going to use current top-end 250G drives, and 'cos we're the government we get them for the low! low! price of $100. That's 40 million bucks' worth just in drives a year, plus the technicians to install them at the rate of 1100 per day, plus the very expensive and mushrooming SAN infrastructure to hook them up to your as-yet uncosted global data-gathering facility, plus the 100 MW power station you need to keep this thing running, plus... And that's just voice, for one year. Add in email, HTTP, IM, FTP, NNTP. Now we need the software to classify all this stuff, search tools to retrieve the data in a timely fashion, the agents and cryptologists and traffic analysts, and the facility to house the thousands of terminals that this army of spooks is going to need. And if it's been competently encrypted, you're hosed from the get-go. 'Store everything' is idiotic. It doesn't even pass the laugh test. You can't even collect everything, let alone store it. Posted by: David Gillies at July 21, 2004 10:46 AMBill, That rule restricting the collection on American citizens would be United States Signals Intelligence Directive 18. And yes it is very bad 'juju' to break it. Posted by: 98GStrat at July 21, 2004 11:43 AMGoogle the term "echelon", it's not all conspiracy theory. Posted by: Joe at July 21, 2004 12:27 PMDavid is also correct. It just isn't possible to *STORE* everything. It is possible to check most communications on the fly, tagging and saving those of interest for later analysis. I suspect that if you were really, really, really paranoid about security, you would go with a one time pad type encryption system. They are bit of a pain in the butt to use, in that the keys have to be distributed in a secure manner, however they are unbreakable both in theory and in practice, providing that you follow the rules (never ever re-use the keys. The Russians did that in WWII, and that allowed us to read a fraction of their GRU/NKVD traffic). A conventional 1.44M floppy could contain enough key material to last for quite a long time, providing you are only sending text. As for how much of a crack identical messages with different keys would open in PGP, I'm not qualified to say, other than I know that it is a possible generic attack for any cipher system. As for an initiation signal, it could be anything. It doesn't even have to be a doctored photo on a website. It could be an undoctored photo of a particular subject on a website or in an e-mail. During WWII, the BBC transmitted certain phrases and songs as messages to the French Resistance. Digitally signing them by subtle manipulations of bits assures authenticity, but isn't strictly necessary for a secret pre-arraigned signal. Interestingly, on the subject of subtle modifications, sometimes security checks like that are ignored. For many months, the Germans controlled an entire network of spies/saboteurs/resistance fighters inserted on the continent by the English before D-Day. The English didn't catch on, despite the fact that the radio operators sending messages back to them were omitting their security checks (specific mistakes in encoding to prove authenticity of the message, and that the agent wasn't 'turned' by the opposition). Finally, the English did eventually catch on, but not before they had dropped thousands of pounds of guns, ammunition, explosives, tobacco, and other things into the waiting hands of the Germans. After a few months of sparring on the airwaves, the Germans sent a hilarious message in plaintext to the British, promising to welcome them with open arms. Which is a long-winded example that sometimes even professionals can get sloppy, and sometimes security checks can be ignored. And sloppy key generation would not be unheard of. In fact, it was and is a common thing. People get lazy. A frequent thing German Enigma operators in certain units would do is use the rotor position that the machine was on after the last message was encoded or decoded as the key for the next message. Very bad. The Russians, probably due more to the logistical problems inherent in distributing keys during wartime than sloppiness, re-used keys for the OTP system used by NKVD agents in the US during WWII, opening a crack for the NSA, giving us the Venona documents (very interesting stuff. The Goldbergs were guilty as sin). Another potential attack is the know or assumed plaintext attack. Given the nature of the terrorists, it would not be unreasonable that they might have passages from the Koran in at least some of their messages. Again, I'm not personally qualified as to whether it is or isn't possible. Even with encrypted VPN's and anonymous remailers and such, an increase or decrease in the traffic in any particular suspicious part of the Internet outside the statistical norms could be an indicator, even if you don't know who sent it, where it is going, and what it says. Especially if something significant happens immediately before or immediately after the increase or decrease. All of the cats in one particular area might look gray, but if there is a sudden increase in the number of cats, something might be afoot. I suspect that when you hear about 'increased chatter' on the news (which makes me cringe every time), some of that might be simple statistical analysis like I outlined above. You can attempt to hide important messages by passing large amounts of dummy traffic, in an attempt to foil this sort of analysis, but dummy traffic always seems to look like, well, dummy traffic. If you are just sending random bytes, that looks statistically different than an actual message (with the exception of the OTP system). Unfortunately, one of the disadvantages of a free press is that you can't suppress information that while isn't technically classified, is helpful to the other side. A prime example is the fact that we were able to intercept the satellite phones used by Osama bin Laden in Afghanistan. Had that information not been reported in 1998 or '99, we might have been able to quickly capture or kill him after 9/11. Not that I am knocking a free press, mind you. I would much rather have to deal with the problems inherent in a free press than with the problems inherent in government control of the press. But it can be frustrating.
Thanks 98GStrat. It's been 15 years since I got out, and didn't remember the actual law. By the way, I understand that 05H's have been renumbered as 98H's, and that the 05D's (the DF guys) have been merged into that MOS.
I _am_ an idiot, of course. I never deny the obvious. Idiocy gives me the prerogative to ask stupid questions... Isn't the point that storage capacity is rising along with bandwidth capacity valid? Or is it really the case that the number of channels to be monitored is rising faster than our ability to make any record at all, even that "pen register"? Posted by: pouncer at July 21, 2004 02:58 PMBandwidth is rising a lot faster than storage capacity growth. But it's the interpretation of the information that's an issue. The ultimate limitation is human factors, not hardware. Bill: if you run dummy traffic over an end-to-end encrypted link it looks indistinguishable from genuine traffic. It also makes the cryptanalyst's task much harder. Posted by: David Gillies at July 21, 2004 04:43 PMDavid, if you run enough dummy traffic over a link, it will look different than real traffic. Having said that, it does complicate the job somewhat, as you have figure out what is the dummy traffic and what is the real traffic, without being able to read either. If you send too much encrypted dummy traffic (Koran texts, or perhaps sound files of OBL speeches) you run the risk of sending lots of duplicate information, and thus the attendant risk of compromising your cryptosystem. If you just send unencrypted random garbage, it will look statistically different than the real traffic. I don't know this specifically, but I suspect that if you encrypt random garbage there might be a statistical test that, given enough traffic, will tell you that it is random. There is also the fact that dummy traffic tends to get stereotyped, due to (once again) the laziness of operators, and sometimes to overly rigid communications rules. This would all be useless in the face of a properly set up OTP system, but the logistical difficulties preclude it from being used on the type of end-to-end encrypted link we are talking about. You would have to not only pass the one time keypads (or files), but also tons of dummy pads. And, of course, all this is for naught if the other side is passing physical disks with encrypted files in person. Posted by: Bill at July 22, 2004 06:01 AM |
MDS - Give Until It Hurts Terror War Scorecard Watching America 50 Things American Cancer Ablation Center Buy VodkaPundit Stuff
"I'm Chris Muir, and I approved this blog."
Ann Althouse
Across the Atlantic
American Realpolitik
Albion's Seedlings
Justene Adamec
The Argument Clinic
Todd A
Moe Freedman
Allah Is In the House
Body in Mind
Ben Domenech
Duck Season
Banana Counting Monkey
Ted Barlow
Eric Alterman
American Times
|
![]() ![]() ![]() ![]() ![]() |