NSA XKeyscore Powerpoint
NSA XKeyscore Powerpoint
DERIVEP F ~ O M : f::J
DATED: 20070108
TOP SECRET I ICOMINT I IREL TO USA, AUS, CAN, GBR, NZL DECLASSIFY ON: 20320108
~ I .J:._ .... .j,. ..
1.
2.
3.
I
I 4.
I
DNI Exploitation System/ Analytic Framework
Performs strong (e.g. email) and soft (content) selection
Provides real-time target activity (tipping)
"Rolling Buffer" of rv3 days of ALL unfiltered data seen by
XKEYSCORE:
I
Stores full-take data at the collection site - indexed by meta-data
Provides a series of viewers for common data types
1. Federated Query system -one query scans all sites
Performing full-take allows analysts to find targets that were
previously unknown by mining the meta-data
TOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
Small, focused team
Work closely with the analysts
Evolutionary development cycle (deploy early, deploy often)
React to mission requirements
Support staff integrated with developers
Sometimes
1
a delicate balance of mission and research
I
. TOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
Massive distributed Linux cluster
Over 500 servers distributed around the world
I
Systehl can scale linearly - simply add a new
servet to the cluster
I
I
Federated Query Mechanism
I
I
I
[ TOP S ~ C R E T /COMINT/ /REL TO USA, AUS, CAN, GBR, NZL
QIPll User Queries I
Query
1 XKEYSCORE web Server
Query
Query
Query
Query
I FORNSAT site I I ssO site I
I F6 Site 1 I I f6 Site 21
I
l. I roP SECRET/ /COMINT/ /REL TO USA, AUS, CAN, GBR, NZL
I
Approximately 150 sites
Over 700 servers
I
:. TOP SE;CRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
TOP SECRET I I CO MINT I IREL TO USA, AUS, CAN, GBR, NZL
O'l
c
- ..c
CJ) 4-.J
CJ) 0.
Q) Q)
u 0
0
a..
Processing Speed
._ TOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
I
Can l1ook at more data
I
I
can also be configured to .
go shallow if the data rate is too high
: TOP SI;:CRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
Strong Selection itself give us only a very
limited capability
I
A larg1e amount of time spent on the web is
perfoliming actions that are anonymous
I
i
We ca'n use this traffic to detect anomalies
whichi can lead us to intelligence by itself, or
strong selectors for traditional tasking
I
I
I
,_.TOP SI;CRET//COMINT//RELTO USA, AUS, CAN, GBR, NZL
Plug-ins extract and index metadata into
tables
[sessions] - ~ > [processing engine]
p hone number s
~ . metadata
tables
email addresses
log i ns
user activity
, TOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
Plug-in
E-mail Addresses
Extracted Files
Full Log
HTTP Parser
Phone Number
~ = = = = = = = = = = = = = = ~
User Activity
~ ~
DESCRIPTION
Indexes every E-mail address seen in a session by
both username and domain
Indexes every file seen in a session by both filename
and extension
Indexes every DNI session collected. Data is
indexed by the standard N-tupple (IP, Port,
Casenotation etc.)
Indexes the client-side HTTP traffic (examples to
follow)
Indexes every phone number seen in a session (e.g.
address book entries or signature block)
Indexes the Webmail and Chat activity to include
username, buddylist, machine specific cookies etc.
. TOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
Anything you wish to extract
Choose your metadata
Customizable storage times
Ex: HTTP Parser
connect1on: keep-al1ve
_TOP S_ECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
TOP SECRET //COMINT //REL TO USA, AUS, CAN, GBR, NZL
How do I find a strong-selector for a known
target?
How do I find a cell of terrorists that has no
connection to known strong-selectors?
Answer: Look for anomalous events
E.g. Someone whose language is out of place for the
region they r ~ in
Someone who is using encryption
Someone searching the web for suspicious stuff
.. JOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
Show me all the encrypted word
documents from Iran
Show me all PGP usage in Iran
Once again data volume too high so
forwarding these back is not possible
No strong-selector
Can perform this kind of retrospective
query, then simply pull content of interest
from site as required
TOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
...
1
Show me all the VPN startups in
country X, and give me the data so I
can decrypt and discover the users
These events are easily browsable in
X KEYS CORE
No strong-selector
XKEYSCORE extracts and stores authoring
information for many major document types - can
perform a retrospective survey to trace the
document origin since metadata is typically kept for
up to 30 days
No other system performs this on raw unselected
bulk traffic, data volumes prohibit forwarding
. TOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
Traditionally triggered by a strong-selector
event, but it doesn't have to be this way
Reverse PSC - from anomalous event back to
a strong selector. You cannot perform this
kind of analysis when the data has first been
strong selected.
Tie in with Marina - allow PSC collection after
the event
TOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
I
My target speaks German but is in
Pakistan - how can I find him?
XKEYSCORE's HlTP Activity plugin extracts
and stores all HTML language tags which
can then be searched
Not possible in any other system but
XKEYSCORE, nor could it be -
volumes are too great to forward
No strong-selector
,_TOP TO USA, AUS, CAN, GBR, NZL
My target uses Google Maps to scope target
locations - can I use this information to
determine his email address? What about the
web-searches - do any stand out and look
suspicious?
XKEYSCORE extracts and databases these events
including all web-based searches which can be
retrospectively queried
No strong-selector
Data volume too high to forward
TOP SECRET/ /CO MINT/ /REL TO USA, AUS, CAN, GBR, NZL
I have a Jihadist document that
has been passed around through
numerous people, who wrote this
and where were they?
TOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
Show me all the Microsoft Excel spreadsheets
cont aining MAC addresses coming out of Iraq
so I can perform network mapping
New extractor allows different dictionaries to run on
document/email bodies - these more complex
dictionaries can generate and database this
information
No strong-selector
Data volume is high
Multiple dictionaries targeted at specific data types
TOP SECRET/ /COMINT/ /REL TO USA, AUS, CAN, GBR, NZL
Show me all the exploitable machines in
country X
Fingerprints from TAO are loaded into
XKEYSCORE's application/fingerprintiD
.
eng1ne
Data is tagged and databased
No strong-selector
Complex boolean tasking and regular
expressions required
TOP SI;CRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
New web services every day
Scanning content for the userid
rather than performing strong
selection means we may detect
activity for applications we
previously had no idea about
TOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
Have technology (thanks to R6) - for
English, Arabic and Chinese
Allow queries like:
Show me all the word documents with
references to IAEO
Show me all documents that reference
Osama Bin Laden
Will allow a 'show me more like this'
capability
TOP SECRET/ /COMINT/ /REL TO USA, AUS, CAN, GBR, NZL
TOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL
TOP SECRET I I CO MINT I IREL TO USA, AUS, CAN, GBR, NZL
High Speed Selection
Toolbar
Integration with Marina
GPRS, WLAN integration
SSO CRDB
I
'
Workflows
Multi-level Dictionaries
TOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL .
High speeds yet again (algorithmic and Cell
Processor (R4))
Better presentation
Entity Extraction
VoiP
More networking protocols
Additional metadata
Expand on google-earth capability
EXIF tags
Integration of all CES-AppProcs
Easier to install/maintain/upgrade
__ TOP SECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL