SANS_CD_OSINT_POSTER_v1
SANS_CD_OSINT_POSTER_v1
OSINT
IPv4 addresses were deployed in 1981 and have 4.3 billion possible addresses.
40.112.72.205
IPv6 addresses were deployed in 1998 and have 340 undecillion possible addresses.
2001:4998:0024:120d:0000:0000:0001:0001
2001:4998:24:120d::1:1
Note: IPv6 addresses are so long, there are rules for shortening them!
P O S T E R The two IPv6 addresses above are actually the same address, the bottom adddress just had any leading zeros removed.
• S
hodan.io – Shodan is a technology-focused search engine that allows you
to see what services are running on a system and other detailed information.
• S
earch.Censys.io – Censys is a technology-focused search engine that allows you to see what services are running on a
system and other detailed information on known vulnerabilities and SSL certificates.
sans.org/cyber-defense • G
reynoise.io – GreyNoise catalogs traffic flowing across the internet, allowing you to determine if activity from an IP address is
Poster Created by Matt Edmondson. ©2024 SANS Institute. All Rights Reserved. targeted towards a specific group or indiscriminately scanning the internet.
CD_OSINT_v1.1_0724
• A
buseIPDB.com – AbuseIPDB is a central repository for reporting IP addresses that have been associated with malicious
online activity.
@[term] Search social media for a handle @hgtv • country:"[country code]" – Searches for devices in a specific country
-[term] Exclude a term “fixer upper” “stream” –HGTV • hostname:"[hostname]" – Searches for devices with a specific hostname
filetype:[type] Only show results with a certain file type, such as xlsx or pdf site:hgtv.com filetype:pdf • net:"[IP range]" – Searches for devices within a specific IP range
Group operations together to control how a • os:"[operating system]" – Searches for devices with a specific operating system
([operations]) site:gov wifi (password OR key)
complex search executes • port:"[port number]" – Searches for devices with a specific port
site:gov • org:"[organization name]" – Searches for devices associated with a specific organization
site: Limit results to a specific site or class of site
site:example.com
• isp:"[internet service provider]" – Searches for devices using a specific internet service provider
intitle:[term] Search for one term in the title of a page intitle:hgtv | site:hgtv.com
• product:"[product name]" – Searches for devices that are using a specific software or hardware product
Search for all of the terms that follow the : in the
allintitle:[term] allintitle:bbc news | site:bbc.com • version:"[version number]" – Searches for devices running a specific version of software or firmware
title of the HTML page
inurl:[term] Find sites with one term in URL inurl:magnolia | site:magnolia.com • has_screenshot:"true" – Searches for devices with available screenshots
inurl:magnolia table • ssl.cert.subject.cn:"[common name]" – Searches for SSL certificates with a specific common name
allinurl:[term] Find sites with multiple terms in URL
site:magnolia.com • http.title:"[title text]" – Searches for web pages with a specific title
intext:[term] Search for occurrences of a term in website text intext:waco | site:magnolia.com • http.html:"[html content]" – Searches for web pages containing specific HTML content
allintext:silo shops • before:"[date]" | after:"[date]" – Searches for devices that were online before or after a specific date
allintext:[term] Search for occurrences of multiple terms in website text
site:magnolia.com
• product:"[product name]" – Searches for devices running a specific product
Find pages that link to a page, but doesn’t seem to link:www.magnolia.com
link:[url] • version:"[version]" – Searches for devices with a specific version number
work consistently site:www.magnolia.com
before:[Date] Find results before or after a specific date or year, • webcam – Searches for internet-connected webcams
before:2020 after:2015-7
after:[Date] or in between a specific date range
[n]..[n] Put .. between two numbers to search a range (not super accurate) nikon $300..$500 Example Shodan Searches
• "default password" – Searches for devices using default passwords
•
"MongoDB Server Information" port:27017 -authentication – Finds exposed MongoDB databases
• "in-tank inventory" port:10001 – Finds gas station pump controllers
Limit results to a line that contains a specific term Send the selected output to a new file QUERY EXPLANATION
• Linux – cat file.txt | grep -I sans • Linux – cat file.txt | grep -I sans > new_file.txt Searches for pages that contain Discord invite links from the primary domain used
site:discord.gg
• Windows – type file.txt | findstr -I sans • Windows – type file.txt | findstr -I sans > new_file.txt for Discord invites
site:discord.gg
Adds specific keywords related to the topic or interest (e.g., “gaming community”) to
“gaming community” find relevant servers
Dark Web Resources on the Internet Looks for PDF documents containing Discord invite links—can also use filetype:txt,
filetype:pdf “discord invite”
filetype:doc, etc.
• ahmia.fi – Fantastic (and free) dark web search engine site:reddit.com “discord.gg”
Combines keywords with site-specific searches to find Discord invites posted on
“cybersecurity” Reddit, which often shares server invites
• Tor.taxi – Trusted directory of sites on the dark web
inurl:discord.gg
Uses the inurl operator to search for URLs that contain specific text such as
• Tor.link – Directory of sites on the dark web “music community” “discord.gg” combined with keywords
• Daunt.link – Trusted directory of sites on the dark web site:forum.example.com
Targets specific forums (replace forum.example.com with the URL of the forum) to
• Torry.io – Dark web search engine “discord invite” find Discord invites
• h
ttps://github.com/fastfire/deepdarkCTI – intitle:”discord invite”
Uses the intitle operator to search for pages with specific words in the title, useful
Incredible list of URLS for dark websites including: “technology” for finding blog posts or articles mentioning Discord invites
When you start the Tor Browser, it automatically opens a proxy listener on port 9050 and/or 9150. Any application that can be site:facebook.com
Targets specific community-driven platforms like Facebook to find groups that share
configured to use this proxy will have its traffic routed through Tor and will be able to access the dark web. “discord.gg” “fitness” Discord server links
This functionality has many uses including being able to use Chrome/Firefox etc. and their plugins to gather information from site:discord.gg “FPS gaming”
Searches for pages containing Discord invites specifically mentioning “FPS gaming”
dark websites as well as add the ability to access the dark web with your Python code. site:reddit.com/r/cybersecurity
Targets the Reddit cybersecurity community for shared Discord invites
“discord.gg”
Detecting AI Content
filetype:txt “discord.gg”
Searches for text files containing Discord invites related to study groups
“study group”
intitle:”discord
invite” Finds blog posts or articles with titles indicating they contain Discord invites related
As AI models become more sophisticated, detecting AI-generated content has become more challenging. “tech news” to tech news
METHOD DETAILS
AI-generated content often includes repetitive phrases, sentences, or ideas. This happens because
Look for repetitive
the model might generate similar outputs based on its training data. Identifying repetitive
Telegram
phrases and ideas.
patterns, especially unusual repetition, can be a strong indicator of machine-generated text. Telegram is a cloud-based instant messaging app that allows users to send messages, share media, and create groups and
AI might produce content with sudden changes in tone or style. For example, a paragraph might channels for broadcasting messages to large audiences. It is known for its focus on privacy, encryption, and speed. These
Check for inconsistent Telegram channels are often used by members of the criminal underground to exchange ideas and information such as breach
switch from formal to informal language abruptly. These inconsistencies occur because AI models
writing styles. data and information stealer logs. Additionally, they serve as marketplaces where illicit goods and services, including stolen
might pull from varied data sources, leading to a lack of uniformity in writing style.
credentials, hacking tools, and illegal substances, are bought and sold. Telegram has a built-in search functionality which can
AI models sometimes use words or phrases in ways that seem slightly off or uncommon. This help you find users and channels. You can also use Google dorks to help find Telegram content.
Identify odd word choices
can include the use of synonyms that don't fit the context perfectly or awkward sentence
or phrasing.
constructions that a native speaker wouldn't typically use. QUERY EXPLANATION
AI-generated text might lack the depth and nuance of human writing. While it can produce site:t.me
Searches for pages that contain Telegram links from the t.me domain
Assess the depth of
coherent and informative content, it often misses the detailed insights, unique perspectives, or
content. Adds specific keywords related to the topic (e.g., “cybersecurity”) to find relevant
expert knowledge that a human writer would provide. site:t.me “cybersecurity”
Telegram links
AI-generated content can include inaccurate information. It's essential to cross-check facts
Looks for PDF documents containing Telegram invite links—can also use filetype:txt,
Verify factual accuracy. with reliable sources. AI may generate plausible-sounding but incorrect facts, especially when filetype:pdf “telegram invite”
filetype:doc, etc.
discussing niche or complex topics, making thorough verification critical.
site:reddit.com “telegram
Combines keywords with site-specific searches to find Telegram invites posted on
Analyze sentence AI models may use more complex or unnatural sentence structures compared to human writers.
invite” “cybersecurity” Reddit, often shared in discussions
structure and linguistic Look for overly complex sentences, unusual grammar patterns, or repetitive sentence structures.
patterns. These linguistic anomalies can signal that the text is machine-generated. Uses the inurl operator to search for URLs that contain specific text, such as “t.me”,
inurl:t.me “trading group”
combined with keywords
Frequent, high-volume content posting without a corresponding number of human interactions
Monitor posting behavior. can be a sign of AI-generated content. Analyze the posting frequency and engagement metrics. AI- site:forum.example.com
Targets specific forums (replace forum.example.com with the URL of the forum) to
generated posts might flood a platform with content at a rate that is impractical for human users. “telegram invite” find Telegram invites
AI-generated content might attract different engagement patterns compared to human content. intitle:”telegram invite”
Uses the intitle operator to search for pages with specific words in the title, useful
Examine user engagement
For instance, such content might receive fewer meaningful comments or generate engagement “cryptocurrency” for finding blog posts or articles mentioning Telegram invites
patterns.
that seems automated. Human users tend to engage in more nuanced and varied ways.
site:twitter.com “t.me”
Targets social media platforms where users share Telegram invites by combining the
“book club” site operator with keywords
Specific Prompt Error Messages to Search For
Pastebin and similar sites are often used to share collections of links and can be a
site:pastebin.com “t.me”
• “As an AI language model” • “Cannot provide a phrase” source for Telegram invites
• “I'm sorry, I cannot generate” • “Violates OpenAI's content policy” site:facebook.com
Targets specific community-driven platforms like Facebook to find groups that share
“telegram invite” “fitness” Telegram server links
• “Not a recognized word”
site:t.me “hacking group”
Searches for pages containing Telegram invites specifically mentioning “hacking group”
site:reddit.com/r/OSINT
The Admiralty Code (AKA NATO System) is a relatively Credible – Qcc.com & Qixin.com make aquiring information from EDGAR much easier. The tool is
simple scheme for categorizing evidence according to its A2 B2 C2 D2 E2 F2 accept
available here: https://github.com/bellingcat/EDGAR
credibility. It was initially used by the British Admiralty for A3 B3 C3 D3 E3 F3 Uncertain –
the assessment of evidence used in naval intelligence, but
A4 B4 C4 D4 E4 F4
Executive Summary (AKA Key Findings, Bottom Line Up Front (BLUF), etc.)
investigate/wait
it is now used in many police departments, intelligence
agencies, and defense-related organizations, including the A5 B5 C5 D5 E5 F5 Non-credible –
reject
U.S. Army. (U.S. Army Field Manual 2-22.3, 2006) A6 B1 C6 D6 E6 F6 Unfortunately, no matter how great a report is, some people know as well, but that’s often not the case. You likely have
EXPECTED RELIABILITY
OF THE SOURCE
will only spend a few minutes reviewing it. To make sure that we knowledge or context that others viewing the report may not,
convey the most importing findings to those individuals, we need so you need to convey that information where possible to help
THE SOURCE THE CONTENT an executive summary section at the beginning of our report. answer the “so what?” question.
o doubt of authenticity,
N onfirmed by other independent sources,
C Methodology Appendices
A Reliable trustworthiness, or competency, 1 Confirmed logical in itself, consistent with other One of the key components of digital forensics is that if two One of the most difficult parts of report writing is meeting
has a history of complete reliability information on the subject forensics practitioners receive the same evidence, analyze the needs of a majority of people who read your report. As a
Minor doubt about authenticity, Not confirmed, logical in itself, consistent with it, and reports their findings, they should both come to very general rule, shorter reports are preferred over large ones.
2 Probably true similar conclusions. OSINT is similar, and the methodology The executive summary meets the needs of those only spending
B Usually trustworthiness, or competency, other information on the subject
reliable has a history of valid information section is where you describe at a high level the methods used a few minutes with your report, while the main body of the
ot confirmed, reasonably logical, agees with
N in your research and analysis. report meets the needs of someone spending a little more
most of the time 3 Possibly true
some other information on the subject time. But what about another OSINT practitioner who receives
oubt of authenticity,
D Methodology helps others understand how something was
4 Doubtfully
ot confirmed, possible but not logical, no other
N a task based off your report? The report you generate may be
C Fairly trustworthiness, or competency, accomplished so another practitioner can verify the results.
true information on the subject the end of the journey on that particular topic, but someone
reliable but had provided valid information When results are scarce, the methodology section is more else may receive that report and use it as the starting point
in the past ot confirmed, not logical in itself, contradicted
N
5 Improbable important than ever to let others know what you tried. This not for their own research.
by other information on the subject only conveys how much effort you put into your research, but can
ignificant doubt about
S
In the case of your report being a starting point, that individual
D Not usually authenticity, trustworthiness, or Unintentionally false, not logical in itself, help others avoid wasting time trying the same techniques you did.
would likely appreciate more details, including some of the
reliable competency, but had provided 6 Misinformation contradicted by other information on the subject, Findings “raw” information, that anyone else reading the report would
valid information in the past confirmed by other independent sources
likely just gloss over. To help meet this goal, appendices are
The findings section likely represents the bulk of your report,
acking authenticity,
L Deliberately false, contradicted by other a fantastic tool. Raw output from tools or other sources that
and in addition to your writeup, it can include images,
E Unreliable trustworthiness, or competency, 7 Deception information on the subject, confirmed by other are summarized in your report can be placed in an appendix
screenshots, tables, etc.
has a history of invalid information independent sources and included as an attachment with your report. Having an
One of the most important things for an analyst to provide is executive summary, a concise report, and additional raw
F Cannot
be No basis for evaluating the
8 Cannot be No basis for evaluating the reliability
an explanation of why things are important. Analysts who focus information included as appendices can help ensure that your
judged reliability of the source judged of the source
on a particular area often believe that what they know, others report meets the diverse needs of all readers.