0% found this document useful (0 votes)
65 views

Unit 2

Information gathering is the first phase of hacking where hackers gather information about the target through both active and passive techniques. There are two main categories of information gathering - active which directly engages with the target, and passive which uses external sources like search engines. Common information gathering techniques include using social media, search engines, forums, tools like Whois to find owner/server details, reverse IP lookups to find other sites on the same server, and tracing the location of servers. Fingerprinting tools are also used to determine the web server and version in use.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views

Unit 2

Information gathering is the first phase of hacking where hackers gather information about the target through both active and passive techniques. There are two main categories of information gathering - active which directly engages with the target, and passive which uses external sources like search engines. Common information gathering techniques include using social media, search engines, forums, tools like Whois to find owner/server details, reverse IP lookups to find other sites on the same server, and tracing the location of servers. Fingerprinting tools are also used to determine the web server and version in use.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 49

Information Gathering Techniques

Information gathering is the first phase of hacking.

“The more information you have about the target, the more is
the chance of successful exploitation.”

In this phase, we gather as much information as possible


regarding the target’s online presence, which in turn reveal
useful information about the target itself.
The required information will depend on whether we are
doing a network pentest or a web application pentest.

In the case of a network pentest, our main goal would be to


gather information on the network.

The same applies to web application pentests.


All information gathering techniques can be classified into
two main categories:

1. Active information gathering


2. Passive information gathering
Active information gathering: In active information gathering,
we would directly engage with the target, for example, gathering
information about what ports are open on a particular target,
what services they are running, and what operating system they
are using.

Disadvantages of Active Information gathering:

1. The techniques involving active information gathering would


be very noisy at the other end.

2. As they are easily detected by IDS, IPS, and firewalls and


generate a log of their presence, and hence are not recommended
Passive information gathering: In passive information
gathering, we do not directly engage with the target.

We use search engines, social media, and other websites to


gather information about the target.

This method is recommended, since it does not generate any


log of presence on the target system.
A common example would be to use LinkedIn, Facebook, and
other social networks to gather information about the
employees and their interests. This would be very useful
when we perform phishing, keylogging, browser
exploitation, and other client side attacks on the employees.
Sources of Information Gathering
There are many sources of information.

the most important ones are as follows:

1. Social media website


2. Search engines
3. Forums
4. Press releases
5. People search
6. Job sites
Copying Websites Locally

It can be used to investigate the website further.

For example, let’s suppose that the file permissions of a


configuration file are not set properly. The configuration
might reveal some important information, for example,
username and password, about the target.
There are many tools that can be used to copy websites
locally;

One of the most comprehensive tool is httrack.

If you are on Linux, you can use Wget command to copy a


webpage locally.
Wget http://www.rafayhackingarticles.net

https://www.youtube.com/watch?v=tvE6C9OVisc

Another great tool is Website Ripper Copier, which has a few


additional functions than httrack.
There are many tools that can be used to copy websites
locally;

One of the most comprehensive tool is httrack.

If you are on Linux, you can use Wget command to copy a


webpage locally.
Wget http://www.rafayhackingarticles.net

https://www.youtube.com/watch?v=tvE6C9OVisc

Another great tool is Website Ripper Copier, which has a few


additional functions than httrack.
Information Gathering with Whois:

Whois holds a huge database that contains information


regarding almost every website that is on the web, most
common information are “who owns the website” and “the
e-mail of the owner” which can be used to perform social
engineering attacks.
“Whois” database is accessible on
https://whois.domaintools.com/ It’s also available in
BackTrack.

you would need to issue the following command from


BackTrack to enable it:
apt-get install whois

In order to perform a Whois search on a website, you would


need to type Whois <domainname> from the command line:

whois www.techlotips.com
You would see the following output after giving whois
command.
You can see that it has revealed some interesting information
such as the e-mail of the owner and the name servers, which
shows that hostagtor.com is hosting this website.
Finding Other Websites Hosted on the Same
Server
Reverse IP lookup method is used to find the domains
hosted on the same server.

Yougetsignal.com allows you to perform a reverse IP lookup


on a webserver to detect all other websites present on the
same server.

All you need to do is enter the domain. Shown below.


There is another tool called ritx that is also used to perform
this task(Reverse IP lookup ).
Tracing the Location
You would need to know the IP address of the webserver in
order to trace the exact location.

There are several methods to figure it out. We will use the


simplest one, that is, the ping command.
From your command line, type the following:

ping www.techlotips.com

The output would be as follows:

C:\Users\ Rafay Baloch>ping www.techlotips.com


Pinging techlotips.com [50.22.81.62] with 32 bytes of data:
Reply from 50.22.81.62: bytes = 32 time = 304ms TTL = 47
Reply from 50.22.81.62: bytes = 32 time = 282ms TTL = 47
Reply from 50.22.81.62: bytes = 32 time = 291ms TTL = 47
Reply from 50.22.81.62: bytes = 32 time = 297ms TTL = 47
After determining the webserver’s IP, we can use some online
tools to track the exact location of the webserver.

One such tool is IPTracer that is available at


http://www.ip-adress.com/ip_tracer/yourip

Just replace your IP with your target’s IP, and it will show you
the exact location of the webserver via Google Maps.

http://www.ip-address.com/ip_tracer/ 50.22.81.62
Traceroute
Traceroute is a very popular utility available in both
Windows and Linux.

It is used for network orientation. By network orientation I


don’t mean scanning a host for open ports or scanning for
services running on a port.

It means to figure out how the network topology, firewalls,


load balancers, and control points, etc. are implemented on
the network.
A traceroute uses a TTL (time to live) field from the IP
header, and it increments the IP packet in order to determine
where the system is.

The time to live value decreases every time it reaches a hop


on the network (i.e. router to server is one hop).
There are 3 different types of traceroutes:

1. ICMP(Internet Control Message Protocol) traceroute


(which is used in Windows by default)
2. TCP(Transmission Control Protocol ) traceroute
3. UDP(User Datagram Protocol) traceroute

Traceroute
https://www.youtube.com/watch?v=up3bcBLZS74
ICMP(Internet Control Message Protocol):

Microsoft Windows by default uses ICMP traceroute.

however, after a few hops, you will get a timeout, which


indicates that there might be a device like IDS or firewall that
is blocking ICMP echo requests.
TCP (Transmission Control Protocol ) Traceroute:

Many devices are configured to block ICMP traceroutes.


This is where we try TCP.

$ sudo apt-get install tcptraceroute

$ tcptraceroute www.google.com
UDP (User Datagram Protocol) Traceroute:

$ sudo apt-get install traceroute

$traceroute www.google.co.in
Enumerating and Fingerprinting the
Webservers

For successful target enumeration, it’s necessary for us to


figure out what webserver is running at the back end.
Intercepting a Response

For successful target enumeration, it’s necessary for us to figure out


what webserver is running at the back end.

The first thing you should probably try is to send an http request to
a webserver and intercept the response.

http responses normally reveal the webserver version of many


websites.

For that purpose, you would need a web proxy such as Burp Suite,
Let’s try to find out the name and version of the webserver running behind
www.google.co.in by using Burp Suite

Step 1—First, download the free version of Burp Suite from the following
website: http://portswigger.net/burp/

Step 2—Next, install the Burp Suite and launch it.

Step 3—Next, open Firefox.


Note: You can use any browser, but I would recommend Firefox. Go to
Tools → Options →
Advanced → Network → Settings.

Step 4—Click on the “Manual Proxy configuration” and insert the


information given in following
screenshot and click “Ok”.
Step 5—Next, open up Burp Suite again, navigate to the “proxy” tab
and click on the “intercept” tab and click on “intercept is off” to turn it
on.
Step 6—Next, from your Firefox browser, go to www.ptcl.com.pk and
send an http request by refreshing the page. Make sure the intercept is
turned on.

Step 7—Next, we would need to capture the http response in order to


view the banner information.

Intercepting the response is turned off by default, so we need to turn it


on.
For that purpose, select the http request and then right click on it, and
under “do intercept”, click on “response to this request.”
Step 8—Next, click on the “Forward” button to forward the http
request to the server. In a few seconds, we will receive an http response,
revealing the http server and its version. In this case, it is Microsoft’s
IIS 7.5.
Acunetix Vulnerability Scanner:

Acunetix vulnerability scanner also has an excellent webserver


fingerprinting feature, and is freely available from acunetix.com.

Once you’ve downloaded it, launch it and choose to scan a website.

Under “website” type your desired website and click “Next” and it will
give you the exact version
of webserver.
WhatWeb

WhatWeb is an all-an-one package for performing active footprinting on a


website.

It has more than 900 plug-ins capable of identifying server version, e-mail
addresses, and SQL errors.

The tool is available in BackTrack by default in the


/pentest/enumeration/web/whatweb directory.

$sudo apt-get install whatweb


Google Hacking
Google searches can be more than a treasure for a pentester, if
he/she uses them effectively.

With Google searches, an attacker may be able to gather some


very interesting information, including passwords, on the
target.

Google has developed a few search parameters in order to


improve targeted search.
Some Basic Google Search Parameters
Site: The site parameter is used to search for all the web pages that are indexed by Google
and this information is saved in the robots.txt file, which an attacker can easily view.

Ex: website/robots.txt, www.google.co.in/robots.txt (open in web wrowser)

https://developers.google.com/search/docs/crawling-indexing/robots/create-robots-txt

Inurl: Inurl is a very useful search query. It can be used to return URLs with specific
keywords.
Site: www.techlotips.com inurl:ceo names
This query will return all URLs with the given keyword.
Filetype: Site: www.msn.com filetype:pdf You can also ask Google to return specific files
such as PDF and .docx by using the filetype query

Lots of Webmasters of websites that sell e-books and other products forget to block the URL
from being indexed. Using filetype, you can search for these files, and if you are lucky, you
may be able to download products for free.
Google Hacking Database:
Google hacking database has a list of many Google dorks that could be
used to find usernames, passwords, e-mail list, password hashes, and
other important information.

https://www.exploit-db.com/

From the above link we can download the program and run it and we
can read the purpose of the program with code.
Xcode Exploit Scanner:
Xcode exploit scanner is an automated tool that uses some common Google dorks to
scan for vulnerabilities such as SQLI(SQL Injection) and XSS(Cross-Site Scripting).
SQLI-Servers-side attack
XSS-Client-Side attack
(Chapter 12).

File Analysis:
Analyzing the files of the target could also reveal some interesting information such
as the metadata (data about data) of a particular target. (Chapter 8)

Foca:
Foca is a very effective tool that is capable of analyzing files without downloading
them. It can search a wide variety of extensions from all the three big search engines
(Google, Yahoo, and Bing). It’s also capable of finding some vulnerabilities such as
directory listing and DNS cache snooping.
Harvesting E-Mail Lists

Gathering information about e-mails of employees of an organization can give us a very broad attack
vector against the target.

These e-mail lists and usernames could be used later for social engineering attacks and other brute force
attacks.

Luckily, we have lots of built-in tools in BackTrack that can take care of this. One of those tools is
TheHarvester, written in Python (theHarvester.py).

Now, let’s say that we are performing a pentest on Microsoft.com and that we would like to gather e-
mail lists. We will issue the following command:

root@root:/pentest/enumeration/theharvester# ./theHarvester.py –d Microsoft.com –l 500 –b google

The -l parameter allows us to limit the number of search results; for example, here we have limited it to
500 by assigning –l 500 command.

-b parameter: this tells TheHarvester to extract the results from Google


Harvesting E-Mail Lists

https://www.youtube.com/watch?v=VytCL2ujjcA
Gathering Wordlist from a Target Website :
After we have gathered e-mail lists from search engines, it would be really
useful for us to gather a list of words that we would use for brute forcing
purposes.

CEWL is another excellent tool in BackTrack, which enables you to gather a


list of words from the target website, which can be later used for brute-
forcing the e-mail addresses we found earlier.
$ cewl google.com –w abc.txt (the output will be written in abc.txt file)
Scanning for Subdomains:
Most Webmasters put all their efforts in securing their main domain, often ignoring their subdomains.
What if an attacker manages to hack into a subdomain and uses it to compromise the main domain.

A very common way of searching for subdomains is by using a simple Google dork.

Even though you won’t be able to find all the subdomains with this method, you can find some important ones.
http://msn.com -inurl:www (This query is telling the search engine to return results without www which are
normally subdomains).
TheHarvester can also be used for this task, which uses Google to search for subdomains.

Fierce is also an amazing tool for scanning subdomains. It is also capable of bypassing CloudFlare
protection.

To scan a host for subdomains, you need to issue the following command from the fierce directory.

$fierce –domain google.com


Scanning for SSL Version:
SSL stands for secure socket layer.

It is used for encrypting communication.

Since an attacker on the local network could easily sniff the traffic, most highly sensitive communications such as
“login pages” use https (Port 443).

There are two versions for SSL, that is, SSL 2.0 and SSL 3.0. SSL 2.0 is known to be deprecated as an attacker can
easily decrypt the traffic between the client and the server by using various sniffing methods. Therefore, it is
highly recommended to use either SSL 3.0 or TLS 1.0 for web pages where highly confidential information is
being sent and received.

BackTrack has a great tool SSLSCAN preinstalled, which checks what version of SSL, 2.0 or 3.0, a server is
running.

$sslscan www.google.com
DNS Enumeration:
Without a domain name, Google.com would just be 173.194.35.144, which is it’s IP. Imagine having
to memorize the IPs of all the websites you visit—surfing the Internet would become really difficult.
That’s why DNS protocol was developed. It is responsible for translating an IP address to a domain
name. DNS is one of the most important sources of information on public and private servers of the
target.

You might also like