Information Gathering

โšก Prerequisites

  • Basic familiarity with Linux

  • Basic familiarity with web technologies

๐Ÿ“• Learning Objectives

  • Differences between active and passive information gathering

  • Perform passive and active information gathering with various tools and resources

๐Ÿ—’๏ธ Information gathering (Reconnaissance) is the initial stage of any penetration test and one of the most important phase.

  • It involves finding out as much information as possible about a targeted individual, website, company or system.

  • The more information a pentester has on a target, the more successful and easier the latter stages of a pentest will be. It depends on the scope of the penetration test too.

  • E.g.1 - Pentest on a Website: web technology, vulnerabilities, IP address of the hosting server.

  • E.g.2 - Pentest on a public facing assets and some internal systems, there can be more attack vectors:

    • gain access to the internal network through the public facing web server (one access vector)

    • during the info-gathering phase, learn more about the company employees (names, email addresses, credentials), getting this important information (useful for exploitation or initial access) by using phishing attacks, malicious attachments via email (another access vector)

Passive Information Gathering Introduction

๐Ÿ—’๏ธ Passive information gathering involves obtaining as much data as possible without actively interacting with the target.

  • The pentester uses what's available on the Internet.

  • E.g. - Website: utilizing publicly accessible information and resources of that particular website, through the browser, public IP address of the webserver hosting that website, etc.

What passive information?

  • IP addresses, DNS, domain names and domain ownership

  • Email addresses, social media profiles

  • Web technologies, subdomains

Active Information Gathering Introduction

๐Ÿ—’๏ธ Active information gathering involves obtaining as much information as possible by actively engaging with the target.

โ—An authorization is required to conduct active information gathering.

  • The target will be aware of the attacker's engagement.

  • E.g. - Website: perform a port scan of the webserver IP address (found with passive info gathering) using nmap tool to identify the open ports and running services. Identify exploitable vulnerabilities on those services and consequently access the web server.

What active information?

  • Open ports, internal network/organization infrastructure

  • Enumeration target info

Code of conduct

๐Ÿ“Œ From the The Pentester's Code of Conduct - by Sherri Davidoff

  • Know your scope.

  • Do not exceed your scope.

  • Take responsibility.

  • Only hack when under signed contract.

  • Verify your targets well in advance of the start of an engagement, and have the list in writing.

  • Do a thorough and complete job.

  • Take careful notes.

  • Upload your evidence to a central repository as soon as you can.

  • Know your client.

  • Communicate with your teammates, your client, and your project managers.

  • Know your limitations and do not exceed them.

  • Treat all others with respect.

  • Own your mistakes.

  • Include your best suggestions for a solution when reporting a problem.

  • Google first, then ask questions.

  • Share your knowledge.

  • Above all, exercise common sense.

๐Ÿšฉ๐Ÿ”ฌ zonetransfer.me domain will be utilized for training purposes and examples.


Passive Information Gathering

Website Reconnaissance & Footprinting

๐Ÿ—’๏ธ Footprinting is like reconnaissance, with more important information about a particular target.

What to look for in a Website?

IP addresses of the web server

Hidden directories

Names, Email addresses

Phone numbers

Physical Addresses

Web technologies

E.g. - Passive Reconnaissance on hackersploit.org:

host command

  • 2 IP addresses found - the website is behind Cloudflare proxy.

  • Social Links at the bottom of the main page:

robots.txt file - https://hackersploit.org/robots.txt

  • Avoid having the site indexed by search engines by using the "Disallow" feature, which lets the site owner designate which file or folder not to index.

  • /wp-content indicates that the website is running Wordpress

hackersploit.org/robots.txt

sitemap.xml file - https://hackersploit.org/sitemap.xml

  • Used to provide search engines with an organized way of indexing the website.

  • List of site pages, categories, author, etc

hackersploit.org/sitemap.xml

Broswer add-ons for Web Technology footprinting:

  • Wappalyzer - find out the technology stack of the website

Wappalyzer Example

whatweb command

whatweb hackersploit.org

Download the entire website, for analyzing the source code for example:

HTTrack Website copier

Whois Enumeration

Whois lookups are used to identify information regarding a particular domain.

  • Date of registration, Owner, Registrar, Owner Email address, etc

  • WHOIS is a query and response protocol that is widely used for querying databases that store the registered users or assignees of an Internet resource, such as a domain name, an IP address block or an autonomous system, but is also used for a wider range of other information. - Whois - Wikipedia

whois command

whois hackersploit.org
whois 172.64.32.93
domaintools.com example

Website Footprinting with Netcraft

Netcraft provides internet security services for a large number of use cases, including cybercrime detection and disruption, application testing and PCI scanning.

  • It collates previous information identified with other tools and outputs an easy to read format.

E.g. - Netcraft - Hackersploit.org - check the information needed for the pentest:

  • Background

  • Network: domain IP address, Nameserver, Domain registrar, IP delegation

  • SSL/TLS Certificate: Issuer, Validity, Transparency, vulnerabilities

  • Hosting History

  • Web Trackers

  • Site Technology: Server-Side, Client-Side, Frameworks, etc

sitereport.netcraft.com example

DNS Reconnaissance

๐Ÿ—’๏ธ DNS Recon is used to identify DNS records associated to a domain, like A record, IP address, mail server IP.

dnsrecon tool - a Python script that provides the ability to perform NS/DNS Records Enumeration, records lookup, subdomain brute force, etc.

dnsrecon --help
dnsrecon -d hackersploit.org
dnsrecon -d zonetransfer.me

dnsdumpster.com site

  • discover hosts related to a domain

  • map the domain in a graph .png image or .xlsx file.

dnsdumpster.com example
dnsdumpster.com export
zonetransfer.me Domain map - from dnsdumpster.com

WAF

Web Application Firewall (WAF) detection with wafw00f.

It does the following:

  • Sends a normal HTTP request and analyses the response; this identifies a number of WAF solutions.

  • If that is not successful, it sends a number of (potentially malicious) HTTP requests and uses simple logic to deduce which WAF it is.

  • If that is also not successful, it analyses the responses previously returned and uses another simple algorithm to guess if a WAF or security solution is actively responding to our attacks.

wafw00f -h
wafw00f -l
wafw00f hackersploit.org
wafw00f hackertube.net
wafw00f hackertube.net -a
wafw00f zonetransfer.me -a
  • This would be definitely tested within the active information gathering phase with a port scan on the webserver IP address.

Subdomain Enumeration with Sublist3r

To identify the subdomains of a specific domain in a passive way, publicly available resources and databases can be utilized.

sublist3r tool - a Python tool that enumerate subdomains of websites using OSINT (Open-Source Inteligence).

  • this example is NOT active enumeration - is is passive (using public available resources)

  • it enumerates subdomains using search engines (Google, Yaoo, Bing ...) and other tools (Netcraft, Virustotal, DNSdumpster, ReverseDNS, ThreatCrowd).

sublist3r -h
sublist3r -d ine.com

Google Dorks

๐Ÿ—’๏ธ Google Dorking/Hacking can be utilized to identify public information pertinent to a target.

  • Search filters for specific subdomains, files, etc using google.com.

  • First try to directly search for the specific domain and look for useful information.

"ine.com" on Google Search

site:

  • limit all results to the particular domain/site

  • shows subdomains for that particular domain

"site:ine.com" on Google Search
"site:ine.com employees" standard Google Search

inurl:

  • look for specific results within the website title/URL

  • e.g. - inurl:admin , etc.

"site:ine.com inurl:forum" on Google Search

site:*.site.com

  • show subdomains (indexed by Google) for a particular domain

  • usually they are exposed subdomains

    • sometimes unintended exposed subdomains

"site:*.ine.com" on Google Search

intitle:

  • limit the results to subdomains with a specific word in the site title

"site:*.ine.com intitle:forum" on Google Search

filetype:

  • limit the results to a file type in the URL

  • make the search query a bit more specific

"site:*.ine.com filetype:pdf" on Google Search

intitle:index of

  • look for sites with directory listing enabled, searching for index of

  • common web servers vulnerability/misconfiguration (against security)

  • directory listing allows users to see the content of the directory

"intitle:index of" on Google Search

cache:

  • shows the cached website

"cache:ine.com" on Google Search
  • Other Google dorking examples:

    • inurl:auth_user_file.txt

    • inurl:passwd.txt

    • inurl:wp-config.bak

Google Hacking Database - exploit-db.com

  • use it to search for Dorks by Category to find potentially unsecured files

Google Hacking Database - exploit-db.com

Wayback Machine

  • a digital archive by the Internet Archive

  • captures/snapshots web pages over time

  • check earlier version of websites

  • on older versions of the websites there can be useful sensitive information leaked

ine.com on Wayback Machine
01.09.2012 - ine.com on Wayback Machine

Email Harvesting

theHarvester tool - an open-source Python tool that performs OSINT gathering to help determine a domain's external threat landscape.

  • used to enumerate the emails (names, IPs, URLs, subdomains) belonging to a domain target, using publicly available resources and databases.

  • check the GitHub repository for more information on the Passive and Active information gathering and Installation.

  • In this case the tool is used for Email Harvesting.

theHarvester -h
theHarvester -d hackersploit.org
theHarvester -d hackersploit.org -b dnsdumpster,duckduckgo,crtsh
theHarvester -d zonetransfer.me -b all
  • Emails could be used to send phishing email with malicious attachments during an attack.

Leaked Password Databases

Email or account passwords can be potentially found and used for a password spray attack = use the discovered passwords and test them for authentication on many other services (not part of Passive info gathering).

  • Leaked online password databases can be utilized, usually coming from a site data breach containing the users credentials.

haveibeenpwned.com site by Troy Hunt

  • safe, reliable, no signup

  • insert the found target email in the site to check for data breaches

  • for older emails there is a greater chance of finding data breaches!

haveibeenpwned.com - clean
haveibeenpwned.com - breached

Active Information Gathering

DNS Zone Transfers

๐Ÿ—’๏ธ Check my basic DNS theory notes here.

๐Ÿ“Œ More in depth explanations about DNS can be found at the Cloudflare Learning Center.

๐Ÿ”ฌ Training list: Check some PentesterAcademy/INE DNS Network Pentesting Labs (subscription required)

  • Most common types of DNS:

Record Type
Description

A

Holds/Resolves the IPv4 address of a domain/hostname

AAAA

Holds/Resolves the IPv6 address of a domain/hostname

CNAME

Used for domain aliases, forwards one domain/subdomain to another domain

MX

Resolves a domain to a mail server

TXT

Used for admin text notes, often used for email security

NS

Reference to the domains name server

SOA

Stores admin information about a domain (domain authority)

HINFO

Host information

SRV

Specific services records

PTR

Resolves an IP address to a hostname - reverse lookups

๐Ÿ—’๏ธ Enumerating DNS records for a particular domain is done through a procedure known as DNS Interrogation.

  • Probe a DNS server to provide additional records and information (domain IP address, subdomains, mail server addresses, etc)

To obtain more records from a DNS server with regards to a particular domain, DNS Zone Transfers may be useful:

  • A zone transfer occurs when a system admin may want to copy or transfer zone files (containing domain records) from one DNS server to another.

  • This functionality can be abused by attackers when left misconfigured, to copy the zone file from the primary DNS to another DNS server.

  • It can give penetration testers a complete picture of the network architecture of an organization and internal network addresses may be found.

An IP address can be mapped to a local (or external) specific domain name using the /etc/hosts file:

Passive reconnaissance here - using dnsdumpster.com, dnsrecon

Active reconnaissance

dnsenum tool - a multithread Perl script to enumerate DNS information of a domain and to discover non-contiguous ip blocks

  • enumerate public DNS records

  • perform automatic DNS zone transfer

  • perform DNS brute force on subdomains

dnsenum --help
  • The two name server of ZoneTransfer.me are nsztm1.digi.ninja and nsztm2.digi.ninja

    • DNS Zone transfer functionality must be ON on the Name Servers.

    • Identify subdomains and internal IP addresses from the Zone Transfer results.

Check comments below

  • dnsenum can fail if zone transfer is disabled (e.g. Cloudflare NS)

dnsenum hackersploit.org - failed Zone Transfers

dig tool - query DNS name servers

  • AXFR zone transfers are the full DNS zone transfers of all DNS data. The Primary DNS server sends the whole zone file that contains all the DNS records to the Secondary DNS servers. This assures that the secondary DNS server is well synced. It will have all the latest changes that were made to the Master DNS zone.

fierce tool - a semi-lightweight scanner that helps locate non-contiguous IP space and hostnames against specified domains, using DNS primarily

โœ๏ธ "Zone transfers are rare these days, but they give us the keys to the DNS castle." fierce - Geeksforgeeks

Host Discovery with Nmap

nmap - open source security tool for network exploration, security scanning and auditing.

man nmap
  • E.g. - Discover all the devices on a target network using a ping sweep (ping scan) with Nmap.

    • -sn option - Ping Scan (ping sweep), disable port scan. It finds the responding hosts. -sn consist of:

      • an ICMP echo request

      • a TCP SYN to port 443

      • a TCP ACK to port 80

      • an ICMP default timestamp

      • -sn must be run as sudo

  • Copy the found IPs for future references and move on to the port scan phase on each of them.

netdiscover - an active/passive ARP discovering tool

  • it utilizes ARP requests

netdiscover -i eth1 -r 192.168.31.0/24

๐Ÿ“Œ Nmap Command Examples

Port Scanning With Nmap

๐Ÿ”ฌ Training list: PentesterAcademy Windows Recon - Host Discovery (subscription required)

Use nmap to identify open ports and the respective running services on a target system.

  • Enumerate as much information as possible

  • E.g. - perform post scanning on TCP and UDP ports, using a few techniques

Lab with Nmap

๐Ÿ”ฌ Nmap Host Discovery LAB

  • Windows systems will typically block ICMP ping probes, resulting in a "host down" response from the nmap command.

nmap <WIN_TARGET_IP>

-Pn option - skip host discovery (skip ping)

nmap -Pn <WIN_TARGET_IP>
  • Try to access the webserver with a browser:

Port 80 - HttpFileServer

-p- - Scan the entire range of TCP ports (65535 ports)

  • the scan will take longer

-p <PORTS_LIST> - Scan a specific or more TCP ports:

  • if a port state is filtered it means the port is blocked by a firewall or closed

-F - fast mode, scan 100 of the most commonly used ports -v - increase verbosity, see background scanning info

-sU - UDP scan

  • always try to do a UDP port scan (DNS service, etc). Default nmap scan performs only TCP scans.

-sV - probe open ports to determine service/version info

-O - Operating System detection, based on the open ports and running services

  • sometimes is not accurate

  • a penetration tester can start to identify specific O.S. version vulnerabilities and exploits

-sC - default nmap script scan

  • under each service, nmap will run a series of scripts based on the service

-A - Aggressive scan: OS detection, version detection, script scanning (-sV + -O + -sC)

-T# - nmap Timing templates - optimize and speed up scanning (higher is faster)

  • -T0 - paranoid (possible IDS evasion, slow)

  • -T1 - sneaky (possible IDS evasion, slow)

  • -T2 - polite (less bandwidth and target machine resources, slow)

  • -T3 - normal (default scan)

  • -T4 - aggressive (reasonably fast, modern and reliable network)

  • -T5 - insane (extraordinarily fast network)

  • the lower the number the slower the scan

-oN - output the report into three main formats


Last updated

Was this helpful?