IPQ BDB filter - Homepage
ibd is used as a prefix for IPQ BDB commands. The latter stands for IP Queue based on a Berkeley DataBase (I pronounce it I-peek-you bədəbə, where ə ‐schwa‐ is a reduced vowel). The diagram on the left illustrates the concept. It may refer to a single server, a cluster, or a local subnet.
The blue part represents netfilter,
the kernel module normally configured via the iptables
command,
running on the bastion host. IP tables define what is going to be filtered.
There are
things you should know about netfilter, and a nice flow
chart. Netfilter includes
a netfilter queue part (the NFQUEUE
target of iptables) that
marshals packets to a user-space daemon. ibd-judge
sniffs the
packets and looks up the relevant IPv4 addresses in a Berkeley DB.
If the record is found, its probability of being blocked is compared to
a random number, and the result determines the verdict.
Behind the firewall, web and/or mail daemons listen for client connections.
Bad client behavior can be recognized by a local script or by parsing
log files. For example, a wrong userid/ password pair, non-existing users
or web pages, delivery to spamtraps, are symptoms of bad client behavior.
Scripts can use ibd-ban
, while ibd-parse
reads log files.
Reported IPs are either inserted into the db, or, if they already exist,
their probability of being blocked is doubled. Thus, while the first attempts
can be harmless, if the client persists the probability grows high enough
to affect the daemon's next verdict for that IP.
As time passes the probability of being blocked decays, and the IP is
eventually rehabilitated without human intervention.
In the twilight range of middle probabilities, gray clients experience
occasional timeouts in their attempts to connect to the server, while the
server reserves more band for good clients.
For new records, the initial probability is specified by the
initial count; that is, the number of times required to reach 100%
probability by doubling the current value each time the IP is caught.
The decay is expressed as the number of seconds required for the probability
to halve. These values, as well as a human readable reason, accompany each
invocation of ibd-ban
and each regular expression configured
for ibd-parse
.
I've been running this since December 2008. Version 1 in January 2010
changed the database record structure, and I restarted collecting IPs from
scratch.
Berkeley DB's Concurrent Data Store model is simple and effective for
controlling access to the database.
With a nominal record size of 64 bytes, in this moment (March 2011, testing
v1.03 candidate) my block.db
is 36Mbytes, holding 319691 records
at an average of 118.54 bytes each.
Maintenance consists in running ibd-del
once a day.
That command (not showed in the diagram)
is used to list or delete selected records.
Size is a minor problem, though. The btree algorithm performs as O(log N), where N is the number of keys and the logarithm's base is the average number of keys per page. IPv4 addresses result in tiny 4-byte keys, so we may easily fit 1000 keys per page, which means that access time would roughly double whenever the total number of mapped addresses is multiplied by one million! With such performance, I could map the entire IPv4 address space, up to the end of the world.