NAME
Linux::SocketFilter::Assembler
- assemble BPF programs from textual code
SYNOPSIS
use Linux::SocketFilter;
use Linux::SocketFilter::Assembler qw( assemble );
use IO::Socket::Packet;
use Socket qw( SOCK_DGRAM );
my $sock = IO::Socket::Packet->new(
IfIndex => 0,
Type => SOCK_DGRAM,
) or die "Cannot socket - $!";
$sock->attach_filter( assemble( <<"EOF" ) );
LD AD[PROTOCOL]
JEQ 0x0800, 0, 1
RET 20
JEQ 0x86dd, 0, 1
RET 40
RET 0
EOF
while( my $addr = $sock->recv( my $buffer, 40 ) ) {
printf "Packet: %v02x\n", $buffer;
}
DESCRIPTION
Linux sockets allow a filter to be attached, which determines which packets will be allowed through, and which to block. They are most often used on PF_PACKET
sockets when used to capture network traffic, as a filter to determine the traffic of interest to the capturing application. By running directly in the kernel, the filter can discard all, or most, of the traffic that is not interesting to the application, allowing higher performance due to reduced context switches between kernel and userland.
This module allows filter programs to be written in textual code, and assembled into a binary filter, to attach to the socket using the SO_ATTACH_FILTER
socket option.
FILTER MACHINE
The virtual machine on which these programs run is a simple load/store register machine operating on 32-bit words. It has one general-purpose accumulator register (A
) and one special purpose index register (X
). It has a number of temporary storage locations, called scratchpads (M[]
). It is given read access to the contents of the packet to be filtered in 8-bit (BYTE[]
), 16-bit (HALF[]
) or 32-bit (WORD[]
) sized quantities. It also has an implicit program counter, though direct access to it is not provided.
The filter program is run by the kernel on every packet captured by the socket to which it is attached. It can inspect data in the packet and certain other items of metadata concerning the packet, and decide if this packet should be accepted by the capture socket. It returns the number of bytes to capture if it should be captured, or zero to indicate this packet should be ignored. It starts on the first instruction, and proceeds forwards, unless the flow is modified by a jump instruction. The program terminates on a RET
instruction, which informs the kernel of the required fate of the packet. The last instruction in the filter must therefore be a RET
instruction; though others may appear at earlier points.
In order to guarantee termination of the program in all circumstances, the virtual machine is not fully Turing-powerful. All jumps, conditional or unconditional, may only jump forwards in the program. It is not possible to construct a loop of instructions that executes repeatedly.
FUNCTIONS
$filter = assemble( $text )
Takes a program (fragment) in text form and returns a binary string representing the instructions packed ready for attach_filter()
.
The program consists of \n
-separated lines of instructions or comments. Leading whitespace is ignored. Blank lines are ignored. Lines beginning with a ;
(after whitespace) are ignored as comments.
INSTRUCTION FORMAT
Each instruction in the program is formed of an opcode followed by its operands. Where numeric literals are involved, they may be given in decimal, hexadecimal, or octal form. Literals will be notated as lit
in the following descriptions.
LD BYTE[addr]
LD HALF[addr]
LD WORD[addr]
Load the A
register from the 8, 16, or 32-bit quantity in the packet buffer at the address. The address may be given in the forms
lit
X+lit
NET+lit
NET+X+lit
To load from an immediate or X
-index address, starting from either the beginning of the buffer, or the beginning of the network header, respectively.
LD len
Load the A
register with the length of the packet.
LD lit
Load the A
register with a literal value
LD M[lit]
Load the A
register with the value from the given scratchpad cell
LD X
TXA
Load the A
register with the value from the X
register. (These two instructions are synonymous)
LD AD[name]
Load the A
register with a value from the packet auxiliary data area. The following data points are available.
- PROTOCOL
-
The ethertype protocol number of the packet
- PKTTYPE
-
The type of the packet; see the
PACKET_*
constants defined in Socket::Packet. - IFINDEX
-
The index of the interface the packet was received on or transmitted from.
LDX lit
Load the X
register with a literal value
LDX M[lit]
Load the X
register with the value from the given scratchpad cell
LDX A
TAX
Load the X
register with the value from the A
register. (These two instructions are synonymous)
LDMSHX BYTE[lit]
Load the X
register with a value obtained from a byte in the packet masked and shifted (hence the name). The byte at the literal address is masked by 0x0f
to obtain the lower 4 bits, then shifted 2 bits upwards. This special-purpose instruction loads the X
register with the size, in bytes, of an IPv4 header beginning at the given literal address.
ST M[lit]
Store the value of the A
register into the given scratchpad cell
STX M[lit]
Store the value of the X
register into the given scratchpad cell
ADD src # A = A + src
SUB src # A = A - src
MUL src # A = A * src
DIV src # A = A / src
AND src # A = A & src
OR src # A = A | src
LSH src # A = A << src
RSH src # A = A >> src
Perform arithmetic or bitwise operations. In each case, the operands are the A
register and the given source, which can be either the X
register or a literal. The result is stored in the A
register.
JGT src, jt, jf # test if A > src
JGE src, jt, jf # test if A >= src
JEQ src, jt, jf # test if A == src
JSET src, jt, jf # test if A & src is non-zero
Jump conditionally based on comparisons between the A
register and the given source, which is either the X
register or a literal. If the comparison is true, the jt
branch is taken; if false the jf
. Each branch is a numeric count of the number of instructions to skip forwards.
JA jmp
Jump unconditionally forward by the given number of instructions.
RET lit
Terminate the filter program and return the literal value to the kernel.
RET A
Terminate the filter program and return the value of the A
register to the kernel.
AUTHOR
Paul Evans <leonerd@leonerd.org.uk>