--- New manual pages for the DIFFUSE kernel module, exporter and collector, and --- an update to the ipfw.8 manual page to describe the DIFFUSE grammar --- extensions. --- --- Sponsored by: FreeBSD Foundation --- Reviewed by: bz --- MFC after: 1 month --- diff -r a5c28de70e9b sbin/ipfw/diffuse_collector/Makefile --- a/sbin/ipfw/diffuse_collector/Makefile Wed Nov 16 19:06:57 2011 +1100 +++ b/sbin/ipfw/diffuse_collector/Makefile Thu Nov 17 13:26:48 2011 +1100 @@ -7,6 +7,6 @@ SRCS= diffuse_collector.c diffuse_proto.c DPADD= ${LIBUTIL} LDADD= -lutil -MAN= +MAN= diffuse_collector.8 .include diff -r a5c28de70e9b sbin/ipfw/diffuse_collector/diffuse_collector.8 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/sbin/ipfw/diffuse_collector/diffuse_collector.8 Thu Nov 17 13:26:48 2011 +1100 @@ -0,0 +1,160 @@ +.\" +.\" Copyright (c) 2010 +.\" Swinburne University of Technology, Melbourne, Australia. +.\" Copyright (c) 2011 The FreeBSD Foundation +.\" All rights reserved. +.\" +.\" This software was developed at the Centre for Advanced Internet +.\" Architectures, Swinburne University of Technology, by Sebastian Zander, made +.\" possible in part by a gift from The Cisco University Research Program Fund, a +.\" corporate advised fund of Silicon Valley Community Foundation. +.\" +.\" Portions of this documentation were written at the Centre for Advanced +.\" Internet Architectures, Swinburne University of Technology, Melbourne, +.\" Australia by Lawrence Stewart under sponsorship from the FreeBSD Foundation. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR +.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd November 16, 2011 +.Dt diffuse_collector 8 +.Os +.Sh NAME +.Nm diffuse_collector +.Nd receive flow information from one or more flow information sources and +manage local firewall and/or traffic shaper actions. +.Sh SYNOPSIS +.Nm +.Op Fl hnv +.Op Fl c Ar config-file +.Op Fl s Ar sctp-port +.Op Fl t Ar tcp-port +.Op Fl u Ar udp-port +.Sh DESCRIPTION +The +.Nm +utility receives flow information directly from classifier nodes via UDP, or +indirectly from +.Xr diffuse_exporter +instances via UDP, TCP or SCTP. +.Nm +uses the flow information received to manage local firewall and/or traffic +shaper actions. +.Pp +.Nm +was written and tested with +.Xr ipfw 4 +like firewalls in mind, but provisions exist in the code to simplify adding +support for different firewall types in future. +.Pp +The following options are available: +.Bl -tag -width "Ar actions-file" -offset indent +.It Fl c Ar config-file +Path to the plain text INI configuration file. +.It Fl h +Show help. +.It Fl n +Do not generate firewall commands. +Useful for testing. +.It Fl s Ar sctp-port +SCTP port to listen on. +.It Fl t Ar tcp-port +TCP port to listen on. +.It Fl u Ar udp-port +UDP port to listen on. +.It Fl v +Increase verbosity. +.Sh CONFIGURATION FILE +The +.Nm +configuration file uses an INI style layout with key-value pair configuration +items. +There are currently three sections: "general", "firewall" and "classactions". +The example configuration file mentioned in the +.Sx FILES +section documents the available configuration options and their syntax. +.Sh FILES +.Bl -tag -width "conf" -compact -offset 0 +.It Pa /usr/share/examples/diffuse/collector.conf +An example collector configuration file using +.Xr ipfw 4 +for the underlying firewall. +.El +.Sh EXAMPLES +Listen for flow information on UDP and TCP port 5000 and parse the +/etc/collector.conf configuration file to obtain the rest of the required +configuration: +.Bd -literal -offset indent +.Nm Ns + -c /etc/collector.conf -u 5000 -t 5000 +.Ed +.Sh SEE ALSO +.Xr diffuse 4 , +.Xr dummynet 4 , +.Xr ipfw 4 , +.Xr diffuse_exporter 8 , +.Xr ipfw 8 +.Sh ACKNOWLEDGEMENTS +Development and testing of this software were made possible in part by grants +from the FreeBSD Foundation and The Cisco University Research Program Fund, a +corporate advised fund of Silicon Valley Community Foundation. +.Sh HISTORY +The +.Nm +utility is part of the DIFFUSE architecture and first appeared in +.Fx 10.0 . +.Pp +DIFFUSE +.Ns ( Em DI Ns stributed +.Em F Ns irewall +and +.Em F Ns low-shaper +.Em U Ns sing +.Em S Ns tatistical +.Em E Ns vidence ) +was first released in 2010 by Sebastian Zander whilst working on the DIFFUSE +research project at Swinburne University of Technology's Centre for Advanced +Internet Architectures, Melbourne, Australia, which was made possible in part by +a gift from The Cisco University Research Program Fund, a corporate advised fund +of Silicon Valley Community Foundation. +More details are available at: +.Pp +http://caia.swin.edu.au/urp/diffuse/ +.Sh AUTHORS +.An -nosplit +The +.Nm +was written by +.An Sebastian Zander Aq szander@swin.edu.au +and later extended by +.An Lawrence Stewart Aq lstewart@FreeBSD.org . +.Pp +This manual page was written by +.An Sebastian Zander Aq szander@swin.edu.au +and +.An Lawrence Stewart Aq lstewart@FreeBSD.org . +.Sh BUGS +.Bl -dash +.It +IPv6 is currently unsupported. +.El diff -r a5c28de70e9b sbin/ipfw/diffuse_exporter/Makefile --- a/sbin/ipfw/diffuse_exporter/Makefile Wed Nov 16 19:06:57 2011 +1100 +++ b/sbin/ipfw/diffuse_exporter/Makefile Thu Nov 17 13:26:48 2011 +1100 @@ -7,6 +7,6 @@ SRCS= diffuse_exporter.c diffuse_proto.c DPADD= ${LIBUTIL} LDADD= -lutil -MAN= +MAN= diffuse_exporter.8 .include diff -r a5c28de70e9b sbin/ipfw/diffuse_exporter/diffuse_exporter.8 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/sbin/ipfw/diffuse_exporter/diffuse_exporter.8 Thu Nov 17 13:26:48 2011 +1100 @@ -0,0 +1,146 @@ +.\" +.\" Copyright (c) 2010 +.\" Swinburne University of Technology, Melbourne, Australia. +.\" Copyright (c) 2011 The FreeBSD Foundation +.\" All rights reserved. +.\" +.\" This software was developed at the Centre for Advanced Internet +.\" Architectures, Swinburne University of Technology, by Sebastian Zander, made +.\" possible in part by a gift from The Cisco University Research Program Fund, a +.\" corporate advised fund of Silicon Valley Community Foundation. +.\" +.\" Portions of this documentation were written at the Centre for Advanced +.\" Internet Architectures, Swinburne University of Technology, Melbourne, +.\" Australia by Lawrence Stewart under sponsorship from the FreeBSD Foundation. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR +.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd November 15, 2011 +.Dt diffuse_exporter 8 +.Os +.Sh NAME +.Nm diffuse_exporter +.Nd forward flow information from a classifier node to one or more action nodes. +.Sh SYNOPSIS +.Nm +.Op Fl hv +.Op Fl a Ar prot://host:port , Ns ... +.Op Fl c Ar host:port +.Sh DESCRIPTION +The +.Nm +utility acts as an intermediary between a DIFFUSE classifier node and one or +more action nodes. +It listens for UDP messages containing flow information sent by the in-kernel +.Xr diffuse 4 +classifier and forwards the information to remote action nodes using UDP, TCP or +SCTP. +.Pp +The following options are available: +.Bl -tag -width "Ar prot://host:port,..." -offset indent +.It Fl a Ar prot://host:port , Ns ... +Comma separated (no spaces) list of action nodes to forward flow information to. +.Ar proto +must be either "udp", "tcp", or "sctp", +.Ar host +is the action node's IP address or fully qualified host name, and +.Ar port +is the port the action node is listening on. +If no +.Ar port +is specified, 3191 is used as a default. +.It Fl c Ar host:port +The classifier node to listen for flow information from. +.Ar host +is the classifier node's IP address or fully qualified host name, and +.Ar port +is the port +.Nm +will listen on for messages from the classifier node. +If no +.Ar port +is specified, 3191 is used as a default. +.It Fl h +Show help. +.It Fl v +Increase verbosity. +.El +.Sh EXAMPLES +Listen for flow information from a local classifier on the default port and +forward it to two different action nodes using TCP and SCTP respectively: +.Bd -literal -offset indent +.Nm Ns + -a tcp://action1.node:5000,sctp://action2.node:5000 \\\ + -c localhost +.Ed +.Sh SEE ALSO +.Xr diffuse 4 , +.Xr dummynet 4 , +.Xr ipfw 4 , +.Xr ipfw 8 , +.Xr diffuse_collector 8 +.Sh ACKNOWLEDGEMENTS +Development and testing of this software were made possible in part by grants +from the FreeBSD Foundation and The Cisco University Research Program Fund, a +corporate advised fund of Silicon Valley Community Foundation. +.Sh HISTORY +The +.Nm +utility is part of the DIFFUSE architecture and first appeared in +.Fx 10.0 . +.Pp +DIFFUSE +.Ns ( Em DI Ns stributed +.Em F Ns irewall +and +.Em F Ns low-shaper +.Em U Ns sing +.Em S Ns tatistical +.Em E Ns vidence ) +was first released in 2010 by Sebastian Zander whilst working on the DIFFUSE +research project at Swinburne University of Technology's Centre for Advanced +Internet Architectures, Melbourne, Australia, which was made possible in part by +a gift from The Cisco University Research Program Fund, a corporate advised fund +of Silicon Valley Community Foundation. +More details are available at: +.Pp +http://caia.swin.edu.au/urp/diffuse/ +.Sh AUTHORS +.An -nosplit +The +.Nm +was written by +.An Sebastian Zander Aq szander@swin.edu.au +and later extended by +.An Lawrence Stewart Aq lstewart@FreeBSD.org . +.Pp +This manual page was written by +.An Sebastian Zander Aq szander@swin.edu.au +and +.An Lawrence Stewart Aq lstewart@FreeBSD.org . +.Sh BUGS +.Bl -dash +.It +IPv6 is currently unsupported. +.El diff -r a5c28de70e9b sbin/ipfw/ipfw.8 --- a/sbin/ipfw/ipfw.8 Wed Nov 16 19:06:57 2011 +1100 +++ b/sbin/ipfw/ipfw.8 Thu Nov 17 13:26:48 2011 +1100 @@ -91,6 +91,21 @@ .Oc .Oc .Ar pathname +.Pp +.Ss DIFFUSE +.Nm +.Brq Cm feature | mlclass | export +.Ar name +.Cm config +.Ar config-options +.Nm +.Brq Cm feature | mlclass | export +.Brq Cm delete | show +.Ar name +.Nm +.Cm flowtable +.Brq Cm show | zero | flush +.Op expired .Sh DESCRIPTION The .Nm @@ -98,8 +113,10 @@ .Xr ipfw 4 firewall, the .Xr dummynet 4 -traffic shaper/packet scheduler, and the -in-kernel NAT services. +traffic shaper/packet scheduler, in-kernel NAT services, and +.Xr diffuse 4 , +an extension that provides machine-learning based traffic classification and +distributed firewalling / traffic shaping. .Pp A firewall configuration, or .Em ruleset , @@ -336,6 +353,17 @@ See the .Sx TRAFFIC SHAPER (DUMMYNET) CONFIGURATION Section below for details. +.Ss DIFFUSE CONFIGURATION +The +.Nm +.Cm feature , mlclass +and +.Cm export +commands are used to configure +.Xr diffuse 4 . +See the +.Sx DIFFUSE CONFIGURATION +Section below for details. .Pp If the world and the kernel get out of sync the .Nm @@ -722,6 +750,14 @@ socket bound to port .Ar port . The search terminates. +.It Cm export Ar name +Export flow information for flows classified by +.Xr diffuse 4 . +The information is exported according to the configuration of the export +.Ar name +(see +.Sx DIFFUSE CONFIGURATION Ns ). +The search continues with the next rule. .It Cm fwd | forward Ar ipaddr | tablearg Ns Op , Ns Ar port Change the next-hop on matching packets to .Ar ipaddr , @@ -769,6 +805,12 @@ .Cm fwd a custom kernel needs to be compiled with the option .Cd "options IPFIREWALL_FORWARD" . +.It Cm mlclass Ar name +Classify flows using machine learning classifier +.Ar name , +which was previously configured with a config command (see +.Sx DIFFUSE CONFIGURATION Ns ). +The search continues with the next rule. .It Cm nat Ar nat_nr Pass packet to a nat instance @@ -1249,6 +1291,19 @@ .It Cm bridged Alias for .Cm layer2 . +.It Cm class-tags Ar tag , Ns Ar ... +This option can only be specified for rules with a +.Cm mlclass +action. +If specified, packets will be tagged with the specified tag(s) according to +their class(es). +For example, a packet that classified as the first class will be tagged with the +first tag. +(See +.Sx DIFFUSE CONFIGURATION Ns ). +.Pp +Tagged packets can then be matched in later rules using the tagged option +described below. .It Cm diverted Matches only packets generated by a divert socket. .It Cm diverted-loopback @@ -1289,6 +1344,48 @@ .Pq Cm ah , and IPsec encapsulated security payload headers .Pq Cm esp . +.It Cm features Ar feature-name , Ns Ar ... Op unidirectional +Enables the computation of one or more previously configured features +named +.Ar feature-name , Ns Ar ... +(see +.Sx DIFFUSE CONFIGURATION Ns ). +Features will only be computed for flows +whose packets match all rule options prior to +.Cm features . +Flows are bidirectional by default, unless +.Ar unidirectional +is specified here. +.Pp +If not explicitly specified, the features option is automatically generated for +every rule that has one or more feature matches, match-if-class options, or a +mlclass action. +.It Qq Ar feature-stat-name Ns Bro < Ns | Ns <= Ns | Ns = Ns | Ns >= Ns | Ns > Brc Ns value +Matches a packet if the feature +.Ar feature-stat-name +is smaller, smaller equal, equal, larger equal or larger than the specified +value. +A +.Ar feature-stat-name +is expressed as: +.Pp +.Op Bro Ar fwd Ns | Ns Ar bck Brc Ns . Ns +.Ar stat-name . Ns Ar feature-name +.Pp +fwd and bck indicate the direction (bidirectional flows), +.Ar stat-name +defines the name of the statistics provided by the feature module, and +.Ar feature-name +is the name of the configured feature (see +.Sx DIFFUSE CONFIGURATION Ns ). +.Pp +For bidirectional flows, omitting "fwd" or "bck" implies the forward direction +of flows. +For unidirectional flows or features computed in both directions of a +bidirectional flow, neither "fwd" or "bck" should be specified. +.Pp +The whole option must be specified in double quotes if necessary +to prevent the shell from interpreting the ">" etc. .It Cm fib Ar fibnum Matches a packet that has been tagged to use the given FIB (routing table) number. @@ -1537,6 +1634,14 @@ and they are always printed as hexadecimal (unless the .Cm -N option is used, in which case symbolic resolution will be attempted). +.It Cm match-if-class Ar classifier Ns : Ns Ar class , Ns Ar ... +Matches packets that were classified by a previously configured classifier +.Ar classifier +as one of the classes listed. +Each +.Ar class +can be a class name or a # followed by the class number (see +.Sx DIFFUSE CONFIGURATION Ns ). .It Cm proto Ar protocol Matches packets with the corresponding IP protocol. .It Cm recv | xmit | via Brq Ar ifX | Ar if Ns Cm * | Ar ipno | Ar any @@ -2393,6 +2498,510 @@ so those packets are dropped in the output path. Care should be taken to ensure that link-local packets are not passed to .Nm dummynet . +.Sh DIFFUSE CONFIGURATION +With the +.Xr diffuse 4 +extension, +.Nm +can be used to classify flows using machine learning and to export the +information to remote hosts, which can perform actions on these classified +flows, e.g. block or rate limit them. +.Pp +.Nm DIFFUSE +operates by first using the firewall to select packets. +Selected packets are grouped into bidirectional or unidirectional flows, for +which characteristics (features) are computed. +Based on the features, flows can then be classified using machine learning. +Classified flows can be exported to remote hosts to trigger remote actions. +.Pp +Packets are classified into flows similar to +.Nm Ns 's +generation of dynamic rules (if keep-state is used). +Bidirectional flows are like dynamic rule entries. +However, +.Nm diffuse +flows can be unidirectional too (if required). +The timeout of flows also works the same way as for dynamic rules. +The timeout values for different protocols are configurable (sysctl variables). +.Pp +.Nm DIFFUSE +adds three new objects: +.Bl -hang -width "classifier" +.It Em feature +A +.Em feature +is a configured instance of a feature module (implemented as a kernel module). +Features are not single statistics but a set of statistics related to some +characteristic of a flow, e.g. packet length. +Features can be computed independently for each packet but they can also be +computed over a series of packets, called a flow. +.It Em classifier +A +.Em classifier +is a configured instance of a machine learning classifier algorithm (implemented +as a kernel module). +A classifier takes a set of feature statistics as input and based on these +assigns a class to the packet or flow. +.It Em export +An +.Em export +is a configured instance of the export function. +The only way to currently export classified flow information from the kernel is +via UDP. +An +.Em export +sends the information directly or via a +.Xr diffuse_exporter 8 +instance to one or more action node(s), which then based on the class of flows +can perform various actions, such as blocking or shaping. +.El +.Ss FEATURES +Before features can be used they must be configured using the config +command: +.Pp +.Nm +.Cm feature +.Ar feature-name +.Cm config module +.Ar module-name +.Op Ar module-options +.Pp +The +.Ar feature-name +is the name of the new feature instance. +It cannot be the name of an existing feature instance. +.Ar module-name +is the name of a feature module. +See +.Xr diffuse 4 +for the list of existing modules. +.Pp +Since the "plen" and "iat" modules are used in the examples below we briefly +explain their statistics here. +The "plen" module computes minimum, mean, maximum, and variance of packet +length, and the "iat" module computes minimum, mean, maximum, and variance +of packet inter-arrival times. +.Pp +For each existing feature module +.Nm diffuse +will create a default feature instance with default configuration that +has the same name as the module. +If the default configuration is deemed adequate the default feature can be use +as is without any further configuration. +.Pp +However, in many cases the default configuration will not be appropriate. +The following command creates a new feature instance of the "plen" module: +.Pp +.Dl "ipfw feature myplen config module plen window 25" +.Pp +The +.Ar module-options +are module specific, and the -h parameter can be used to view available options +e.g. the following command will show the plen module options: +.Pp +.Dl "ipfw feature myplen config module plen -h" +.Pp +To view the properties of configured features use: +.Pp +.Nm +.Cm feature show +.Brq Ar feature-name | Cm all +.Pp +This shows the details for the configured feature +.Ar feature-name +or for all configured features if "all" is specified. +.Pp +Configured features can be deleted using: +.Pp +.Nm +.Cm feature delete +.Ar feature-name +.Ss CLASSIFIERS +Before classifiers can be used they must be configured using the config +command: +.Pp +.Nm +.Cm mlclass +.Ar classifier-name +.Cm config algorithm +.Ar algorithm-name +.Op use-feature-stats Ar feature Ns , Ns ... +.Op class-names Ar name Ns , Ns ... +.Op confirm Ar number +.Op Brq Cm sample Ar number | Cm rnd-sample Ar prob +.Op Ar algorithm-options +.Pp +The +.Ar classifier-name +is the name of the new classifier instance. +It cannot be the name of an existing classifier instance. +.Ar algorithm-name +is the name of a classifier module, which implements the actual classification +algorithm. +See +.Xr diffuse 4 +for a list of classifiers. +The "nbayes" algorithm used in the following examples classifies packets or +flows based on the Bayes formula. +.Pp +By default the names of the feature statistics used are read from the model +file. +However, if the model file does not contain valid feature statistic names or one +wants to use different statistics, the parameter +.Cm use-feature-stats +allows the definition of features used by the classifier as part of the +classifier configuration. +Feature statistics are specified as for feature matches (see rule options). +.Pp +By default the names of the classes are read from the model file too. +However, the parameter +.Cm class-names +allows to override the names specified in the model file as part of the +classifier configuration. +.Pp +The parameter +.Cm confirm +controls how many times a class has to be confirmed before a +.Cm match-if-class +will match. +For example, if +.Cm confirm +is set to 2 a +.Cm match-if-class +will only match if at least 3 consecutive packets were classified as the same +class. +By default +.Cm confirm +is zero; as soon as all features are computed each packet can be matched with a +.Cm match-if-class . +.Pp +By default a classifier classifies every packet for which all needed features +have been computed. +Sampling can be used to only classify a subset of packets to reduce CPU load. +The parameter +.Cm sample +can be used to only execute the classifier every +.Ar number +packets. +The parameter +.Cm rnd-sample +can be used to randomly sample packets for which the classifier is executed. +The previous packet's class (if any) is assigned to non-sampled packets. +.Pp +To facilitate a quick classification of new flows the first packet +where all features have been computed is always classified, regardless of the +sampling parameters. +Note: specify only +.Cm sample +or +.Cm rnd-sample +but not both! +.Pp +Since a classifier configuration totally depends on the classifier model, no +default classifier instances are generated. +.Pp +The following command creates a new classifier instance for the "nbayes" +algorithm using a fictitious classifier model called example_nbayes.diffuse. +.Bd -literal -offset indent + ipfw mlclass myclass config algorithm nbayes model \\\ + example_nbayes.diffuse +.Ed +.Pp +The +.Ar module-options +depend on the algorithm. +To view the available options the -h parameter can be used, for example: +.Pp +.Dl "ipfw mlclass myclass config algorithm nbayes -h" +.Pp +will show all the options the "nbayes" algorithm provides. +.Pp +To view the properties of configured classifiers use: +.Pp +.Nm +.Cm mlclass show +.Brq Ar class-name | Cm all +.Pp +This shows the details for the configured classifier +.Ar class-name +or for all configured classifiers if "all" is specified. +.Pp +Configured classifiers can be deleted using: +.Pp +.Nm +.Cm mlclass delete +.Ar class-name +.Ss EXPORTS +Before exports can be used they must be configured using the config command: +.Pp +.Nm +.Cm export +.Ar export-name +.Cm config target udp:// Ns Ar host Ns +.Op : Ns Ar port +.Op action Ar action-name +.Op action-params Ar action-params +.Op min-batch Ar min-batch-number +.Op max-batch Ar max-batch-number +.Op max-delay Ar max-delay-number +.Op confirm Ar confirm-number +.Op unidirectional +.Pp +The +.Ar export-name +is the name of the new export instance. +It cannot be the name of an existing export instance. +.Ar action-name +and +.Ar action-params +are the action name and parameters that are sent for matching flows. +Note that the receiver may overrule these with locally specified actions. +.Pp +Note that the arguments of +.Ar action-name +and +.Ar action-params +are currently opaque values for DIFFUSE i.e. no checking is performed. +Also note that the length of the arguments is currently limited to +.Em seven +characters and longer arguments are truncated. +.Pp +The argument of +.Cm target +specifies the receiver. +The protocol must be UDP, +.Ar host +is the fully qualified host name, and +.Ar port +is the port number of the export target. +.Pp +.Cm min-batch-number +is the minimum number of flows exported in one batch. +Similarly, +.Ar max-batch-number +is the maximum number of flows exported in one batch (must be equal or larger +than +.Ar min-batch-number ). +These parameters allow controlling the export batch size. +Note that increasing +.Ar min-batch-number +also increases the delay for delivering flow information. +.Pp +.Ar max-delay-number +specifies a maximum delay between the generation of export information and the +actual export. +Note that if +.Ar max-delay-number +is set (value > 0) the minimum batch size is still enforced, but the maximum +batch size can now be exceeded (if more records are over the maximum delay than +the maximum batch size). +.Pp +.Ar confirm-number +is used to specify how many times a flow has to be consecutively classified as the +.Em same +class before it is exported. +For example, if +.Ar confirm-number +is set to 2, information is only exported if the class was confirmed twice +(meaning three consecutive classifications resulting in the same class). +.Pp +By default the receiver will treat flows as bidirectional, i.e. apply actions +to both directions of a flow. +The +.Cm unidirectional +option specifies that the receiver should treat flows as unidirectional. +However, based on local configuration the receiver may still decide to treat +flows differently than indicated by the classifier. +.Pp +Note that all UDP packets send by an export rule bypass the firewall. +.Pp +The following command creates a new export instance that exports data to +localhost using the default port (port 3191). +The minimum batch size is set to 2 and the maximum batch size is set to 10. Note +that packets send by an export rule bypass the firewall. +.Pp +.Bd -literal -offset indent + ipfw export myexp config target udp://localhost min-batch 1 \\\ + max-batch 5 +.Ed +.Pp +To view the properties of configured exports use: +.Pp +.Nm +.Cm export show +.Brq Ar export-name | Cm all +.Pp +This shows the details for the configured export +.Ar export-name +or for all configured exports if "all" is specified. +.Pp +Configured exports can be deleted using: +.Pp +.Nm +.Cm export delete +.Ar export-name +.Pp +.Ss FLOWTABLE +.Nm DIFFUSE +has a flow table to keep state of any flows for which features are computed, and +which are classified using a classifier. +The current list of flows in the table can be viewed using the following +command: +.Pp +.Nm +.Cm flowtable show +.Op expired +.Pp +By default only active flows are shown. +If expired is specified, expired flows are also shown. Note, that any expired +flows are eventually deleted. +.Pp +The command outputs the list of features followed by one line for each flow in +the table. +Each line contains the following fields: +.Bl -bullet +.It +Rule number of the rule that generated the flow +.It +Bucket number in the hash table +.It +Number of packets +.It +Number of bytes +.It +Expire time +.It +Protocol, source IP and port (UDP, TCP), destination IP and port (UDP, TCP) +.It +List of all feature statistics with their current values +.It +List of classes (if the flow has been classified) +.El +.Pp +The packet and byte counters can be reset using: +.Pp +.Nm +.Cm flowtable zero +.Pp +All entries can be removed from the flow table using: +.Pp +.Nm +.Cm flowtable flush +.Pp +The flow table has a fixed number of buckets, but it can be re-sized. +First, change the sysctl variable net.inet.ip.diffuse.ft_buckets to the desired +value, and then flush the flow table. +The sysctl variable net.inet.ip.diffuse.ft_curr_buckets shows the actual number +of buckets. +.Ss FLOWS AND FEATURE COMPUTATION +The first rule with a features option a packet hits triggers the flow lookup and +flow state generation. +Later rules for the same packet then use a pointer to the flow entry. +This means whether a flow is bidirectional or unidirectional is determined by +the rule creating the flow, and later rules cannot change this. +.Pp +The use of bidirectional flows is recommended (the default). +It is possible to use unidirectional flows with the "unidirectional" option of +"features". +Do not mix both types unless you know what you are doing. +.Pp +The "features" option is implicitly generated for rules with feature matches, +"match-if-class" options or a classifier action, even if not explicitly +specified. +Implicit "features" options list all the features used by the feature matches or +the classifier and will create bidirectional flows. +.Pp +Features of a flow are updated by every rule a packet hits that has a "features" +option. +Each feature is of course only updated once for each packet. +The maximum number of features per flow is currently limited to +.Em eight . +Rules that attempt to update features past this limit do not match, as will +subsequent rules that attempt to use the non-existing features. +.Pp +Note that since the first rule (with terminating action) that matches a packet +terminates the search, rules that update features but do not see all packets +effectively compute the features over a subset of packets. +This allows to create hierarchical features, such as packet length statistics +can be computed only for packets smaller/larger than a threshold. +.Pp +However, in other cases all features must be computed over all packets. +An initial rule with non-terminating action does this: +.Bd -literal -offset indent + ipfw add 1 count ip from any to any features plen,iat + ... +.Ed +.Pp +Now the "plen" and "iat" features will always be computed over all +IP packets regardless of the following rules. +.Pp +.Ss EXAMPLES +The first example configures a new feature instance and uses it to match only +flows with large packets in both directions: +.Pp +.Bd -literal -offset indent + ipfw feature myplen config module plen window 25 + ipfw add allow ip from any to any "fwd.max.myplen>1000" \\\ + "bck.max.myplen>1000" + ipfw add deny ip from any to any +.Ed +.Pp +The following rules create a classifier instance and use it to classify flows. +Then count actions are used to count the matching packets. +(The example classifier model only works for IPv4.) +.Bd -literal -offset indent + ipfw mlclass myclass config algorithm nbayes model \\\ + et_vs_other_plenonly.nbayes.diffuse + ipfw add count ipv4 from any to any match-if-class myclass:#0 + ipfw add count ipv4 from any to any match-if-class myclass:#1 +.Ed +.Pp +The following rules do the same but use the "tag" action and "tagged" option: +.Bd -literal -offset indent + ipfw mlclass myclass config algorithm nbayes model \\\ + et_vs_other_plenonly.nbayes.diffuse + ipfw add mlclass myclass ipv4 from any to any class-tags 10,20 + ipfw add count ipv4 from any to any tagged 10 + ipfw add count ipv4 from any to any tagged 20 +.Ed +.Pp +The following rules create a classifier instance and an export instance. +Here we explicitly specify the feature statistics and class names. +Instead of class numbers class names are used in the "match-if-class" here. +Flows of class "et" are then exported. +.Bd -literal -offset indent + ipfw mlclass myclass config algorithm nbayes model \\\ + et_vs_other_plenonly.nbayes.diffuse \\\ + use-feature-stats fwd.min.myplen,fwd.mean.myplen,\\\ + fwd.max.myplen,fwd.stdev.myplen,bck.min.myplen,\\\ + bck.mean.myplen,bck.max.myplen,bck.stdev.myplen \\\ + class-names other,et + ipfw myexp config target udp://localhost min-batch 5 + ipfw add export myexp ipv4 from any to any match-if-class myclass:et +.Ed +.Ss SYSCTLS +.Nm DIFFUSE +provides a number of sysctl variables that are described under +.Sx SYSCTL VARIABLES . +.Ss LIMITATIONS +.Nm DIFFUSE +uses names for features, classifiers and exports instead of numbers +for easier use. +Currently, the size of names is limited to +.Em seven +characters. +Longer names will be truncated. +.Pp +The same limitation applies to the arguments of +.Ar action-name +and +.Ar action-params +when configuring exports. +.Pp +.Nm DIFFUSE +only works with IPv4. +(Parts of it, like the flow table, work with IPv6, but other parts, like the +flow information export, do not.) .Sh CHECKLIST Here are some important points to consider when designing your rules: @@ -2850,6 +3459,40 @@ Controls whether bridged packets are passed to .Nm . Default is no. +.It Va net.inet.ip.diffuse.to_curr_buckets: Va net.inet.ip.diffuse.to_buckets +The current size of the timeout ring buffer (readonly) +.It Va net.inet.ip.diffuse.to_buckets: No 512 +The size of the timeout ring buffer. +Must be a power of 2. +Must be larger than the largest lifetime (see below). +.It Va net.inet.ip.diffuse.ft_ack_lifetime : No 300 +.It Va net.inet.ip.diffuse.ft_syn_lifetime : No 20 +.It Va net.inet.ip.diffuse.ft_fin_lifetime : No 1 +.It Va net.inet.ip.diffuse.ft_rst_lifetime : No 1 +.It Va net.inet.ip.diffuse.ft_udp_lifetime : No 5 +.It Va net.inet.ip.diffuse.ft_short_lifetime : No 30 +These variables control the lifetime, in seconds, of flows in the +.Nm diffuse +flowtable. +Upon the initial SYN exchange the lifetime is kept short, then increased after +both SYN have been seen, then decreased again during the final FIN exchange or +when a RST is received. +.It Va net.inet.ip.diffuse.ft_curr_buckets : Va net.inet.ip.diffuse.ft_buckets +The current number of buckets in the flow table (readonly). +.It Va net.inet.ip.diffuse.ft_buckets : No 256 +The number of buckets in the flow table (hash table). +Must be a power of 2, up to 65536. +It only takes effect when all entries are expired, so you need to flush the flow +table. +.It Va net.inet.ip.diffuse.ft_max : No 4096 +Maximum number of flow table entries. +When you hit this limit, no more flow entries can be installed until old ones +expire. +.It Va net.inet.ip.diffuse.ft_count : No XXX +Current number of flow entries including expired) (read-only). +.It Va net.inet.ip.diffuse.ex_max_qsize: No 256 +This variable controls the maximum size of the flow information export queue. +If this size is exceeded records will be silently dropped. .El .Pp .Sh EXAMPLES @@ -3201,6 +3844,7 @@ .Xr cpp 1 , .Xr m4 1 , .Xr altq 4 , +.Xr diffuse 4 , .Xr divert 4 , .Xr dummynet 4 , .Xr if_bridge 4 , @@ -3265,6 +3909,13 @@ Delay profiles have been developed by Alessandro Cerri and Luigi Rizzo, supported by the European Commission within Projects Onelab and Onelab2. +.Pp +The +.Xr diffuse 4 +extension that provides machine learning traffic classification and distributed +firewalling / traffic shaping has been developed by +.An The Centre for Advanced Internet Architectures (CAIA) Aq http://www.caia.swin.edu.au . +The author of this extension is Sebastian Zander. .Sh BUGS The syntax has grown over the years and sometimes it might be confusing. Unfortunately, backward compatibility prevents cleaning up mistakes diff -r a5c28de70e9b share/man/man4/Makefile --- a/share/man/man4/Makefile Wed Nov 16 19:06:57 2011 +1100 +++ b/share/man/man4/Makefile Thu Nov 17 13:26:48 2011 +1100 @@ -101,6 +101,7 @@ ddb.4 \ de.4 \ devctl.4 \ + diffuse.4 \ digi.4 \ disc.4 \ divert.4 \ diff -r a5c28de70e9b share/man/man4/diffuse.4 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/share/man/man4/diffuse.4 Thu Nov 17 13:26:48 2011 +1100 @@ -0,0 +1,211 @@ +.\" +.\" Copyright (c) 2010 +.\" Swinburne University of Technology, Melbourne, Australia. +.\" Copyright (c) 2011 The FreeBSD Foundation +.\" All rights reserved. +.\" +.\" This software was developed at the Centre for Advanced Internet +.\" Architectures, Swinburne University of Technology, by Sebastian Zander, made +.\" possible in part by a gift from The Cisco University Research Program Fund, a +.\" corporate advised fund of Silicon Valley Community Foundation. +.\" +.\" Portions of this documentation were written at the Centre for Advanced +.\" Internet Architectures, Swinburne University of Technology, Melbourne, +.\" Australia by Lawrence Stewart under sponsorship from the FreeBSD Foundation. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR +.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd November 16, 2011 +.Dt DIFFUSE 4 +.Os +.Sh NAME +.Nm DIFFUSE +.Nd Distributed Firewall and Flow-shaper Using Statistical Evidence +.Sh DESCRIPTION +.Nm +.Ns ( Em DI Ns stributed +.Em F Ns irewall +and +.Em F Ns low-shaper +.Em U Ns sing +.Em S Ns tatistical +.Em E Ns vidence ) +implements an architecture for distributed network flow classification and +treatment. +.Nm +has two main entities: classifier nodes and action nodes. +.Pp +Classifier nodes run the +.Nm +kernel module which extends +.Xr ipfw 4 +to provide machine learning based traffic classification using statistical +properties (features) of observed traffic flows. +IPFW rules are configured using new +.Nm +specific grammar to export flow information, which is typically relayed to one +or more action nodes by a +.Xr diffuse_exporter 8 +instance. +.Pp +Action nodes receive flow information from classifier node(s) using a +.Xr diffuse_collector 8 +instance and perform actions (block, redirect, rate shape, etc.) on packets +belonging to classified flows. +.Pp +The following diagram outlines the typical way in which the architecture would +be deployed: +.Bd -literal + +-----------------+ +----------------+ + | Classifier Node | | Action Node | + | | | | + | +------------+ | Flow Info | +------------+ | + | | Exporter |------------------>| Collector | | + | +------------+ | | +------------+ | + | ^ | | | | + | | | | v | + | +------------+ | | +------------+ | + | | Classifier | | | | Firewall/ | | + | +------------+ | | | Shaper | | + | ^ | | +------------+ | + | | | | | | + +-------|---------+ +--------|-------- + | Traffic measurement | Traffic manipulation + | V + <================== Network Traffic ===================> +.Ed +.Pp +Classifier nodes and action nodes are logical entities that only require IP +connectivity between them. +They can be located on separate physical machines or co-located on the same +machine. +.Ss Feature Modules +The following feature modules are available as kernel modules named +diffuse_feature_: +.Bl -tag -width "plenbd" +.It iat +Calculates unidirectional interarrival time features +.It iatbd +Calculates bidirectional interarrival time features +.It pcnt +Calculates packet count features +.It plen +Calculates unidirectional packet length features +.It plenbd +Calculates bidirectional packet length features +.It skype +Calculates skype specific features +.El +.Ss Classifier Modules +The following classifier modules are available as kernel modules named +diffuse_classifier_: +.Bl -tag -width "nbayes" +.It c45 +C4.5 decision tree classifier implementation +.It nbayes +Naive-Bayes classifier implementation +.El +.Ss Kernel Options +The following options in the kernel configuration file are related to +.Nm : +.Pp +.Bl -tag -width "IPFIREWALL_VERBOSE_LIMIT" -offset indent -compact +.It Dv IPFIREWALL +enable ipfirewall (required for +.Nm ) +.It Dv IPFIREWALL_VERBOSE +enable firewall output +.It Dv IPFIREWALL_VERBOSE_LIMIT +limit firewall output +.It Dv DUMMYNET +enable dummynet (required for shaping) +.It Dv HZ +set the timer granularity (for dummynet) +.It Dv DIFFUSE +enable +.El +.Pp +If loading IPFW and +.Nm +as kernel modules, no changes to the kernel configuration file are necessary. +.Pp +If you wish to compile +.Nm +into the kernel the following options are required: +.Bd -literal -offset indent +options IPFIREWALL +options DUMMYNET +options DIFFUSE +options HZ=1000 # strongly recommended for dummynet +.Ed +.Sh SEE ALSO +.Xr dummynet 4 , +.Xr ipfw 4 , +.Xr diffuse_collector 8 , +.Xr diffuse_exporter 8 , +.Xr ipfw 8 , +.Xr kldload 8 +.Sh ACKNOWLEDGEMENTS +Development and testing of this software were made possible in part by grants +from the FreeBSD Foundation and The Cisco University Research Program Fund, a +corporate advised fund of Silicon Valley Community Foundation. +.Sh HISTORY +The +.Nm +kernel module is part of the DIFFUSE architecture and first appeared in +.Fx 10.0 . +.Pp +.Nm +.Ns ( Em DI Ns stributed +.Em F Ns irewall +and +.Em F Ns low-shaper +.Em U Ns sing +.Em S Ns tatistical +.Em E Ns vidence ) +was first released in 2010 by Sebastian Zander whilst working on the DIFFUSE +research project at Swinburne University of Technology's Centre for Advanced +Internet Architectures, Melbourne, Australia, which was made possible in part by +a gift from The Cisco University Research Program Fund, a corporate advised fund +of Silicon Valley Community Foundation. +More details are available at: +.Pp +http://caia.swin.edu.au/urp/diffuse/ +.Sh AUTHORS +.An -nosplit +.Nm +was written by +.An Sebastian Zander Aq szander@swin.edu.au +and later extended by +.An Lawrence Stewart Aq lstewart@FreeBSD.org . +.Pp +This manual page was written by +.An Sebastian Zander Aq szander@swin.edu.au +and +.An Lawrence Stewart Aq lstewart@FreeBSD.org . +.Sh BUGS +.Bl -dash +.It +IPv6 is currently unsupported. +.El