Saturday, October 23, 2010

OSSEC Rules 101

As I've already written about decoders, it's time to cover rules. Rules are phase 3 in our 3 phase plan (pre-decoding, decoding, and rules). The syntax guide for writing rules can be found in the OSSEC documentation: here.

I will still be using the ossec-logtest progam to help illustrate the rules. I'll also be starting with the same log message as before:


Oct 8 14:15:33 fedora11 sshd[11773]: Failed password for ddp from 172.16.51.1 port 56588 ssh2


Just to refresh your memory, the beginning of the message gets chopped off during pre-decoding and decoding, so we will be left with "Failed password for ddp from 172.16.51.1 port 56588 ssh2" as the material for creating a rule. Here is the output of feeding the above message through ossec-logtest:


**Phase 1: Completed pre-decoding.
full event: 'Oct 8 14:15:33 fedora11 sshd[11773]: Failed password for ddp from 172.16.51.1 port 56588 ssh2'
hostname: 'fedora11'
program_name: 'sshd'
log: 'Failed password for ddp from 172.16.51.1 port 56588 ssh2'

**Phase 2: Completed decoding.
decoder: 'sshd'
dstuser: 'ddp'
srcip: '172.16.51.1'

This will be out reference for writing the rules.

There are a number of ways for OSSEC to evaluate a log message to determine the rule that will be triggered. Much like decoders the first rule that matches will be triggered, so the order of the rules is important. Since OpenSSH is a reasonably complicated group of applications, there will probably be a few rules associated with it (much like there are a number of decoders). 

All OSSEC rules start the same way:
<rule id="NUMBER" level="NUMBER">

The id is just an identification number to reference the rule. The level determines the severity of the event, and can be any number between 0 and 15. Here's a handy wiki entry detailing the various severity levels.

Since I am using examples that are already in place I can just use the id and level of the real rule:
<rule id="5700" level="0" noalert="1">

The option 'noalert="1"' means that this rule will never trigger an alert. We are going to use this rule to group all ssh rules together, so we don't want any sshd event we do not yet have a rule for to trigger Rule 5700.

It can be helpful to write a rule to help identify sshd log messages. We can do this a number of ways, but since we have an sshd decoder already, we'll match on the decoder.
  <decoded_as>sshd</decoded_as>

With this option, the rule will catch all log events that are decoded (in Phase 2) as sshd. Now we will add a description to this rule to help us understand it better later, and close the rule out:
  <description>SSHD messages grouped.</description>
<./rule>

Here's the rule all together:
<rule id="5700" level="0" noalert="1">
  <decoded_as>sshd</decoded_as>
  <description>SSHD messages grouped.</description>
</rule>

This is the first rule in the sshd_rules.xml file. Above it is the line:
<group name="syslog,sshd,">

This creates two groups called "sshd" and "syslogd." These groups can be used in rules and reports. More on this later.

With rule 5700 in place my log message would match it, but not report itself as matching it (because of the noalert="1" option). So I need another rule to match the specific message I'm seeing. We'll start it the same way as the last message, and add another option:
<rule id="5716" level="5">
  <if_sid>5700</if_sid>

As  you may be able to guess, this is the 17th rule in sshd_rules.xml (5700-5716). It's classified as a level 5 rule because one mistyped password may not be a big deal.

The <if_sid> option is similar to the <parent> option in decoders, this rule will only be checked if rule 5700 matched the log message. This should help optimize the ruleset, and not force OSSEC to check too many unrelated rules when a log message comes in.

The next thing we want to do is somehow match the log message. We'll add the following to do just that:
  <match>^Failed|^error: PAM: Authentication</match>

What this line does is tries to match its value to the log message. Remember the "^" symbol from the decoder post? If not, the symbol means that the character immediately following it should be the first character in the log message. The pipe symbol, "|", is a logical "or." In this case the value "^Failed" or "^error: PAM: Authentication" can match in the rule. Without the pipe we'd have to create two rules, one for each string. Using the pipe means we can be lazy and only write one rule to handle both possible messages.

So with this <match> option we're looking for the first part of the log message to be either "Failed" or "error." With the above log message we do indeed see "Failed." Remember that we are only dealing with the log section, as given to us in the decoding phase (Failed password for ddp from 172.16.51.1 port 56588). So this <match> will match our log message.

We want to be able to identify what is happening, and the description is what will be sent in the alert. Here is the description:
  <description>SSHD authentication failed.</description>

We also want to be able to search for these types of events (mis-typed passwords) for reports, so we'll also add a group to identify it as an authentication failure and close out the rule:
  <group>authentication_failed,</group>
</rule>

This group will be in addition to the main groups posted above. The complete group listing for this rule will be "syslog,sshd,authentication_failed."

Here is the complete rule:
<rule id="5716" level="5">
  <if_sid>5700</if_sid>
  <match>^Failed|^error: PAM: Authentication</match>
  <description>SSHD authentication failed.</description>
  <group>authentication_failed,</group>
</rule>

**Phase 1: Completed pre-decoding.
       full event: 'Oct  8 14:15:33 fedora11 sshd[11773]: Failed password for ddp from 172.16.51.1 port 56588 ssh2'
       hostname: 'fedora11'
       program_name: 'sshd'
       log: 'Failed password for ddp from 172.16.51.1 port 56588 ssh2'

**Phase 2: Completed decoding.
       decoder: 'sshd'
       dstuser: 'ddp'
       srcip: '172.16.51.1'

**Phase 3: Completed filtering (rules).
       Rule id: '5716'
       Level: '5'
       Description: 'SSHD authentication failed.'
**Alert to be generated.


As I mentioned above one failed login may not be a big deal. Bob mis-types his password at least once a day. But what happens if a bot out there is trying to brute force a password? That's a big deal, and something that should be reported. Here's a full rule to help group rule 5716 events:
<rule id="5720" level="10" frequency="6">
  <if_matched_sid>5716</if_matched_sid>
  <same_source_ip />
  <description>Multiple SSHD authentication failures.</description>
  <group>authentication_failures,</group>
</rule>

Since this is multiple attempts using the wrong password the level is now set to 10. Frequency is an option I haven't introduced yet. Frequency specifies how many times a rule must match before this rule will fire. In this case a failed login has to happen 6 times before rule 5720 will match.

<if_matched_sid> is similar to <if_sid>. There may be a difference, but I don't really know it. When creating rules with the frequency option, use <if_matched_sid>.

<same_source_ip /> just tells OSSEC that the 6 failed attempts must come from the same source IP address. If Bob, Lisa, Angela, Todd, Jason, and Herbert all mistype their passwords at the same time from different systems we don't want this alert to fire.

The rest of the rule is pretty standard so I'm not going to explain it.

Frequency can be very useful, and even combined with a couple of other options. Here is another brute force rule dealing with bad ssh logins:
<rule id="5712" level="10" frequency="6" timeframe="120" ignore="60">
  <if_matched_sid>5710</if_matched_sid>
  <description>SSHD brute force trying to get access to </description>
  <description>the system.</description>
  <same_source_ip />
  <group>authentication_failures,</group>
</rule>

In addition to frequency this rule also uses timeframe and ignore. These options are pretty simple. In this rule a user has to match rule 5710 6 times in a timeframe of 120 seconds. If that 6th event is 121 seconds after the first, this rule will not match. After rule 5712 has fired it will not fire again for 60 seconds thanks to the ignore option. This is to help keep you from getting a flood of alerts.

You'll also note that the description is broken into two lines. This is purely for readability reasons, the lines will be combined in the alert.

I'm not sure why these two rules have different options. I'd like to think that if one of them has the timeframe and ignore options they both should, or neither of them should. I'll have to look into that.

Since ossec-logtest doesn't keep state between the messages fed into it I can't really show a 5720 or 5712 alert using it.

There are a lot more options available in rules. For instance you can use <regex> in a rule. You won't be able to pull information out with parentheses in a regex, that's only available in decoders. Options like <category>, <srcip>, <user>, and <category> can be used to match specific elements. <category>syscheck</category> can be used to modify a syscheck alert. Here's an example:
<rule id="10999" level="15">
  <category>syscheck</category>
  <match>/var/ossec/etc/ossec.conf</match>
  <description>ossec.conf has been modified!</description>
</rule>

This rule would set off an alert of level 15 if syscheck has noticed that /var/ossec/etc/ossec.conf has been modified.

Since we know Bob mis-types his password at least once per day we can do the following to not have alerts fire when he mistypes his password:
<rule id="110000" level="0">
  <if_sid>5716</if_sid>
  <user>bob</user>
  <description>Ignore bob</description>
</rule>

If we want to limit it even further we can limit this to his source IP:
<rule id="110000" level="0">
  <if_sid>5716</if_sid>
  <user>bob</user>
  <srcip>192.168.1.23</srcip>
  <description>Ignore bob</description>
</rule>

There are two more options I want to mention: alert_by_email and no_email_alert. The first always sends an email when that event is triggered, no matter what level is set. This can be useful if you have determined that an event isn't a good candidate for a high level, but you always want to see an email on it. The second means the alert never sends an email. I'm not sure how useful this one is, but I'm guessing someone wanted it.

There's still a lot of information to cover, but it'll have to wait for future blog posts. This one is long enough. If I didn't explain something here well enough please let me know!

3 comments: