Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch Size Affects Log Parsing #60

Open
wrharding opened this issue Mar 15, 2023 · 4 comments
Open

Batch Size Affects Log Parsing #60

wrharding opened this issue Mar 15, 2023 · 4 comments

Comments

@wrharding
Copy link
Contributor

Sagan Version

Compiled with commit fa32f8f and the following config options:
wharding@ruletest:~/sagan$ ./configure --enable-geoip --enable-esmtp --enable-libpcap --enable-redis --enable-gzip

--[Sagan version 2.1.0 | Help/usage screen]--------------------------------

-h, --help		Help (this screen).
-C, --credits		Sagan credits.
-d, --debug [type]	Types: engine, syslog, load, external, threads, ipc, limits, malformed, xbit, flexbit, brointel, parse_ip, client-stats, track-clients, smtp, normalize, plog, geoip, redis, json.
-D, --daemon		Make process a daemon (fork to the background).
-u, --user [username]	Run as user (defaults to 'sagan').
-c, --chroot [dir]	Chroot Sagan to specified directory.
-f, --config [file]	Sagan configuration file to load.
-F, --file [file]	FIFO over ride.  This reads a file in rather than reading
			from a FIFO.  The file must be in the Sagan format!
-l, --log [file]	sagan.log location [default: /var/log/sagan/sagan.log].
-Q, --quiet		Run Sagan in 'quiet' mode (no console output)
-t, --threads [#]	Set number of threads to use (overrides the sagan.yaml).
-r, --rules [file]	Use a specific rule set (overrides the sagan.yaml).
-T, --test		Test configuration (sagan.yaml) and rule sets.

* libesmtp (SMTP) support is included.
* liblognorm (log normalization) support is included.
* PLOG (syslog sniffer) support is included.
* libfastjson support is included.
* Maxmind GeoIP support is included.
* Syslog output is included.
* Using PCRE JIT.
* Using HiRedis/Redis.

* Compiled on Mar 15 2023 at 14:40:47.

Expected Behavior

Attempting to utilize only JSON for log ingestion should result in JSON being parsed correctly with the following (abbreviated) settings:

sagan-core:

  # Settings used by the Sagan's core engine.

  core:

    sensor-name: "default_sensor_name"   # Unique name for this sensor (no spaces)
    cluster-name: "default_cluster_name" # Cluster name (no spaces)
    default-host: 192.168.2.1
    default-port: 514
    default-proto: udp
    dns-warnings: disabled
    source-lookup: disabled		
    fifo-size: 1048576		# System must support F_GETPIPE_SZ/F_SETPIPE_SZ. 
    classification: "$RULE_PATH/classification.config"
    reference: "$RULE_PATH/reference.config"
    gen-msg-map: "$RULE_PATH/gen-msg.map"
    protocol-map: "$RULE_PATH/protocol.map"
    chown-fifo: yes             # Change ownership of FIFO to the "sagan" user.
    xbit-storage: mmap          # "redis" or "mmap"
    json-message-map: "$RULE_PATH/json-message-CC.map"

    max-threads: 100            # Increase this if the system is under heavy load.
    batch-size: 10
    input-type: json                       # pipe or json
    json-map: "$RULE_PATH/json-input.map"  # mapping file if input-type: json
    json-software: nxlog          # by "software" type. 

    json-message-map: "$RULE_PATH/json-message.map"
    json-parse-data: enabled

Using the following rule should result in an alert for the included log:

alert any any any -> any any (msg: "[CUSTOM] RDP Login to Monitored Server [1/2]"; content: "4624"; classtype: suspicious-login; sid: 12345678; rev:1;)

Logs:

{"WorkstationName":"wharding","VirtualAccount":"%%1843","Version":"2","TransmittedServices":"-","TaskValue":"12544","TargetUserSid":"S-1-5-21-3332521084-1015784887-692711768-60568","TargetUserName":"wharding-user","TargetOutboundUserName":"-","TargetOutboundDomainName":"-","TargetLogonId":"0x21fd1ee80","TargetLinkedLogonId":"0x21fd1eee0","TargetDomainName":"WHARDING.LOCAL","SubjectUserSid":"S-1-5-18","SubjectUserName":"wharding$","SubjectLogonId":"0x3e7","SubjectDomainName":"WHARDING.LOCAL","SourceName":"Security","RestrictedAdminMode":"%%1843","RecordNumber":"1367595664","ProviderGuid":"{54849625-5478-4994-A5BA-3E3B0328C30D}","ProcessName":"C:\\Windows\\System32\\svchost.exe","ProcessId":"0x940","PROGRAM":"Security","PID":"0x940","OpcodeValue":"0","Opcode":"Info","Message":"4624: An account was successfully logged on. Subject: Security ID: S-1-5-18 Account Name: WHARDING$ Account Domain: WHARDING.LOCAL Logon ID: 0x3E7 Logon Information: Logon Type: 10 Restricted Admin Mode: No Virtual Account: No Elevated Token: Yes Impersonation Level: Impersonation New Logon: Security ID: S-1-5-21-3332521084-1015784887-692711768-60568 Account Name: wharding Account Domain: WHARDING.LOCAL Logon ID: 0x21FD1EE80 Linked Logon ID: 0x21FD1EEE0 Network Account Name: - Network Account Domain: - Logon GUID: {5809fc30-4386-f83b-2ec7-c0261664ae97} Process Information: Process ID: 0x940 Process Name: C:\\Windows\\System32\\svchost.exe Network Information: Workstation Name: wharding Source Network Address: 192.168.2.2 Source Port: 0 Detailed Authentication Information: Logon Process: User32 Authentication Package: Negotiate Transited Services: - Package Name (NTLM only): - Key Length: 0 This event is generated when a logon session is created. It is generated on the computer that was accessed. The subject fields indicate the account on the local system which requested the logon. This is most commonly a service such as the Server service, or a local process such as Winlogon.exe or Services.exe. The logon type field indicates the kind of logon that occurred. The most common types are 2 (interactive) and 3 (network). The New Logon fields indicate the account for whom the new logon was created, i.e. the account that was logged on. The network fields indicate where a remote logon request originated. Workstation name is not always available and may be left blank in some cases. The impersonation level field indicates the extent to which a process in the logon session can impersonate. The authentication information fields provide detailed information about this specific logon request. - Logon GUID is a unique identifier that can be used to correlate this event with a KDC event. - Transited services indicate which intermediate services have participated in this logon request. - Package name indicates which sub-protocol was used among the NTLM protocols. - Key length indicates the length of the generated session key. This will be 0 if no session key was requested.","LogonType":"10","LogonProcessName":"User32 ","LogonGuid":"{5809fc30-4386-f83b-2ec7-c0261664ae97}","LmPackageName":"-","Keywords":"9232379236109516800","KeyLength":"0","IpPort":"0","IpAddress":"172.16.64.4","ImpersonationLevel":"%%1833","Hostname":"wharding.localhost","HOST_IP":"192.168.2.1","ExecutionThreadID":"2272","ExecutionProcessID":"832","EventTime":"2023-03-14T23:44:06+00:00","EventReceivedTime":"2023-03-14T23:44:07+00:00","EventID":"4624","ElevatedToken":"%%1842","Channel":"Security","Category":"Logon","AuthenticationPackageName":"Negotiate","ActivityID":"{C93111C1-447B-0002-F611-31C97B44D901}"}

Observed Behavior

No alerts are generated from the log (duplicated the log in my local env):

[*]  ,-._,-.  -[ Sagan Version 2.1.0 - Engine Statistics ]-
[*]  \/)"(\/
[*]   (_o_)    Received/Processed/Ignored : 2/0/0 (0.000%/0.000%)
[*]   /   \/)  Signatures matched         : 0 (0.000%)
[*]  (|| ||)   Bytes Processed            : 7251
[*]   oo-oo    Largest log size (bytes)   : 3626 bytes [0 over]
[*]            Alerts                     : 0 (0.000%)
[*]            After                      : 0 (0.000%)
[*]            Threshold                  : 0 (0.000%)
[*]            Dropped                    : 0 (0.000%)
[*]            Thread Exhaustion          : 0 (0.000%)
[*]            Thread Usage               : 0/0/100 (0.000%)
[*]            JSON detected in logs      : 0 (0.000%)
[*]            Uptime                     : 0 days, 0 hours, 0 minutes, 0 seconds.
[*]            Avg. events per/second     : 0
[*] 
[*]           -[ Sagan Malformed Data Statistics ]-
[*] 
[*]            Host                       : 0 (0.000%)
[*]            Facility                   : 0 (0.000%)
[*]            Priority                   : 0 (0.000%)
[*]            Level                      : 0 (0.000%)
[*]            Tag                        : 0 (0.000%)
[*]            Date                       : 0 (0.000%)
[*]            Time                       : 0 (0.000%)
[*]            Program                    : 0 (0.000%)
[*]            Message                    : 0 (0.000%)
[*]            Null Messages              : 0 (0.000%)
[*]            JSON detected in logs      : 0 (0.000%)
[*] 
[*]           -[ Sagan Processor Statistics ]-
[*] 
[*]            Dropped                    : 0 (0.000%)
[*] 
[*]           -[ Sagan follow_flow Statistics ]-
[*] 
[*]            Total                      : 0
[*]            Dropped                    : 0 (0.000%)
[*] -------------------------------------------------------------------------------

Changing the batch-size to 1 from 10 results in two successful alerts.

[*]  ,-._,-.  -[ Sagan Version 2.1.0 - Engine Statistics ]-
[*]  \/)"(\/
[*]   (_o_)    Received/Processed/Ignored : 2/2/0 (100.000%/0.000%)
[*]   /   \/)  Signatures matched         : 2 (100.000%)
[*]  (|| ||)   Bytes Processed            : 7251
[*]   oo-oo    Largest log size (bytes)   : 3626 bytes [0 over]
[*]            Alerts                     : 2 (100.000%)
[*]            After                      : 0 (0.000%)
[*]            Threshold                  : 0 (0.000%)
[*]            Dropped                    : 0 (0.000%)
[*]            Thread Exhaustion          : 0 (0.000%)
[*]            Thread Usage               : 0/2/100 (0.000%)
[*]            JSON detected in logs      : 0 (0.000%)
[*]            Uptime                     : 0 days, 0 hours, 0 minutes, 1 seconds.
[*]            Avg. events per/second     : 2
[*] 
[*]           -[ Sagan Malformed Data Statistics ]-
[*] 
[*]            Host                       : 0 (0.000%)
[*]            Facility                   : 0 (0.000%)
[*]            Priority                   : 0 (0.000%)
[*]            Level                      : 0 (0.000%)
[*]            Tag                        : 0 (0.000%)
[*]            Date                       : 0 (0.000%)
[*]            Time                       : 0 (0.000%)
[*]            Program                    : 0 (0.000%)
[*]            Message                    : 0 (0.000%)
[*]            Null Messages              : 0 (0.000%)
[*]            JSON detected in logs      : 0 (0.000%)
[*] 
[*]           -[ Sagan Processor Statistics ]-
[*] 
[*]            Dropped                    : 0 (0.000%)
[*] 
[*]           -[ Sagan follow_flow Statistics ]-
[*] 
[*]            Total                      : 0
[*]            Dropped                    : 0 (0.000%)
[*] -------------------------------------------------------------------------------

From the docs it is not clear why this would have an affect: batch-size

@quadrantsec
Copy link
Collaborator

As it stands, the batch size has to be met before it is handed to the processor. If the batch size is set to 10 and you send it two logs, they will sit in the queue waiting for another 8 logs to show up. What is happening, you're sending 2 logs with a batch size of 10 and then control-c'ing (i assume) the program. Since the batch size isn't met, the logs are discarded.

I've run into this myself. I'll have the batch size set to 50 and my rules won't work. I'll be cussing at the keyboard, then realize what is happening.

I can look into forcing the logs to be processed when Sagan gets an interrupt. At least then it might be less confusing.

@quadrantsec
Copy link
Collaborator

Another note: The batch size should only be altered if you're environment is really busy (10k+ EPS). On your test system, always leave the batch size at 1 or you'll find yourself banging your head on the keyboard.

In super busy environments, the batch size can make a big difference in CPU usage.

@wrharding
Copy link
Contributor Author

Sagan is run like so and completes on it's own, but it sounds like your points are still applicable since only 2 logs are ever ingested:

sagan -u root -F test.log

Note, Sagan is run as root only because I'm bad at containers and I'm not sure it's a good idea to make different users within an a container (if it's even possible).

Thank you for the clarification! Would it be possible to make note of this in the documentation? As it stands the verbiage makes it sound like you should run anything other than a batch-size of 1.

@quadrantsec
Copy link
Collaborator

I opened a ticket for documenting this.

I also looked at the code to see if I can change the behavior. It doens't look very straight forward to me but I want to think about it for a day or two. Initially I thought I saw an easy fix but that creates a race condition and the logs still dont get processed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant