Post

log analysis for activity detection

A Python log analyzer that reads SSH and web server logs and flags suspicious patterns like brute-force attempts, port scans, and probing of restricted URLs.

This is a Python script that reads log files and flags patterns associated with common attack behaviour: brute-force login attempts, port scans, and probes against URLs that should not be accessible.

The point is not to replace a real SIEM. It is to automate the first ten minutes of looking at a log when something feels off, the part where you grep for failed logins and source IPs and stare at the patterns. That work is repetitive and easy to script.

What it handles

  • .log input from OpenSSH (auth.log) and Apache or Nginx (access.log)
  • Failed SSH logins, repeated requests from the same IP, sequences that look like port scans, requests against known-sensitive paths
  • Output aggregated by source IP and rule type
  • A regex-based rule table that is straightforward to extend
  • CLI invocation, suitable for cron or piping into anything else

How it works

The script reads the file line by line and applies a list of regex rules. Each rule is tagged with a threat type and a severity. Matched events are stored with timestamp, source IP (when available), and the rule that fired. Results print to stdout and optionally to a file.

Rules live in a single dictionary at the top of the file, so adding a new detection is one line plus a regex.

Example output

1
2
3
[2025-06-01 21:31:44] WARNING - Repeated failed SSH login from 192.168.1.45
[2025-06-01 21:32:05] INFO    - Possible port scan detected from 103.21.244.0
[2025-06-01 21:34:09] ALERT   - Attempted access to /admin (403 Forbidden) from 45.67.89.123

Usage

1
python log_analyzer.py --input /path/to/file.log --output report.txt

Flags:

  • --input path to the log file
  • --output write results to a file instead of stdout
  • --verbose include the surrounding lines for each match

What I would add next

  • GeoIP enrichment so the source country is visible on flagged IPs
  • Optional firewall integration, so a flagged IP can be pushed straight to a deny list
  • A small web view for browsing results, mostly for convenience
  • Configurable thresholds per rule, so a quiet environment does not generate the same severity as a busy one

The current version covers enough to be useful for small environments and to act as a base for anything more ambitious. Real SIEMs do all of this and more, but they also cost money and require infrastructure. For a homelab or a single VPS, a short Python script is a reasonable starting point.

This post is licensed under CC BY 4.0 by the author.