Log Investigation Techniques

Good log investigation is about narrowing the problem fast: identify time window, find the failing request, extract the stacktrace, remove noise, then group repeating patterns.

Core Workflow

Start with the time window and service name.
Search for ERROR, Exception, request ID, transaction ID, username or endpoint.
Use context lines around the match to understand the sequence.
Summarize duplicate failures to find the main issue.
Correlate with service restarts, deployments, traffic spikes or database/network events.

Most Useful Commands

grep -n 'ERROR' app.log
grep -n 'Exception' app.log
grep -A 20 -B 5 'NullPointerException' app.log
tail -f app.log | grep -E 'ERROR|WARN'
sed -n '1200,1260p' app.log
awk '/ERROR|WARN|Exception/' app.log
zgrep -n 'ERROR' app.log.gz

Practical Patterns

1. Search huge logs safely

less app.log
grep -n 'OutOfMemoryError' big.log

2. Show the stacktrace area

grep -n -A 30 -B 5 'Exception' app.log

3. Count repeated errors

grep 'ERROR' app.log | sort | uniq -c | sort -nr | head

4. Focus on one request or correlation ID

grep 'requestId=abc123' app.log

5. Investigate compressed archives

zgrep -n 'timeout' app.log.2.gz

6. See only today's lines

grep '2026-03-08' app.log

What to Look For

First error before the cascade of follow-up errors
Retry loops, timeouts and connection refused messages
Memory or GC pressure before slow responses
Restart events before or after the failure window
Common thread names, request IDs or host names