How to Use the Command 'strace' (with Examples)

Linux
December 17, 2024

‘strace’ is a powerful diagnostic and debugging utility in Unix-like operating systems that allows users to trace system calls and signals executed by a process. System calls form the interface between an application process and the Linux kernel, thus, ‘strace’ provides valuable insights into the binaries’ interaction with the underlying kernel. By using ‘strace’, developers and system administrators can pinpoint performance bottlenecks, debug tricky bugs, and gain a deeper understanding of how applications interact with the system.

Start tracing a specific process by its PID

Code:

strace -p pid

Motivation:

Tracing a running process using its Process ID (PID) is essential for debugging and performance analysis in real-time, especially for processes that are long-running or experiencing unexpected behavior. Identifying the exact system calls and their sequence helps diagnose application problems without having to stop the service.

Explanation:

-p pid: This flag indicates that ‘strace’ should attach to an already running process, identified by its PID. The tool then begins logging the system calls made by this process dynamically, allowing developers to troubleshoot live issues without having to restart the process.

Example output:

open("/etc/ld.so.cache", O_RDONLY) = 3
fstat(3, {...}) = 0
mmap(NULL, 2116, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f4fd4590000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)

Trace a process and filter output by system call

Code:

strace -p pid -e open,close,read

Motivation:

Filtering system calls is important to focus on specific areas of interest, especially in dynamics where only certain behaviors, such as file manipulations, need scrutiny. This specificity reduces the noise in strace’s output and makes analysis significantly easier when diagnosing issues like file I/O problems.

Explanation:

-p pid: Attaches ‘strace’ to the process identified by its PID.
-e open,close,read: The -e flag is used to specify an expression that filters the output to include only the listed system calls, in this case, open, close, and read.

Example output:

open("/var/log/syslog", O_RDONLY) = 5
read(5, "Jul 31 10:15:23 machine start[338"..., 4096) = 4096
close(5) = 0

Count time, calls, and errors for each system call and report a summary on program exit

Code:

strace -p pid -c

Motivation:

Collecting system call statistics provides a snapshot overview of a process’s behavior. Understanding the system call frequencies, time spent in each call, and error statistics help identify both common and problematic areas at a glance, supporting performance tuning or debugging tasks.

Explanation:

-p pid: Targeting an existing process by PID.
-c: Instructs ‘strace’ to collect and print a summary of system call statistics, including the number of calls, errors, and total time taken by each system call.

Example output:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 81.38    0.001234          86        14           write
 10.87    0.000165          55         3        3 open
  7.75    0.000118          57         2           read
------ ----------- ----------- --------- --------- ----------------
100.00    0.001517                    19        3 total

Show the time spent in every system call and specify the maximum string size to print

Code:

strace -p pid -T -s 32

Motivation:

Knowing how much time is spent in each system call can highlight performance issues within processes. Additionally, tailoring the string output size allows users to see more or less of the data involved in system calls, aiding in gathering useful, succinct information for debugging or performance assessment.

Explanation:

-p pid: Traces an existing process specified by its PID.
-T: This option appends a time stamp to every traced system call, showing how long each call takes to complete.
-s 32: Limits the size of the data strace logs from system calls to 32 bytes, providing concise output suitable for identifying key data without overwhelming verbosity.

Example output:

open("/path/to/file", O_RDONLY) = 3 <0.000012>
read(3, "This is a sampl", 4096) = 16 <0.000023>
close(3) = 0 <0.000011>

Start tracing a program by executing it

Code:

strace program

Motivation:

Launching a program with ‘strace’ from the start is invaluable when debugging startup issues or when the program’s whereabouts during execution are entirely unknown. Begin-to-end tracing provides comprehensive visibility into lifecycle events and system interactions from launch.

Explanation:

program: This is the executable command or script to be launched. ‘strace’ attaches to the process right from its initialization, thereby logging its full activity spectrum over its entire runtime.

Example output:

execve("/usr/bin/program", ["program"], [/* 30 vars */]) = 0
brk(NULL) = 0x56213c8c4000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)

Start tracing file operations of a program

Code:

strace -e trace=file program

Motivation:

When diagnosing issues related to file interactions, like access permissions or missing files, capturing only file-specific system calls reduces complexity and nerves focus directly on file operation flows, minimizing other irrelevant traces.

Explanation:

-e trace=file: This flag configures ‘strace’ to log only calls related to file operations, such as open, close, read, write, amongst others.
program: The command for the program that will be traced for file operations.

Example output:

open("/some/file/path", O_RDONLY) = 3
read(3, "Contents of the file", 4096) = 20
write(2, "\nSome error while accessing file", 31) = 31

Start tracing network operations of a program as well as all its forked and child processes, saving the output to a file

Code:

strace -f -e trace=network -o trace.txt program

Motivation:

Networking issues, such as unexpected delays or miscommunications, require visibility into network system calls across the main and subsidiary processes. Capturing these network interactions, including sockets and related operations, often reveals the crux of networking problems. Persisting these traces to a file facilitates comprehensive analysis over a longer duration.

Explanation:

-f: Ensures that tracing includes all forked child processes.
-e trace=network: Filters the output to include only network-related system calls, like connect, send, and receive.
-o trace.txt: Directs all trace outputs to be written to ’trace.txt’ for subsequent examination, preserving detailed logs for review.

Example output:

(Contents of trace.txt)

socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("192.168.1.1")}, 16) = 0
send(3, "GET / HTTP/1.1\r\nHost: example.com\r\n", 36, 0) = 36

Conclusion:

Throughout these examples, ‘strace’ showcases its utility as a versatile and essential tool for system call tracing, enabling improved debugging, performance tuning, and forensic analysis by providing detailed insight into program and process behavior. By using ‘strace’ effectively in these use cases, developers and system administrators can attain a deeper understanding of how software interfaces with the operating system, thereby enhancing reliability and performance.

How to Use the Command 'strace' (with Examples)

Start tracing a specific process by its PID

Trace a process and filter output by system call

Count time, calls, and errors for each system call and report a summary on program exit

Show the time spent in every system call and specify the maximum string size to print

Start tracing a program by executing it

Start tracing file operations of a program

Start tracing network operations of a program as well as all its forked and child processes, saving the output to a file

Conclusion:

Tags :

Related Posts

Using the 'carthage' Command for Dependency Management (with examples)

How to Use the Command 'gunzip' (with examples)

How to use the command 'gdalbuildvrt' (with examples)