How to Use the Command 'perf' (with examples)
- Linux
- December 17, 2024
The perf
command in Linux is a powerful tool designed to assist developers and system administrators in understanding the performance of processes running on a Linux system. It provides a comprehensive and flexible interface for retrieving and displaying performance monitoring information for the Linux kernel. This information can be invaluable when diagnosing performance bottlenecks, tuning applications, and optimizing system performance. The perf
tool suite can perform a wide variety of tasks, from monitoring CPU usage to analyzing detailed function calls within complex software stacks. This article will explore several actionable examples to illustrate the various use cases for the perf
command.
Use case 1: Display Basic Performance Counter Stats for a Command
Code:
perf stat gcc hello.c
Motivation:
When developing software, understanding the performance characteristics of your compilation process is essential. By using perf stat
, developers can gather useful statistics, such as the number of CPU cycles, instructions per cycle (IPC), and cache misses, which can help identify inefficient code paths or compilation options. This command provides a quick overview of execution metrics, making it possible to refine and optimize compile times and resource use.
Explanation:
perf stat
: This component of the command invokes thestat
subcommand inperf
, which collects and displays performance counter statistics.gcc
: This calls the GNU Compiler Collection, an open-source compiler used for compiling C, C++, and other languages.hello.c
: This is the source file being compiled, typically a C program, though you can substitute any specific file you are working with.
Example Output:
Performance counter stats for 'gcc hello.c':
150.221935 task-clock (msec) # 0.843 CPUs utilized
2837 context-switches # 0.019 M/sec
4 cpu-migrations # 0.027 K/sec
513 page-faults # 3.418 K/sec
5102563150 cycles # 3.396 GHz
8283035643 instructions # 1.62 insn per cycle
1310891026 branches # 8725.076 M/sec
63473248 branch-misses # 6.92% of all branches
0.178179165 seconds time elapsed
Use case 2: Display System-Wide Real-Time Performance Counter Profile
Code:
sudo perf top
Motivation:
For system administrators or developers, getting a real-time overview of the system’s performance is crucial. The perf top
command allows users to monitor and diagnose active processes and their resource consumption continuously. It can be particularly helpful for identifying processes that consume excessive CPU time and for understanding system load dynamics in real-time.
Explanation:
sudo
: This prefix runs the command with superuser privileges, necessary for accessing system-wide performance statistics.perf top
: This command runs thetop
tool withinperf
, displaying live updates of the most actively consuming system functions and processes, similar to the standardtop
utility but with more detailed metrics related to CPU performance.
Example Output:
Overhead Shared Object Symbol
15.36% [kernel] [k] system_call_fastpath
13.25% [kernel] [k] __do_softirq
6.78% [kernel] [k] update_curr
5.19% libc-2.31.so [.] __memcpy_avx_unaligned
4.35% [kernel] [k] free_one_page
...
Use case 3: Run a Command and Record Its Profile into perf.data
Code:
sudo perf record command
Motivation:
Profiling a command execution can provide insights into its performance characteristics over a particular run. This is particularly useful for developers when optimizing an application or diagnosing performance issues. By recording this data, you can analyze it in detail later with other perf
tools, allowing for a more thorough understanding of where optimizations can be applied.
Explanation:
sudo
: Ensures that the command runs with the necessary privileges to record performance data comprehensively.perf record
: This invocation ofperf
records performance data related to the execution of a command.command
: Replace this with the specific command you want to profile, such as any executable or script file you are troubleshooting.
Example Output:
Typically, the command itself does not produce an output but generates a file named perf.data
, which contains all the recorded performance data.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 7.812 MB perf.data (5000 samples) ]
Use case 4: Record the Profile of an Existing Process into perf.data
Code:
sudo perf record -p pid
Motivation:
When a performance issue is noticed in a long-running process, it may be beneficial to profile that specific process to understand its behavior better. For instance, if a web server is running slow, recording and analyzing its performance can shed light on specific bottlenecks. You can use this approach for live systems where restarting the process is not feasible.
Explanation:
sudo
: Run with superuser privileges to access necessary system resources and existing processes.perf record
: This subcommand records performance data.-p pid
: This option specifies which process to record, withpid
being the identifier for the live process you wish to analyze.
Example Output:
Just like in use case 3, this generates a perf.data
file rather than an immediate output. As you analyze this data, it might look like this:
[ perf record: Woken up 10 times to write data ]
[ perf record: Captured and wrote 20.000 MB perf.data (15000 samples) ]
Use case 5: Read perf.data
(Created by perf record
) and Display the Profile
Code:
sudo perf report
Motivation:
After collecting data either for a specific command or an ongoing system process, examining these details becomes vital for comprehensive analysis. The perf report
command allows users to interpret the perf.data
file and provides a detailed breakdown of the captured performance metrics, facilitating a deeper understanding of potential performance bottlenecks or efficiency losses in software applications.
Explanation:
sudo
: Required to read the performance data files, which typically include sensitive system information.perf report
: This command processes theperf.data
file generated during the recording and offers a summarized and human-readable report of the performance data.
Example Output:
The report may include function call graphs, execution times, and code sections that are resource-intensive:
# To display the report, please use [UP/DOWN] to navigate, [ENTER] to drill down, [q] to quit
# Detailed analysis below
#
# Total Lost Samples: 0
#
# Samples: 20K of event 'cycles:ppp'
# Event count (approx.): 125.678 million
#
# Overhead Command Shared Object Symbol
# ........ ........ ................. ..................
#
35.15% main ./app [.] execute_task
27.89% main ./liballoc.so.1 [.] allocate_memory
15.76% main ./kernel [k] sys_execve
11.23% main ./libc.so.6 [.] memcpy
...
Conclusion:
The perf
command offers a wide range of capabilities for analyzing performance aspects of processes on Linux systems. From providing real-time insights and recording extensive profiling data to offering detailed reports, perf
is a crucial toolkit for those aiming to optimize performance and diagnose bottlenecks effectively. By utilizing these use cases, users can better harness the power of perf
to enhance application performance and system efficiency.