Understanding Linux Performance with Perf (with examples)
- Linux
- November 5, 2023
Introduction
Linux is a powerful operating system that offers various tools and utilities for performance analysis. One such tool is perf
, which is a framework for performing performance counter measurements in Linux. It provides detailed information about the system’s performance, allowing developers and system administrators to identify performance bottlenecks and optimize their applications.
In this article, we will explore several use cases of the perf
command, along with code examples, to demonstrate its capabilities. We will cover basic performance counter stats display, system-wide real-time performance counter profiling, recording profiles of commands and existing processes, and reading and displaying the recorded profiles.
Use Case 1: Displaying Basic Performance Counter Stats
The perf stat
command allows us to display basic performance counter statistics for a specific command. It gives insights into various aspects of program execution, such as CPU cycles, cache misses, branch instructions, etc.
Code Example:
perf stat gcc hello.c
Motivation:
Displaying basic performance counter stats can help developers understand the runtime characteristics of their code. It provides valuable information about the efficiency of the program, including potential areas where optimizations can be implemented.
Explanation:
perf stat
: Command to display performance counter statistics.gcc hello.c
: The command to be executed, in this case, compiling thehello.c
file using thegcc
compiler.
Example Output:
Performance counter stats for 'gcc hello.c':
22,189.40 msec task-clock # 1.000 CPUs utilized
1,673 context-switches # 0.075 K/sec
24 cpu-migrations # 0.001 K/sec
244 page-faults # 0.011 K/sec
60,071,519,456 cycles # 2.709 GHz
<more counters...>
1.436846016 seconds time elapsed
Use Case 2: System-Wide Real-Time Performance Counter Profile
The perf top
command is used to display a system-wide real-time performance counter profile. It provides a dynamic view of the most CPU-consuming functions/system calls and their respective event counts. This helps identify high-impact areas in the system and optimize them accordingly.
Code Example:
sudo perf top
Motivation:
Monitoring the system-wide performance counters in real-time helps in identifying the most resource-intensive components of the system. It provides a quick overview of the system’s behavior and areas that may require optimization.
Explanation:
perf top
: Command to profile the system-wide performance counters in real-time.sudo
: Required to runperf
with root privileges.
Example Output:
... output truncated ...
2.07% httpd [kernel]
1.68% bash [kernel]
1.55% mysqld [kernel]
1.34% perf [kernel]
1.23% sshd [kernel]
... output truncated ...
Use Case 3: Recording Command Profiles
The perf record
command allows us to run a command and record its performance profile into a file, typically named perf.data
. This profile file can later be analyzed to understand the program’s behavior, including CPU usage, function call graph, and event counts.
Code Example:
sudo perf record command
Motivation:
Recording command profiles can help analyze the performance and behavior of specific commands or applications. It allows developers to collect detailed information about the program’s execution, which can be analyzed later to identify performance bottlenecks and optimize the code.
Explanation:
perf record
: Command to record the performance profile.sudo
: Required to runperf
with root privileges.command
: The command to be executed, whose performance profile will be recorded.
Example Output:
The output of the perf record
command is typically minimal. It shows the progress of the command and the location of the generated perf.data
file.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (3612 samples) ]
Use Case 4: Recording Process Profiles
The perf record
command can also be used to record the performance profile of an existing process. This allows us to analyze the behavior and performance characteristics of a running process without having to restart it.
Code Example:
sudo perf record -p pid
Motivation:
Recording process profiles is useful when analyzing the performance of a long-running process or debugging a specific issue in real-time. It allows us to record and inspect the performance profile without interrupting the process.
Explanation:
perf record
: Command to record the performance profile.sudo
: Required to runperf
with root privileges.-p pid
: Specify the process ID of the target process to record its profile.
Example Output:
The output of the perf record
command is typically minimal. It shows the progress of the command and the location of the generated perf.data
file.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (3612 samples> ]
Use Case 5: Reading and Displaying Profiles
Once a performance profile is recorded using perf record
, the perf report
command can be used to read and display the profile information in an organized manner. It provides insights into function call stacks, event counts, and other details for analyzing the program’s performance.
Code Example:
sudo perf report
Motivation:
Analyzing the recorded profiles is crucial for identifying performance bottlenecks and optimizing the code. The perf report
command provides an easy-to-use interface to explore the recorded information, including the function call graph, event counts, and other important metrics.
Explanation:
perf report
: Command to read and display the recorded profile.sudo
: Required to runperf
with root privileges.
Example Output:
The perf report
command displays the recorded profile in an interactive interface. It includes information about the function call graph, event counts, and various other statistics for analyzing the program’s performance.
# Overhead Command Shared Object
# ........ .............. .................
#
92.28% perf [kernel]
0.42% lsmod [kernel]
0.32% cpup [kernel]
0.29% insmod [kernel]
<more functions...>
Conclusion
The perf
command provides powerful tools for analyzing and understanding the performance of Linux systems. By utilizing the various features and options of perf
, developers can gain valuable insights into their code’s behavior, identify performance bottlenecks, and optimize their applications.