How to Use the Command 'perf' (with examples)

How to Use the Command 'perf' (with examples)

The perf command in Linux is a powerful tool designed to assist developers and system administrators in understanding the performance of processes running on a Linux system. It provides a comprehensive and flexible interface for retrieving and displaying performance monitoring information for the Linux kernel. This information can be invaluable when diagnosing performance bottlenecks, tuning applications, and optimizing system performance. The perf tool suite can perform a wide variety of tasks, from monitoring CPU usage to analyzing detailed function calls within complex software stacks. This article will explore several actionable examples to illustrate the various use cases for the perf command.

Use case 1: Display Basic Performance Counter Stats for a Command

Code:

perf stat gcc hello.c

Motivation:

When developing software, understanding the performance characteristics of your compilation process is essential. By using perf stat, developers can gather useful statistics, such as the number of CPU cycles, instructions per cycle (IPC), and cache misses, which can help identify inefficient code paths or compilation options. This command provides a quick overview of execution metrics, making it possible to refine and optimize compile times and resource use.

Explanation:

  • perf stat: This component of the command invokes the stat subcommand in perf, which collects and displays performance counter statistics.
  • gcc: This calls the GNU Compiler Collection, an open-source compiler used for compiling C, C++, and other languages.
  • hello.c: This is the source file being compiled, typically a C program, though you can substitute any specific file you are working with.

Example Output:

 Performance counter stats for 'gcc hello.c':

       150.221935      task-clock (msec)         #    0.843 CPUs utilized
             2837      context-switches          #    0.019 M/sec
                4      cpu-migrations            #    0.027 K/sec
              513      page-faults               #    3.418 K/sec
      5102563150      cycles                    #    3.396 GHz
      8283035643      instructions              #    1.62  insn per cycle
      1310891026      branches                  #  8725.076 M/sec
        63473248      branch-misses             #    6.92% of all branches

       0.178179165 seconds time elapsed

Use case 2: Display System-Wide Real-Time Performance Counter Profile

Code:

sudo perf top

Motivation:

For system administrators or developers, getting a real-time overview of the system’s performance is crucial. The perf top command allows users to monitor and diagnose active processes and their resource consumption continuously. It can be particularly helpful for identifying processes that consume excessive CPU time and for understanding system load dynamics in real-time.

Explanation:

  • sudo: This prefix runs the command with superuser privileges, necessary for accessing system-wide performance statistics.
  • perf top: This command runs the top tool within perf, displaying live updates of the most actively consuming system functions and processes, similar to the standard top utility but with more detailed metrics related to CPU performance.

Example Output:

Overhead  Shared Object         Symbol
  15.36%  [kernel]              [k] system_call_fastpath
  13.25%  [kernel]              [k] __do_softirq
  6.78%   [kernel]              [k] update_curr
  5.19%   libc-2.31.so          [.] __memcpy_avx_unaligned
  4.35%   [kernel]              [k] free_one_page
  ...

Use case 3: Run a Command and Record Its Profile into perf.data

Code:

sudo perf record command

Motivation:

Profiling a command execution can provide insights into its performance characteristics over a particular run. This is particularly useful for developers when optimizing an application or diagnosing performance issues. By recording this data, you can analyze it in detail later with other perf tools, allowing for a more thorough understanding of where optimizations can be applied.

Explanation:

  • sudo: Ensures that the command runs with the necessary privileges to record performance data comprehensively.
  • perf record: This invocation of perf records performance data related to the execution of a command.
  • command: Replace this with the specific command you want to profile, such as any executable or script file you are troubleshooting.

Example Output:

Typically, the command itself does not produce an output but generates a file named perf.data, which contains all the recorded performance data.

[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 7.812 MB perf.data (5000 samples) ]

Use case 4: Record the Profile of an Existing Process into perf.data

Code:

sudo perf record -p pid

Motivation:

When a performance issue is noticed in a long-running process, it may be beneficial to profile that specific process to understand its behavior better. For instance, if a web server is running slow, recording and analyzing its performance can shed light on specific bottlenecks. You can use this approach for live systems where restarting the process is not feasible.

Explanation:

  • sudo: Run with superuser privileges to access necessary system resources and existing processes.
  • perf record: This subcommand records performance data.
  • -p pid: This option specifies which process to record, with pid being the identifier for the live process you wish to analyze.

Example Output:

Just like in use case 3, this generates a perf.data file rather than an immediate output. As you analyze this data, it might look like this:

[ perf record: Woken up 10 times to write data ]
[ perf record: Captured and wrote 20.000 MB perf.data (15000 samples) ]

Use case 5: Read perf.data (Created by perf record) and Display the Profile

Code:

sudo perf report

Motivation:

After collecting data either for a specific command or an ongoing system process, examining these details becomes vital for comprehensive analysis. The perf report command allows users to interpret the perf.data file and provides a detailed breakdown of the captured performance metrics, facilitating a deeper understanding of potential performance bottlenecks or efficiency losses in software applications.

Explanation:

  • sudo: Required to read the performance data files, which typically include sensitive system information.
  • perf report: This command processes the perf.data file generated during the recording and offers a summarized and human-readable report of the performance data.

Example Output:

The report may include function call graphs, execution times, and code sections that are resource-intensive:

# To display the report, please use [UP/DOWN] to navigate, [ENTER] to drill down, [q] to quit
# Detailed analysis below
#
# Total Lost Samples: 0
#
# Samples: 20K of event 'cycles:ppp'
# Event count (approx.): 125.678 million
#
# Overhead  Command  Shared Object     Symbol
# ........ ........ ................. ..................
#
   35.15%  main     ./app              [.] execute_task
   27.89%  main     ./liballoc.so.1    [.] allocate_memory
   15.76%  main     ./kernel           [k] sys_execve
   11.23%  main     ./libc.so.6        [.] memcpy
    ...

Conclusion:

The perf command offers a wide range of capabilities for analyzing performance aspects of processes on Linux systems. From providing real-time insights and recording extensive profiling data to offering detailed reports, perf is a crucial toolkit for those aiming to optimize performance and diagnose bottlenecks effectively. By utilizing these use cases, users can better harness the power of perf to enhance application performance and system efficiency.

Tags :

Related Posts

How to Use the Command 'useradd' (with examples)

How to Use the Command 'useradd' (with examples)

The useradd command is a fundamental utility in Unix-like operating systems used to create new user accounts.

Read More
How to use the command 'aws sts' (with examples)

How to use the command 'aws sts' (with examples)

The AWS Security Token Service (STS) is a global service provided by Amazon Web Services that allows clients to request temporary, limited-privilege credentials for users or workloads.

Read More
How to use the command 'svn' (with examples)

How to use the command 'svn' (with examples)

Subversion, often abbreviated as SVN, is an open-source version control system that is utilized to manage files and directories over time.

Read More