Thursday, June 6, 2019

Linux Introspection & Profiling


  • SystemTap

SystemTap is a GPLv2 license system wide tool that allows you to gather tracing and profiling data from a running Linux system.
Understanding systemtap

    Systemtap connects to the Linux kernel and monitors for available events, which are exposed through the kprobes kernel facility
    Based on an event, the kernel can run a handler which is executed as a sub routine
    The event and handler together are refers as probe
    Stap is doing its work by running scripts. To work, these scripts you need to compile the stap scripts in the kernel and start it as a kernel module. The kernel module next will do its work and after doing so it will unload
    The stap command will run the scripts and do the compilation

https://www.golinuxcloud.com/systemtap-tutorial-linux-example/


  • Linux introspection and SystemTap


SystemTap is a dynamic method of monitoring and tracing the operation of a running Linux kernel.
An interface and language for dynamic kernel analysis

Kernel tracing
SystemTap is similar to an older technology called DTrace, which originated in the Sun Solaris operating system. Within DTrace, developers can write scripts in the D programming language (a subset of the C language but modified to support trace-specific behaviors). A DTrace script contains a number of probes and associated actions that occur when the probe "fires." For example, a probe can represent something as simple as invoking a system call or more complicated interactions such as a particular line of code being executed.

https://www.ibm.com/developerworks/linux/library/l-systemtap/

  • SystemTap provides a command line interface and a scripting language to examine the activities of a running Linux system, particularly the kernel, in fine detail. SystemTap scripts are written in the SystemTap scripting language, are then compiled to C-code kernel modules and inserted into the kernel.

https://doc.opensuse.org/documentation/leap/tuning/html/book.sle.tuning/cha.tuning.systemtap.html

  • OProfile

OProfile is a low overhead, system-wide performance monitoring tool provided by the oprofile package.
It uses the performance monitoring hardware on the processor to retrieve information about the kernel and executables on the system, such as when memory is referenced, the number of second-level cache requests, and the number of hardware interrupts received. OProfile is also able to profile applications that run in a Java Virtual Machine (JVM)
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/developer_guide/oprofile

  • Learn eBPF Tracing: Tutorial and Examples

It can be used for many things: network performance, firewalls, security, tracing, and device drivers.
The term tracing refers to performance analysis and observability tools that can produce per-event info.

What is eBPF, bcc, bpftrace, and iovisor?
eBPF does to Linux what JavaScript does to HTML. (Sort of.) So instead of a static HTML website, JavaScript lets you define mini programs that run on events like mouse clicks, which are run in a safe virtual machine in the browser. And with eBPF, instead of a fixed kernel, you can now write mini programs that run on events like disk I/O, which are run in a safe virtual machine in the kernel
http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html

  • 5.4. Performance Counters for Linux (PCL) Tools and perf

Performance Counters for Linux (PCL) is a new kernel-based subsystem that provides a framework for collecting and analyzing performance data.

The PCL subsystem can be used to measure hardware events, including retired instructions and processor clock cycles. It can also measure software events, including major page faults and context switches. For example, PCL counters can compute the Instructions Per Clock (IPC) from a process's counts of instructions retired and processor clock cycles. A low IPC ratio indicates the code makes poor use of the CPU. Other hardware events can also be used to diagnose poor CPU performance.

Both OProfile and Performance Counters for Linux (PCL) use the same hardware Performance Monitoring Unit (PMU).

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/developer_guide/perf

  • Perf is a profiler tool for Linux 2.6+ based systems that abstracts away CPU hardware differences in Linux performance measurements and presents a simple commandline interface.

https://www.dedoimedo.com/computers/linux-perf.html

  • perf Examples

the perf Linux profiler
Performance Counters for Linux (PCL)
perf_events is an event-oriented observability tool,which can help you solve advanced performance and troubleshooting functions
perf_events is part of the Linux kernel
While it uses many Linux tracing features, some are not yet exposed via the perf command, and need to be used via the ftrace interface instead

    Why is the kernel on-CPU so much? What code-paths?
    Which code-paths are causing CPU level 2 cache misses?
    Are the CPUs stalled on memory I/O?
    Which code-paths are allocating memory, and how much?
    What is triggering TCP retransmits?
    Is a certain kernel function being called, and how often?
    What reasons are threads leaving the CPU?


http://www.brendangregg.com/perf.html#Tracepoints

  • Valgrind

Valgrind is an instrumentation framework for building dynamic analysis tools that can be used to profile applications in detail. The default installation alrready provides five standard tools. Valgrind tools are generally used to investigate memory management and threading problems.
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/developer_guide/profiling#valgrind


  • FTRACE

The ftrace framework provides users with several tracing capabilities, accessible through an interface much simpler than SystemTap's. This framework uses a set of virtual files in the debugfs file system; these files enable specific tracers. The ftrace function tracer outputs each function called in the kernel in real time; other tracers within the ftrace framework can also be used to analyze wakeup latency, task switches, kernel events, and the like.
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/developer_guide/ftrace