From 3e3ede49ce93bc5549d1a1a756410071d6e3a7b1 Mon Sep 17 00:00:00 2001 From: "Guilherme G. Piccoli" Date: Sat, 3 Feb 2024 12:21:13 -0300 Subject: docs: Document possible_cpus parameter The number of possible CPUs is set be kernel in early boot time through some discovery mechanisms, like ACPI in x86. We have a parameter both in x86 and S390 to override that - there are some cases of BIOSes exposing more possible CPUs than the available ones, so this parameter is a good testing mechanism, but for some reason wasn't mentioned so far in the kernel parameters guide - let's fix that. Cc: Changwoo Min Signed-off-by: "Guilherme G. Piccoli" Reviewed-by: Randy Dunlap Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20240203152208.1461293-1-gpiccoli@igalia.com --- Documentation/admin-guide/kernel-parameters.txt | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'Documentation/admin-guide') diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 31b3a25680d0..e553740190ea 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4644,6 +4644,11 @@ may be specified. Format: ,.... + possible_cpus= [SMP,S390,X86] + Format: + Set the number of possible CPUs, overriding the + regular discovery mechanisms (such as ACPI/FW, etc). + powersave=off [PPC] This option disables power saving features. It specifically disables cpuidle and sets the platform machine description specific power_save -- cgit v1.2.3 From f9197538d71a5d3ccd57451e048a0eb302bddc03 Mon Sep 17 00:00:00 2001 From: Thorsten Blum Date: Mon, 5 Feb 2024 14:24:10 +0100 Subject: Documentation: admin-guide: tainted-kernels.rst: Add missing article and comma - Add missing article "the" - s/above example/example above/ - Add missing comma after introductory clause to improve readability Signed-off-by: Thorsten Blum Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20240205132409.1957-1-thorsten.blum@toblux.com --- Documentation/admin-guide/tainted-kernels.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation/admin-guide') diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst index 92a8a07f5c43..f92551539e8a 100644 --- a/Documentation/admin-guide/tainted-kernels.rst +++ b/Documentation/admin-guide/tainted-kernels.rst @@ -34,7 +34,7 @@ name of the command ('Comm:') that triggered the event:: You'll find a 'Not tainted: ' there if the kernel was not tainted at the time of the event; if it was, then it will print 'Tainted: ' and characters -either letters or blanks. In above example it looks like this:: +either letters or blanks. In the example above it looks like this:: Tainted: P W O @@ -52,7 +52,7 @@ At runtime, you can query the tainted state by reading tainted; any other number indicates the reasons why it is. The easiest way to decode that number is the script ``tools/debugging/kernel-chktaint``, which your distribution might ship as part of a package called ``linux-tools`` or -``kernel-tools``; if it doesn't you can download the script from +``kernel-tools``; if it doesn't, you can download the script from `git.kernel.org `_ and execute it with ``sh kernel-chktaint``, which would print something like this on the machine that had the statements in the logs that were quoted earlier:: -- cgit v1.2.3 From 3a5f1c3d832838b23a2df5f203bacf318d7362f2 Mon Sep 17 00:00:00 2001 From: Hunter Chasens Date: Wed, 7 Feb 2024 12:10:07 -0500 Subject: docs: admin-guide: Update bootloader and installation instructions Updates the bootloader and installation instructions in admin-guide/README.rst to align with modern practices. Details of Changes: - Added guidance on using EFISTUB for UEFI/EFI systems. - Noted that LILO is no longer in active development and provides alternatives. - Kept LILO instructions but marked as Legacy LILO Instructions. Suggest removal in future patch. Signed-off-by: Hunter Chasens Signed-off-by: Jonathan Corbet [jc: repaired added whitespace warnings] Link: https://lore.kernel.org/r/20240207171007.45405-1-hunter.chasens18@ncf.edu --- Documentation/admin-guide/README.rst | 69 +++++++++++++++++++++++------------- 1 file changed, 45 insertions(+), 24 deletions(-) (limited to 'Documentation/admin-guide') diff --git a/Documentation/admin-guide/README.rst b/Documentation/admin-guide/README.rst index 9a969c0157f1..f2bebff6a733 100644 --- a/Documentation/admin-guide/README.rst +++ b/Documentation/admin-guide/README.rst @@ -262,9 +262,11 @@ Compiling the kernel - Make sure you have at least gcc 5.1 available. For more information, refer to :ref:`Documentation/process/changes.rst `. - - Do a ``make`` to create a compressed kernel image. It is also - possible to do ``make install`` if you have lilo installed to suit the - kernel makefiles, but you may want to check your particular lilo setup first. + - Do a ``make`` to create a compressed kernel image. It is also possible to do + ``make install`` if you have lilo installed or if your distribution has an + install script recognised by the kernel's installer. Most popular + distributions will have a recognized install script. You may want to + check your distribution's setup first. To do the actual install, you have to be root, but none of the normal build should require that. Don't take the name of root in vain. @@ -301,32 +303,51 @@ Compiling the kernel image (e.g. .../linux/arch/x86/boot/bzImage after compilation) to the place where your regular bootable kernel is found. - - Booting a kernel directly from a floppy without the assistance of a - bootloader such as LILO, is no longer supported. - - If you boot Linux from the hard drive, chances are you use LILO, which - uses the kernel image as specified in the file /etc/lilo.conf. The - kernel image file is usually /vmlinuz, /boot/vmlinuz, /bzImage or - /boot/bzImage. To use the new kernel, save a copy of the old image - and copy the new image over the old one. Then, you MUST RERUN LILO - to update the loading map! If you don't, you won't be able to boot - the new kernel image. - - Reinstalling LILO is usually a matter of running /sbin/lilo. - You may wish to edit /etc/lilo.conf to specify an entry for your - old kernel image (say, /vmlinux.old) in case the new one does not - work. See the LILO docs for more information. - - After reinstalling LILO, you should be all set. Shutdown the system, + - Booting a kernel directly from a storage device without the assistance + of a bootloader such as LILO or GRUB, is no longer supported in BIOS + (non-EFI systems). On UEFI/EFI systems, however, you can use EFISTUB + which allows the motherboard to boot directly to the kernel. + On modern workstations and desktops, it's generally recommended to use a + bootloader as difficulties can arise with multiple kernels and secure boot. + For more details on EFISTUB, + see "Documentation/admin-guide/efi-stub.rst". + + - It's important to note that as of 2016 LILO (LInux LOader) is no longer in + active development, though as it was extremely popular, it often comes up + in documentation. Popular alternatives include GRUB2, rEFInd, Syslinux, + systemd-boot, or EFISTUB. For various reasons, it's not recommended to use + software that's no longer in active development. + + - Chances are your distribution includes an install script and running + ``make install`` will be all that's needed. Should that not be the case + you'll have to identify your bootloader and reference its documentation or + configure your EFI. + +Legacy LILO Instructions +------------------------ + + + - If you use LILO the kernel images are specified in the file /etc/lilo.conf. + The kernel image file is usually /vmlinuz, /boot/vmlinuz, /bzImage or + /boot/bzImage. To use the new kernel, save a copy of the old image and copy + the new image over the old one. Then, you MUST RERUN LILO to update the + loading map! If you don't, you won't be able to boot the new kernel image. + + - Reinstalling LILO is usually a matter of running /sbin/lilo. You may wish + to edit /etc/lilo.conf to specify an entry for your old kernel image + (say, /vmlinux.old) in case the new one does not work. See the LILO docs + for more information. + + - After reinstalling LILO, you should be all set. Shutdown the system, reboot, and enjoy! - If you ever need to change the default root device, video mode, - etc. in the kernel image, use your bootloader's boot options - where appropriate. No need to recompile the kernel to change - these parameters. + - If you ever need to change the default root device, video mode, etc. in the + kernel image, use your bootloader's boot options where appropriate. No need + to recompile the kernel to change these parameters. - Reboot with the new kernel and enjoy. + If something goes wrong ----------------------- -- cgit v1.2.3 From 23764f18f72515aa46e6b56a7b9af6e57de8f8c0 Mon Sep 17 00:00:00 2001 From: Carlos Bilbao Date: Tue, 9 Jan 2024 09:56:42 -0600 Subject: docs: Correct formatting of title in admin-guide/index.rst Adjust the title of "The Linux kernel user's and administrator's guide" to adhere to the expected reStructuredText (rst) formatting, using double equal signs for the main header. Signed-off-by: Carlos Bilbao Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20240109155643.3489369-2-carlos.bilbao@amd.com --- Documentation/admin-guide/index.rst | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation/admin-guide') diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst index fb40a1f6f79e..ed8a629e59c8 100644 --- a/Documentation/admin-guide/index.rst +++ b/Documentation/admin-guide/index.rst @@ -1,3 +1,4 @@ +================================================= The Linux kernel user's and administrator's guide ================================================= -- cgit v1.2.3 From 781296727646489709e27076ed79dd77e37fcf8d Mon Sep 17 00:00:00 2001 From: Thorsten Leemhuis Date: Fri, 1 Mar 2024 09:41:06 +0100 Subject: docs: new text on bisecting which also covers bug validation Add a second document on bisecting regressions explaining the whole process from beginning to end -- while also describing how to validate if a problem is still present in mainline. This "two in one" approach is possible, as checking whenever a bug is in mainline is one of the first steps before performing a bisection anyway and thus needs to be described. Due to this approach the text also works quite nicely in conjunction with Documentation/admin-guide/reporting-issues.rst, as it covers all typical cases where users will need to build a kernel in exactly the same order. The text targets users that normally run kernels from their Linux distributor who might never have compiled their own kernel. This aim is why the first kernel built while following this guide is generated from the latest mainline codebase. This will rule out that the regression (a) was fixed already and (b) is caused by config change a vendor distributor performed; checking mainline will furthermore (c) determine if the issue is something that needs to be reported to the regular developers or the stable team (this is needed even when readers bisect within a stable series). Only then are readers instructed to build their own variant of the 'good' kernel to validate the trimmed .config file created during early in the guide, as performing a bisection with a broken one would be a waste of time. There is a small downside of this order: readers might have to go back to testing mainline, if it turns out there is a problem with their .config. But that should be rare -- and if the regression was already fixed readers might not get to this point anyway. Hence in the end this order should mean that readers built less kernels overall. This sequence allows the text to easily cover the "check if a bug is present in the upstream kernel" case while only making things a tiny bit more complicated. The text tries to prevent readers from running into many mistakes users are known to frequently make. The steps required for this might look superfluous for people that are already familiar with bisections -- but anyone with that knowledge should be able to adapt the instructions to their use-case or will not need this text at all. Style and structure of the text match the one Documentation/admin-guide/quickly-build-trimmed-linux.rst uses. Quite a few paragraphs are even copied from there and not changed at all or only slightly. This will complicate maintenance, as some future changes to one of these documents will have to be replicated in the other. But this is the lesser evil: solutions like "sending readers from one document over to the other" or "extracting the common parts into a separate document" might work in other cases, but would be too confusing here given the topic and the target audience. Signed-off-by: Thorsten Leemhuis [jc: Undo spurious removal of subsection header line] Signed-off-by: Jonathan Corbet Message-ID: <02b084a06de4ad61ac4ecd92b9265d4df4d03d71.1709282441.git.linux@leemhuis.info> --- Documentation/admin-guide/index.rst | 1 + .../verify-bugs-and-bisect-regressions.rst | 1919 ++++++++++++++++++++ MAINTAINERS | 1 + 3 files changed, 1921 insertions(+) create mode 100644 Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst (limited to 'Documentation/admin-guide') diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst index ed8a629e59c8..cb11d569b597 100644 --- a/Documentation/admin-guide/index.rst +++ b/Documentation/admin-guide/index.rst @@ -38,6 +38,7 @@ problems and bugs in particular. reporting-issues reporting-regressions quickly-build-trimmed-linux + verify-bugs-and-bisect-regressions bug-hunting bug-bisect tainted-kernels diff --git a/Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst b/Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst new file mode 100644 index 000000000000..54bde8bac95c --- /dev/null +++ b/Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst @@ -0,0 +1,1919 @@ +.. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0) +.. [see the bottom of this file for redistribution information] + +========================================= +How to verify bugs and bisect regressions +========================================= + +This document describes how to check if some Linux kernel problem occurs in code +currently supported by developers -- to then explain how to locate the change +causing the issue, if it is a regression (e.g. did not happen with earlier +versions). + +The text aims at people running kernels from mainstream Linux distributions on +commodity hardware who want to report a kernel bug to the upstream Linux +developers. Despite this intent, the instructions work just as well for users +who are already familiar with building their own kernels: they help avoid +mistakes occasionally made even by experienced developers. + +.. + Note: if you see this note, you are reading the text's source file. You + might want to switch to a rendered version: it makes it a lot easier to + read and navigate this document -- especially when you want to look something + up in the reference section, then jump back to where you left off. +.. + Find the latest rendered version of this text here: + https://docs.kernel.org/admin-guide/verify-bugs-and-bisect-regressions.rst.html + +The essence of the process (aka 'TL;DR') +======================================== + +*[If you are new to building or bisecting Linux, ignore this section and head +over to the* ":ref:`step-by-step guide`" *below. It utilizes +the same commands as this section while describing them in brief fashion. The +steps are nevertheless easy to follow and together with accompanying entries +in a reference section mention many alternatives, pitfalls, and additional +aspects, all of which might be essential in your present case.]* + +**In case you want to check if a bug is present in code currently supported by +developers**, execute just the *preparations* and *segment 1*; while doing so, +consider the newest Linux kernel you regularly use to be the 'working' kernel. +In the following example that's assumed to be 6.0.13, which is why the sources +of v6.0 will be used to prepare the .config file. + +**In case you face a regression**, follow the steps at least till the end of +*segment 2*. Then you can submit a preliminary report -- or continue with +*segment 3*, which describes how to perform a bisection needed for a +full-fledged regression report. In the following example 6.0.13 is assumed to be +the 'working' kernel and 6.1.5 to be the first 'broken', which is why v6.0 +will be considered the 'good' release and used to prepare the .config file. + +* **Preparations**: set up everything to build your own kernels:: + + # * Remove any software that depends on externally maintained kernel modules + # or builds any automatically during bootup. + # * Ensure Secure Boot permits booting self-compiled Linux kernels. + # * If you are not already running the 'working' kernel, reboot into it. + # * Install compilers and everything else needed for building Linux. + # * Ensure to have 15 Gigabyte free space in your home directory. + git clone -o mainline --no-checkout \ + https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ~/linux/ + cd ~/linux/ + git remote add -t master stable \ + https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git + git checkout --detach v6.0 + # * Hint: if you used an existing clone, ensure no stale .config is around. + make olddefconfig + # * Ensure the former command picked the .config of the 'working' kernel. + # * Connect external hardware (USB keys, tokens, ...), start a VM, bring up + # VPNs, mount network shares, and briefly try the feature that is broken. + yes '' | make localmodconfig + ./scripts/config --set-str CONFIG_LOCALVERSION '-local' + ./scripts/config -e CONFIG_LOCALVERSION_AUTO + # * Note, when short on storage space, check the guide for an alternative: + ./scripts/config -d DEBUG_INFO_NONE -e KALLSYMS_ALL -e DEBUG_KERNEL \ + -e DEBUG_INFO -e DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT -e KALLSYMS + # * Hint: at this point you might want to adjust the build configuration; + # you'll have to, if you are running Debian. + make olddefconfig + cp .config ~/kernel-config-working + +* **Segment 1**: build a kernel from the latest mainline codebase. + + This among others checks if the problem was fixed already and which developers + later need to be told about the problem; in case of a regression, this rules + out a .config change as root of the problem. + + a) Checking out latest mainline code:: + + cd ~/linux/ + git checkout --force --detach mainline/master + + b) Build, install, and boot a kernel:: + + cp ~/kernel-config-working .config + make olddefconfig + make -j $(nproc --all) + # * Make sure there is enough disk space to hold another kernel: + df -h /boot/ /lib/modules/ + # * Note: on Arch Linux, its derivatives and a few other distributions + # the following commands will do nothing at all or only part of the + # job. See the step-by-step guide for further details. + command -v installkernel && sudo make modules_install install + # * Check how much space your self-built kernel actually needs, which + # enables you to make better estimates later: + du -ch /boot/*$(make -s kernelrelease)* | tail -n 1 + du -sh /lib/modules/$(make -s kernelrelease)/ + # * Hint: the output of the following command will help you pick the + # right kernel from the boot menu: + make -s kernelrelease | tee -a ~/kernels-built + reboot + # * Once booted, ensure you are running the kernel you just built by + # checking if the output of the next two commands matches: + tail -n 1 ~/kernels-built + uname -r + + c) Check if the problem occurs with this kernel as well. + +* **Segment 2**: ensure the 'good' kernel is also a 'working' kernel. + + This among others verifies the trimmed .config file actually works well, as + bisecting with it otherwise would be a waste of time: + + a) Start by checking out the sources of the 'good' version:: + + cd ~/linux/ + git checkout --force --detach v6.0 + + b) Build, install, and boot a kernel as described earlier in *segment 1, + section b* -- just feel free to skip the 'du' commands, as you have a rough + estimate already. + + c) Ensure the feature that regressed with the 'broken' kernel actually works + with this one. + +* **Segment 3**: perform and validate the bisection. + + a) In case your 'broken' version is a stable/longterm release, add the Git + branch holding it:: + + git remote set-branches --add stable linux-6.1.y + git fetch stable + + b) Initialize the bisection:: + + cd ~/linux/ + git bisect start + git bisect good v6.0 + git bisect bad v6.1.5 + + c) Build, install, and boot a kernel as described earlier in *segment 1, + section b*. + + In case building or booting the kernel fails for unrelated reasons, run + ``git bisect skip``. In all other outcomes, check if the regressed feature + works with the newly built kernel. If it does, tell Git by executing + ``git bisect good``; if it does not, run ``git bisect bad`` instead. + + All three commands will make Git checkout another commit; then re-execute + this step (e.g. build, install, boot, and test a kernel to then tell Git + the outcome). Do so again and again until Git shows which commit broke + things. If you run short of disk space during this process, check the + "Supplementary tasks" section below. + + d) Once your finished the bisection, put a few things away:: + + cd ~/linux/ + git bisect log > ~/bisect-log + cp .config ~/bisection-config-culprit + git bisect reset + + e) Try to verify the bisection result:: + + git checkout --force --detach mainline/master + git revert --no-edit cafec0cacaca0 + + This is optional, as some commits are impossible to revert. But if the + second command worked flawlessly, build, install, and boot one more kernel + kernel, which should not show the regression. + +* **Supplementary tasks**: cleanup during and after the process. + + a) To avoid running out of disk space during a bisection, you might need to + remove some kernels you built earlier. You most likely want to keep those + you built during segment 1 and 2 around for a while, but you will most + likely no longer need kernels tested during the actual bisection + (Segment 3 c). You can list them in build order using:: + + ls -ltr /lib/modules/*-local* + + To then for example erase a kernel that identifies itself as + '6.0-rc1-local-gcafec0cacaca0', use this:: + + sudo rm -rf /lib/modules/6.0-rc1-local-gcafec0cacaca0 + sudo kernel-install -v remove 6.0-rc1-local-gcafec0cacaca0 + # * Note, if kernel-install is missing, you will have to + # manually remove the kernel image and related files. + + b) If you performed a bisection and successfully validated the result, feel + free to remove all kernels built during the actual bisection (Segment 3 c); + the kernels you built earlier and later you might want to keep around for + a week or two. + +.. _introguide_bissbs: + +Step-by-step guide on how to verify bugs and bisect regressions +=============================================================== + +This guide describes how to set up your own Linux kernels for investigating bugs +or regressions you intent to report. How far you want to follow the instructions +depends on your issue: + +Execute all steps till the end of *segment 1* to **verify if your kernel problem +is present in code supported by Linux kernel developers**. If it is, you are all +set to report the bug -- unless it did not happen with earlier kernel versions, +as then your want to at least continue with *segment 2* to **check if the issue +qualifies as regression** which receive priority treatment. Depending on the +outcome you then are ready to report a bug or submit a preliminary regression +report; instead of the latter your could also head straight on and follow +*segment 3* to **perform a bisection** for a full-fledged regression report +developers are obliged to act upon. + + :ref:`Preparations: set up everything to build your own kernels.` + + :ref:`Segment 1: try to reproduce the problem with the latest codebase.` + + :ref:`Segment 2: check if the kernels you build work fine.` + + :ref:`Segment 3: perform a bisection and validate the result.` + + :ref:`Supplementary tasks: cleanup during and after following this guide.` + +The steps in each segment illustrate the important aspects of the process, while +a comprehensive reference section holds additional details. The latter sometimes +also outlines alternative approaches, pitfalls, as well as problems that might +occur at the particular step -- and how to get things rolling again. + +For further details on how to report Linux kernel issues or regressions check +out Documentation/admin-guide/reporting-issues.rst, which works in conjunction +with this document. It among others explains why you need to verify bugs with +the latest 'mainline' kernel, even if you face a problem with a kernel from a +'stable/longterm' series; for users facing a regression it also explains that +sending a preliminary report after finishing segment 2 might be wise, as the +regression and its culprit might be known already. For further details on +what actually qualifies as a regression check out +Documentation/admin-guide/reporting-regressions.rst. + +.. _introprep_bissbs: + +Preparations: set up everything to build your own kernels +--------------------------------------------------------- + +.. _backup_bissbs: + +* Create a fresh backup and put system repair and restore tools at hand, just + to be prepared for the unlikely case of something going sideways. + + [:ref:`details`] + +.. _vanilla_bissbs: + +* Remove all software that depends on externally developed kernel drivers or + builds them automatically. That includes but is not limited to DKMS, openZFS, + VirtualBox, and Nvidia's graphics drivers (including the GPLed kernel module). + + [:ref:`details`] + +.. _secureboot_bissbs: + +* On platforms with 'Secure Boot' or similar solutions, prepare everything to + ensure the system will permit your self-compiled kernel to boot. The + quickest and easiest way to achieve this on commodity x86 systems is to + disable such techniques in the BIOS setup utility; alternatively, remove + their restrictions through a process initiated by + ``mokutil --disable-validation``. + + [:ref:`details`] + +.. _rangecheck_bissbs: + +* Determine the kernel versions considered 'good' and 'bad' throughout this + guide. + + Do you follow this guide to verify if a bug is present in the code developers + care for? Then consider the mainline release your 'working' kernel (the newest + one you regularly use) is based on to be the 'good' version; if your 'working' + kernel for example is '6.0.11', then your 'good' kernel is 'v6.0'. + + In case you face a regression, it depends on the version range where the + regression was introduced: + + * Something which used to work in Linux 6.0 broke when switching to Linux + 6.1-rc1? Then henceforth regard 'v6.0' as the last known 'good' version + and 'v6.1-rc1' as the first 'bad' one. + + * Some function stopped working when updating from 6.0.11 to 6.1.4? Then for + the time being consider 'v6.0' as the last 'good' version and 'v6.1.4' as + the 'bad' one. Note, at this point it is merely assumed that 6.0 is fine; + this assumption will be checked in segment 2. + + * A feature you used in 6.0.11 does not work at all or worse in 6.1.13? In + that case you want to bisect within a stable/longterm series: consider + 'v6.0.11' as the last known 'good' version and 'v6.0.13' as the first 'bad' + one. Note, in this case you still want to compile and test a mainline kernel + as explained in segment 1: the outcome will determine if you need to report + your issue to the regular developers or the stable team. + + *Note, do not confuse 'good' version with 'working' kernel; the latter term + throughout this guide will refer to the last kernel that has been working + fine.* + + [:ref:`details`] + +.. _bootworking_bissbs: + +* Boot into the 'working' kernel and briefly use the apparently broken feature. + + [:ref:`details`] + +.. _diskspace_bissbs: + +* Ensure to have enough free space for building Linux. 15 Gigabyte in your home + directory should typically suffice. If you have less available, be sure to pay + attention to later steps about retrieving the Linux sources and handling of + debug symbols: both explain approaches reducing the amount of space, which + should allow you to master these tasks with about 4 Gigabytes free space. + + [:ref:`details`] + +.. _buildrequires_bissbs: + +* Install all software required to build a Linux kernel. Often you will need: + 'bc', 'binutils' ('ld' et al.), 'bison', 'flex', 'gcc', 'git', 'openssl', + 'pahole', 'perl', and the development headers for 'libelf' and 'openssl'. The + reference section shows how to quickly install those on various popular Linux + distributions. + + [:ref:`details`] + +.. _sources_bissbs: + +* Retrieve the mainline Linux sources; then change into the directory holding + them, as all further commands in this guide are meant to be executed from + there. + + *Note, the following describe how to retrieve the sources using a full + mainline clone, which downloads about 2,75 GByte as of early 2024. The* + :ref:`reference section describes two alternatives ` *: + one downloads less than 500 MByte, the other works better with unreliable + internet connections.* + + Execute the following command to retrieve a fresh mainline codebase:: + + git clone -o mainline --no-checkout \ + https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ~/linux/ + cd ~/linux/ + + [:ref:`details`] + +.. _oldconfig_bissbs: + +* Start preparing a kernel build configuration (the '.config' file). + + Before doing so, ensure you are still running the 'working' kernel an earlier + step told you to boot; if you are unsure, check the current kernel release + identifier using ``uname -r``. + + Afterwards check out the source code for the version earlier established as + 'good' and create a .config file:: + + git checkout --detach v6.0 + make olddefconfig + + The second command will try to locate the build configuration file for the + running kernel and then adjust it for the needs of the kernel sources you + checked out. While doing so, it will print a few lines you need to check. + + Look out for a line starting with '# using defaults found in'. It should be + followed by a path to a file in '/boot/' that contains the release identifier + of your currently working kernel. If the line instead continues with something + like 'arch/x86/configs/x86_64_defconfig', then the build infra failed to find + the .config file for your running kernel -- in which case you have to put one + there manually, as explained in the reference section. + + In case you can not find such a line, look for one containing '# configuration + written to .config'. If that's the case you have a stale build configuration + lying around. Unless you intend to use it, delete it; afterwards run + 'make olddefconfig' again and check if it now picked up the right config file + as base. + + [:ref:`details`] + +.. _localmodconfig_bissbs: + +* Disable any kernel modules apparently superfluous for your setup. This is + optional, but especially wise for bisections, as it speeds up the build + process enormously -- at least unless the .config file picked up in the + previous step was already tailored to your and your hardware needs, in which + case you should skip this step. + + To prepare the trimming, connect external hardware you occasionally use (USB + keys, tokens, ...), quickly start a VM, and bring up VPNs. And if you rebooted + since you started that guide, ensure that you tried using the feature causing + trouble since you started the system. Only then trim your .config:: + + yes '' | make localmodconfig + + There is a catch to this, as the 'apparently' in initial sentence of this step + and the preparation instructions already hinted at: + + The 'localmodconfig' target easily disables kernel modules for features only + used occasionally -- like modules for external peripherals not yet connected + since booting, virtualization software not yet utilized, VPN tunnels, and a + few other things. That's because some tasks rely on kernel modules Linux only + loads when you execute tasks like the aforementioned ones for the first time. + + This drawback of localmodconfig is nothing you should lose sleep over, but + something to keep in mind: if something is misbehaving with the kernels built + during this guide, this is most likely the reason. You can reduce or nearly + eliminate the risk with tricks outlined in the reference section; but when + building a kernel just for quick testing purposes this is usually not worth + spending much effort on, as long as it boots and allows to properly test the + feature that causes trouble. + + [:ref:`details`] + +.. _tagging_bissbs: + +* Ensure all the kernels you will build are clearly identifiable using a special + tag and a unique version number:: + + ./scripts/config --set-str CONFIG_LOCALVERSION '-local' + ./scripts/config -e CONFIG_LOCALVERSION_AUTO + + [:ref:`details`] + +.. _debugsymbols_bissbs: + +* Decide how to handle debug symbols. + + In the context of this document it is often wise to enable them, as there is a + decent chance you will need to decode a stack trace from a 'panic', 'Oops', + 'warning', or 'BUG':: + + ./scripts/config -d DEBUG_INFO_NONE -e KALLSYMS_ALL -e DEBUG_KERNEL \ + -e DEBUG_INFO -e DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT -e KALLSYMS + + But if you are extremely short on storage space, you might want to disable + debug symbols instead:: + + ./scripts/config -d DEBUG_INFO -d DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT \ + -d DEBUG_INFO_DWARF4 -d DEBUG_INFO_DWARF5 -e CONFIG_DEBUG_INFO_NONE + + [:ref:`details`] + +.. _configmods_bissbs: + +* Check if you may want or need to adjust some other kernel configuration + options: + + * Are you running Debian? Then you want to avoid known problems by performing + additional adjustments explained in the reference section. + + [:ref:`details`]. + + * If you want to influence other aspects of the configuration, do so now + by using make targets like 'menuconfig' or 'xconfig'. + + [:ref:`details`]. + +.. _saveconfig_bissbs: + +* Reprocess the .config after the latest adjustments and store it in a safe + place:: + + make olddefconfig + cp .config ~/kernel-config-working + + [:ref:`details`] + +.. _introlatestcheck_bissbs: + +Segment 1: try to reproduce the problem with the latest codebase +---------------------------------------------------------------- + +The following steps verify if the problem occurs with the code currently +supported by developers. In case you face a regression, it also checks that the +problem is not caused by some .config change, as reporting the issue then would +be a waste of time. [:ref:`details`] + +.. _checkoutmaster_bissbs: + +* Check out the latest Linux codebase:: + + cd ~/linux/ + git checkout --force --detach mainline/master + + [:ref:`details`] + +.. _build_bissbs: + +* Build the image and the modules of your first kernel using the config file you + prepared:: + + cp ~/kernel-config-working .config + make olddefconfig + make -j $(nproc --all) + + If you want your kernel packaged up as deb, rpm, or tar file, see the + reference section for alternatives, which obviously will require other + steps to install as well. + + [:ref:`details`] + +.. _install_bissbs: + +* Install your newly built kernel. + + Before doing so, consider checking if there is still enough room for it:: + + df -h /boot/ /lib/modules/ + + 150 MByte in /boot/ and 200 in /lib/modules/ usually suffice. Those are rough + estimates assuming the worst case. How much your kernels actually require will + be determined later. + + Now install the kernel, which will be saved in parallel to the kernels from + your Linux distribution:: + + command -v installkernel && sudo make modules_install install + + On many commodity Linux distributions this will take care of everything + required to boot your kernel. You might want to ensure that's the case by + checking if your boot loader's configuration was updated; furthermore ensure + an initramfs (also known as initrd) exists, which on many distributions can be + achieved by running ``ls -l /boot/init*$(make -s kernelrelease)*``. Those + steps are recommended, as there are quite a few Linux distribution where above + command is insufficient: + + * On Arch Linux, its derivatives, many immutable Linux distributions, and a + few others the above command does nothing at, as they lack 'installkernel' + executable. + + * Some distributions install the kernel, but don't add an entry for your + kernel in your boot loader's configuration -- the kernel thus won't show up + in the boot menu. + + * Some distributions add a boot loader menu entry, but don't create an + initramfs on installation -- in that case your kernel most likely will be + unable to mount the root partition during bootup. + + If any of that applies to you, see the reference section for further guidance. + Once you figured out what to do, consider writing down the necessary + installation steps: if you will build more kernels as described in + segment 2 and 3, you will have to execute these commands every time that + ``command -v installkernel [...]`` comes up again. + + [:ref:`details`] + +.. _storagespace_bissbs: + +* In case you plan to follow this guide further, check how much storage space + the kernel, its modules, and other related files like the initramfs consume:: + + du -ch /boot/*$(make -s kernelrelease)* | tail -n 1 + du -sh /lib/modules/$(make -s kernelrelease)/ + + Write down or remember those two values for later: they enable you to prevent + running out of disk space accidentally during a bisection. + + [:ref:`details`] + +.. _kernelrelease_bissbs: + +* Show and store the kernelrelease identifier of the kernel you just built:: + + make -s kernelrelease | tee -a ~/kernels-built + + Remember the identifier momentarily, as it will help you pick the right kernel + from the boot menu upon restarting. + +.. _recheckbroken_bissbs: + +* Reboot into the kernel you just built and check if the feature that is + expected to be broken really is. + + Start by making sure the kernel you booted is the one you just built. When + unsure, check if the output of these commands show the exact same release + identifier:: + + tail -n 1 ~/kernels-built + uname -r + + Now verify if the feature that causes trouble works with your newly built + kernel. If things work while investigating a regression, check the reference + section for further details. + + [:ref:`details`] + +.. _recheckstablebroken_bissbs: + +* Are you facing a problem within a stable/longterm release, but failed to + reproduce it with the mainline kernel you just built? Then check if the latest + codebase for the particular series might already fix the problem. To do so, + add the stable series Git branch for your 'good' kernel and check out the + latest version:: + + cd ~/linux/ + git remote set-branches --add stable linux-6.0.y + git fetch stable + git checkout --force --detach linux-6.0.y + + Now use the checked out code to build and install another kernel using the + commands the earlier steps already described in more detail:: + + cp ~/kernel-config-working .config + make olddefconfig + make -j $(nproc --all) + # * Check if the free space suffices holding another kernel: + df -h /boot/ /lib/modules/ + command -v installkernel && sudo make modules_install install + make -s kernelrelease | tee -a ~/kernels-built + reboot + + Now verify if you booted the kernel you intended to start, to then check if + everything works fine with this kernel:: + + tail -n 1 ~/kernels-built + uname -r + + [:ref:`details`] + +Do you follow this guide to verify if a problem is present in the code +currently supported by Linux kernel developers? Then you are done at this +point. If you later want to remove the kernel you just built, check out +:ref:`Supplementary tasks: cleanup during and after following this guide.`. + +In case you face a regression, move on and execute at least the next segment +as well. + +.. _introworkingcheck_bissbs: + +Segment 2: check if the kernels you build work fine +--------------------------------------------------- + +In case of a regression, you now want to ensure the trimmed configuration file +you created earlier works as expected; a bisection with the .config file +otherwise would be a waste of time. [:ref:`details`] + +.. _recheckworking_bissbs: + +* Build your own variant of the 'working' kernel and check if the feature that + regressed works as expected with it. + + Start by checking out the sources for the version earlier established as + 'good':: + + cd ~/linux/ + git checkout --detach v6.0 + + Now use the checked out code to configure, build, and install another kernel + using the commands the previous subsection explained in more detail:: + + cp ~/kernel-config-working .config + make olddefconfig + make -j $(nproc --all) + # * Check if the free space suffices holding another kernel: + df -h /boot/ /lib/modules/ + command -v installkernel && sudo make modules_install install + make -s kernelrelease | tee -a ~/kernels-built + reboot + + When the system booted, you may want to verify once again that the + kernel you started is the one you just built: + + tail -n 1 ~/kernels-built + uname -r + + Now check if this kernel works as expected; if not, consult the reference + section for further instructions. + + [:ref:`details`] + +.. _introbisect_bissbs: + +Segment 3: perform the bisection and validate the result +-------------------------------------------------------- + +With all the preparations and precaution builds taken care of, you are now ready +to begin the bisection. This will make you build quite a few kernels -- usually +about 15 in case you encountered a regression when updating to a newer series +(say from 6.0.11 to 6.1.3). But do not worry, due to the trimmed build +configuration created earlier this works a lot faster than many people assume: +overall on average it will often just take about 10 to 15 minutes to compile +each kernel on commodity x86 machines. + +* In case your 'bad' version is a stable/longterm release (say v6.1.5), add its + stable branch, unless you already did so earlier:: + + cd ~/linux/ + git remote add -t master stable \ + https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable.git + git remote set-branches --add stable linux-6.1.y + git fetch stable + +.. _bisectstart_bissbs: + +* Start the bisection and tell Git about the versions earlier established as + 'good' and 'bad':: + + cd ~/linux/ + git bisect start + git bisect good v6.0 + git bisect bad v6.1.5 + + [:ref:`details`] + +.. _bisectbuild_bissbs: + +* Now use the code Git checked out to build, install, and boot a kernel using + the commands introduced earlier:: + + cp ~/kernel-config-working .config + make olddefconfig + make -j $(nproc --all) + # * Check if the free space suffices holding another kernel: + df -h /boot/ /lib/modules/ + command -v installkernel && sudo make modules_install install + make -s kernelrelease | tee -a ~/kernels-built + reboot + + If compilation fails for some reason, run ``git bisect skip`` and restart + executing the stack of commands from the beginning. + + In case you skipped the "test latest codebase" step in the guide, check its + description as for why the 'df [...]' and 'make -s kernelrelease [...]' + commands are here. + + Important note: the latter command from this point on will print release + identifiers that might look odd or wrong to you -- which they are not, as it's + totally normal to see release identifiers like '6.0-rc1-local-gcafec0cacaca0' + if you bisect between versions 6.1 and 6.2 for example. + + [:ref:`details`] + +.. _bisecttest_bissbs: + +* Now check if the feature that regressed works in the kernel you just built. + + You again might want to start by making sure the kernel you booted is the one + you just built:: + + cd ~/linux/ + tail -n 1 ~/kernels-built + uname -r + + Now verify if the feature that regressed works at this kernel bisection point. + If it does, run this:: + + git bisect good + + If it does not, run this:: + + git bisect bad + + Be sure about what you tell Git, as getting this wrong just once will send the + rest of the bisection totally off course. + + While the bisection is ongoing, Git will use the information you provided to + find and check out another bisection point for you to test. While doing so, it + will print something like 'Bisecting: 675 revisions left to test after this + (roughly 10 steps)' to indicate how many further changes it expects to be + tested. Now build and install another kernel using the instructions from the + previous step; afterwards follow the instructions in this step again. + + Repeat this again and again until you finish the bisection -- that's the case + when Git after tagging a change as 'good' or 'bad' prints something like + 'cafecaca0c0dacafecaca0c0dacafecaca0c0da is the first bad commit'; right + afterwards it will show some details about the culprit including the patch + description of the change. The latter might fill your terminal screen, so you + might need to scroll up to see the message mentioning the culprit; + alternatively, run ``git bisect log > ~/bisection-log``. + + [:ref:`details`] + +.. _bisectlog_bissbs: + +* Store Git's bisection log and the current .config file in a safe place before + telling Git to reset the sources to the state before the bisection:: + + cd ~/linux/ + git bisect log > ~/bisection-log + cp .config ~/bisection-config-culprit + git bisect reset + + [:ref:`details`] + +.. _revert_bissbs: + +* Try reverting the culprit on top of latest mainline to see if this fixes your + regression. + + This is optional, as it might be impossible or hard to realize. The former is + the case, if the bisection determined a merge commit as the culprit; the + latter happens if other changes depend on the culprit. But if the revert + succeeds, it is worth building another kernel, as it validates the result of + a bisection, which can easily deroute; it furthermore will let kernel + developers know, if they can resolve the regression with a quick revert. + + Begin by checking out the latest codebase depending on the range you bisected: + + * Did you face a regression within a stable/longterm series (say between + 6.0.11 and 6.0.13) that does not happen in mainline? Then check out the + latest codebase for the affected series like this:: + + git fetch stable + git checkout --force --detach linux-6.0.y + + * In all other cases check out latest mainline:: + + git fetch mainline + git checkout --force --detach mainline/master + + If you bisected a regression within a stable/longterm series that also + happens in mainline, there is one more thing to do: look up the mainline + commit-id. To do so, use a command like ``git show abcdcafecabcd`` to + view the patch description of the culprit. There will be a line near + the top which looks like 'commit cafec0cacaca0 upstream.' or + 'Upstream commit cafec0cacaca0'; use that commit-id in the next command + and not the one the bisection blamed. + + Now try reverting the culprit by specifying its commit id:: + + git revert --no-edit cafec0cacaca0 + + If that fails, give up trying and move on to the next step. But if it works, + build a kernel again using the familiar command sequence:: + + cp ~/kernel-config-working .config + make olddefconfig && + make -j $(nproc --all) && + # * Check if the free space suffices holding another kernel: + df -h /boot/ /lib/modules/ + command -v installkernel && sudo make modules_install install + Make -s kernelrelease | tee -a ~/kernels-built + reboot + + Now check one last time if the feature that made you perform a bisection work + with that kernel. + + [:ref:`details`] + +.. _introclosure_bissbs: + +Supplementary tasks: cleanup during and after the bisection +----------------------------------------------------------- + +During and after following this guide you might want or need to remove some of +the kernels you installed: the boot menu otherwise will become confusing or +space might run out. + +.. _makeroom_bissbs: + +* To remove one of the kernels you installed, look up its 'kernelrelease' + identifier. This guide stores them in '~/kernels-built', but the following + command will print them as well:: + + ls -ltr /lib/modules/*-local* + + You in most situations want to remove the oldest kernels built during the + actual bisection (e.g. segment 3 of this guide). The two ones you created + beforehand (e.g. to test the latest codebase and the version considered + 'good') might become handy to verify something later -- thus better keep them + around, unless you are really short on storage space. + + To remove the modules of a kernel with the kernelrelease identifier + '*6.0-rc1-local-gcafec0cacaca0*', start by removing the directory holding its + modules:: + + sudo rm -rf /lib/modules/6.0-rc1-local-gcafec0cacaca0 + + Afterwards try the following command:: + + sudo kernel-install -v remove 6.0-rc1-local-gcafec0cacaca0 + + On quite a few distributions this will delete all other kernel files installed + while also removing the kernel's entry from the boot menu. But on some + distributions this command does not exist or will fail; in that case consult + the reference section, as your Linux distribution needs special care. + + [:ref:`details`] + +.. _finishingtouch_bissbs: + +* Once you have finished the bisection, do not immediately remove anything you + set up, as you might need a few things again. What is safe to remove depends + on the outcome of the bisection: + + * Could you initially reproduce the regression with the latest codebase and + after the bisection were able to fix the problem by reverting the culprit on + top of the latest codebase? Then you want to keep those two kernels around + for a while, but safely remove all others with a '-local' in the release + identifier. + + * Did the bisection end on a merge-commit or seems questionable for other + reasons? Then you want to keep as many kernels as possible around for a few + days: it's pretty likely that you will be asked to recheck something. + + * In other cases it likely is a good idea to keep the following kernels around + for some time: the one built from the latest codebase, the one created from + the version considered 'good', and the last three or four you compiled + during the actual bisection process. + + [:ref:`details`] + +.. _submit_improvements: + +This concludes the step-by-step guide. + +Did you run into trouble following any of the above steps not cleared up by the +reference section below? Did you spot errors? Or do you have ideas how to +improve the guide? Then please take a moment and let the maintainer of this +document know by email (Thorsten Leemhuis ), ideally while +CCing the Linux docs mailing list (linux-doc@vger.kernel.org). Such feedback is +vital to improve this document further, which is in everybody's interest, as it +will enable more people to master the task described here -- and hopefully also +improve similar guides inspired by this one. + + +Reference section for the step-by-step guide +============================================ + +This section holds additional information for almost all the items in the above +step-by-step guide. + +.. _backup_bisref: + +Prepare for emergencies +----------------------- + + *Create a fresh backup and put system repair and restore tools at hand.* + [:ref:`... `] + +Remember, you are dealing with computers, which sometimes do unexpected things +-- especially if you fiddle with crucial parts like the kernel of an operating +system. That's what you are about to do in this process. Hence, better prepare +for something going sideways, even if that should not happen. + +[:ref:`back to step-by-step guide `] + +.. _vanilla_bisref: + +Remove anything related to externally maintained kernel modules +--------------------------------------------------------------- + + *Remove all software that depends on externally developed kernel drivers or + builds them automatically.* [:ref:`...`] + +Externally developed kernel modules can easily cause trouble during a bisection. + +But there is a more important reason why this guide contains this step: most +kernel developers will not care about reports about regressions occurring with +kernels that utilize such modules. That's because such kernels are not +considered 'vanilla' anymore, as Documentation/admin-guide/reporting-issues.rst +explains in more detail. + +[:ref:`back to step-by-step guide `] + +.. _secureboot_bisref: + +Deal with techniques like Secure Boot +------------------------------------- + + *On platforms with 'Secure Boot' or similar techniques, prepare everything to + ensure the system will permit your self-compiled kernel to boot later.* + [:ref:`... `] + +Many modern systems allow only certain operating systems to start; that's why +they reject booting self-compiled kernels by default. + +You ideally deal with this by making your platform trust your self-built kernels +with the help of a certificate. How to do that is not described +here, as it requires various steps that would take the text too far away from +its purpose; 'Documentation/admin-guide/module-signing.rst' and various web +sides already explain everything needed in more detail. + +Temporarily disabling solutions like Secure Boot is another way to make your own +Linux boot. On commodity x86 systems it is possible to do this in the BIOS Setup +utility; the required steps vary a lot between machines and therefore cannot be +described here. + +On mainstream x86 Linux distributions there is a third and universal option: +disable all Secure Boot restrictions for your Linux environment. You can +initiate this process by running ``mokutil --disable-validation``; this will +tell you to create a one-time password, which is safe to write down. Now +restart; right after your BIOS performed all self-tests the bootloader Shim will +show a blue box with a message 'Press any key to perform MOK management'. Hit +some key before the countdown exposes, which will open a menu. Choose 'Change +Secure Boot state'. Shim's 'MokManager' will now ask you to enter three +randomly chosen characters from the one-time password specified earlier. Once +you provided them, confirm you really want to disable the validation. +Afterwards, permit MokManager to reboot the machine. + +[:ref:`back to step-by-step guide `] + +.. _bootworking_bisref: + +Boot the last kernel that was working +------------------------------------- + + *Boot into the last working kernel and briefly recheck if the feature that + regressed really works.* [:ref:`...`] + +This will make later steps that cover creating and trimming the configuration do +the right thing. + +[:ref:`back to step-by-step guide `] + +.. _buildrequires_bisref: + +.. _diskspace_bisref: + +Space requirements +------------------ + + *Ensure to have enough free space for building Linux.* + [:ref:`... `] + +The numbers mentioned are rough estimates with a big extra charge to be on the +safe side, so often you will need less. + +If you have space constraints, be sure to hay attention to the :ref:`step about +debug symbols' ` and its :ref:`accompanying reference +section' `, as disabling then will reduce the consumed disk +space by quite a few gigabytes. + +[:ref:`back to step-by-step guide `] + +.. _rangecheck_bisref: + +Bisection range +--------------- + + *Determine the kernel versions considered 'good' and 'bad' throughout this + guide.* [:ref:`...`] + +Establishing the range of commits to be checked is mostly straightforward, +except when a regression occurred when switching from a release of one stable +series to a release of a later series (e.g. from 6.0.11 to 6.1.4). In that case +Git will need some hand holding, as there is no straight line of descent. + +That's because with the release of 6.0 mainline carried on to 6.1 while the +stable series 6.0.y branched to the side. It's therefore theoretically possible +that the issue you face with 6.1.4 only worked in 6.0.11, as it was fixed by a +commit that went into one of the 6.0.y releases, but never hit mainline or the +6.1.y series. Thankfully that normally should not happen due to the way the +stable/longterm maintainers maintain the code. It's thus pretty safe to assume +6.0 as a 'good' kernel. That assumption will be tested anyway, as that kernel +will be built and tested in the segment '2' of this guide; Git would force you +to do this as well, if you tried bisecting between 6.0.11 and 6.1.13. + +[:ref:`back to step-by-step guide `] + +.. _sources_bisref: + +Install build requirements +-------------------------- + + *Install all software required to build a Linux kernel.* + [:ref:`...`] + +The kernel is pretty stand-alone, but besides tools like the compiler you will +sometimes need a few libraries to build one. How to install everything needed +depends on your Linux distribution and the configuration of the kernel you are +about to build. + +Here are a few examples what you typically need on some mainstream +distributions: + +* Debian, Ubuntu, and derivatives:: + + sudo apt install bc binutils bison dwarves flex gcc git make openssl \ + pahole perl-base libssl-dev libelf-dev + +* Fedora and derivatives:: + + sudo dnf install binutils /usr/include/{libelf.h,openssl/pkcs7.h} \ + /usr/bin/{bc,bison,flex,gcc,git,openssl,make,perl,pahole} + +* openSUSE and derivatives:: + + sudo zypper install bc binutils bison dwarves flex gcc git make perl-base \ + openssl openssl-devel libelf-dev + +In case you wonder why these lists include openssl and its development headers: +they are needed for the Secure Boot support, which many distributions enable in +their kernel configuration for x86 machines. + +Sometimes you will need tools for compression formats like bzip2, gzip, lz4, +lzma, lzo, xz, or zstd as well. + +In case you want to adjust the build configuration with make targets like +'menuconfig' or 'xconfig' later, ensure to also install development headers for +ncurses and Qt5. + +You furthermore might need additional libraries and their development headers +for tasks not covered in this guide. For example, zlib will be needed when +building kernel tools from the tools/ directory;. + +[:ref:`back to step-by-step guide `] + +Download the sources using git +------------------------------ + + *Retrieve the Linux mainline sources.* + [:ref:`...`] + +The step-by-step guide outlines how to retrieve the Linux sources using a full +Git clone of Linus' mainline repository. If you have an unreliable internet +connection, you instead might want to use a 'Git bundle' to retrieve the +sources; if downloading the complete repository would take too long or requires +too much storage space, use a shallow clone instead. + +Downloading Linux mainline using a bundle +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Switch to you home directory and follow the instructions `kernel.org provides +for this case `_. + +Afterwards add the stable Git repository as remote and all required +stable/branches as explained in the step-by-step guide. + +Downloading Linux mainline using a shallow clone +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +First, execute the following command to retrieve the latest mainline codebase:: + + git clone -o mainline --no-checkout --depth 1 -b master \ + https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ~/linux/ + cd ~/linux/ + +Now deepen your clone's history to the second predecessor of the mainline +release of your 'good' version. In case the latter are 6.0 or 6.0.11, 5.19 would +be the first predecessor and 5.18 the second -- hence deepen the history up to +the latter:: + + git fetch --shallow-exclude=v5.18 mainline + +Afterwards add the stable Git repository as remote and all required stable +branches as explained in the step-by-step guide. + +Note, shallow clones have a few peculiar characteristics: + + * For bisections the history needs to be deepend a few mainline versions + farther than it seems necessary, as explained above already. That's because + Git otherwise will be unable to revert or describe most of the commits within + a range (say v6.1..v6.2), as they are internally based on earlier kernels + releases (like v6.0-rc2 or 5.19-rc3). + + * This document in most places uses ``git fetch`` with ``--shallow-exclude=`` + to specify the earliest version you care about (or to be precise: its git + tag). You alternatively can use the parameter ``--shallow-since=`` to specify + an absolute (say ``'2023-07-15'``) or relative (``'12 months'``) date to + define the depth of the history you want to download. When using them while + bisecting mainline, ensure to deepen the history to at least 7 months before + the release of the mainline release your 'good' kernel is based on. + + * Be warned, when deepening your clone you might encounter an error like + 'fatal: error in object: unshallow cafecaca0c0dacafecaca0c0dacafecaca0c0da'. + In that case run ``git repack -d`` and try again. + +[:ref:`back to step-by-step guide `] +[:ref:`back to section intro `] + +.. _oldconfig_bisref: + +Start defining the build configuration for your kernel +------------------------------------------------------ + + *Start preparing a kernel build configuration (the '.config' file).* + [:ref:`... `] + +*Note, this is the first of multiple steps in this guide that create or modify +build artifacts. The commands used in this guide store them right in the source +tree to keep things simple. In case you prefer storing the build artifacts +separately, create a directory like '~/linux-builddir/' and add the parameter +``O=~/linux-builddir/`` to all make calls used throughout this guide. You will +have to point other commands there as well -- among them the ``./scripts/config +[...]`` commands, which will require ``--file ~/linux-builddir/.config`` to +locate the right build configuration.* + +Two things can easily go wrong when creating a .config file as advised: + + * The oldconfig target will use a .config file from your build directory, if + one is already present there (e.g. '~/linux/.config'). That's totally fine if + that's what you intend (see next step), but in all other cases you want to + delete it. This for example is important in case you followed this guide + further, but due to problems come back here to redo the configuration from + scratch. + + * Sometimes olddefconfig is unable to locate the .config file for your running + kernel and will use defaults, as briefly outlined in the guide. In that case + check if your distribution ships the configuration somewhere and manually put + it in the right place (e.g. '~/linux/.config') if it does. On distributions + where /proc/config.gz exists this can be achieved using this command:: + + zcat /proc/config.gz > .config + + Once you put it there, run ``make olddefconfig`` again to adjust it to the + needs of the kernel about to be built. + +Note, the olddefconfig target will set any undefined build options to their +default value. If you prefer to set such configuration options manually, use +``make oldconfig`` instead. Then for each undefined configuration option you +will be asked how to proceed; in case you are unsure what to answer, simply hit +'enter' to apply the default value. Note though that for bisections you normally +want to go with the defaults, as you otherwise might enable a new feature that +causes a problem looking like regressions (for example due to security +restrictions). + +Occasionally odd things happen when trying to use a config file prepared for one +kernel (say 6.1) on an older mainline release -- especially if it is much older +(say v5.15). That's one of the reasons why the previous step in the guide told +you to boot the kernel where everything works. If you manually add a .config +file you thus want to ensure it's from the working kernel and not from a one +that shows the regression. + +In case you want to build kernels for another machine, locate its kernel build +configuration; usually ``ls /boot/config-$(uname -r)`` will print its name. Copy +that file to the build machine and store it as ~/linux/.config; afterwards run +``make olddefconfig`` to adjust it. + +[:ref:`back to step-by-step guide `] + +.. _localmodconfig_bisref: + +Trim the build configuration for your kernel +-------------------------------------------- + + *Disable any kernel modules apparently superfluous for your setup.* + [:ref:`... `] + +As explained briefly in the step-by-step guide already: with localmodconfig it +can easily happen that your self-built kernels will lack modules for tasks you +did not perform at least once before utilizing this make target. That happens +when a task requires kernel modules which are only autoloaded when you execute +it for the first time. So when you never performed that task since starting your +kernel the modules will not have been loaded -- and from localmodonfig's point +of view look superfluous, which thus disables them to reduce the amount of code +to be compiled. + +You can try to avoid this by performing typical tasks that often will autoload +additional kernel modules: start a VM, establish VPN connections, loop-mount a +CD/DVD ISO, mount network shares (CIFS, NFS, ...), and connect all external +devices (2FA keys, headsets, webcams, ...) as well as storage devices with file +systems you otherwise do not utilize (btrfs, ext4, FAT, NTFS, XFS, ...). But it +is hard to think of everything that might be needed -- even kernel developers +often forget one thing or another at this point. + +Do not let that risk bother you, especially when compiling a kernel only for +testing purposes: everything typically crucial will be there. And if you forget +something important you can turn on a missing feature manually later and quickly +run the commands again to compile and install a kernel that has everything you +need. + +But if you plan to build and use self-built kernels regularly, you might want to +reduce the risk by recording which modules your system loads over the course of +a few weeks. You can automate this with `modprobed-db +`_. Afterwards use ``LSMOD=`` to +point localmodconfig to the list of modules modprobed-db noticed being used:: + + yes '' | make LSMOD='${HOME}'/.config/modprobed.db localmodconfig + +That parameter also allows you to build trimmed kernels for another machine in +case you copied a suitable .config over to use as base (see previous step). Just +run ``lsmod > lsmod_foo-machine`` on that system and copy the generated file to +your build's host home directory. Then run these commands instead of the one the +step-by-step guide mentions:: + + yes '' | make LSMOD=~/lsmod_foo-machine localmodconfig + +[:ref:`back to step-by-step guide `] + +.. _tagging_bisref: + +Tag the kernels about to be build +--------------------------------- + + *Ensure all the kernels you will build are clearly identifiable using a + special tag and a unique version identifier.* [:ref:`... `] + +This allows you to differentiate your distribution's kernels from those created +during this process, as the file or directories for the latter will contain +'-local' in the name; it also helps picking the right entry in the boot menu and +not lose track of you kernels, as their version numbers will look slightly +confusing during the bisection. + +[:ref:`back to step-by-step guide `] + +.. _debugsymbols_bisref: + +Decide to enable or disable debug symbols +----------------------------------------- + + *Decide how to handle debug symbols.* [:ref:`... `] + +Having debug symbols available can be important when your kernel throws a +'panic', 'Oops', 'warning', or 'BUG' later when running, as then you will be +able to find the exact place where the problem occurred in the code. But +collecting and embedding the needed debug information takes time and consumes +quite a bit of space: in late 2022 the build artifacts for a typical x86 kernel +trimmed with localmodconfig consumed around 5 Gigabyte of space with debug +symbols, but less than 1 when they were disabled. The resulting kernel image and +modules are bigger as well, which increases storage requirements for /boot/ and +load times. + +In case you want a small kernel and are unlikely to decode a stack trace later, +you thus might want to disable debug symbols to avoid those downsides. If it +later turns out that you need them, just enable them as shown and rebuild the +kernel. + +You on the other hand definitely want to enable them for this process, if there +is a decent chance that you need to decode a stack trace later. The section +'Decode failure messages' in Documentation/admin-guide/reporting-issues.rst +explains this process in more detail. + +[:ref:`back to step-by-step guide `] + +.. _configmods_bisref: + +Adjust build configuration +-------------------------- + + *Check if you may want or need to adjust some other kernel configuration + options:* + +Depending on your needs you at this point might want or have to adjust some +kernel configuration options. + +.. _configmods_distros_bisref: + +Distro specific adjustments +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + *Are you running* [:ref:`... `] + +The following sections help you to avoid build problems that are known to occur +when following this guide on a few commodity distributions. + +**Debian:** + + * Remove a stale reference to a certificate file that would cause your build to + fail:: + + ./scripts/config --set-str SYSTEM_TRUSTED_KEYS '' + + Alternatively, download the needed certificate and make that configuration + option point to it, as `the Debian handbook explains in more detail + `_ + -- or generate your own, as explained in + Documentation/admin-guide/module-signing.rst. + +[:ref:`back to step-by-step guide `] + +.. _configmods_individual_bisref: + +Individual adjustments +~~~~~~~~~~~~~~~~~~~~~~ + + *If you want to influence the other aspects of the configuration, do so + now.* [:ref:`... `] + +You at this point can use a command like ``make menuconfig`` to enable or +disable certain features using a text-based user interface; to use a graphical +configuration utility, call the make target ``xconfig`` or ``gconfig`` instead. +All of them require development libraries from toolkits they are based on +(ncurses, Qt5, Gtk2); an error message will tell you if something required is +missing. + +[:ref:`back to step-by-step guide `] + +.. _saveconfig_bisref: + +Put the .config file aside +-------------------------- + + *Reprocess the .config after the latest changes and store it in a safe place.* + [:ref:`... `] + +Put the .config you prepared aside, as you want to copy it back to the build +directory every time during this guide before you start building another +kernel. That's because going back and forth between different versions can alter +.config files in odd ways; those occasionally cause side effects that could +confuse testing or in some cases render the result of your bisection +meaningless. + +[:ref:`back to step-by-step guide `] + +.. _introlatestcheck_bisref: + +Try to reproduce the regression +----------------------------------------- + + *Verify the regression is not caused by some .config change and check if it + still occurs with the latest codebase.* [:ref:`... `] + +For some readers it might seem unnecessary to check the latest codebase at this +point, especially if you did that already with a kernel prepared by your +distributor or face a regression within a stable/longterm series. But it's +highly recommended for these reasons: + +* You will run into any problems caused by your setup before you actually begin + a bisection. That will make it a lot easier to differentiate between 'this + most likely is some problem in my setup' and 'this change needs to be skipped + during the bisection, as the kernel sources at that stage contain an unrelated + problem that causes building or booting to fail'. + +* These steps will rule out if your problem is caused by some change in the + build configuration between the 'working' and the 'broken' kernel. This for + example can happen when your distributor enabled an additional security + feature in the newer kernel which was disabled or not yet supported by the + older kernel. That security feature might get into the way of something you + do -- in which case your problem from the perspective of the Linux kernel + upstream developers is not a regression, as + Documentation/admin-guide/reporting-regressions.rst explains in more detail. + You thus would waste your time if you'd try to bisect this. + +* If the cause for your regression was already fixed in the latest mainline + codebase, you'd perform the bisection for nothing. This holds true for a + regression you encountered with a stable/longterm release as well, as they are + often caused by problems in mainline changes that were backported -- in which + case the problem will have to be fixed in mainline first. Maybe it already was + fixed there and the fix is already in the process of being backported. + +* For regressions within a stable/longterm series it's furthermore crucial to + know if the issue is specific to that series or also happens in the mainline + kernel, as the report needs to be sent to different people: + + * Regressions specific to a stable/longterm series are the stable team's + responsibility; mainline Linux developers might or might not care. + + * Regressions also happening in mainline are something the regular Linux + developers and maintainers have to handle; the stable team does not care + and does not need to be involved in the report, they just should be told + to backport the fix once it's ready. + + Your report might be ignored if you send it to the wrong party -- and even + when you get a reply there is a decent chance that developers tell you to + evaluate which of the two cases it is before they take a closer look. + +[:ref:`back to step-by-step guide `] + +.. _checkoutmaster_bisref: + +Checkout the latest Linux codebase +---------------------------------- + + *Checkout the latest Linux codebase.* + [:ref:`... `] + +In case you later want to recheck if an ever newer codebase might fix the +problem, remember to run that ``git fetch --shallow-exclude [...]`` command +again mentioned earlier to update your local Git repository. + +[:ref:`back to step-by-step guide `] + +.. _build_bisref: + +Build your kernel +----------------- + + *Build the image and the modules of your first kernel using the config file + you prepared.* [:ref:`... `] + +A lot can go wrong at this stage, but the instructions below will help you help +yourself. Another subsection explains how to directly package your kernel up as +deb, rpm or tar file. + +Dealing with build errors +~~~~~~~~~~~~~~~~~~~~~~~~~ + +When a build error occurs, it might be caused by some aspect of your machine's +setup that often can be fixed quickly; other times though the problem lies in +the code and can only be fixed by a developer. A close examination of the +failure messages coupled with some research on the internet will often tell you +which of the two it is. To perform such a investigation, restart the build +process like this:: + + make V=1 + +The ``V=1`` activates verbose output, which might be needed to see the actual +error. To make it easier to spot, this command also omits the ``-j $(nproc +--all)`` used earlier to utilize every CPU core in the system for the job -- but +this parallelism also results in some clutter when failures occur. + +After a few seconds the build process should run into the error again. Now try +to find the most crucial line describing the problem. Then search the internet +for the most important and non-generic section of that line (say 4 to 8 words); +avoid or remove anything that looks remotely system-specific, like your username +or local path names like ``/home/username/linux/``. First try your regular +internet search engine with that string, afterwards search Linux kernel mailing +lists via `lore.kernel.org/all/ `_. + +This most of the time will find something that will explain what is wrong; quite +often one of the hits will provide a solution for your problem, too. If you +do not find anything that matches your problem, try again from a different angle +by modifying your search terms or using another line from the error messages. + +In the end, most trouble you are to run into has likely been encountered and +reported by others already. That includes issues where the cause is not your +system, but lies the code. If you run into one of those, you might thus find a +solution (e.g. a patch) or workaround for your problem, too. + +Package your kernel up +~~~~~~~~~~~~~~~~~~~~~~ + +The step-by-step guide uses the default make targets (e.g. 'bzImage' and +'modules' on x86) to build the image and the modules of your kernel, which later +steps of the guide then install. You instead can also directly build everything +and directly package it up by using one of the following targets: + + * ``make -j $(nproc --all) bindeb-pkg`` to generate a deb package + + * ``make -j $(nproc --all) binrpm-pkg`` to generate a rpm package + + * ``make -j $(nproc --all) tarbz2-pkg`` to generate a bz2 compressed tarball + +This is just a selection of available make targets for this purpose, see +``make help`` for others. You can also use these targets after running +``make -j $(nproc --all)``, as they will pick up everything already built. + +If you employ the targets to generate deb or rpm packages, ignore the +step-by-step guide's instructions on installing and removing your kernel; +instead install and remove the packages using the package utility for the format +(e.g. dpkg and rpm) or a package management utility build on top of them (apt, +aptitude, dnf/yum, zypper, ...). Be aware that the packages generated using +these two make targets are designed to work on various distributions utilizing +those formats, they thus will sometimes behave differently than your +distribution's kernel packages. + +[:ref:`back to step-by-step guide `] + +.. _install_bisref: + +Put the kernel in place +----------------------- + + *Install the kernel you just built.* [:ref:`... `] + +What you need to do after executing the command in the step-by-step guide +depends on the existence and the implementation of an ``installkernel`` +executable. Many commodity Linux distributions ship such a kernel installer in +'/sbin/' that does everything needed, hence there is nothing left for you +except rebooting. But some distributions contain an installkernel that does +only part of the job -- and a few lack it completely and leave all the work to +you. + +If ``installkernel`` is found, the kernel's build system will delegate the +actual installation of your kernel's image and related files to this executable. +On almost all Linux distributions it will store the image as '/boot/vmlinuz- +' and put a 'System.map-' alongside it. Your kernel will thus be installed in parallel to any +existing ones, unless you already have one with exactly the same release name. + +Installkernel on many distributions will afterwards generate an 'initramfs' +(often also called 'initrd'), which commodity distributions rely on for booting; +hence be sure to keep the order of the two make targets used in the step-by-step +guide, as things will go sideways if you install your kernel's image before its +modules. Often installkernel will then add your kernel to the bootloader +configuration, too. You have to take care of one or both of these tasks +yourself, if your distributions installkernel doesn't handle them. + +A few distributions like Arch Linux and its derivatives totally lack an +installkernel executable. On those just install the modules using the kernel's +build system and then install the image and the System.map file manually:: + + sudo make modules_install + sudo install -m 0600 $(make -s image_name) /boot/vmlinuz-$(make -s kernelrelease) + sudo install -m 0600 System.map /boot/System.map-$(make -s kernelrelease) + +If your distribution boots with the help of an initramfs, now generate one for +your kernel using the tools your distribution provides for this process. +Afterwards add your kernel to your bootloader configuration and reboot. + +[:ref:`back to step-by-step guide `] + +.. _storagespace_bisref: + +Storage requirements per kernel +------------------------------- + + *Check how much storage space the kernel, its modules, and other related files + like the initramfs consume.* [:ref:`... `] + +The kernels built during a bisection consume quite a bit of space in /boot/ and +/lib/modules/, especially if you enabled debug symbols. That makes it easy to +fill up volumes during a bisection -- and due to that even kernels which used to +work earlier might fail to boot. To prevent that you will need to know how much +space each installed kernel typically requires. + +Note, most of the time the pattern '/boot/*$(make -s kernelrelease)*' used in +the guide will match all files needed to boot your kernel -- but neither the +path nor the naming scheme are mandatory. On some distributions you thus will +need to look in different places. + +[:ref:`back to step-by-step guide `] + +.. _recheckbroken_bisref: + +Check the kernel built from the latest codebase +----------------------------------------------- + + *Reboot into the kernel you just built and check if the feature that regressed + is really broken there.* [:ref:`... `] + +There are a couple of reasons why the regression you face might not show up with +your own kernel built from the latest codebase. These are the most frequent: + +* The cause for the regression was fixed meanwhile. + +* The regression with the broken kernel was caused by a change in the build + configuration the provider of your kernel carried out. + +* Your problem might be a race condition that does not show up with your kernel; + the trimmed build configuration, a different setting for debug symbols, the + compiler used, and various other things can cause this. + +* In case you encountered the regression with a stable/longterm kernel it might + be a problem that is specific to that series; the next step in this guide will + check this. + +[:ref:`back to step-by-step guide `] + +.. _recheckstablebroken_bisref: + +Check the kernel built from the latest stable/longterm codebase +--------------------------------------------------------------- + + *Are you facing a regression within a stable/longterm release, but failed to + reproduce it with the kernel you just built using the latest mainline sources? + Then check if the latest codebase for the particular series might already fix + the problem.* [:ref:`... `] + +If this kernel does not show the regression either, there most likely is no need +for a bisection. + +[:ref:`back to step-by-step guide `] + +.. _introworkingcheck_bisref: + +Ensure the 'good' version is really working well +------------------------------------------------ + + *Check if the kernels you build work fine.* + [:ref:`... `] + +This section will reestablish a known working base. Skipping it might be +appealing, but is usually a bad idea, as it does something important: + +It will ensure the .config file you prepared earlier actually works as expected. +That is in your own interest, as trimming the configuration is not foolproof -- +and you might be building and testing ten or more kernels for nothing before +starting to suspect something might be wrong with the build configuration. + +That alone is reason enough to spend the time on this, but not the only reason. + +Many readers of this guide normally run kernels that are patched, use add-on +modules, or both. Those kernels thus are not considered 'vanilla' -- therefore +it's possible that the thing that regressed might never have worked in vanilla +builds of the 'good' version in the first place. + +There is a third reason for those that noticed a regression between +stable/longterm kernels of different series (e.g. v6.0.13..v6.1.5): it will +ensure the kernel version you assumed to be 'good' earlier in the process (e.g. +v6.0) actually is working. + +[:ref:`back to step-by-step guide `] + +.. _recheckworking_bisref: + +Build your own version of the 'good' kernel +------------------------------------------- + + *Build your own variant of the working kernel and check if the feature that + regressed works as expected with it.* [:ref:`... `] + +In case the feature that broke with newer kernels does not work with your first +self-built kernel, find and resolve the cause before moving on. There are a +multitude of reasons why this might happen. Some ideas where to look: + +* Maybe localmodconfig did something odd and disabled the module required to + test the feature? Then you might want to recreate a .config file based on the + one from the last working kernel and skip trimming it down; manually disabling + some features in the .config might work as well to reduce the build time. + +* Maybe it's not a kernel regression and something that is caused by some fluke, + a broken initramfs (also known as initrd), new firmware files, or an updated + userland software? + +* Maybe it was a feature added to your distributor's kernel which vanilla Linux + at that point never supported? + +Note, if you found and fixed problems with the .config file, you want to use it +to build another kernel from the latest codebase, as your earlier tests with +mainline and the latest version from an affected stable/longterm series most +likely has been flawed. + +[:ref:`back to step-by-step guide `] + +.. _bisectstart_bisref: + +Start the bisection +------------------- + + *Start the bisection and tell Git about the versions earlier established as + 'good' and 'bad'.* [:ref:`... `] + +This will start the bisection process; the last of the commands will make Git +checkout a commit round about half-way between the 'good' and the 'bad' changes +for your to test. + +[:ref:`back to step-by-step guide `] + +.. _bisectbuild_bisref: + +Build a kernel from the bisection point +--------------------------------------- + + *Build, install, and boot a kernel from the code Git checked out using the + same commands you used earlier.* [:ref:`... `] + +There are two things worth of note here: + +* Occasionally building the kernel will fail or it might not boot due some + problem in the code at the bisection point. In that case run this command:: + + git bisect skip + + Git will then check out another commit nearby which with a bit of luck should + work better. Afterwards restart executing this step. + +* Those slightly odd looking version identifiers can happen during bisections, + because the Linux kernel subsystems prepare their changes for a new mainline + release (say 6.2) before its predecessor (e.g. 6.1) is finished. They thus + base them on a somewhat earlier point like v6.1-rc1 or even v6.0 -- and then + get merged for 6.2 without rebasing nor squashing them once 6.1 is out. This + leads to those slightly odd looking version identifiers coming up during + bisections. + +[:ref:`back to step-by-step guide `] + +.. _bisecttest_bisref: + +Bisection checkpoint +-------------------- + + *Check if the feature that regressed works in the kernel you just built.* + [:ref:`... `] + +Ensure what you tell Git is accurate: getting it wrong just one time will bring +the rest of the bisection totally of course, hence all testing after that point +will be for nothing. + +[:ref:`back to step-by-step guide `] + +.. _bisectlog_bisref: + +Put the bisection log away +-------------------------- + + *Store Git's bisection log and the current .config file in a safe place.* + [:ref:`... `] + +As indicated above: declaring just one kernel wrongly as 'good' or 'bad' will +render the end result of a bisection useless. In that case you'd normally have +to restart the bisection from scratch. The log can prevent that, as it might +allow someone to point out where a bisection likely went sideways -- and then +instead of testing ten or more kernels you might only have to build a few to +resolve things. + +The .config file is put aside, as there is a decent chance that developers might +ask for it after you reported the regression. + +[:ref:`back to step-by-step guide `] + +.. _revert_bisref: + +Try reverting the culprit +------------------------- + + *Try reverting the culprit on top of the latest codebase to see if this fixes + your regression.* [:ref:`... `] + +This is an optional step, but whenever possible one you should try: there is a +decent chance that developers will ask you to perform this step when you bring +the bisection result up. So give it a try, you are in the flow already, building +one more kernel shouldn't be a big deal at this point. + +The step-by-step guide covers everything relevant already except one slightly +rare thing: did you bisected a regression that also happened with mainline using +a stable/longterm series, but Git failed to revert the commit in mainline? Then +try to revert the culprit in the affected stable/longterm series -- and if that +succeeds, test that kernel version instead. + +[:ref:`back to step-by-step guide `] + + +Supplementary tasks: cleanup during and after the bisection +----------------------------------------------------------- + +.. _makeroom_bisref: + +Cleaning up during the bisection +-------------------------------- + + *To remove one of the kernels you installed, look up its 'kernelrelease' + identifier.* [:ref:`... `] + +The kernels you install during this process are easy to remove later, as its +parts are only stored in two places and clearly identifiable. You thus do not +need to worry to mess up your machine when you install a kernel manually (and +thus bypass your distribution's packaging system): all parts of your kernels are +relatively easy to remove later. + +One of the two places is a directory in /lib/modules/, which holds the modules +for each installed kernel. This directory is named after the kernel's release +identifier; hence, to remove all modules for one of the kernels you built, +simply remove its modules directory in /lib/modules/. + +The other place is /boot/, where typically two up to five files will be placed +during installation of a kernel. All of them usually contain the release name in +their file name, but how many files and their exact name depends somewhat on +your distribution's installkernel executable and its initramfs generator. On +some distributions the ``kernel-install remove...`` command mentioned in the +step-by-step guide will delete all of these files for you while also removing +the menu entry for the kernel from your bootloader configuration. On others you +have to take care of these two tasks yourself. The following command should +interactively remove the three main files of a kernel with the release name +'6.0-rc1-local-gcafec0cacaca0':: + + rm -i /boot/{System.map,vmlinuz,initr}-6.0-rc1-local-gcafec0cacaca0 + +Afterwards check for other files in /boot/ that have +'6.0-rc1-local-gcafec0cacaca0' in their name and consider deleting them as well. +Now remove the boot entry for the kernel from your bootloader's configuration; +the steps to do that vary quite a bit between Linux distributions. + +Note, be careful with wildcards like '*' when deleting files or directories +for kernels manually: you might accidentally remove files of a 6.0.11 kernel +when all you want is to remove 6.0 or 6.0.1. + +[:ref:`back to step-by-step guide `] + +Cleaning up after the bisection +------------------------------- + +.. _finishingtouch_bisref: + + *Once you have finished the bisection, do not immediately remove anything + you set up, as you might need a few things again.* + [:ref:`... `] + +When you are really short of storage space removing the kernels as described in +the step-by-step guide might not free as much space as you would like. In that +case consider running ``rm -rf ~/linux/*`` as well now. This will remove the +build artifacts and the Linux sources, but will leave the Git repository +(~/linux/.git/) behind -- a simple ``git reset --hard`` thus will bring the +sources back. + +Removing the repository as well would likely be unwise at this point: there is a +decent chance developers will ask you to build another kernel to perform +additional tests. This is often required to debug an issue or check proposed +fixes. Before doing so you want to run the ``git fetch mainline`` command again +followed by ``git checkout mainline/master`` to bring your clone up to date and +checkout the latest codebase. Then apply the patch using ``git apply +`` or ``git am `` and build yet another kernel using the +familiar commands. + +Additional tests are also the reason why you want to keep the +~/kernel-config-working file around for a few weeks. + +[:ref:`back to step-by-step guide `] + + +Additional reading material +=========================== + +Further sources +--------------- + +* The `man page for 'git bisect' `_ and + `fighting regressions with 'git bisect' `_ + in the Git documentation. +* `Working with git bisect `_ + from kernel developer Nathan Chancellor. +* `Using Git bisect to figure out when brokenness was introduced `_. +* `Fully automated bisecting with 'git bisect run' `_. + +.. + end-of-content +.. + This document is maintained by Thorsten Leemhuis . If + you spot a typo or small mistake, feel free to let him know directly and + he'll fix it. You are free to do the same in a mostly informal way if you + want to contribute changes to the text -- but for copyright reasons please CC + linux-doc@vger.kernel.org and 'sign-off' your contribution as + Documentation/process/submitting-patches.rst explains in the section 'Sign + your work - the Developer's Certificate of Origin'. +.. + This text is available under GPL-2.0+ or CC-BY-4.0, as stated at the top + of the file. If you want to distribute this text under CC-BY-4.0 only, + please use 'The Linux kernel development community' for author attribution + and link this as source: + https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst + +.. + Note: Only the content of this RST file as found in the Linux kernel sources + is available under CC-BY-4.0, as versions of this text that were processed + (for example by the kernel's build system) might contain content taken from + files which use a more restrictive license. diff --git a/MAINTAINERS b/MAINTAINERS index 36d69d90aa8e..741d9142b343 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6373,6 +6373,7 @@ L: linux-doc@vger.kernel.org S: Maintained F: Documentation/admin-guide/quickly-build-trimmed-linux.rst F: Documentation/admin-guide/reporting-issues.rst +F: Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst DOCUMENTATION SCRIPTS M: Mauro Carvalho Chehab -- cgit v1.2.3 From 0c8e9b538ed7ecf4159b080ab0dafca3941c69db Mon Sep 17 00:00:00 2001 From: Thorsten Leemhuis Date: Wed, 6 Mar 2024 10:21:12 +0100 Subject: docs: verify/bisect: fixes, finetuning, and support for Arch Assorted changes for the recently added document. Improvements: * Add instructions for installing required software on Arch Linux. Fixes: * Move a 'git remote add -t master stable [...]' from a totally wrong to the right place. * Fix two anchors. * Add two required packages to the openSUSE install instructions. Fine tuning: * Improve the reference section about downloading Linux mainline sources to make it more obvious that those are alternatives. * Include the full instructions for git bundles to ensure the remote gets the right name; that way the text also works stand alone. * Install ncurses and qt headers for use of menuconfig and xconfig by default, but tell users that they are free to omit them. * Mention ahead of time which version number are meant as example in commands used during the step-by-step guide. * Mention that 'kernel-install remove' might do a incomplete job. Signed-off-by: Thorsten Leemhuis Signed-off-by: Jonathan Corbet Message-ID: <6592c9ef4244faa484b4113f088dbc1beca61015.1709716794.git.linux@leemhuis.info> --- .../verify-bugs-and-bisect-regressions.rst | 135 +++++++++++++-------- 1 file changed, 84 insertions(+), 51 deletions(-) (limited to 'Documentation/admin-guide') diff --git a/Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst b/Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst index 54bde8bac95c..58211840ac6f 100644 --- a/Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst +++ b/Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst @@ -192,8 +192,8 @@ will be considered the 'good' release and used to prepare the .config file. sudo rm -rf /lib/modules/6.0-rc1-local-gcafec0cacaca0 sudo kernel-install -v remove 6.0-rc1-local-gcafec0cacaca0 - # * Note, if kernel-install is missing, you will have to - # manually remove the kernel image and related files. + # * Note, on some distributions kernel-install is missing + # or does only part of the job. b) If you performed a bisection and successfully validated the result, feel free to remove all kernels built during the actual bisection (Segment 3 c); @@ -348,11 +348,14 @@ Preparations: set up everything to build your own kernels one downloads less than 500 MByte, the other works better with unreliable internet connections.* - Execute the following command to retrieve a fresh mainline codebase:: + Execute the following command to retrieve a fresh mainline codebase while + preparing things to add stable/longterm branches later:: git clone -o mainline --no-checkout \ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ~/linux/ cd ~/linux/ + git remote add -t master stable \ + https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git [:ref:`details`] @@ -365,7 +368,7 @@ Preparations: set up everything to build your own kernels identifier using ``uname -r``. Afterwards check out the source code for the version earlier established as - 'good' and create a .config file:: + 'good' (in this example this is assumed to be 6.0) and create a .config file:: git checkout --detach v6.0 make olddefconfig @@ -462,8 +465,10 @@ Preparations: set up everything to build your own kernels [:ref:`details`]. - * If you want to influence other aspects of the configuration, do so now - by using make targets like 'menuconfig' or 'xconfig'. + * If you want to influence other aspects of the configuration, do so now using + your preferred tool. Note, to use make targets like 'menuconfig' or + 'nconfig', you will need to install the development files of ncurses; for + 'xconfig' you likewise need the Qt5 or Qt6 headers. [:ref:`details`]. @@ -601,8 +606,8 @@ be a waste of time. [:ref:`details`] * Are you facing a problem within a stable/longterm release, but failed to reproduce it with the mainline kernel you just built? Then check if the latest codebase for the particular series might already fix the problem. To do so, - add the stable series Git branch for your 'good' kernel and check out the - latest version:: + add the stable series Git branch for your 'good' kernel (again, this here is + assumed to be 6.0) and check out the latest version:: cd ~/linux/ git remote set-branches --add stable linux-6.0.y @@ -652,7 +657,7 @@ otherwise would be a waste of time. [:ref:`details`] regressed works as expected with it. Start by checking out the sources for the version earlier established as - 'good':: + 'good' (once again assumed to be 6.0 here):: cd ~/linux/ git checkout --detach v6.0 @@ -697,15 +702,13 @@ each kernel on commodity x86 machines. stable branch, unless you already did so earlier:: cd ~/linux/ - git remote add -t master stable \ - https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable.git git remote set-branches --add stable linux-6.1.y git fetch stable .. _bisectstart_bissbs: * Start the bisection and tell Git about the versions earlier established as - 'good' and 'bad':: + 'good' (6.0 in the following example command) and 'bad' (6.1.5):: cd ~/linux/ git bisect start @@ -884,8 +887,9 @@ space might run out. On quite a few distributions this will delete all other kernel files installed while also removing the kernel's entry from the boot menu. But on some - distributions this command does not exist or will fail; in that case consult - the reference section, as your Linux distribution needs special care. + distributions kernel-install does not exist or leaves boot-loader entries or + kernel image and related files behind; in that case remove them as described + in the reference section. [:ref:`details`] @@ -1015,8 +1019,6 @@ the right thing. [:ref:`back to step-by-step guide `] -.. _buildrequires_bisref: - .. _diskspace_bisref: Space requirements @@ -1060,7 +1062,7 @@ to do this as well, if you tried bisecting between 6.0.11 and 6.1.13. [:ref:`back to step-by-step guide `] -.. _sources_bisref: +.. _buildrequires_bisref: Install build requirements -------------------------- @@ -1076,72 +1078,103 @@ about to build. Here are a few examples what you typically need on some mainstream distributions: +* Arch Linux and derivatives:: + + sudo pacman --needed -S bc binutils bison flex gcc git kmod libelf openssl \ + pahole perl zlib ncurses qt6-base + * Debian, Ubuntu, and derivatives:: - sudo apt install bc binutils bison dwarves flex gcc git make openssl \ - pahole perl-base libssl-dev libelf-dev + sudo apt install bc binutils bison dwarves flex gcc git kmod libelf-dev \ + libssl-dev make openssl pahole perl-base pkg-config zlib1g-dev \ + libncurses-dev qt6-base-dev g++ * Fedora and derivatives:: - sudo dnf install binutils /usr/include/{libelf.h,openssl/pkcs7.h} \ - /usr/bin/{bc,bison,flex,gcc,git,openssl,make,perl,pahole} + sudo dnf install binutils \ + /usr/bin/{bc,bison,flex,gcc,git,openssl,make,perl,pahole,rpmbuild} \ + /usr/include/{libelf.h,openssl/pkcs7.h,zlib.h,ncurses.h,qt6/QtGui/QAction} * openSUSE and derivatives:: - sudo zypper install bc binutils bison dwarves flex gcc git make perl-base \ - openssl openssl-devel libelf-dev - -In case you wonder why these lists include openssl and its development headers: -they are needed for the Secure Boot support, which many distributions enable in -their kernel configuration for x86 machines. + sudo zypper install bc binutils bison dwarves flex gcc git \ + kernel-install-tools libelf-devel make modutils openssl openssl-devel \ + perl-base zlib-devel rpm-build ncurses-devel qt6-base-devel -Sometimes you will need tools for compression formats like bzip2, gzip, lz4, -lzma, lzo, xz, or zstd as well. - -In case you want to adjust the build configuration with make targets like -'menuconfig' or 'xconfig' later, ensure to also install development headers for -ncurses and Qt5. +These commands install a few packages that are often, but not always needed. You +for example might want to skip installing the development headers for ncurses, +which you will only need in case you later might want to adjust the kernel build +configuration using make the targets 'menuconfig' or 'nconfig'; likewise omit +the headers of Qt6 is you do not plan to adjust the .config using 'xconfig'. You furthermore might need additional libraries and their development headers -for tasks not covered in this guide. For example, zlib will be needed when -building kernel tools from the tools/ directory;. +for tasks not covered in this guide -- for example when building utilities from +the kernel's tools/ directory. [:ref:`back to step-by-step guide `] -Download the sources using git +.. _sources_bisref: + +Download the sources using Git ------------------------------ *Retrieve the Linux mainline sources.* [:ref:`...`] -The step-by-step guide outlines how to retrieve the Linux sources using a full -Git clone of Linus' mainline repository. If you have an unreliable internet -connection, you instead might want to use a 'Git bundle' to retrieve the -sources; if downloading the complete repository would take too long or requires -too much storage space, use a shallow clone instead. +The step-by-step guide outlines how to download the Linux sources using a full +Git clone of Linus' mainline repository. There is nothing more to say about +that -- but there are two alternatives ways to retrieve the sources that might +work better for you: -Downloading Linux mainline using a bundle -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + * If you have an unreliable internet connection, consider + :ref:`using a 'Git bundle'`. -Switch to you home directory and follow the instructions `kernel.org provides -for this case `_. + * If downloading the complete repository would take too long or requires too + much storage space, consider :ref:`using a 'shallow + clone'`. -Afterwards add the stable Git repository as remote and all required -stable/branches as explained in the step-by-step guide. +.. _sources_bundle_bisref: -Downloading Linux mainline using a shallow clone -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Downloading Linux mainline sources using a bundle +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Use the following commands to retrieve the Linux mainline sources using a +bundle:: + + wget -c \ + https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/clone.bundle + git clone --no-checkout clone.bundle ~/linux/ + cd ~/linux/ + git remote remove origin + git remote add mainline \ + https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git + git fetch mainline + git remote add -t master stable \ + https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git + +In case the 'wget' command fails, just re-execute it, it will pick up where +it left off. + +[:ref:`back to step-by-step guide `] +[:ref:`back to section intro `] + +.. _sources_shallow_bisref: + +Downloading Linux mainline sources using a shallow clone +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ First, execute the following command to retrieve the latest mainline codebase:: git clone -o mainline --no-checkout --depth 1 -b master \ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ~/linux/ cd ~/linux/ + git remote add -t master stable \ + https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git Now deepen your clone's history to the second predecessor of the mainline release of your 'good' version. In case the latter are 6.0 or 6.0.11, 5.19 would be the first predecessor and 5.18 the second -- hence deepen the history up to -the latter:: +that version:: git fetch --shallow-exclude=v5.18 mainline @@ -1150,7 +1183,7 @@ branches as explained in the step-by-step guide. Note, shallow clones have a few peculiar characteristics: - * For bisections the history needs to be deepend a few mainline versions + * For bisections the history needs to be deepened a few mainline versions farther than it seems necessary, as explained above already. That's because Git otherwise will be unable to revert or describe most of the commits within a range (say v6.1..v6.2), as they are internally based on earlier kernels -- cgit v1.2.3