diff options
Diffstat (limited to 'Documentation')
192 files changed, 6610 insertions, 1907 deletions
diff --git a/Documentation/ABI/obsolete/sysfs-class-cxl b/Documentation/ABI/removed/sysfs-class-cxl index 8cba1b626985..266c413b96e8 100644 --- a/Documentation/ABI/obsolete/sysfs-class-cxl +++ b/Documentation/ABI/removed/sysfs-class-cxl @@ -1,5 +1,4 @@ -The cxl driver is no longer maintained, and will be removed from the kernel in -the near future. +The cxl driver was removed in 6.15. Please note that attributes that are shared between devices are stored in the directory pointed to by the symlink device/. @@ -10,7 +9,7 @@ For example, the real path of the attribute /sys/class/cxl/afu0.0s/irqs_max is Slave contexts (eg. /sys/class/cxl/afu0.0s): What: /sys/class/cxl/<afu>/afu_err_buf -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only AFU Error Buffer contents. The contents of this file are @@ -21,7 +20,7 @@ Description: read only What: /sys/class/cxl/<afu>/irqs_max -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read/write Decimal value of maximum number of interrupts that can be @@ -32,7 +31,7 @@ Description: read/write Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<afu>/irqs_min -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only Decimal value of the minimum number of interrupts that @@ -42,7 +41,7 @@ Description: read only Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<afu>/mmio_size -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only Decimal value of the size of the MMIO space that may be mmapped @@ -50,7 +49,7 @@ Description: read only Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<afu>/modes_supported -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only List of the modes this AFU supports. One per line. @@ -58,7 +57,7 @@ Description: read only Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<afu>/mode -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read/write The current mode the AFU is using. Will be one of the modes @@ -68,7 +67,7 @@ Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<afu>/prefault_mode -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read/write Set the mode for prefaulting in segments into the segment table @@ -88,7 +87,7 @@ Description: read/write Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<afu>/reset -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: write only Writing 1 here will reset the AFU provided there are not @@ -96,14 +95,14 @@ Description: write only Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<afu>/api_version -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only Decimal value of the current version of the kernel/user API. Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<afu>/api_version_compatible -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only Decimal value of the lowest version of the userspace API @@ -117,7 +116,7 @@ An AFU may optionally export one or more PCIe like configuration records, known as AFU configuration records, which will show up here (if present). What: /sys/class/cxl/<afu>/cr<config num>/vendor -Date: February 2015 +Date: February 2015, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only Hexadecimal value of the vendor ID found in this AFU @@ -125,7 +124,7 @@ Description: read only Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<afu>/cr<config num>/device -Date: February 2015 +Date: February 2015, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only Hexadecimal value of the device ID found in this AFU @@ -133,7 +132,7 @@ Description: read only Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<afu>/cr<config num>/class -Date: February 2015 +Date: February 2015, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only Hexadecimal value of the class code found in this AFU @@ -141,7 +140,7 @@ Description: read only Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<afu>/cr<config num>/config -Date: February 2015 +Date: February 2015, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only This binary file provides raw access to the AFU configuration @@ -155,7 +154,7 @@ Users: https://github.com/ibm-capi/libcxl Master contexts (eg. /sys/class/cxl/afu0.0m) What: /sys/class/cxl/<afu>m/mmio_size -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only Decimal value of the size of the MMIO space that may be mmapped @@ -163,14 +162,14 @@ Description: read only Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<afu>m/pp_mmio_len -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only Decimal value of the Per Process MMIO space length. Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<afu>m/pp_mmio_off -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only (not in a guest) @@ -181,21 +180,21 @@ Users: https://github.com/ibm-capi/libcxl Card info (eg. /sys/class/cxl/card0) What: /sys/class/cxl/<card>/caia_version -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only Identifies the CAIA Version the card implements. Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<card>/psl_revision -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only Identifies the revision level of the PSL. Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<card>/base_image -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only (not in a guest) @@ -206,7 +205,7 @@ Description: read only Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<card>/image_loaded -Date: September 2014 +Date: September 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only (not in a guest) @@ -215,7 +214,7 @@ Description: read only Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<card>/load_image_on_perst -Date: December 2014 +Date: December 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read/write (not in a guest) @@ -232,7 +231,7 @@ Description: read/write Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<card>/reset -Date: October 2014 +Date: October 2014, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: write only Writing 1 will issue a PERST to card provided there are no @@ -243,7 +242,7 @@ Description: write only Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<card>/perst_reloads_same_image -Date: July 2015 +Date: July 2015, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read/write (not in a guest) @@ -257,7 +256,7 @@ Description: read/write Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<card>/psl_timebase_synced -Date: March 2016 +Date: March 2016, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only Returns 1 if the psl timebase register is synchronized @@ -265,7 +264,7 @@ Description: read only Users: https://github.com/ibm-capi/libcxl What: /sys/class/cxl/<card>/tunneled_ops_supported -Date: May 2018 +Date: May 2018, removed February 2025 Contact: linuxppc-dev@lists.ozlabs.org Description: read only Returns 1 if tunneled operations are supported in capi mode, diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node index 402af4b2b905..a02707cb7cbc 100644 --- a/Documentation/ABI/stable/sysfs-devices-node +++ b/Documentation/ABI/stable/sysfs-devices-node @@ -177,6 +177,12 @@ Description: The cache write policy: 0 for write-back, 1 for write-through, other or unknown. +What: /sys/devices/system/node/nodeX/memory_side_cache/indexY/address_mode +Date: March 2025 +Contact: Dave Jiang <dave.jiang@intel.com> +Description: + The address mode: 0 for reserved, 1 for extended-linear. + What: /sys/devices/system/node/nodeX/x86/sgx_total_bytes Date: November 2021 Contact: Jarkko Sakkinen <jarkko@kernel.org> diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram index 1ef69e0271f9..36c57de0a10a 100644 --- a/Documentation/ABI/testing/sysfs-block-zram +++ b/Documentation/ABI/testing/sysfs-block-zram @@ -22,14 +22,6 @@ Description: device. The reset operation frees all the memory associated with this device. -What: /sys/block/zram<id>/max_comp_streams -Date: February 2014 -Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> -Description: - The max_comp_streams file is read-write and specifies the - number of backend's zcomp_strm compression streams (number of - concurrent compress operations). - What: /sys/block/zram<id>/comp_algorithm Date: February 2014 Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm index 53cb454b60d0..a341b08ae70b 100644 --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm @@ -257,3 +257,18 @@ Contact: Jinlong Mao (QUIC) <quic_jinlmao@quicinc.com>, Tao Zhang (QUIC) <quic_t Description: (RW) Set/Get the MSR(mux select register) for the CMB subunit TPDM. + +What: /sys/bus/coresight/devices/<tpdm-name>/mcmb_trig_lane +Date: Feb 2025 +KernelVersion 6.15 +Contact: Jinlong Mao (QUIC) <quic_jinlmao@quicinc.com>, Tao Zhang (QUIC) <quic_taozha@quicinc.com> +Description: + (RW) Set/Get which lane participates in the output pattern + match cross trigger mechanism for the MCMB subunit TPDM. + +What: /sys/bus/coresight/devices/<tpdm-name>/mcmb_lanes_select +Date: Feb 2025 +KernelVersion 6.15 +Contact: Jinlong Mao (QUIC) <quic_jinlmao@quicinc.com>, Tao Zhang (QUIC) <quic_taozha@quicinc.com> +Description: + (RW) Set/Get the enablement of the individual lane. diff --git a/Documentation/ABI/testing/sysfs-bus-counter b/Documentation/ABI/testing/sysfs-bus-counter index 73ac84c0bca7..3e8259e56d38 100644 --- a/Documentation/ABI/testing/sysfs-bus-counter +++ b/Documentation/ABI/testing/sysfs-bus-counter @@ -34,6 +34,14 @@ Contact: linux-iio@vger.kernel.org Description: Count data of Count Y represented as a string. +What: /sys/bus/counter/devices/counterX/countY/compare +KernelVersion: 6.15 +Contact: linux-iio@vger.kernel.org +Description: + If the counter device supports compare registers -- registers + used to compare counter channels against a particular count -- + the compare count for channel Y is provided by this attribute. + What: /sys/bus/counter/devices/counterX/countY/capture KernelVersion: 6.1 Contact: linux-iio@vger.kernel.org @@ -301,6 +309,7 @@ Description: What: /sys/bus/counter/devices/counterX/cascade_counts_enable_component_id What: /sys/bus/counter/devices/counterX/external_input_phase_clock_select_component_id +What: /sys/bus/counter/devices/counterX/countY/compare_component_id What: /sys/bus/counter/devices/counterX/countY/capture_component_id What: /sys/bus/counter/devices/counterX/countY/ceiling_component_id What: /sys/bus/counter/devices/counterX/countY/floor_component_id diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl index 3f5627a1210a..99bb3faf7a0e 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -1,5 +1,5 @@ What: /sys/bus/cxl/flush -Date: Januarry, 2022 +Date: January, 2022 KernelVersion: v5.18 Contact: linux-cxl@vger.kernel.org Description: @@ -18,6 +18,24 @@ Description: specification. +What: /sys/bus/cxl/devices/memX/payload_max +Date: December, 2020 +KernelVersion: v5.12 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) Maximum size (in bytes) of the mailbox command payload + registers. Linux caps this at 1MB if the device reports a + larger size. + + +What: /sys/bus/cxl/devices/memX/label_storage_size +Date: May, 2021 +KernelVersion: v5.13 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) Size (in bytes) of the Label Storage Area (LSA). + + What: /sys/bus/cxl/devices/memX/ram/size Date: December, 2020 KernelVersion: v5.12 @@ -33,7 +51,7 @@ Date: May, 2023 KernelVersion: v6.8 Contact: linux-cxl@vger.kernel.org Description: - (RO) For CXL host platforms that support "QoS Telemmetry" + (RO) For CXL host platforms that support "QoS Telemetry" this attribute conveys a comma delimited list of platform specific cookies that identifies a QoS performance class for the volatile partition of the CXL mem device. These @@ -60,7 +78,7 @@ Date: May, 2023 KernelVersion: v6.8 Contact: linux-cxl@vger.kernel.org Description: - (RO) For CXL host platforms that support "QoS Telemmetry" + (RO) For CXL host platforms that support "QoS Telemetry" this attribute conveys a comma delimited list of platform specific cookies that identifies a QoS performance class for the persistent partition of the CXL mem device. These @@ -321,14 +339,13 @@ KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it - translates from a host physical address range, to a device local - address range. Device-local address ranges are further split - into a 'ram' (volatile memory) range and 'pmem' (persistent - memory) range. The 'mode' attribute emits one of 'ram', 'pmem', - 'mixed', or 'none'. The 'mixed' indication is for error cases - when a decoder straddles the volatile/persistent partition - boundary, and 'none' indicates the decoder is not actively - decoding, or no DPA allocation policy has been set. + translates from a host physical address range, to a device + local address range. Device-local address ranges are further + split into a 'ram' (volatile memory) range and 'pmem' + (persistent memory) range. The 'mode' attribute emits one of + 'ram', 'pmem', or 'none'. The 'none' indicates the decoder is + not actively decoding, or no DPA allocation policy has been + set. 'mode' can be written, when the decoder is in the 'disabled' state, with either 'ram' or 'pmem' to set the boundaries for the @@ -423,7 +440,7 @@ Date: May, 2023 KernelVersion: v6.5 Contact: linux-cxl@vger.kernel.org Description: - (RO) For CXL host platforms that support "QoS Telemmetry" this + (RO) For CXL host platforms that support "QoS Telemetry" this root-decoder-only attribute conveys a platform specific cookie that identifies a QoS performance class for the CXL Window. This class-id can be compared against a similar "qos_class" @@ -586,3 +603,15 @@ Description: See Documentation/ABI/stable/sysfs-devices-node. access0 provides the number to the closest initiator and access1 provides the number to the closest CPU. + + +What: /sys/bus/cxl/devices/nvdimm-bridge0/ndbusX/nmemY/cxl/dirty_shutdown +Date: Feb, 2025 +KernelVersion: v6.15 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) The device dirty shutdown count value, which is the number + of times the device could have incurred in potential data loss. + The count is persistent across power loss and wraps back to 0 + upon overflow. If this file is not present, the device does not + have the necessary support for dirty tracking. diff --git a/Documentation/ABI/testing/sysfs-bus-iio b/Documentation/ABI/testing/sysfs-bus-iio index 25d366d452a5..722aa989baac 100644 --- a/Documentation/ABI/testing/sysfs-bus-iio +++ b/Documentation/ABI/testing/sysfs-bus-iio @@ -2268,7 +2268,7 @@ Description: representing the sensor unique ID number. What: /sys/bus/iio/devices/iio:deviceX/filter_type_available -What: /sys/bus/iio/devices/iio:deviceX/in_voltage-voltage_filter_mode_available +What: /sys/bus/iio/devices/iio:deviceX/in_voltage-voltage_filter_type_available KernelVersion: 6.1 Contact: linux-iio@vger.kernel.org Description: @@ -2290,6 +2290,16 @@ Description: * "sinc3+pf2" - Sinc3 + device specific Post Filter 2. * "sinc3+pf3" - Sinc3 + device specific Post Filter 3. * "sinc3+pf4" - Sinc3 + device specific Post Filter 4. + * "wideband" - filter with wideband low ripple passband + and sharp transition band. + +What: /sys/bus/iio/devices/iio:deviceX/filter_type +What: /sys/bus/iio/devices/iio:deviceX/in_voltageY-voltageZ_filter_type +KernelVersion: 6.1 +Contact: linux-iio@vger.kernel.org +Description: + Specifies which filter type apply to the channel. The possible + values are given by the filter_type_available attribute. What: /sys/.../events/in_proximity_thresh_either_runningperiod KernelVersion: 6.6 diff --git a/Documentation/ABI/testing/sysfs-bus-iio-adc-ad4130 b/Documentation/ABI/testing/sysfs-bus-iio-adc-ad4130 new file mode 100644 index 000000000000..d3fad27421d6 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-bus-iio-adc-ad4130 @@ -0,0 +1,20 @@ +What: /sys/bus/iio/devices/iio:deviceX/in_voltage-voltage_filter_mode_available +KernelVersion: 6.2 +Contact: linux-iio@vger.kernel.org +Description: + Reading returns a list with the possible filter modes. + + This ABI is only kept for backwards compatibility and the values + returned are identical to filter_type_available attribute + documented in Documentation/ABI/testing/sysfs-bus-iio. Please, + use filter_type_available like ABI to provide filter options for + new drivers. + +What: /sys/bus/iio/devices/iio:deviceX/in_voltageY-voltageZ_filter_mode +KernelVersion: 6.2 +Contact: linux-iio@vger.kernel.org +Description: + This ABI is only kept for backwards compatibility and the values + returned are identical to in_voltageY-voltageZ_filter_type + attribute documented in Documentation/ABI/testing/sysfs-bus-iio. + Please, use in_voltageY-voltageZ_filter_type for new drivers. diff --git a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc index c12316dfd973..a6e400364932 100644 --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc @@ -17,7 +17,7 @@ Description: Read only. Returns the firmware version of Intel MAX10 What: /sys/bus/.../drivers/intel-m10-bmc/.../mac_address Date: January 2021 KernelVersion: 5.12 -Contact: Peter Colberg <peter.colberg@intel.com> +Contact: Peter Colberg <peter.colberg@altera.com> Description: Read only. Returns the first MAC address in a block of sequential MAC addresses assigned to the board that is managed by the Intel MAX10 BMC. It is stored in @@ -28,7 +28,7 @@ Description: Read only. Returns the first MAC address in a block What: /sys/bus/.../drivers/intel-m10-bmc/.../mac_count Date: January 2021 KernelVersion: 5.12 -Contact: Peter Colberg <peter.colberg@intel.com> +Contact: Peter Colberg <peter.colberg@altera.com> Description: Read only. Returns the number of sequential MAC addresses assigned to the board managed by the Intel MAX10 BMC. This value is stored in FLASH and is mirrored diff --git a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update index 9051695d2211..c69fd3894eb4 100644 --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update @@ -1,7 +1,7 @@ What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../security/sr_root_entry_hash Date: Sep 2022 KernelVersion: 5.20 -Contact: Peter Colberg <peter.colberg@intel.com> +Contact: Peter Colberg <peter.colberg@altera.com> Description: Read only. Returns the root entry hash for the static region if one is programmed, else it returns the string: "hash not programmed". This file is only @@ -11,7 +11,7 @@ Description: Read only. Returns the root entry hash for the static What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../security/pr_root_entry_hash Date: Sep 2022 KernelVersion: 5.20 -Contact: Peter Colberg <peter.colberg@intel.com> +Contact: Peter Colberg <peter.colberg@altera.com> Description: Read only. Returns the root entry hash for the partial reconfiguration region if one is programmed, else it returns the string: "hash not programmed". This file @@ -21,7 +21,7 @@ Description: Read only. Returns the root entry hash for the partial What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../security/bmc_root_entry_hash Date: Sep 2022 KernelVersion: 5.20 -Contact: Peter Colberg <peter.colberg@intel.com> +Contact: Peter Colberg <peter.colberg@altera.com> Description: Read only. Returns the root entry hash for the BMC image if one is programmed, else it returns the string: "hash not programmed". This file is only visible if the @@ -31,7 +31,7 @@ Description: Read only. Returns the root entry hash for the BMC image What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../security/sr_canceled_csks Date: Sep 2022 KernelVersion: 5.20 -Contact: Peter Colberg <peter.colberg@intel.com> +Contact: Peter Colberg <peter.colberg@altera.com> Description: Read only. Returns a list of indices for canceled code signing keys for the static region. The standard bitmap list format is used (e.g. "1,2-6,9"). @@ -39,7 +39,7 @@ Description: Read only. Returns a list of indices for canceled code What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../security/pr_canceled_csks Date: Sep 2022 KernelVersion: 5.20 -Contact: Peter Colberg <peter.colberg@intel.com> +Contact: Peter Colberg <peter.colberg@altera.com> Description: Read only. Returns a list of indices for canceled code signing keys for the partial reconfiguration region. The standard bitmap list format is used (e.g. "1,2-6,9"). @@ -47,7 +47,7 @@ Description: Read only. Returns a list of indices for canceled code What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../security/bmc_canceled_csks Date: Sep 2022 KernelVersion: 5.20 -Contact: Peter Colberg <peter.colberg@intel.com> +Contact: Peter Colberg <peter.colberg@altera.com> Description: Read only. Returns a list of indices for canceled code signing keys for the BMC. The standard bitmap list format is used (e.g. "1,2-6,9"). @@ -55,7 +55,7 @@ Description: Read only. Returns a list of indices for canceled code What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../security/flash_count Date: Sep 2022 KernelVersion: 5.20 -Contact: Peter Colberg <peter.colberg@intel.com> +Contact: Peter Colberg <peter.colberg@altera.com> Description: Read only. Returns number of times the secure update staging area has been flashed. Format: "%u". diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-cma b/Documentation/ABI/testing/sysfs-kernel-mm-cma index dfd755201142..aaf2a5d8b13b 100644 --- a/Documentation/ABI/testing/sysfs-kernel-mm-cma +++ b/Documentation/ABI/testing/sysfs-kernel-mm-cma @@ -29,3 +29,16 @@ Date: Feb 2024 Contact: Anshuman Khandual <anshuman.khandual@arm.com> Description: the number of pages CMA API succeeded to release + +What: /sys/kernel/mm/cma/<cma-heap-name>/total_pages +Date: Jun 2024 +Contact: Frank van der Linden <fvdl@google.com> +Description: + The size of the CMA area in pages. + +What: /sys/kernel/mm/cma/<cma-heap-name>/available_pages +Date: Jun 2024 +Contact: Frank van der Linden <fvdl@google.com> +Description: + The number of pages in the CMA area that are still + available for CMA allocation. diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-damon b/Documentation/ABI/testing/sysfs-kernel-mm-damon index b057eddefbfc..293197f180ad 100644 --- a/Documentation/ABI/testing/sysfs-kernel-mm-damon +++ b/Documentation/ABI/testing/sysfs-kernel-mm-damon @@ -91,6 +91,36 @@ Description: Writing a value to this file sets the update interval of the DAMON context in microseconds as the value. Reading this file returns the value. +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/monitoring_attrs/intervals/intrvals_goal/access_bp +Date: Feb 2025 +Contact: SeongJae Park <sj@kernel.org> +Description: Writing a value to this file sets the monitoring intervals + auto-tuning target DAMON-observed access events ratio within + the given time interval (aggrs in same directory), in bp + (1/10,000). Reading this file returns the value. + +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/monitoring_attrs/intervals/intrvals_goal/aggrs +Date: Feb 2025 +Contact: SeongJae Park <sj@kernel.org> +Description: Writing a value to this file sets the time interval to achieve + the monitoring intervals auto-tuning target DAMON-observed + access events ratio (access_bp in same directory) within. + Reading this file returns the value. + +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/monitoring_attrs/intervals/intrvals_goal/min_sample_us +Date: Feb 2025 +Contact: SeongJae Park <sj@kernel.org> +Description: Writing a value to this file sets the minimum value of + auto-tuned sampling interval in microseconds. Reading this + file returns the value. + +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/monitoring_attrs/intervals/intrvals_goal/max_sample_us +Date: Feb 2025 +Contact: SeongJae Park <sj@kernel.org> +Description: Writing a value to this file sets the maximum value of + auto-tuned sampling interval in microseconds. Reading this + file returns the value. + What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/monitoring_attrs/nr_regions/min WDate: Mar 2022 @@ -345,6 +375,20 @@ Description: If 'addr' is written to the 'type' file, writing to or reading from this file sets or gets the end address of the address range for the filter. +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/min +Date: Feb 2025 +Contact: SeongJae Park <sj@kernel.org> +Description: If 'hugepage_size' is written to the 'type' file, writing to + or reading from this file sets or gets the minimum size of the + hugepage for the filter. + +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/max +Date: Feb 2025 +Contact: SeongJae Park <sj@kernel.org> +Description: If 'hugepage_size' is written to the 'type' file, writing to + or reading from this file sets or gets the maximum size of the + hugepage for the filter. + What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/target_idx Date: Dec 2022 Contact: SeongJae Park <sj@kernel.org> @@ -365,6 +409,22 @@ Description: Writing 'Y' or 'N' to this file sets whether to allow or reject applying the scheme's action to the memory that satisfies the 'type' and the 'matching' of the directory. +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters +Date: Feb 2025 +Contact: SeongJae Park <sj@kernel.org> +Description: Directory for DAMON core layer-handled DAMOS filters. Files + under this directory works same to those of + /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters + directory. + +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/ops_filters +Date: Feb 2025 +Contact: SeongJae Park <sj@kernel.org> +Description: Directory for DAMON operations set layer-handled DAMOS filters. + Files under this directory works same to those of + /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters + directory. + What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/stats/nr_tried Date: Mar 2022 Contact: SeongJae Park <sj@kernel.org> diff --git a/Documentation/ABI/testing/sysfs-kernel-reboot b/Documentation/ABI/testing/sysfs-kernel-reboot index 837330fb2511..e117aba46be0 100644 --- a/Documentation/ABI/testing/sysfs-kernel-reboot +++ b/Documentation/ABI/testing/sysfs-kernel-reboot @@ -30,3 +30,11 @@ KernelVersion: 5.11 Contact: Matteo Croce <mcroce@microsoft.com> Description: Don't wait for any other CPUs on reboot and avoid anything that could hang. + +What: /sys/kernel/reboot/hw_protection +Date: April 2025 +KernelVersion: 6.15 +Contact: Ahmad Fatoum <a.fatoum@pengutronix.de> +Description: Hardware protection action taken on critical events like + overtemperature or imminent voltage loss. + Valid values are: reboot shutdown diff --git a/Documentation/ABI/testing/sysfs-pps-gen-tio b/Documentation/ABI/testing/sysfs-pps-gen-tio new file mode 100644 index 000000000000..3c34ff17a335 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-pps-gen-tio @@ -0,0 +1,6 @@ +What: /sys/class/pps-gen/pps-genx/enable +Date: April 2025 +KernelVersion: 6.15 +Contact: Subramanian Mohan<subramanian.mohan@intel.com> +Description: + Enable or disable PPS TIO generator output. diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst index 1ef5784c1b84..53faeed7c190 100644 --- a/Documentation/RCU/whatisRCU.rst +++ b/Documentation/RCU/whatisRCU.rst @@ -971,6 +971,16 @@ unfortunately any spinlock in a ``SLAB_TYPESAFE_BY_RCU`` object must be initialized after each and every call to kmem_cache_alloc(), which renders reference-free spinlock acquisition completely unsafe. Therefore, when using ``SLAB_TYPESAFE_BY_RCU``, make proper use of a reference counter. +If using refcount_t, the specialized refcount_{add|inc}_not_zero_acquire() +and refcount_set_release() APIs should be used to ensure correct operation +ordering when verifying object identity and when initializing newly +allocated objects. Acquire fence in refcount_{add|inc}_not_zero_acquire() +ensures that identity checks happen *after* reference count is taken. +refcount_set_release() should be called after a newly allocated object is +fully initialized and release fence ensures that new values are visible +*before* refcount can be successfully taken by other users. Once +refcount_set_release() is called, the object should be considered visible +by other tasks. (Those willing to initialize their locks in a kmem_cache constructor may also use locking, including cache-friendly sequence locking.) diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst index 1576fb93f06c..9bdb30901a93 100644 --- a/Documentation/admin-guide/blockdev/zram.rst +++ b/Documentation/admin-guide/blockdev/zram.rst @@ -54,7 +54,7 @@ The list of possible return codes: If you use 'echo', the returned value is set by the 'echo' utility, and, in general case, something like:: - echo 3 > /sys/block/zram0/max_comp_streams + echo foo > /sys/block/zram0/comp_algorithm if [ $? -ne 0 ]; then handle_error fi @@ -73,21 +73,7 @@ This creates 4 devices: /dev/zram{0,1,2,3} num_devices parameter is optional and tells zram how many devices should be pre-created. Default: 1. -2) Set max number of compression streams -======================================== - -Regardless of the value passed to this attribute, ZRAM will always -allocate multiple compression streams - one per online CPU - thus -allowing several concurrent compression operations. The number of -allocated compression streams goes down when some of the CPUs -become offline. There is no single-compression-stream mode anymore, -unless you are running a UP system or have only 1 CPU online. - -To find out how many streams are currently available:: - - cat /sys/block/zram0/max_comp_streams - -3) Select compression algorithm +2) Select compression algorithm =============================== Using comp_algorithm device attribute one can see available and @@ -107,7 +93,7 @@ Examples:: For the time being, the `comp_algorithm` content shows only compression algorithms that are supported by zram. -4) Set compression algorithm parameters: Optional +3) Set compression algorithm parameters: Optional ================================================= Compression algorithms may support specific parameters which can be @@ -138,7 +124,7 @@ better the compression ratio, it even can take negatives values for some algorithms), for other algorithms `level` is acceleration level (the higher the value the lower the compression ratio). -5) Set Disksize +4) Set Disksize =============== Set disk size by writing the value to sysfs node 'disksize'. @@ -158,7 +144,7 @@ There is little point creating a zram of greater than twice the size of memory since we expect a 2:1 compression ratio. Note that zram uses about 0.1% of the size of the disk when not in use so a huge zram is wasteful. -6) Set memory limit: Optional +5) Set memory limit: Optional ============================= Set memory limit by writing the value to sysfs node 'mem_limit'. @@ -177,7 +163,7 @@ Examples:: # To disable memory limit echo 0 > /sys/block/zram0/mem_limit -7) Activate +6) Activate =========== :: @@ -188,7 +174,7 @@ Examples:: mkfs.ext4 /dev/zram1 mount /dev/zram1 /tmp -8) Add/remove zram devices +7) Add/remove zram devices ========================== zram provides a control interface, which enables dynamic (on-demand) device @@ -208,7 +194,7 @@ execute:: echo X > /sys/class/zram-control/hot_remove -9) Stats +8) Stats ======== Per-device statistics are exported as various nodes under /sys/block/zram<id>/ @@ -228,8 +214,6 @@ mem_limit WO specifies the maximum amount of memory ZRAM can writeback_limit WO specifies the maximum amount of write IO zram can write out to backing device as 4KB unit writeback_limit_enable RW show and set writeback_limit feature -max_comp_streams RW the number of possible concurrent compress - operations comp_algorithm RW show and change the compression algorithm algorithm_params WO setup compression algorithm parameters compact WO trigger memory compaction @@ -310,7 +294,7 @@ a single line of text and contains the following stats separated by whitespace: Unit: 4K bytes ============== ============================================================= -10) Deactivate +9) Deactivate ============== :: @@ -318,7 +302,7 @@ a single line of text and contains the following stats separated by whitespace: swapoff /dev/zram0 umount /dev/zram1 -11) Reset +10) Reset ========= Write any positive value to 'reset' sysfs node:: diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst index 02b8206a3594..d6b1db8cc7eb 100644 --- a/Documentation/admin-guide/cgroup-v1/memory.rst +++ b/Documentation/admin-guide/cgroup-v1/memory.rst @@ -610,6 +610,10 @@ memory.stat file includes following statistics: 'rss + mapped_file" will give you resident set size of cgroup. + Note that some kernel configurations might account complete larger + allocations (e.g., THP) towards 'rss' and 'mapped_file', even if + only some, but not all that memory is mapped. + (Note: file and shmem may be shared among other cgroups. In that case, mapped_file is accounted only when the memory cgroup is owner of page cache.) diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index f293a13b42ed..1a16ce68a4d7 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1445,7 +1445,10 @@ The following nested keys are defined. anon Amount of memory used in anonymous mappings such as - brk(), sbrk(), and mmap(MAP_ANONYMOUS) + brk(), sbrk(), and mmap(MAP_ANONYMOUS). Note that + some kernel configurations might account complete larger + allocations (e.g., THP) if only some, but not all the + memory of such an allocation is mapped anymore. file Amount of memory used to cache filesystem data, @@ -1488,7 +1491,10 @@ The following nested keys are defined. Amount of application memory swapped out to zswap. file_mapped - Amount of cached filesystem data mapped with mmap() + Amount of cached filesystem data mapped with mmap(). Note + that some kernel configurations might account complete + larger allocations (e.g., THP) if only some, but not + not all the memory of such an allocation is mapped. file_dirty Amount of cached filesystem data that was modified but @@ -1560,6 +1566,12 @@ The following nested keys are defined. workingset_nodereclaim Number of times a shadow node has been reclaimed + pswpin (npn) + Number of pages swapped into memory + + pswpout (npn) + Number of pages swapped out of memory + pgscan (npn) Amount of scanned pages (in an inactive LRU list) @@ -1575,6 +1587,9 @@ The following nested keys are defined. pgscan_khugepaged (npn) Amount of scanned pages by khugepaged (in an inactive LRU list) + pgscan_proactive (npn) + Amount of scanned pages proactively (in an inactive LRU list) + pgsteal_kswapd (npn) Amount of reclaimed pages by kswapd @@ -1584,6 +1599,9 @@ The following nested keys are defined. pgsteal_khugepaged (npn) Amount of reclaimed pages by khugepaged + pgsteal_proactive (npn) + Amount of reclaimed pages proactively + pgfault (npn) Total number of page faults incurred @@ -1661,6 +1679,9 @@ The following nested keys are defined. pgdemote_khugepaged Number of pages demoted by khugepaged. + pgdemote_proactive + Number of pages demoted by proactively. + hugetlb Amount of memory used by hugetlb pages. This metric only shows up if hugetlb usage is accounted for in memory.current (i.e. diff --git a/Documentation/admin-guide/device-mapper/dm-crypt.rst b/Documentation/admin-guide/device-mapper/dm-crypt.rst index 9f8139ff97d6..4467f6d4b632 100644 --- a/Documentation/admin-guide/device-mapper/dm-crypt.rst +++ b/Documentation/admin-guide/device-mapper/dm-crypt.rst @@ -146,6 +146,11 @@ integrity:<bytes>:<type> integrity for the encrypted device. The additional space is then used for storing authentication tag (and persistent IV if needed). +integrity_key_size:<bytes> + Optionally set the integrity key size if it differs from the digest size. + It allows the use of wrapped key algorithms where the key size is + independent of the cryptographic key size. + sector_size:<bytes> Use <bytes> as the encryption unit instead of 512 bytes sectors. This option can be in range 512 - 4096 bytes and must be power of two. diff --git a/Documentation/admin-guide/device-mapper/dm-integrity.rst b/Documentation/admin-guide/device-mapper/dm-integrity.rst index d8a5f14d0e3c..c2e18ecc065c 100644 --- a/Documentation/admin-guide/device-mapper/dm-integrity.rst +++ b/Documentation/admin-guide/device-mapper/dm-integrity.rst @@ -92,6 +92,11 @@ Target arguments: allowed. This mode is useful for data recovery if the device cannot be activated in any of the other standard modes. + I - inline mode - in this mode, dm-integrity will store integrity + data directly in the underlying device sectors. + The underlying device must have an integrity profile that + allows storing user integrity data and provides enough + space for the selected integrity tag. 5. the number of additional arguments diff --git a/Documentation/admin-guide/device-mapper/verity.rst b/Documentation/admin-guide/device-mapper/verity.rst index a65c1602cb23..8c3f1f967a3c 100644 --- a/Documentation/admin-guide/device-mapper/verity.rst +++ b/Documentation/admin-guide/device-mapper/verity.rst @@ -87,6 +87,15 @@ panic_on_corruption Panic the device when a corrupted block is discovered. This option is not compatible with ignore_corruption and restart_on_corruption. +restart_on_error + Restart the system when an I/O error is detected. + This option can be combined with the restart_on_corruption option. + +panic_on_error + Panic the device when an I/O error is detected. This option is + not compatible with the restart_on_error option but can be combined + with the panic_on_corruption option. + ignore_zero_blocks Do not verify blocks that are expected to contain zeroes and always return zeroes instead. This may be useful if the partition contains unused blocks @@ -142,8 +151,15 @@ root_hash_sig_key_desc <key_description> already in the secondary trusted keyring. try_verify_in_tasklet - If verity hashes are in cache, verify data blocks in kernel tasklet instead - of workqueue. This option can reduce IO latency. + If verity hashes are in cache and the IO size does not exceed the limit, + verify data blocks in bottom half instead of workqueue. This option can + reduce IO latency. The size limits can be configured via + /sys/module/dm_verity/parameters/use_bh_bytes. The four parameters + correspond to limits for IOPRIO_CLASS_NONE, IOPRIO_CLASS_RT, + IOPRIO_CLASS_BE and IOPRIO_CLASS_IDLE in turn. + For example: + <none>,<rt>,<be>,<idle> + 4096,4096,4096,4096 Theory of operation =================== diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst index ff0b440ef2dc..451874b8135d 100644 --- a/Documentation/admin-guide/hw-vuln/index.rst +++ b/Documentation/admin-guide/hw-vuln/index.rst @@ -22,3 +22,4 @@ are configurable at compile, boot or run time. srso gather_data_sampling reg-file-data-sampling + rsb diff --git a/Documentation/admin-guide/hw-vuln/rsb.rst b/Documentation/admin-guide/hw-vuln/rsb.rst new file mode 100644 index 000000000000..21dbf9cf25f8 --- /dev/null +++ b/Documentation/admin-guide/hw-vuln/rsb.rst @@ -0,0 +1,268 @@ +.. SPDX-License-Identifier: GPL-2.0 + +======================= +RSB-related mitigations +======================= + +.. warning:: + Please keep this document up-to-date, otherwise you will be + volunteered to update it and convert it to a very long comment in + bugs.c! + +Since 2018 there have been many Spectre CVEs related to the Return Stack +Buffer (RSB) (sometimes referred to as the Return Address Stack (RAS) or +Return Address Predictor (RAP) on AMD). + +Information about these CVEs and how to mitigate them is scattered +amongst a myriad of microarchitecture-specific documents. + +This document attempts to consolidate all the relevant information in +once place and clarify the reasoning behind the current RSB-related +mitigations. It's meant to be as concise as possible, focused only on +the current kernel mitigations: what are the RSB-related attack vectors +and how are they currently being mitigated? + +It's *not* meant to describe how the RSB mechanism operates or how the +exploits work. More details about those can be found in the references +below. + +Rather, this is basically a glorified comment, but too long to actually +be one. So when the next CVE comes along, a kernel developer can +quickly refer to this as a refresher to see what we're actually doing +and why. + +At a high level, there are two classes of RSB attacks: RSB poisoning +(Intel and AMD) and RSB underflow (Intel only). They must each be +considered individually for each attack vector (and microarchitecture +where applicable). + +---- + +RSB poisoning (Intel and AMD) +============================= + +SpectreRSB +~~~~~~~~~~ + +RSB poisoning is a technique used by SpectreRSB [#spectre-rsb]_ where +an attacker poisons an RSB entry to cause a victim's return instruction +to speculate to an attacker-controlled address. This can happen when +there are unbalanced CALLs/RETs after a context switch or VMEXIT. + +* All attack vectors can potentially be mitigated by flushing out any + poisoned RSB entries using an RSB filling sequence + [#intel-rsb-filling]_ [#amd-rsb-filling]_ when transitioning between + untrusted and trusted domains. But this has a performance impact and + should be avoided whenever possible. + + .. DANGER:: + **FIXME**: Currently we're flushing 32 entries. However, some CPU + models have more than 32 entries. The loop count needs to be + increased for those. More detailed information is needed about RSB + sizes. + +* On context switch, the user->user mitigation requires ensuring the + RSB gets filled or cleared whenever IBPB gets written [#cond-ibpb]_ + during a context switch: + + * AMD: + On Zen 4+, IBPB (or SBPB [#amd-sbpb]_ if used) clears the RSB. + This is indicated by IBPB_RET in CPUID [#amd-ibpb-rsb]_. + + On Zen < 4, the RSB filling sequence [#amd-rsb-filling]_ must be + always be done in addition to IBPB [#amd-ibpb-no-rsb]_. This is + indicated by X86_BUG_IBPB_NO_RET. + + * Intel: + IBPB always clears the RSB: + + "Software that executed before the IBPB command cannot control + the predicted targets of indirect branches executed after the + command on the same logical processor. The term indirect branch + in this context includes near return instructions, so these + predicted targets may come from the RSB." [#intel-ibpb-rsb]_ + +* On context switch, user->kernel attacks are prevented by SMEP. User + space can only insert user space addresses into the RSB. Even + non-canonical addresses can't be inserted due to the page gap at the + end of the user canonical address space reserved by TASK_SIZE_MAX. + A SMEP #PF at instruction fetch prevents the kernel from speculatively + executing user space. + + * AMD: + "Finally, branches that are predicted as 'ret' instructions get + their predicted targets from the Return Address Predictor (RAP). + AMD recommends software use a RAP stuffing sequence (mitigation + V2-3 in [2]) and/or Supervisor Mode Execution Protection (SMEP) + to ensure that the addresses in the RAP are safe for + speculation. Collectively, we refer to these mitigations as "RAP + Protection"." [#amd-smep-rsb]_ + + * Intel: + "On processors with enhanced IBRS, an RSB overwrite sequence may + not suffice to prevent the predicted target of a near return + from using an RSB entry created in a less privileged predictor + mode. Software can prevent this by enabling SMEP (for + transitions from user mode to supervisor mode) and by having + IA32_SPEC_CTRL.IBRS set during VM exits." [#intel-smep-rsb]_ + +* On VMEXIT, guest->host attacks are mitigated by eIBRS (and PBRSB + mitigation if needed): + + * AMD: + "When Automatic IBRS is enabled, the internal return address + stack used for return address predictions is cleared on VMEXIT." + [#amd-eibrs-vmexit]_ + + * Intel: + "On processors with enhanced IBRS, an RSB overwrite sequence may + not suffice to prevent the predicted target of a near return + from using an RSB entry created in a less privileged predictor + mode. Software can prevent this by enabling SMEP (for + transitions from user mode to supervisor mode) and by having + IA32_SPEC_CTRL.IBRS set during VM exits. Processors with + enhanced IBRS still support the usage model where IBRS is set + only in the OS/VMM for OSes that enable SMEP. To do this, such + processors will ensure that guest behavior cannot control the + RSB after a VM exit once IBRS is set, even if IBRS was not set + at the time of the VM exit." [#intel-eibrs-vmexit]_ + + Note that some Intel CPUs are susceptible to Post-barrier Return + Stack Buffer Predictions (PBRSB) [#intel-pbrsb]_, where the last + CALL from the guest can be used to predict the first unbalanced RET. + In this case the PBRSB mitigation is needed in addition to eIBRS. + +AMD RETBleed / SRSO / Branch Type Confusion +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +On AMD, poisoned RSB entries can also be created by the AMD RETBleed +variant [#retbleed-paper]_ [#amd-btc]_ or by Speculative Return Stack +Overflow [#amd-srso]_ (Inception [#inception-paper]_). The kernel +protects itself by replacing every RET in the kernel with a branch to a +single safe RET. + +---- + +RSB underflow (Intel only) +========================== + +RSB Alternate (RSBA) ("Intel Retbleed") +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Some Intel Skylake-generation CPUs are susceptible to the Intel variant +of RETBleed [#retbleed-paper]_ (Return Stack Buffer Underflow +[#intel-rsbu]_). If a RET is executed when the RSB buffer is empty due +to mismatched CALLs/RETs or returning from a deep call stack, the branch +predictor can fall back to using the Branch Target Buffer (BTB). If a +user forces a BTB collision then the RET can speculatively branch to a +user-controlled address. + +* Note that RSB filling doesn't fully mitigate this issue. If there + are enough unbalanced RETs, the RSB may still underflow and fall back + to using a poisoned BTB entry. + +* On context switch, user->user underflow attacks are mitigated by the + conditional IBPB [#cond-ibpb]_ on context switch which effectively + clears the BTB: + + * "The indirect branch predictor barrier (IBPB) is an indirect branch + control mechanism that establishes a barrier, preventing software + that executed before the barrier from controlling the predicted + targets of indirect branches executed after the barrier on the same + logical processor." [#intel-ibpb-btb]_ + +* On context switch and VMEXIT, user->kernel and guest->host RSB + underflows are mitigated by IBRS or eIBRS: + + * "Enabling IBRS (including enhanced IBRS) will mitigate the "RSBU" + attack demonstrated by the researchers. As previously documented, + Intel recommends the use of enhanced IBRS, where supported. This + includes any processor that enumerates RRSBA but not RRSBA_DIS_S." + [#intel-rsbu]_ + + However, note that eIBRS and IBRS do not mitigate intra-mode attacks. + Like RRSBA below, this is mitigated by clearing the BHB on kernel + entry. + + As an alternative to classic IBRS, call depth tracking (combined with + retpolines) can be used to track kernel returns and fill the RSB when + it gets close to being empty. + +Restricted RSB Alternate (RRSBA) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Some newer Intel CPUs have Restricted RSB Alternate (RRSBA) behavior, +which, similar to RSBA described above, also falls back to using the BTB +on RSB underflow. The only difference is that the predicted targets are +restricted to the current domain when eIBRS is enabled: + +* "Restricted RSB Alternate (RRSBA) behavior allows alternate branch + predictors to be used by near RET instructions when the RSB is + empty. When eIBRS is enabled, the predicted targets of these + alternate predictors are restricted to those belonging to the + indirect branch predictor entries of the current prediction domain. + [#intel-eibrs-rrsba]_ + +When a CPU with RRSBA is vulnerable to Branch History Injection +[#bhi-paper]_ [#intel-bhi]_, an RSB underflow could be used for an +intra-mode BTI attack. This is mitigated by clearing the BHB on +kernel entry. + +However if the kernel uses retpolines instead of eIBRS, it needs to +disable RRSBA: + +* "Where software is using retpoline as a mitigation for BHI or + intra-mode BTI, and the processor both enumerates RRSBA and + enumerates RRSBA_DIS controls, it should disable this behavior." + [#intel-retpoline-rrsba]_ + +---- + +References +========== + +.. [#spectre-rsb] `Spectre Returns! Speculation Attacks using the Return Stack Buffer <https://arxiv.org/pdf/1807.07940.pdf>`_ + +.. [#intel-rsb-filling] "Empty RSB Mitigation on Skylake-generation" in `Retpoline: A Branch Target Injection Mitigation <https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/retpoline-branch-target-injection-mitigation.html#inpage-nav-5-1>`_ + +.. [#amd-rsb-filling] "Mitigation V2-3" in `Software Techniques for Managing Speculation <https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/software-techniques-for-managing-speculation.pdf>`_ + +.. [#cond-ibpb] Whether IBPB is written depends on whether the prev and/or next task is protected from Spectre attacks. It typically requires opting in per task or system-wide. For more details see the documentation for the ``spectre_v2_user`` cmdline option in Documentation/admin-guide/kernel-parameters.txt. + +.. [#amd-sbpb] IBPB without flushing of branch type predictions. Only exists for AMD. + +.. [#amd-ibpb-rsb] "Function 8000_0008h -- Processor Capacity Parameters and Extended Feature Identification" in `AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions <https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24594.pdf>`_. SBPB behaves the same way according to `this email <https://lore.kernel.org/5175b163a3736ca5fd01cedf406735636c99a>`_. + +.. [#amd-ibpb-no-rsb] `Spectre Attacks: Exploiting Speculative Execution <https://comsec.ethz.ch/wp-content/files/ibpb_sp25.pdf>`_ + +.. [#intel-ibpb-rsb] "Introduction" in `Post-barrier Return Stack Buffer Predictions / CVE-2022-26373 / INTEL-SA-00706 <https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/post-barrier-return-stack-buffer-predictions.html>`_ + +.. [#amd-smep-rsb] "Existing Mitigations" in `Technical Guidance for Mitigating Branch Type Confusion <https://www.amd.com/content/dam/amd/en/documents/resources/technical-guidance-for-mitigating-branch-type-confusion.pdf>`_ + +.. [#intel-smep-rsb] "Enhanced IBRS" in `Indirect Branch Restricted Speculation <https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/indirect-branch-restricted-speculation.html>`_ + +.. [#amd-eibrs-vmexit] "Extended Feature Enable Register (EFER)" in `AMD64 Architecture Programmer's Manual Volume 2: System Programming <https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf>`_ + +.. [#intel-eibrs-vmexit] "Enhanced IBRS" in `Indirect Branch Restricted Speculation <https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/indirect-branch-restricted-speculation.html>`_ + +.. [#intel-pbrsb] `Post-barrier Return Stack Buffer Predictions / CVE-2022-26373 / INTEL-SA-00706 <https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/post-barrier-return-stack-buffer-predictions.html>`_ + +.. [#retbleed-paper] `RETBleed: Arbitrary Speculative Code Execution with Return Instruction <https://comsec.ethz.ch/wp-content/files/retbleed_sec22.pdf>`_ + +.. [#amd-btc] `Technical Guidance for Mitigating Branch Type Confusion <https://www.amd.com/content/dam/amd/en/documents/resources/technical-guidance-for-mitigating-branch-type-confusion.pdf>`_ + +.. [#amd-srso] `Technical Update Regarding Speculative Return Stack Overflow <https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf>`_ + +.. [#inception-paper] `Inception: Exposing New Attack Surfaces with Training in Transient Execution <https://comsec.ethz.ch/wp-content/files/inception_sec23.pdf>`_ + +.. [#intel-rsbu] `Return Stack Buffer Underflow / Return Stack Buffer Underflow / CVE-2022-29901, CVE-2022-28693 / INTEL-SA-00702 <https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/return-stack-buffer-underflow.html>`_ + +.. [#intel-ibpb-btb] `Indirect Branch Predictor Barrier' <https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/indirect-branch-predictor-barrier.html>`_ + +.. [#intel-eibrs-rrsba] "Guidance for RSBU" in `Return Stack Buffer Underflow / Return Stack Buffer Underflow / CVE-2022-29901, CVE-2022-28693 / INTEL-SA-00702 <https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/return-stack-buffer-underflow.html>`_ + +.. [#bhi-paper] `Branch History Injection: On the Effectiveness of Hardware Mitigations Against Cross-Privilege Spectre-v2 Attacks <http://download.vusec.net/papers/bhi-spectre-bhb_sec22.pdf>`_ + +.. [#intel-bhi] `Branch History Injection and Intra-mode Branch Target Injection / CVE-2022-0001, CVE-2022-0002 / INTEL-SA-00598 <https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html>`_ + +.. [#intel-retpoline-rrsba] "Retpoline" in `Branch History Injection and Intra-mode Branch Target Injection / CVE-2022-0001, CVE-2022-0002 / INTEL-SA-00598 <https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html>`_ diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 3435a062a208..d9fd26b95b34 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1407,18 +1407,15 @@ earlyprintk=serial[,0x...[,baudrate]] earlyprintk=ttySn[,baudrate] earlyprintk=dbgp[debugController#] + earlyprintk=mmio32,membase[,{nocfg|baudrate}] earlyprintk=pciserial[,force],bus:device.function[,{nocfg|baudrate}] earlyprintk=xdbc[xhciController#] earlyprintk=bios - earlyprintk=mmio,membase[,{nocfg|baudrate}] earlyprintk is useful when the kernel crashes before the normal console is initialized. It is not enabled by default because it has some cosmetic problems. - Only 32-bit memory addresses are supported for "mmio" - and "pciserial" devices. - Use "nocfg" to skip UART configuration, assume BIOS/firmware has configured UART correctly. @@ -1866,7 +1863,7 @@ hpet_mmap= [X86, HPET_MMAP] Allow userspace to mmap HPET registers. Default set by CONFIG_HPET_MMAP_DEFAULT. - hugepages= [HW] Number of HugeTLB pages to allocate at boot. + hugepages= [HW,EARLY] Number of HugeTLB pages to allocate at boot. If this follows hugepagesz (below), it specifies the number of pages of hugepagesz to be allocated. If this is the first HugeTLB parameter on the command @@ -1878,15 +1875,24 @@ <node>:<integer>[,<node>:<integer>] hugepagesz= - [HW] The size of the HugeTLB pages. This is used in - conjunction with hugepages (above) to allocate huge - pages of a specific size at boot. The pair - hugepagesz=X hugepages=Y can be specified once for - each supported huge page size. Huge page sizes are - architecture dependent. See also + [HW,EARLY] The size of the HugeTLB pages. This is + used in conjunction with hugepages (above) to + allocate huge pages of a specific size at boot. The + pair hugepagesz=X hugepages=Y can be specified once + for each supported huge page size. Huge page sizes + are architecture dependent. See also Documentation/admin-guide/mm/hugetlbpage.rst. Format: size[KMG] + hugepage_alloc_threads= + [HW] The number of threads that should be used to + allocate hugepages during boot. This option can be + used to improve system bootup time when allocating + a large amount of huge pages. + The default value is 25% of the available hardware threads. + + Note that this parameter only applies to non-gigantic huge pages. + hugetlb_cma= [HW,CMA,EARLY] The size of a CMA area used for allocation of gigantic hugepages. Or using node format, the size of a CMA area per node can be specified. @@ -1897,6 +1903,13 @@ hugepages using the CMA allocator. If enabled, the boot-time allocation of gigantic hugepages is skipped. + hugetlb_cma_only= + [HW,CMA,EARLY] When allocating new HugeTLB pages, only + try to allocate from the CMA areas. + + This option does nothing if hugetlb_cma= is not also + specified. + hugetlb_free_vmemmap= [KNL] Requires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP enabled. @@ -1938,6 +1951,12 @@ which allow the hypervisor to 'idle' the guest on lock contention. + hw_protection= [HW] + Format: reboot | shutdown + + Hardware protection action taken on critical events like + overtemperature or imminent voltage loss. + i2c_bus= [HW] Override the default board specific I2C bus speed or register an additional I2C bus that is not registered from board initialization code. @@ -4243,10 +4262,10 @@ nosmp [SMP,EARLY] Tells an SMP kernel to act as a UP kernel, and disable the IO APIC. legacy for "maxcpus=0". - nosmt [KNL,MIPS,PPC,S390,EARLY] Disable symmetric multithreading (SMT). + nosmt [KNL,MIPS,PPC,EARLY] Disable symmetric multithreading (SMT). Equivalent to smt=1. - [KNL,X86,PPC] Disable symmetric multithreading (SMT). + [KNL,X86,PPC,S390] Disable symmetric multithreading (SMT). nosmt=force: Force disable SMT, cannot be undone via the sysfs control file. @@ -7266,6 +7285,8 @@ This is just one of many ways that can clear memory. Make sure your system keeps the content of memory across reboots before relying on this option. + NB: Both the mapped address and size must be page aligned for the architecture. + See also Documentation/trace/debugging.rst @@ -7511,6 +7532,22 @@ Note that genuine overcurrent events won't be reported either. + unaligned_scalar_speed= + [RISCV] + Format: {slow | fast | unsupported} + Allow skipping scalar unaligned access speed tests. This + is useful for testing alternative code paths and to skip + the tests in environments where they run too slowly. All + CPUs must have the same scalar unaligned access speed. + + unaligned_vector_speed= + [RISCV] + Format: {slow | fast | unsupported} + Allow skipping vector unaligned access speed tests. This + is useful for testing alternative code paths and to skip + the tests in environments where they run too slowly. All + CPUs must have the same vector unaligned access speed. + unknown_nmi_panic [X86] Cause panic on unknown NMI. diff --git a/Documentation/admin-guide/mm/cma_debugfs.rst b/Documentation/admin-guide/mm/cma_debugfs.rst index 7367e6294ef6..4120e9cb0cd5 100644 --- a/Documentation/admin-guide/mm/cma_debugfs.rst +++ b/Documentation/admin-guide/mm/cma_debugfs.rst @@ -12,10 +12,16 @@ its CMA name like below: The structure of the files created under that directory is as follows: - - [RO] base_pfn: The base PFN (Page Frame Number) of the zone. + - [RO] base_pfn: The base PFN (Page Frame Number) of the CMA area. + This is the same as ranges/0/base_pfn. - [RO] count: Amount of memory in the CMA area. - [RO] order_per_bit: Order of pages represented by one bit. - - [RO] bitmap: The bitmap of page states in the zone. + - [RO] bitmap: The bitmap of allocated pages in the area. + This is the same as ranges/0/base_pfn. + - [RO] ranges/N/base_pfn: The base PFN of contiguous range N + in the CMA area. + - [RO] ranges/N/bitmap: The bit map of allocated pages in + range N in the CMA area. - [WO] alloc: Allocate N pages from that CMA area. For example:: echo 5 > <debugfs>/cma/<cma_name>/alloc diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation/admin-guide/mm/damon/usage.rst index 47a44bd348ab..ced2013db3df 100644 --- a/Documentation/admin-guide/mm/damon/usage.rst +++ b/Documentation/admin-guide/mm/damon/usage.rst @@ -64,6 +64,7 @@ comma (","). │ │ │ │ :ref:`0 <sysfs_context>`/avail_operations,operations │ │ │ │ │ :ref:`monitoring_attrs <sysfs_monitoring_attrs>`/ │ │ │ │ │ │ intervals/sample_us,aggr_us,update_us + │ │ │ │ │ │ │ intervals_goal/access_bp,aggrs,min_sample_us,max_sample_us │ │ │ │ │ │ nr_regions/min,max │ │ │ │ │ :ref:`targets <sysfs_targets>`/nr_targets │ │ │ │ │ │ :ref:`0 <sysfs_target>`/pid_target @@ -82,8 +83,8 @@ comma (","). │ │ │ │ │ │ │ │ :ref:`goals <sysfs_schemes_quota_goals>`/nr_goals │ │ │ │ │ │ │ │ │ 0/target_metric,target_value,current_value │ │ │ │ │ │ │ :ref:`watermarks <sysfs_watermarks>`/metric,interval_us,high,mid,low - │ │ │ │ │ │ │ :ref:`filters <sysfs_filters>`/nr_filters - │ │ │ │ │ │ │ │ 0/type,matching,allow,memcg_path,addr_start,addr_end,target_idx + │ │ │ │ │ │ │ :ref:`{core_,ops_,}filters <sysfs_filters>`/nr_filters + │ │ │ │ │ │ │ │ 0/type,matching,allow,memcg_path,addr_start,addr_end,target_idx,min,max │ │ │ │ │ │ │ :ref:`stats <sysfs_schemes_stats>`/nr_tried,sz_tried,nr_applied,sz_applied,sz_ops_filter_passed,qt_exceeds │ │ │ │ │ │ │ :ref:`tried_regions <sysfs_schemes_tried_regions>`/total_bytes │ │ │ │ │ │ │ │ 0/start,end,nr_accesses,age,sz_filter_passed @@ -132,6 +133,11 @@ Users can write below commands for the kdamond to the ``state`` file. - ``off``: Stop running. - ``commit``: Read the user inputs in the sysfs files except ``state`` file again. +- ``update_tuned_intervals``: Update the contents of ``sample_us`` and + ``aggr_us`` files of the kdamond with the auto-tuning applied ``sampling + interval`` and ``aggregation interval`` for the files. Please refer to + :ref:`intervals_goal section <damon_usage_sysfs_monitoring_intervals_goal>` + for more details. - ``commit_schemes_quota_goals``: Read the DAMON-based operation schemes' :ref:`quota goals <sysfs_schemes_quota_goals>`. - ``update_schemes_stats``: Update the contents of stats files for each @@ -213,6 +219,25 @@ writing to and rading from the files. For more details about the intervals and monitoring regions range, please refer to the Design document (:doc:`/mm/damon/design`). +.. _damon_usage_sysfs_monitoring_intervals_goal: + +contexts/<N>/monitoring_attrs/intervals/intervals_goal/ +------------------------------------------------------- + +Under the ``intervals`` directory, one directory for automated tuning of +``sample_us`` and ``aggr_us``, namely ``intervals_goal`` directory also exists. +Under the directory, four files for the auto-tuning control, namely +``access_bp``, ``aggrs``, ``min_sample_us`` and ``max_sample_us`` exist. +Please refer to the :ref:`design document of the feature +<damon_design_monitoring_intervals_autotuning>` for the internal of the tuning +mechanism. Reading and writing the four files under ``intervals_goal`` +directory shows and updates the tuning parameters that described in the +:ref:design doc <damon_design_monitoring_intervals_autotuning>` with the same +names. The tuning starts with the user-set ``sample_us`` and ``aggr_us``. The +tuning-applied current values of the two intervals can be read from the +``sample_us`` and ``aggr_us`` files after writing ``update_tuned_intervals`` to +the ``state`` file. + .. _sysfs_targets: contexts/<N>/targets/ @@ -282,9 +307,10 @@ to ``N-1``. Each directory represents each DAMON-based operation scheme. schemes/<N>/ ------------ -In each scheme directory, five directories (``access_pattern``, ``quotas``, -``watermarks``, ``filters``, ``stats``, and ``tried_regions``) and three files -(``action``, ``target_nid`` and ``apply_interval``) exist. +In each scheme directory, seven directories (``access_pattern``, ``quotas``, +``watermarks``, ``core_filters``, ``ops_filters``, ``filters``, ``stats``, and +``tried_regions``) and three files (``action``, ``target_nid`` and +``apply_interval``) exist. The ``action`` file is for setting and getting the scheme's :ref:`action <damon_design_damos_action>`. The keywords that can be written to and read @@ -395,33 +421,43 @@ The ``interval`` should written in microseconds unit. .. _sysfs_filters: -schemes/<N>/filters/ --------------------- +schemes/<N>/{core\_,ops\_,}filters/ +----------------------------------- -The directory for the :ref:`filters <damon_design_damos_filters>` of the given +Directories for :ref:`filters <damon_design_damos_filters>` of the given DAMON-based operation scheme. -In the beginning, this directory has only one file, ``nr_filters``. Writing a +``core_filters`` and ``ops_filters`` directories are for the filters handled by +the DAMON core layer and operations set layer, respectively. ``filters`` +directory can be used for installing filters regardless of their handled +layers. Filters that requested by ``core_filters`` and ``ops_filters`` will be +installed before those of ``filters``. All three directories have same files. + +Use of ``filters`` directory can make expecting evaluation orders of given +filters with the files under directory bit confusing. Users are hence +recommended to use ``core_filters`` and ``ops_filters`` directories. The +``filters`` directory could be deprecated in future. + +In the beginning, the directory has only one file, ``nr_filters``. Writing a number (``N``) to the file creates the number of child directories named ``0`` to ``N-1``. Each directory represents each filter. The filters are evaluated in the numeric order. -Each filter directory contains seven files, namely ``type``, ``matching``, -``allow``, ``memcg_path``, ``addr_start``, ``addr_end``, and ``target_idx``. -To ``type`` file, you can write one of five special keywords: ``anon`` for -anonymous pages, ``memcg`` for specific memory cgroup, ``young`` for young -pages, ``addr`` for specific address range (an open-ended interval), or -``target`` for specific DAMON monitoring target filtering. Meaning of the -types are same to the description on the :ref:`design doc -<damon_design_damos_filters>`. - -In case of the memory cgroup filtering, you can specify the memory cgroup of -the interest by writing the path of the memory cgroup from the cgroups mount -point to ``memcg_path`` file. In case of the address range filtering, you can -specify the start and end address of the range to ``addr_start`` and -``addr_end`` files, respectively. For the DAMON monitoring target filtering, -you can specify the index of the target between the list of the DAMON context's -monitoring targets list to ``target_idx`` file. +Each filter directory contains nine files, namely ``type``, ``matching``, +``allow``, ``memcg_path``, ``addr_start``, ``addr_end``, ``min``, ``max`` +and ``target_idx``. To ``type`` file, you can write the type of the filter. +Refer to :ref:`the design doc <damon_design_damos_filters>` for available type +names, their meaning and on what layer those are handled. + +For ``memcg`` type, you can specify the memory cgroup of the interest by +writing the path of the memory cgroup from the cgroups mount point to +``memcg_path`` file. For ``addr`` type, you can specify the start and end +address of the range (open-ended interval) to ``addr_start`` and ``addr_end`` +files, respectively. For ``hugepage_size`` type, you can specify the minimum +and maximum size of the range (closed interval) to ``min`` and ``max`` files, +respectively. For ``target`` type, you can specify the index of the target +between the list of the DAMON context's monitoring targets list to +``target_idx`` file. You can write ``Y`` or ``N`` to ``matching`` file to specify whether the filter is for memory that matches the ``type``. You can write ``Y`` or ``N`` to @@ -431,6 +467,7 @@ the ``type`` and ``matching`` should be allowed or not. For example, below restricts a DAMOS action to be applied to only non-anonymous pages of all memory cgroups except ``/having_care_already``.:: + # cd ops_filters/0/ # echo 2 > nr_filters # # disallow anonymous pages echo anon > 0/type diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst index f34a0d798d5b..67a941903fd2 100644 --- a/Documentation/admin-guide/mm/hugetlbpage.rst +++ b/Documentation/admin-guide/mm/hugetlbpage.rst @@ -145,7 +145,17 @@ hugepages It will allocate 1 2M hugepage on node0 and 2 2M hugepages on node1. If the node number is invalid, the parameter will be ignored. +hugepage_alloc_threads + Specify the number of threads that should be used to allocate hugepages + during boot. This parameter can be used to improve system bootup time + when allocating a large amount of huge pages. + The default value is 25% of the available hardware threads. + Example to use 8 allocation threads:: + + hugepage_alloc_threads=8 + + Note that this parameter only applies to non-gigantic huge pages. default_hugepagesz Specify the default huge page size. This parameter can only be specified once on the command line. default_hugepagesz can diff --git a/Documentation/admin-guide/mm/pagemap.rst b/Documentation/admin-guide/mm/pagemap.rst index caba0f52dd36..afce291649dd 100644 --- a/Documentation/admin-guide/mm/pagemap.rst +++ b/Documentation/admin-guide/mm/pagemap.rst @@ -21,7 +21,8 @@ There are four components to pagemap: * Bit 56 page exclusively mapped (since 4.2) * Bit 57 pte is uffd-wp write-protected (since 5.13) (see Documentation/admin-guide/mm/userfaultfd.rst) - * Bits 58-60 zero + * Bit 58 pte is a guard region (since 6.15) (see madvise (2) man page) + * Bits 59-60 zero * Bit 61 page is file-page or shared-anon (since 3.5) * Bit 62 page swapped * Bit 63 page present @@ -37,12 +38,28 @@ There are four components to pagemap: precisely which pages are mapped (or in swap) and comparing mapped pages between processes. + Traditionally, bit 56 indicates that a page is mapped exactly once and bit + 56 is clear when a page is mapped multiple times, even when mapped in the + same process multiple times. In some kernel configurations, the semantics + for pages part of a larger allocation (e.g., THP) can differ: bit 56 is set + if all pages part of the corresponding large allocation are *certainly* + mapped in the same process, even if the page is mapped multiple times in that + process. Bit 56 is clear when any page page of the larger allocation + is *maybe* mapped in a different process. In some cases, a large allocation + might be treated as "maybe mapped by multiple processes" even though this + is no longer the case. + Efficient users of this interface will use ``/proc/pid/maps`` to determine which areas of memory are actually mapped and llseek to skip over unmapped regions. * ``/proc/kpagecount``. This file contains a 64-bit count of the number of - times each page is mapped, indexed by PFN. + times each page is mapped, indexed by PFN. Some kernel configurations do + not track the precise number of times a page part of a larger allocation + (e.g., THP) is mapped. In these configurations, the average number of + mappings per page in this larger allocation is returned instead. However, + if any page of the large allocation is mapped, the returned value will + be at least 1. The page-types tool in the tools/mm directory can be used to query the number of times a page is mapped. diff --git a/Documentation/admin-guide/mm/zswap.rst b/Documentation/admin-guide/mm/zswap.rst index 3598dcd7dbe7..fd3370aa43fe 100644 --- a/Documentation/admin-guide/mm/zswap.rst +++ b/Documentation/admin-guide/mm/zswap.rst @@ -60,15 +60,13 @@ accessed. The compressed memory pool grows on demand and shrinks as compressed pages are freed. The pool is not preallocated. By default, a zpool of type selected in ``CONFIG_ZSWAP_ZPOOL_DEFAULT`` Kconfig option is created, but it can be overridden at boot time by setting the ``zpool`` attribute, -e.g. ``zswap.zpool=zbud``. It can also be changed at runtime using the sysfs +e.g. ``zswap.zpool=zsmalloc``. It can also be changed at runtime using the sysfs ``zpool`` attribute, e.g.:: - echo zbud > /sys/module/zswap/parameters/zpool + echo zsmalloc > /sys/module/zswap/parameters/zpool -The zbud type zpool allocates exactly 1 page to store 2 compressed pages, which -means the compression ratio will always be 2:1 or worse (because of half-full -zbud pages). The zsmalloc type zpool has a more complex compressed page -storage method, and it can achieve greater storage densities. +The zsmalloc type zpool has a complex compressed page storage method, and it +can achieve great storage densities. When a swap page is passed from swapout to zswap, zswap maintains a mapping of the swap entry, a combination of the swap type and swap offset, to the zpool diff --git a/Documentation/admin-guide/sysctl/fs.rst b/Documentation/admin-guide/sysctl/fs.rst index 08e89e031714..6c54718c9d04 100644 --- a/Documentation/admin-guide/sysctl/fs.rst +++ b/Documentation/admin-guide/sysctl/fs.rst @@ -347,3 +347,28 @@ filesystems: ``/proc/sys/fs/fuse/max_pages_limit`` is a read/write file for setting/getting the maximum number of pages that can be used for servicing requests in FUSE. + +``/proc/sys/fs/fuse/default_request_timeout`` is a read/write file for +setting/getting the default timeout (in seconds) for a fuse server to +reply to a kernel-issued request in the event where the server did not +specify a timeout at mount. If the server set a timeout, +then default_request_timeout will be ignored. The default +"default_request_timeout" is set to 0. 0 indicates no default timeout. +The maximum value that can be set is 65535. + +``/proc/sys/fs/fuse/max_request_timeout`` is a read/write file for +setting/getting the maximum timeout (in seconds) for a fuse server to +reply to a kernel-issued request. A value greater than 0 automatically opts +the server into a timeout that will be set to at most "max_request_timeout", +even if the server did not specify a timeout and default_request_timeout is +set to 0. If max_request_timeout is greater than 0 and the server set a timeout +greater than max_request_timeout or default_request_timeout is set to a value +greater than max_request_timeout, the system will use max_request_timeout as the +timeout. 0 indicates no max request timeout. The maximum value that can be set +is 65535. + +For timeouts, if the server does not respond to the request by the time +the set timeout elapses, then the connection to the fuse server will be aborted. +Please note that the timeouts are not 100% precise (eg you may set 60 seconds but +the timeout may kick in after 70 seconds). The upper margin of error for the +timeout is roughly FUSE_TIMEOUT_TIMER_FREQ seconds. diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst index f48eaa98d22d..8290177b4f75 100644 --- a/Documentation/admin-guide/sysctl/vm.rst +++ b/Documentation/admin-guide/sysctl/vm.rst @@ -28,6 +28,7 @@ Currently, these files are in /proc/sys/vm: - compact_memory - compaction_proactiveness - compact_unevictable_allowed +- defrag_mode - dirty_background_bytes - dirty_background_ratio - dirty_bytes @@ -145,6 +146,14 @@ On CONFIG_PREEMPT_RT the default value is 0 in order to avoid a page fault, due to compaction, which would block the task from becoming active until the fault is resolved. +defrag_mode +=========== + +When set to 1, the page allocator tries harder to avoid fragmentation +and maintain the ability to produce huge pages / higher-order pages. + +It is recommended to enable this right after boot, as fragmentation, +once it occurred, can be long-lasting or even permanent. dirty_background_bytes ====================== diff --git a/Documentation/arch/arm64/ptdump.rst b/Documentation/arch/arm64/ptdump.rst index 5dcfc5d7cddf..51eb902ba41a 100644 --- a/Documentation/arch/arm64/ptdump.rst +++ b/Documentation/arch/arm64/ptdump.rst @@ -22,8 +22,6 @@ offlining of memory being accessed by the ptdump code. In order to dump the kernel page tables, enable the following configurations and mount debugfs:: - CONFIG_GENERIC_PTDUMP=y - CONFIG_PTDUMP_CORE=y CONFIG_PTDUMP_DEBUGFS=y mount -t debugfs nodev /sys/kernel/debug diff --git a/Documentation/arch/powerpc/cxl.rst b/Documentation/arch/powerpc/cxl.rst deleted file mode 100644 index 778adda740d2..000000000000 --- a/Documentation/arch/powerpc/cxl.rst +++ /dev/null @@ -1,470 +0,0 @@ -==================================== -Coherent Accelerator Interface (CXL) -==================================== - -Introduction -============ - - The coherent accelerator interface is designed to allow the - coherent connection of accelerators (FPGAs and other devices) to a - POWER system. These devices need to adhere to the Coherent - Accelerator Interface Architecture (CAIA). - - IBM refers to this as the Coherent Accelerator Processor Interface - or CAPI. In the kernel it's referred to by the name CXL to avoid - confusion with the ISDN CAPI subsystem. - - Coherent in this context means that the accelerator and CPUs can - both access system memory directly and with the same effective - addresses. - - **This driver is deprecated and will be removed in a future release.** - -Hardware overview -================= - - :: - - POWER8/9 FPGA - +----------+ +---------+ - | | | | - | CPU | | AFU | - | | | | - | | | | - | | | | - +----------+ +---------+ - | PHB | | | - | +------+ | PSL | - | | CAPP |<------>| | - +---+------+ PCIE +---------+ - - The POWER8/9 chip has a Coherently Attached Processor Proxy (CAPP) - unit which is part of the PCIe Host Bridge (PHB). This is managed - by Linux by calls into OPAL. Linux doesn't directly program the - CAPP. - - The FPGA (or coherently attached device) consists of two parts. - The POWER Service Layer (PSL) and the Accelerator Function Unit - (AFU). The AFU is used to implement specific functionality behind - the PSL. The PSL, among other things, provides memory address - translation services to allow each AFU direct access to userspace - memory. - - The AFU is the core part of the accelerator (eg. the compression, - crypto etc function). The kernel has no knowledge of the function - of the AFU. Only userspace interacts directly with the AFU. - - The PSL provides the translation and interrupt services that the - AFU needs. This is what the kernel interacts with. For example, if - the AFU needs to read a particular effective address, it sends - that address to the PSL, the PSL then translates it, fetches the - data from memory and returns it to the AFU. If the PSL has a - translation miss, it interrupts the kernel and the kernel services - the fault. The context to which this fault is serviced is based on - who owns that acceleration function. - - - POWER8 and PSL Version 8 are compliant to the CAIA Version 1.0. - - POWER9 and PSL Version 9 are compliant to the CAIA Version 2.0. - - This PSL Version 9 provides new features such as: - - * Interaction with the nest MMU on the P9 chip. - * Native DMA support. - * Supports sending ASB_Notify messages for host thread wakeup. - * Supports Atomic operations. - * etc. - - Cards with a PSL9 won't work on a POWER8 system and cards with a - PSL8 won't work on a POWER9 system. - -AFU Modes -========= - - There are two programming modes supported by the AFU. Dedicated - and AFU directed. AFU may support one or both modes. - - When using dedicated mode only one MMU context is supported. In - this mode, only one userspace process can use the accelerator at - time. - - When using AFU directed mode, up to 16K simultaneous contexts can - be supported. This means up to 16K simultaneous userspace - applications may use the accelerator (although specific AFUs may - support fewer). In this mode, the AFU sends a 16 bit context ID - with each of its requests. This tells the PSL which context is - associated with each operation. If the PSL can't translate an - operation, the ID can also be accessed by the kernel so it can - determine the userspace context associated with an operation. - - -MMIO space -========== - - A portion of the accelerator MMIO space can be directly mapped - from the AFU to userspace. Either the whole space can be mapped or - just a per context portion. The hardware is self describing, hence - the kernel can determine the offset and size of the per context - portion. - - -Interrupts -========== - - AFUs may generate interrupts that are destined for userspace. These - are received by the kernel as hardware interrupts and passed onto - userspace by a read syscall documented below. - - Data storage faults and error interrupts are handled by the kernel - driver. - - -Work Element Descriptor (WED) -============================= - - The WED is a 64-bit parameter passed to the AFU when a context is - started. Its format is up to the AFU hence the kernel has no - knowledge of what it represents. Typically it will be the - effective address of a work queue or status block where the AFU - and userspace can share control and status information. - - - - -User API -======== - -1. AFU character devices -^^^^^^^^^^^^^^^^^^^^^^^^ - - For AFUs operating in AFU directed mode, two character device - files will be created. /dev/cxl/afu0.0m will correspond to a - master context and /dev/cxl/afu0.0s will correspond to a slave - context. Master contexts have access to the full MMIO space an - AFU provides. Slave contexts have access to only the per process - MMIO space an AFU provides. - - For AFUs operating in dedicated process mode, the driver will - only create a single character device per AFU called - /dev/cxl/afu0.0d. This will have access to the entire MMIO space - that the AFU provides (like master contexts in AFU directed). - - The types described below are defined in include/uapi/misc/cxl.h - - The following file operations are supported on both slave and - master devices. - - A userspace library libcxl is available here: - - https://github.com/ibm-capi/libcxl - - This provides a C interface to this kernel API. - -open ----- - - Opens the device and allocates a file descriptor to be used with - the rest of the API. - - A dedicated mode AFU only has one context and only allows the - device to be opened once. - - An AFU directed mode AFU can have many contexts, the device can be - opened once for each context that is available. - - When all available contexts are allocated the open call will fail - and return -ENOSPC. - - Note: - IRQs need to be allocated for each context, which may limit - the number of contexts that can be created, and therefore - how many times the device can be opened. The POWER8 CAPP - supports 2040 IRQs and 3 are used by the kernel, so 2037 are - left. If 1 IRQ is needed per context, then only 2037 - contexts can be allocated. If 4 IRQs are needed per context, - then only 2037/4 = 509 contexts can be allocated. - - -ioctl ------ - - CXL_IOCTL_START_WORK: - Starts the AFU context and associates it with the current - process. Once this ioctl is successfully executed, all memory - mapped into this process is accessible to this AFU context - using the same effective addresses. No additional calls are - required to map/unmap memory. The AFU memory context will be - updated as userspace allocates and frees memory. This ioctl - returns once the AFU context is started. - - Takes a pointer to a struct cxl_ioctl_start_work - - :: - - struct cxl_ioctl_start_work { - __u64 flags; - __u64 work_element_descriptor; - __u64 amr; - __s16 num_interrupts; - __s16 reserved1; - __s32 reserved2; - __u64 reserved3; - __u64 reserved4; - __u64 reserved5; - __u64 reserved6; - }; - - flags: - Indicates which optional fields in the structure are - valid. - - work_element_descriptor: - The Work Element Descriptor (WED) is a 64-bit argument - defined by the AFU. Typically this is an effective - address pointing to an AFU specific structure - describing what work to perform. - - amr: - Authority Mask Register (AMR), same as the powerpc - AMR. This field is only used by the kernel when the - corresponding CXL_START_WORK_AMR value is specified in - flags. If not specified the kernel will use a default - value of 0. - - num_interrupts: - Number of userspace interrupts to request. This field - is only used by the kernel when the corresponding - CXL_START_WORK_NUM_IRQS value is specified in flags. - If not specified the minimum number required by the - AFU will be allocated. The min and max number can be - obtained from sysfs. - - reserved fields: - For ABI padding and future extensions - - CXL_IOCTL_GET_PROCESS_ELEMENT: - Get the current context id, also known as the process element. - The value is returned from the kernel as a __u32. - - -mmap ----- - - An AFU may have an MMIO space to facilitate communication with the - AFU. If it does, the MMIO space can be accessed via mmap. The size - and contents of this area are specific to the particular AFU. The - size can be discovered via sysfs. - - In AFU directed mode, master contexts are allowed to map all of - the MMIO space and slave contexts are allowed to only map the per - process MMIO space associated with the context. In dedicated - process mode the entire MMIO space can always be mapped. - - This mmap call must be done after the START_WORK ioctl. - - Care should be taken when accessing MMIO space. Only 32 and 64-bit - accesses are supported by POWER8. Also, the AFU will be designed - with a specific endianness, so all MMIO accesses should consider - endianness (recommend endian(3) variants like: le64toh(), - be64toh() etc). These endian issues equally apply to shared memory - queues the WED may describe. - - -read ----- - - Reads events from the AFU. Blocks if no events are pending - (unless O_NONBLOCK is supplied). Returns -EIO in the case of an - unrecoverable error or if the card is removed. - - read() will always return an integral number of events. - - The buffer passed to read() must be at least 4K bytes. - - The result of the read will be a buffer of one or more events, - each event is of type struct cxl_event, of varying size:: - - struct cxl_event { - struct cxl_event_header header; - union { - struct cxl_event_afu_interrupt irq; - struct cxl_event_data_storage fault; - struct cxl_event_afu_error afu_error; - }; - }; - - The struct cxl_event_header is defined as - - :: - - struct cxl_event_header { - __u16 type; - __u16 size; - __u16 process_element; - __u16 reserved1; - }; - - type: - This defines the type of event. The type determines how - the rest of the event is structured. These types are - described below and defined by enum cxl_event_type. - - size: - This is the size of the event in bytes including the - struct cxl_event_header. The start of the next event can - be found at this offset from the start of the current - event. - - process_element: - Context ID of the event. - - reserved field: - For future extensions and padding. - - If the event type is CXL_EVENT_AFU_INTERRUPT then the event - structure is defined as - - :: - - struct cxl_event_afu_interrupt { - __u16 flags; - __u16 irq; /* Raised AFU interrupt number */ - __u32 reserved1; - }; - - flags: - These flags indicate which optional fields are present - in this struct. Currently all fields are mandatory. - - irq: - The IRQ number sent by the AFU. - - reserved field: - For future extensions and padding. - - If the event type is CXL_EVENT_DATA_STORAGE then the event - structure is defined as - - :: - - struct cxl_event_data_storage { - __u16 flags; - __u16 reserved1; - __u32 reserved2; - __u64 addr; - __u64 dsisr; - __u64 reserved3; - }; - - flags: - These flags indicate which optional fields are present in - this struct. Currently all fields are mandatory. - - address: - The address that the AFU unsuccessfully attempted to - access. Valid accesses will be handled transparently by the - kernel but invalid accesses will generate this event. - - dsisr: - This field gives information on the type of fault. It is a - copy of the DSISR from the PSL hardware when the address - fault occurred. The form of the DSISR is as defined in the - CAIA. - - reserved fields: - For future extensions - - If the event type is CXL_EVENT_AFU_ERROR then the event structure - is defined as - - :: - - struct cxl_event_afu_error { - __u16 flags; - __u16 reserved1; - __u32 reserved2; - __u64 error; - }; - - flags: - These flags indicate which optional fields are present in - this struct. Currently all fields are Mandatory. - - error: - Error status from the AFU. Defined by the AFU. - - reserved fields: - For future extensions and padding - - -2. Card character device (powerVM guest only) -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - - In a powerVM guest, an extra character device is created for the - card. The device is only used to write (flash) a new image on the - FPGA accelerator. Once the image is written and verified, the - device tree is updated and the card is reset to reload the updated - image. - -open ----- - - Opens the device and allocates a file descriptor to be used with - the rest of the API. The device can only be opened once. - -ioctl ------ - -CXL_IOCTL_DOWNLOAD_IMAGE / CXL_IOCTL_VALIDATE_IMAGE: - Starts and controls flashing a new FPGA image. Partial - reconfiguration is not supported (yet), so the image must contain - a copy of the PSL and AFU(s). Since an image can be quite large, - the caller may have to iterate, splitting the image in smaller - chunks. - - Takes a pointer to a struct cxl_adapter_image:: - - struct cxl_adapter_image { - __u64 flags; - __u64 data; - __u64 len_data; - __u64 len_image; - __u64 reserved1; - __u64 reserved2; - __u64 reserved3; - __u64 reserved4; - }; - - flags: - These flags indicate which optional fields are present in - this struct. Currently all fields are mandatory. - - data: - Pointer to a buffer with part of the image to write to the - card. - - len_data: - Size of the buffer pointed to by data. - - len_image: - Full size of the image. - - -Sysfs Class -=========== - - A cxl sysfs class is added under /sys/class/cxl to facilitate - enumeration and tuning of the accelerators. Its layout is - described in Documentation/ABI/obsolete/sysfs-class-cxl - - -Udev rules -========== - - The following udev rules could be used to create a symlink to the - most logical chardev to use in any programming mode (afuX.Yd for - dedicated, afuX.Ys for afu directed), since the API is virtually - identical for each:: - - SUBSYSTEM=="cxl", ATTRS{mode}=="dedicated_process", SYMLINK="cxl/%b" - SUBSYSTEM=="cxl", ATTRS{mode}=="afu_directed", \ - KERNEL=="afu[0-9]*.[0-9]*s", SYMLINK="cxl/%b" diff --git a/Documentation/arch/powerpc/index.rst b/Documentation/arch/powerpc/index.rst index 995268530f21..0560cbae5fa1 100644 --- a/Documentation/arch/powerpc/index.rst +++ b/Documentation/arch/powerpc/index.rst @@ -12,7 +12,6 @@ powerpc bootwrapper cpu_families cpu_features - cxl dawr-power9 dexcr dscr diff --git a/Documentation/arch/riscv/hwprobe.rst b/Documentation/arch/riscv/hwprobe.rst index f273ea15a8e8..53607d962653 100644 --- a/Documentation/arch/riscv/hwprobe.rst +++ b/Documentation/arch/riscv/hwprobe.rst @@ -183,6 +183,9 @@ The following keys are defined: defined in the Atomic Compare-and-Swap (CAS) instructions manual starting from commit 5059e0ca641c ("update to ratified"). + * :c:macro:`RISCV_HWPROBE_EXT_ZICNTR`: The Zicntr extension version 2.0 + is supported as defined in the RISC-V ISA manual. + * :c:macro:`RISCV_HWPROBE_EXT_ZICOND`: The Zicond extension is supported as defined in the RISC-V Integer Conditional (Zicond) operations extension manual starting from commit 95cf1f9 ("Add changes requested by Ved @@ -192,6 +195,9 @@ The following keys are defined: supported as defined in the RISC-V ISA manual starting from commit d8ab5c78c207 ("Zihintpause is ratified"). + * :c:macro:`RISCV_HWPROBE_EXT_ZIHPM`: The Zihpm extension version 2.0 + is supported as defined in the RISC-V ISA manual. + * :c:macro:`RISCV_HWPROBE_EXT_ZVE32X`: The Vector sub-extension Zve32x is supported, as defined by version 1.0 of the RISC-V Vector extension manual. @@ -239,9 +245,32 @@ The following keys are defined: ratified in commit 98918c844281 ("Merge pull request #1217 from riscv/zawrs") of riscv-isa-manual. + * :c:macro:`RISCV_HWPROBE_EXT_ZAAMO`: The Zaamo extension is supported as + defined in the in the RISC-V ISA manual starting from commit e87412e621f1 + ("integrate Zaamo and Zalrsc text (#1304)"). + + * :c:macro:`RISCV_HWPROBE_EXT_ZALRSC`: The Zalrsc extension is supported as + defined in the in the RISC-V ISA manual starting from commit e87412e621f1 + ("integrate Zaamo and Zalrsc text (#1304)"). + * :c:macro:`RISCV_HWPROBE_EXT_SUPM`: The Supm extension is supported as defined in version 1.0 of the RISC-V Pointer Masking extensions. + * :c:macro:`RISCV_HWPROBE_EXT_ZFBFMIN`: The Zfbfmin extension is supported as + defined in the RISC-V ISA manual starting from commit 4dc23d6229de + ("Added Chapter title to BF16"). + + * :c:macro:`RISCV_HWPROBE_EXT_ZVFBFMIN`: The Zvfbfmin extension is supported as + defined in the RISC-V ISA manual starting from commit 4dc23d6229de + ("Added Chapter title to BF16"). + + * :c:macro:`RISCV_HWPROBE_EXT_ZVFBFWMA`: The Zvfbfwma extension is supported as + defined in the RISC-V ISA manual starting from commit 4dc23d6229de + ("Added Chapter title to BF16"). + + * :c:macro:`RISCV_HWPROBE_EXT_ZICBOM`: The Zicbom extension is supported, as + ratified in commit 3dd606f ("Create cmobase-v1.0.pdf") of riscv-CMOs. + * :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: Deprecated. Returns similar values to :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF`, but the key was mistakenly classified as a bitmask rather than a value. @@ -303,3 +332,6 @@ The following keys are defined: * :c:macro:`RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR`: The xtheadvector vendor extension is supported in the T-Head ISA extensions spec starting from commit a18c801634 ("Add T-Head VECTOR vendor extension. "). + +* :c:macro:`RISCV_HWPROBE_KEY_ZICBOM_BLOCK_SIZE`: An unsigned int which + represents the size of the Zicbom block in bytes. diff --git a/Documentation/arch/x86/cpuinfo.rst b/Documentation/arch/x86/cpuinfo.rst index 6ef426a52cdc..f80e2a558d2a 100644 --- a/Documentation/arch/x86/cpuinfo.rst +++ b/Documentation/arch/x86/cpuinfo.rst @@ -79,8 +79,9 @@ feature flags. How are feature flags created? ============================== -a: Feature flags can be derived from the contents of CPUID leaves. ------------------------------------------------------------------- +Feature flags can be derived from the contents of CPUID leaves +-------------------------------------------------------------- + These feature definitions are organized mirroring the layout of CPUID leaves and grouped in words with offsets as mapped in enum cpuid_leafs in cpufeatures.h (see arch/x86/include/asm/cpufeatures.h for details). @@ -89,8 +90,9 @@ cpufeatures.h, and if it is detected at run time, the flags will be displayed accordingly in /proc/cpuinfo. For example, the flag "avx2" comes from X86_FEATURE_AVX2 in cpufeatures.h. -b: Flags can be from scattered CPUID-based features. ----------------------------------------------------- +Flags can be from scattered CPUID-based features +------------------------------------------------ + Hardware features enumerated in sparsely populated CPUID leaves get software-defined values. Still, CPUID needs to be queried to determine if a given feature is present. This is done in init_scattered_cpuid_features(). @@ -104,8 +106,9 @@ has only one feature and would waste 31 bits of space in the x86_capability[] array. Since there is a struct cpuinfo_x86 for each possible CPU, the wasted memory is not trivial. -c: Flags can be created synthetically under certain conditions for hardware features. -------------------------------------------------------------------------------------- +Flags can be created synthetically under certain conditions for hardware features +--------------------------------------------------------------------------------- + Examples of conditions include whether certain features are present in MSR_IA32_CORE_CAPS or specific CPU models are identified. If the needed conditions are met, the features are enabled by the set_cpu_cap or @@ -114,8 +117,8 @@ the feature X86_FEATURE_SPLIT_LOCK_DETECT will be enabled and "split_lock_detect" will be displayed. The flag "ring3mwait" will be displayed only when running on INTEL_XEON_PHI_[KNL|KNM] processors. -d: Flags can represent purely software features. ------------------------------------------------- +Flags can represent purely software features +-------------------------------------------- These flags do not represent hardware features. Instead, they represent a software feature implemented in the kernel. For example, Kernel Page Table Isolation is purely software feature and its feature flag X86_FEATURE_PTI is @@ -130,14 +133,18 @@ x86_cap/bug_flags[] arrays in kernel/cpu/capflags.c. The names in the resulting x86_cap/bug_flags[] are used to populate /proc/cpuinfo. The naming of flags in the x86_cap/bug_flags[] are as follows: -a: The name of the flag is from the string in X86_FEATURE_<name> by default. ----------------------------------------------------------------------------- -By default, the flag <name> in /proc/cpuinfo is extracted from the respective -X86_FEATURE_<name> in cpufeatures.h. For example, the flag "avx2" is from -X86_FEATURE_AVX2. +Flags do not appear by default in /proc/cpuinfo +----------------------------------------------- + +Feature flags are omitted by default from /proc/cpuinfo as it does not make +sense for the feature to be exposed to userspace in most cases. For example, +X86_FEATURE_ALWAYS is defined in cpufeatures.h but that flag is an internal +kernel feature used in the alternative runtime patching functionality. So the +flag does not appear in /proc/cpuinfo. + +Specify a flag name if absolutely needed +---------------------------------------- -b: The naming can be overridden. --------------------------------- If the comment on the line for the #define X86_FEATURE_* starts with a double-quote character (""), the string inside the double-quote characters will be the name of the flags. For example, the flag "sse4_1" comes from @@ -148,36 +155,31 @@ needed. For instance, /proc/cpuinfo is a userspace interface and must remain constant. If, for some reason, the naming of X86_FEATURE_<name> changes, one shall override the new naming with the name already used in /proc/cpuinfo. -c: The naming override can be "", which means it will not appear in /proc/cpuinfo. ----------------------------------------------------------------------------------- -The feature shall be omitted from /proc/cpuinfo if it does not make sense for -the feature to be exposed to userspace. For example, X86_FEATURE_ALWAYS is -defined in cpufeatures.h but that flag is an internal kernel feature used -in the alternative runtime patching functionality. So, its name is overridden -with "". Its flag will not appear in /proc/cpuinfo. - Flags are missing when one or more of these happen ================================================== -a: The hardware does not enumerate support for it. --------------------------------------------------- +The hardware does not enumerate support for it +---------------------------------------------- + For example, when a new kernel is running on old hardware or the feature is not enabled by boot firmware. Even if the hardware is new, there might be a problem enabling the feature at run time, the flag will not be displayed. -b: The kernel does not know about the flag. -------------------------------------------- +The kernel does not know about the flag +--------------------------------------- + For example, when an old kernel is running on new hardware. -c: The kernel disabled support for it at compile-time. ------------------------------------------------------- +The kernel disabled support for it at compile-time +-------------------------------------------------- + For example, if 5-level-paging is not enabled when building (i.e., CONFIG_X86_5LEVEL is not selected) the flag "la57" will not show up [#f1]_. Even though the feature will still be detected via CPUID, the kernel disables it by clearing via setup_clear_cpu_cap(X86_FEATURE_LA57). -d: The feature is disabled at boot-time. ----------------------------------------- +The feature is disabled at boot-time +------------------------------------ A feature can be disabled either using a command-line parameter or because it failed to be enabled. The command-line parameter clearcpuid= can be used to disable features using the feature number as defined in @@ -190,8 +192,9 @@ disable specific features. The list of parameters includes, but is not limited to, nofsgsbase, nosgx, noxsave, etc. 5-level paging can also be disabled using "no5lvl". -e: The feature was known to be non-functional. ----------------------------------------------- +The feature was known to be non-functional +------------------------------------------ + The feature was known to be non-functional because a dependency was missing at runtime. For example, AVX flags will not show up if XSAVE feature is disabled since they depend on XSAVE feature. Another example would be broken diff --git a/Documentation/block/ublk.rst b/Documentation/block/ublk.rst index 1e0e7358e14a..854f823b46c2 100644 --- a/Documentation/block/ublk.rst +++ b/Documentation/block/ublk.rst @@ -309,18 +309,35 @@ with specified IO tag in the command data: ``UBLK_IO_COMMIT_AND_FETCH_REQ`` to the server, ublkdrv needs to copy the server buffer (pages) read to the IO request pages. -Future development -================== - Zero copy --------- -Zero copy is a generic requirement for nbd, fuse or similar drivers. A -problem [#xiaoguang]_ Xiaoguang mentioned is that pages mapped to userspace -can't be remapped any more in kernel with existing mm interfaces. This can -occurs when destining direct IO to ``/dev/ublkb*``. Also, he reported that -big requests (IO size >= 256 KB) may benefit a lot from zero copy. - +ublk zero copy relies on io_uring's fixed kernel buffer, which provides +two APIs: `io_buffer_register_bvec()` and `io_buffer_unregister_bvec`. + +ublk adds IO command of `UBLK_IO_REGISTER_IO_BUF` to call +`io_buffer_register_bvec()` for ublk server to register client request +buffer into io_uring buffer table, then ublk server can submit io_uring +IOs with the registered buffer index. IO command of `UBLK_IO_UNREGISTER_IO_BUF` +calls `io_buffer_unregister_bvec()` to unregister the buffer, which is +guaranteed to be live between calling `io_buffer_register_bvec()` and +`io_buffer_unregister_bvec()`. Any io_uring operation which supports this +kind of kernel buffer will grab one reference of the buffer until the +operation is completed. + +ublk server implementing zero copy or user copy has to be CAP_SYS_ADMIN and +be trusted, because it is ublk server's responsibility to make sure IO buffer +filled with data for handling read command, and ublk server has to return +correct result to ublk driver when handling READ command, and the result +has to match with how many bytes filled to the IO buffer. Otherwise, +uninitialized kernel IO buffer will be exposed to client application. + +ublk server needs to align the parameter of `struct ublk_param_dma_align` +with backend for zero copy to work correctly. + +For reaching best IO performance, ublk server should align its segment +parameter of `struct ublk_param_segment` with backend for avoiding +unnecessary IO split, which usually hurts io_uring performance. References ========== @@ -332,5 +349,3 @@ References .. [#userspace_nbdublk] https://gitlab.com/rwmjones/libnbd/-/tree/nbdublk .. [#userspace_readme] https://github.com/ming1/ubdsrv/blob/master/README - -.. [#xiaoguang] https://lore.kernel.org/linux-block/YoOr6jBfgVm8GvWg@stefanha-x1.localdomain/ diff --git a/Documentation/core-api/refcount-vs-atomic.rst b/Documentation/core-api/refcount-vs-atomic.rst index 79a009ce11df..94e628c1eb49 100644 --- a/Documentation/core-api/refcount-vs-atomic.rst +++ b/Documentation/core-api/refcount-vs-atomic.rst @@ -86,7 +86,19 @@ Memory ordering guarantee changes: * none (both fully unordered) -case 2) - increment-based ops that return no value +case 2) - non-"Read/Modify/Write" (RMW) ops with release ordering +----------------------------------------------------------------- + +Function changes: + + * atomic_set_release() --> refcount_set_release() + +Memory ordering guarantee changes: + + * none (both provide RELEASE ordering) + + +case 3) - increment-based ops that return no value -------------------------------------------------- Function changes: @@ -98,7 +110,7 @@ Memory ordering guarantee changes: * none (both fully unordered) -case 3) - decrement-based RMW ops that return no value +case 4) - decrement-based RMW ops that return no value ------------------------------------------------------ Function changes: @@ -110,7 +122,7 @@ Memory ordering guarantee changes: * fully unordered --> RELEASE ordering -case 4) - increment-based RMW ops that return a value +case 5) - increment-based RMW ops that return a value ----------------------------------------------------- Function changes: @@ -126,7 +138,20 @@ Memory ordering guarantees changes: result of obtaining pointer to the object! -case 5) - generic dec/sub decrement-based RMW ops that return a value +case 6) - increment-based RMW ops with acquire ordering that return a value +--------------------------------------------------------------------------- + +Function changes: + + * atomic_inc_not_zero() --> refcount_inc_not_zero_acquire() + * no atomic counterpart --> refcount_add_not_zero_acquire() + +Memory ordering guarantees changes: + + * fully ordered --> ACQUIRE ordering on success + + +case 7) - generic dec/sub decrement-based RMW ops that return a value --------------------------------------------------------------------- Function changes: @@ -139,7 +164,7 @@ Memory ordering guarantees changes: * fully ordered --> RELEASE ordering + ACQUIRE ordering on success -case 6) other decrement-based RMW ops that return a value +case 8) other decrement-based RMW ops that return a value --------------------------------------------------------- Function changes: @@ -154,7 +179,7 @@ Memory ordering guarantees changes: .. note:: atomic_add_unless() only provides full order on success. -case 7) - lock-based RMW +case 9) - lock-based RMW ------------------------ Function changes: diff --git a/Documentation/core-api/xarray.rst b/Documentation/core-api/xarray.rst index f6a3eef4fe7f..c6c91cbd0c3c 100644 --- a/Documentation/core-api/xarray.rst +++ b/Documentation/core-api/xarray.rst @@ -489,7 +489,19 @@ Storing ``NULL`` into any index of a multi-index entry will set the entry at every index to ``NULL`` and dissolve the tie. A multi-index entry can be split into entries occupying smaller ranges by calling xas_split_alloc() without the xa_lock held, followed by taking the lock -and calling xas_split(). +and calling xas_split() or calling xas_try_split() with xa_lock. The +difference between xas_split_alloc()+xas_split() and xas_try_alloc() is +that xas_split_alloc() + xas_split() split the entry from the original +order to the new order in one shot uniformly, whereas xas_try_split() +iteratively splits the entry containing the index non-uniformly. +For example, to split an order-9 entry, which takes 2^(9-6)=8 slots, +assuming ``XA_CHUNK_SHIFT`` is 6, xas_split_alloc() + xas_split() need +8 xa_node. xas_try_split() splits the order-9 entry into +2 order-8 entries, then split one order-8 entry, based on the given index, +to 2 order-7 entries, ..., and split one order-1 entry to 2 order-0 entries. +When splitting the order-6 entry and a new xa_node is needed, xas_try_split() +will try to allocate one if possible. As a result, xas_try_split() would only +need 1 xa_node instead of 8. Functions and structures ======================== diff --git a/Documentation/dev-tools/checkpatch.rst b/Documentation/dev-tools/checkpatch.rst index abb3ff682076..76bd0ddb0041 100644 --- a/Documentation/dev-tools/checkpatch.rst +++ b/Documentation/dev-tools/checkpatch.rst @@ -342,24 +342,6 @@ API usage See: https://www.kernel.org/doc/html/latest/RCU/whatisRCU.html#full-list-of-rcu-apis - **DEPRECATED_VARIABLE** - EXTRA_{A,C,CPP,LD}FLAGS are deprecated and should be replaced by the new - flags added via commit f77bf01425b1 ("kbuild: introduce ccflags-y, - asflags-y and ldflags-y"). - - The following conversion scheme maybe used:: - - EXTRA_AFLAGS -> asflags-y - EXTRA_CFLAGS -> ccflags-y - EXTRA_CPPFLAGS -> cppflags-y - EXTRA_LDFLAGS -> ldflags-y - - See: - - 1. https://lore.kernel.org/lkml/20070930191054.GA15876@uranus.ravnborg.org/ - 2. https://lore.kernel.org/lkml/1313384834-24433-12-git-send-email-lacombar@gmail.com/ - 3. https://www.kernel.org/doc/html/latest/kbuild/makefiles.html#compilation-flags - **DEVICE_ATTR_FUNCTIONS** The function names used in DEVICE_ATTR is unusual. Typically, the store and show functions are used with <attr>_store and diff --git a/Documentation/devicetree/bindings/arm/arm,coresight-tmc.yaml b/Documentation/devicetree/bindings/arm/arm,coresight-tmc.yaml index cb8dceaca70e..4787d7c6bac2 100644 --- a/Documentation/devicetree/bindings/arm/arm,coresight-tmc.yaml +++ b/Documentation/devicetree/bindings/arm/arm,coresight-tmc.yaml @@ -101,6 +101,29 @@ properties: and ETF configurations. $ref: /schemas/graph.yaml#/properties/port + memory-region: + items: + - description: Reserved trace buffer memory for ETR and ETF sinks. + For ETR, this reserved memory region is used for trace data capture. + Same region is used for trace data retention as well after a panic + or watchdog reset. + This reserved memory region is used as trace buffer or used for trace + data retention only if specifically selected by the user in sysfs + interface. + The default memory usage models for ETR in sysfs/perf modes are + otherwise unaltered. + + For ETF, this reserved memory region is used by default for + retention of trace data synced from internal SRAM after a panic + or watchdog reset. + - description: Reserved meta data memory. Used for ETR and ETF sinks + for storing metadata. + + memory-region-names: + items: + - const: tracedata + - const: metadata + required: - compatible - reg @@ -115,6 +138,9 @@ examples: etr@20070000 { compatible = "arm,coresight-tmc", "arm,primecell"; reg = <0x20070000 0x1000>; + memory-region = <&etr_trace_mem_reserved>, + <&etr_mdata_mem_reserved>; + memory-region-names = "tracedata", "metadata"; clocks = <&oscclk6a>; clock-names = "apb_pclk"; diff --git a/Documentation/devicetree/bindings/arm/qcom,coresight-ctcu.yaml b/Documentation/devicetree/bindings/arm/qcom,coresight-ctcu.yaml new file mode 100644 index 000000000000..843b52eaf872 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/qcom,coresight-ctcu.yaml @@ -0,0 +1,84 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/arm/qcom,coresight-ctcu.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: CoreSight TMC Control Unit + +maintainers: + - Yuanfang Zhang <quic_yuanfang@quicinc.com> + - Mao Jinlong <quic_jinlmao@quicinc.com> + - Jie Gan <quic_jiegan@quicinc.com> + +description: | + The Trace Memory Controller(TMC) is used for Embedded Trace Buffer(ETB), + Embedded Trace FIFO(ETF) and Embedded Trace Router(ETR) configurations. + The configuration mode (ETB, ETF, ETR) is discovered at boot time when + the device is probed. + + The Coresight TMC Control unit controls various Coresight behaviors. + It works as a helper device when connected to TMC ETR device. + It is responsible for controlling the data filter function based on + the source device's Trace ID for TMC ETR device. The trace data with + that Trace id can get into ETR's buffer while other trace data gets + ignored. + +properties: + compatible: + enum: + - qcom,sa8775p-ctcu + + reg: + maxItems: 1 + + clocks: + maxItems: 1 + + clock-names: + items: + - const: apb + + in-ports: + $ref: /schemas/graph.yaml#/properties/ports + + patternProperties: + '^port(@[0-1])?$': + description: Input connections from CoreSight Trace bus + $ref: /schemas/graph.yaml#/properties/port + +required: + - compatible + - reg + - in-ports + +additionalProperties: false + +examples: + - | + ctcu@1001000 { + compatible = "qcom,sa8775p-ctcu"; + reg = <0x1001000 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb"; + + in-ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + ctcu_in_port0: endpoint { + remote-endpoint = <&etr0_out_port>; + }; + }; + + port@1 { + reg = <1>; + ctcu_in_port1: endpoint { + remote-endpoint = <&etr1_out_port>; + }; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/arm/qcom,coresight-tpda.yaml b/Documentation/devicetree/bindings/arm/qcom,coresight-tpda.yaml index 76163abed655..5ed40f21b8eb 100644 --- a/Documentation/devicetree/bindings/arm/qcom,coresight-tpda.yaml +++ b/Documentation/devicetree/bindings/arm/qcom,coresight-tpda.yaml @@ -55,8 +55,7 @@ properties: - const: arm,primecell reg: - minItems: 1 - maxItems: 2 + maxItems: 1 clocks: maxItems: 1 diff --git a/Documentation/devicetree/bindings/arm/qcom,coresight-tpdm.yaml b/Documentation/devicetree/bindings/arm/qcom,coresight-tpdm.yaml index 8eec07d9d454..07d21a3617f5 100644 --- a/Documentation/devicetree/bindings/arm/qcom,coresight-tpdm.yaml +++ b/Documentation/devicetree/bindings/arm/qcom,coresight-tpdm.yaml @@ -41,8 +41,7 @@ properties: - const: arm,primecell reg: - minItems: 1 - maxItems: 2 + maxItems: 1 qcom,dsb-element-bits: description: diff --git a/Documentation/devicetree/bindings/dma/atmel,at91sam9g45-dma.yaml b/Documentation/devicetree/bindings/dma/atmel,at91sam9g45-dma.yaml new file mode 100644 index 000000000000..a58dc407311b --- /dev/null +++ b/Documentation/devicetree/bindings/dma/atmel,at91sam9g45-dma.yaml @@ -0,0 +1,68 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/dma/atmel,at91sam9g45-dma.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Atmel Direct Memory Access Controller (DMA) + +maintainers: + - Ludovic Desroches <ludovic.desroches@microchip.com> + +description: + The Atmel Direct Memory Access Controller (DMAC) transfers data from a source + peripheral to a destination peripheral over one or more AMBA buses. One channel + is required for each source/destination pair. In the most basic configuration, + the DMAC has one master interface and one channel. The master interface reads + the data from a source and writes it to a destination. Two AMBA transfers are + required for each DMAC data transfer. This is also known as a dual-access transfer. + The DMAC is programmed via the APB interface. + +properties: + compatible: + enum: + - atmel,at91sam9g45-dma + - atmel,at91sam9rl-dma + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + + "#dma-cells": + description: + Must be <2>, used to represent the number of integer cells in the dma + property of client devices. The two cells in order are + 1. The first cell represents the channel number. + 2. The second cell is 0 for RX and 1 for TX transfers. + const: 2 + + clocks: + maxItems: 1 + + clock-names: + const: dma_clk + +required: + - compatible + - reg + - interrupts + - "#dma-cells" + - clocks + - clock-names + +additionalProperties: false + +examples: + - | + dma-controller@ffffec00 { + compatible = "atmel,at91sam9g45-dma"; + reg = <0xffffec00 0x200>; + interrupts = <21>; + #dma-cells = <2>; + clocks = <&pmc 2 20>; + clock-names = "dma_clk"; + }; + +... diff --git a/Documentation/devicetree/bindings/dma/atmel,sama5d4-dma.yaml b/Documentation/devicetree/bindings/dma/atmel,sama5d4-dma.yaml index 9ca1c5d1f00f..73fc13b902b3 100644 --- a/Documentation/devicetree/bindings/dma/atmel,sama5d4-dma.yaml +++ b/Documentation/devicetree/bindings/dma/atmel,sama5d4-dma.yaml @@ -32,6 +32,9 @@ properties: - microchip,sam9x60-dma - microchip,sam9x7-dma - const: atmel,sama5d4-dma + - items: + - const: microchip,sama7d65-dma + - const: microchip,sama7g5-dma "#dma-cells": description: | diff --git a/Documentation/devicetree/bindings/dma/atmel-dma.txt b/Documentation/devicetree/bindings/dma/atmel-dma.txt deleted file mode 100644 index f69bcf5a6343..000000000000 --- a/Documentation/devicetree/bindings/dma/atmel-dma.txt +++ /dev/null @@ -1,42 +0,0 @@ -* Atmel Direct Memory Access Controller (DMA) - -Required properties: -- compatible: Should be "atmel,<chip>-dma". -- reg: Should contain DMA registers location and length. -- interrupts: Should contain DMA interrupt. -- #dma-cells: Must be <2>, used to represent the number of integer cells in -the dmas property of client devices. - -Example: - -dma0: dma@ffffec00 { - compatible = "atmel,at91sam9g45-dma"; - reg = <0xffffec00 0x200>; - interrupts = <21>; - #dma-cells = <2>; -}; - -DMA clients connected to the Atmel DMA controller must use the format -described in the dma.txt file, using a three-cell specifier for each channel: -a phandle plus two integer cells. -The three cells in order are: - -1. A phandle pointing to the DMA controller. -2. The memory interface (16 most significant bits), the peripheral interface -(16 less significant bits). -3. Parameters for the at91 DMA configuration register which are device -dependent: - - bit 7-0: peripheral identifier for the hardware handshaking interface. The - identifier can be different for tx and rx. - - bit 11-8: FIFO configuration. 0 for half FIFO, 1 for ALAP, 2 for ASAP. - -Example: - -i2c0@i2c@f8010000 { - compatible = "atmel,at91sam9x5-i2c"; - reg = <0xf8010000 0x100>; - interrupts = <9 4 6>; - dmas = <&dma0 1 7>, - <&dma0 1 8>; - dma-names = "tx", "rx"; -}; diff --git a/Documentation/devicetree/bindings/dma/fsl,edma.yaml b/Documentation/devicetree/bindings/dma/fsl,edma.yaml index 4f925469533e..950e8fa4f4ab 100644 --- a/Documentation/devicetree/bindings/dma/fsl,edma.yaml +++ b/Documentation/devicetree/bindings/dma/fsl,edma.yaml @@ -28,6 +28,14 @@ properties: - fsl,imx95-edma5 - nxp,s32g2-edma - items: + - enum: + - fsl,imx94-edma3 + - const: fsl,imx93-edma3 + - items: + - enum: + - fsl,imx94-edma5 + - const: fsl,imx95-edma5 + - items: - const: fsl,ls1028a-edma - const: fsl,vf610-edma - items: diff --git a/Documentation/devicetree/bindings/dma/fsl,elo-dma.yaml b/Documentation/devicetree/bindings/dma/fsl,elo-dma.yaml new file mode 100644 index 000000000000..92288d76d51b --- /dev/null +++ b/Documentation/devicetree/bindings/dma/fsl,elo-dma.yaml @@ -0,0 +1,137 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/dma/fsl,elo-dma.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Freescale Elo DMA Controller + +maintainers: + - J. Neuschäfer <j.ne@posteo.net> + +description: + This is a little-endian 4-channel DMA controller, used in Freescale mpc83xx + series chips such as mpc8315, mpc8349, mpc8379 etc. + +properties: + compatible: + items: + - enum: + - fsl,mpc8313-dma + - fsl,mpc8315-dma + - fsl,mpc8323-dma + - fsl,mpc8347-dma + - fsl,mpc8349-dma + - fsl,mpc8360-dma + - fsl,mpc8377-dma + - fsl,mpc8378-dma + - fsl,mpc8379-dma + - const: fsl,elo-dma + + reg: + items: + - description: + DMA General Status Register, i.e. DGSR which contains status for + all the 4 DMA channels. + + cell-index: + $ref: /schemas/types.yaml#/definitions/uint32 + description: Controller index. 0 for controller @ 0x8100. + + ranges: true + + "#address-cells": + const: 1 + + "#size-cells": + const: 1 + + interrupts: + maxItems: 1 + description: Controller interrupt. + +required: + - compatible + - reg + +patternProperties: + "^dma-channel@[0-9a-f]+$": + type: object + additionalProperties: false + + properties: + compatible: + oneOf: + # native DMA channel + - items: + - enum: + - fsl,mpc8315-dma-channel + - fsl,mpc8323-dma-channel + - fsl,mpc8347-dma-channel + - fsl,mpc8349-dma-channel + - fsl,mpc8360-dma-channel + - fsl,mpc8377-dma-channel + - fsl,mpc8378-dma-channel + - fsl,mpc8379-dma-channel + - const: fsl,elo-dma-channel + + # audio DMA channel, see fsl,ssi.yaml + - const: fsl,ssi-dma-channel + + reg: + maxItems: 1 + + cell-index: + description: DMA channel index starts at 0. + + interrupts: + maxItems: 1 + description: + Per-channel interrupt. Only necessary if no controller interrupt has + been provided. + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/irq.h> + + dma@82a8 { + compatible = "fsl,mpc8349-dma", "fsl,elo-dma"; + reg = <0x82a8 4>; + #address-cells = <1>; + #size-cells = <1>; + ranges = <0 0x8100 0x1a4>; + interrupts = <71 IRQ_TYPE_LEVEL_LOW>; + cell-index = <0>; + + dma-channel@0 { + compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; + reg = <0 0x80>; + cell-index = <0>; + interrupts = <71 IRQ_TYPE_LEVEL_LOW>; + }; + + dma-channel@80 { + compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; + reg = <0x80 0x80>; + cell-index = <1>; + interrupts = <71 IRQ_TYPE_LEVEL_LOW>; + }; + + dma-channel@100 { + compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; + reg = <0x100 0x80>; + cell-index = <2>; + interrupts = <71 IRQ_TYPE_LEVEL_LOW>; + }; + + dma-channel@180 { + compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; + reg = <0x180 0x80>; + cell-index = <3>; + interrupts = <71 IRQ_TYPE_LEVEL_LOW>; + }; + }; + +... diff --git a/Documentation/devicetree/bindings/dma/fsl,elo3-dma.yaml b/Documentation/devicetree/bindings/dma/fsl,elo3-dma.yaml new file mode 100644 index 000000000000..0f5e475657a7 --- /dev/null +++ b/Documentation/devicetree/bindings/dma/fsl,elo3-dma.yaml @@ -0,0 +1,125 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/dma/fsl,elo3-dma.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Freescale Elo3 DMA Controller + +maintainers: + - J. Neuschäfer <j.ne@posteo.net> + +description: + DMA controller which has same function as EloPlus except that Elo3 has 8 + channels while EloPlus has only 4, it is used in Freescale Txxx and Bxxx + series chips, such as t1040, t4240, b4860. + +properties: + compatible: + const: fsl,elo3-dma + + reg: + items: + - description: + DMA General Status Registers starting from DGSR0, for channel 1~4 + - description: + DMA General Status Registers starting from DGSR1, for channel 5~8 + + ranges: true + + "#address-cells": + const: 1 + + "#size-cells": + const: 1 + + interrupts: + maxItems: 1 + +patternProperties: + "^dma-channel@[0-9a-f]+$": + type: object + additionalProperties: false + + properties: + compatible: + enum: + # native DMA channel + - fsl,eloplus-dma-channel + + # audio DMA channel, see fsl,ssi.yaml + - fsl,ssi-dma-channel + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + description: + Per-channel interrupt. Only necessary if no controller interrupt has + been provided. + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/irq.h> + + dma@100300 { + compatible = "fsl,elo3-dma"; + reg = <0x100300 0x4>, + <0x100600 0x4>; + #address-cells = <1>; + #size-cells = <1>; + ranges = <0x0 0x100100 0x500>; + + dma-channel@0 { + compatible = "fsl,eloplus-dma-channel"; + reg = <0x0 0x80>; + interrupts = <28 IRQ_TYPE_EDGE_FALLING 0 0>; + }; + + dma-channel@80 { + compatible = "fsl,eloplus-dma-channel"; + reg = <0x80 0x80>; + interrupts = <29 IRQ_TYPE_EDGE_FALLING 0 0>; + }; + + dma-channel@100 { + compatible = "fsl,eloplus-dma-channel"; + reg = <0x100 0x80>; + interrupts = <30 IRQ_TYPE_EDGE_FALLING 0 0>; + }; + + dma-channel@180 { + compatible = "fsl,eloplus-dma-channel"; + reg = <0x180 0x80>; + interrupts = <31 IRQ_TYPE_EDGE_FALLING 0 0>; + }; + + dma-channel@300 { + compatible = "fsl,eloplus-dma-channel"; + reg = <0x300 0x80>; + interrupts = <76 IRQ_TYPE_EDGE_FALLING 0 0>; + }; + + dma-channel@380 { + compatible = "fsl,eloplus-dma-channel"; + reg = <0x380 0x80>; + interrupts = <77 IRQ_TYPE_EDGE_FALLING 0 0>; + }; + + dma-channel@400 { + compatible = "fsl,eloplus-dma-channel"; + reg = <0x400 0x80>; + interrupts = <78 IRQ_TYPE_EDGE_FALLING 0 0>; + }; + + dma-channel@480 { + compatible = "fsl,eloplus-dma-channel"; + reg = <0x480 0x80>; + interrupts = <79 IRQ_TYPE_EDGE_FALLING 0 0>; + }; + }; + +... diff --git a/Documentation/devicetree/bindings/dma/fsl,eloplus-dma.yaml b/Documentation/devicetree/bindings/dma/fsl,eloplus-dma.yaml new file mode 100644 index 000000000000..8992f244c4db --- /dev/null +++ b/Documentation/devicetree/bindings/dma/fsl,eloplus-dma.yaml @@ -0,0 +1,132 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/dma/fsl,eloplus-dma.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Freescale EloPlus DMA Controller + +maintainers: + - J. Neuschäfer <j.ne@posteo.net> + +description: + This is a 4-channel DMA controller with extended addresses and chaining, + mainly used in Freescale mpc85xx/86xx, Pxxx and BSC series chips, such as + mpc8540, mpc8641 p4080, bsc9131 etc. + +properties: + compatible: + oneOf: + - items: + - enum: + - fsl,mpc8540-dma + - fsl,mpc8541-dma + - fsl,mpc8548-dma + - fsl,mpc8555-dma + - fsl,mpc8560-dma + - fsl,mpc8572-dma + - fsl,mpc8641-dma + - const: fsl,eloplus-dma + - const: fsl,eloplus-dma + + reg: + items: + - description: + DMA General Status Register, i.e. DGSR which contains + status for all the 4 DMA channels + + cell-index: + $ref: /schemas/types.yaml#/definitions/uint32 + description: + controller index. 0 for controller @ 0x21000, 1 for controller @ 0xc000 + + ranges: true + + "#address-cells": + const: 1 + + "#size-cells": + const: 1 + + interrupts: + maxItems: 1 + description: Controller interrupt. + +patternProperties: + "^dma-channel@[0-9a-f]+$": + type: object + additionalProperties: false + + properties: + compatible: + oneOf: + # native DMA channel + - items: + - enum: + - fsl,mpc8540-dma-channel + - fsl,mpc8541-dma-channel + - fsl,mpc8548-dma-channel + - fsl,mpc8555-dma-channel + - fsl,mpc8560-dma-channel + - fsl,mpc8572-dma-channel + - const: fsl,eloplus-dma-channel + + # audio DMA channel, see fsl,ssi.yaml + - const: fsl,ssi-dma-channel + + reg: + maxItems: 1 + + cell-index: + description: DMA channel index starts at 0. + + interrupts: + maxItems: 1 + description: + Per-channel interrupt. Only necessary if no controller interrupt has + been provided. + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/irq.h> + + dma@21300 { + compatible = "fsl,mpc8540-dma", "fsl,eloplus-dma"; + reg = <0x21300 4>; + #address-cells = <1>; + #size-cells = <1>; + ranges = <0 0x21100 0x200>; + cell-index = <0>; + + dma-channel@0 { + compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; + reg = <0 0x80>; + cell-index = <0>; + interrupts = <20 IRQ_TYPE_EDGE_FALLING>; + }; + + dma-channel@80 { + compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; + reg = <0x80 0x80>; + cell-index = <1>; + interrupts = <21 IRQ_TYPE_EDGE_FALLING>; + }; + + dma-channel@100 { + compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; + reg = <0x100 0x80>; + cell-index = <2>; + interrupts = <22 IRQ_TYPE_EDGE_FALLING>; + }; + + dma-channel@180 { + compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; + reg = <0x180 0x80>; + cell-index = <3>; + interrupts = <23 IRQ_TYPE_EDGE_FALLING>; + }; + }; + +... diff --git a/Documentation/devicetree/bindings/dma/fsl,mxs-dma.yaml b/Documentation/devicetree/bindings/dma/fsl,mxs-dma.yaml index a17cf2360dd4..75a7d9556699 100644 --- a/Documentation/devicetree/bindings/dma/fsl,mxs-dma.yaml +++ b/Documentation/devicetree/bindings/dma/fsl,mxs-dma.yaml @@ -31,6 +31,12 @@ properties: - fsl,imx6q-dma-apbh - fsl,imx6sx-dma-apbh - fsl,imx7d-dma-apbh + - fsl,imx8dxl-dma-apbh + - fsl,imx8mm-dma-apbh + - fsl,imx8mn-dma-apbh + - fsl,imx8mp-dma-apbh + - fsl,imx8mq-dma-apbh + - fsl,imx8qm-dma-apbh - fsl,imx8qxp-dma-apbh - const: fsl,imx28-dma-apbh - enum: diff --git a/Documentation/devicetree/bindings/dma/snps,dw-axi-dmac.yaml b/Documentation/devicetree/bindings/dma/snps,dw-axi-dmac.yaml index 525f5f3932f5..935735a59afd 100644 --- a/Documentation/devicetree/bindings/dma/snps,dw-axi-dmac.yaml +++ b/Documentation/devicetree/bindings/dma/snps,dw-axi-dmac.yaml @@ -59,6 +59,8 @@ properties: minimum: 1 maximum: 8 + dma-noncoherent: true + resets: minItems: 1 maxItems: 2 diff --git a/Documentation/devicetree/bindings/eeprom/at24.yaml b/Documentation/devicetree/bindings/eeprom/at24.yaml index c9e4afbdc448..0ac68646c077 100644 --- a/Documentation/devicetree/bindings/eeprom/at24.yaml +++ b/Documentation/devicetree/bindings/eeprom/at24.yaml @@ -130,10 +130,13 @@ properties: - const: giantec,gt24c32a - const: atmel,24c32 - items: - - const: onnn,n24s64b + - enum: + - onnn,n24s64b + - puya,p24c64f - const: atmel,24c64 - items: - enum: + - giantec,gt24p128e - giantec,gt24p128f - renesas,r1ex24128 - samsung,s524ad0xd1 diff --git a/Documentation/devicetree/bindings/i2c/i2c-exynos5.yaml b/Documentation/devicetree/bindings/i2c/i2c-exynos5.yaml index 70cc2ee9ee27..8d47b290b4ed 100644 --- a/Documentation/devicetree/bindings/i2c/i2c-exynos5.yaml +++ b/Documentation/devicetree/bindings/i2c/i2c-exynos5.yaml @@ -30,6 +30,7 @@ properties: - items: - enum: - samsung,exynos5433-hsi2c + - samsung,exynos7870-hsi2c - tesla,fsd-hsi2c - const: samsung,exynos7-hsi2c - items: diff --git a/Documentation/devicetree/bindings/i2c/i2c-imx-lpi2c.yaml b/Documentation/devicetree/bindings/i2c/i2c-imx-lpi2c.yaml index 1dcb9c78de3b..969030a6f82a 100644 --- a/Documentation/devicetree/bindings/i2c/i2c-imx-lpi2c.yaml +++ b/Documentation/devicetree/bindings/i2c/i2c-imx-lpi2c.yaml @@ -26,6 +26,7 @@ properties: - fsl,imx8qm-lpi2c - fsl,imx8ulp-lpi2c - fsl,imx93-lpi2c + - fsl,imx94-lpi2c - fsl,imx95-lpi2c - const: fsl,imx7ulp-lpi2c diff --git a/Documentation/devicetree/bindings/i2c/i2c-rk3x.yaml b/Documentation/devicetree/bindings/i2c/i2c-rk3x.yaml index a9dae5b52f28..8101afa6f146 100644 --- a/Documentation/devicetree/bindings/i2c/i2c-rk3x.yaml +++ b/Documentation/devicetree/bindings/i2c/i2c-rk3x.yaml @@ -37,6 +37,7 @@ properties: - rockchip,px30-i2c - rockchip,rk3308-i2c - rockchip,rk3328-i2c + - rockchip,rk3562-i2c - rockchip,rk3568-i2c - rockchip,rk3576-i2c - rockchip,rk3588-i2c diff --git a/Documentation/devicetree/bindings/i2c/qcom,i2c-qup.yaml b/Documentation/devicetree/bindings/i2c/qcom,i2c-qup.yaml index f43947514d48..758d8f6321e1 100644 --- a/Documentation/devicetree/bindings/i2c/qcom,i2c-qup.yaml +++ b/Documentation/devicetree/bindings/i2c/qcom,i2c-qup.yaml @@ -40,6 +40,9 @@ properties: - const: tx - const: rx + interconnects: + maxItems: 1 + interrupts: maxItems: 1 @@ -52,9 +55,15 @@ properties: - const: default - const: sleep + power-domains: + maxItems: 1 + reg: maxItems: 1 + required-opps: + maxItems: 1 + required: - compatible - clock-names @@ -67,7 +76,9 @@ unevaluatedProperties: false examples: - | #include <dt-bindings/clock/qcom,gcc-msm8998.h> + #include <dt-bindings/interconnect/qcom,msm8996.h> #include <dt-bindings/interrupt-controller/arm-gic.h> + #include <dt-bindings/power/qcom-rpmpd.h> i2c@c175000 { compatible = "qcom,i2c-qup-v2.2.1"; @@ -82,6 +93,9 @@ examples: pinctrl-names = "default", "sleep"; pinctrl-0 = <&blsp1_i2c1_default>; pinctrl-1 = <&blsp1_i2c1_sleep>; + power-domains = <&rpmpd MSM8909_VDDCX>; + required-opps = <&rpmpd_opp_svs_krait>; + interconnects = <&pnoc MASTER_BLSP_1 &bimc SLAVE_EBI_CH0>; clock-frequency = <400000>; #address-cells = <1>; diff --git a/Documentation/devicetree/bindings/i2c/samsung,s3c2410-i2c.yaml b/Documentation/devicetree/bindings/i2c/samsung,s3c2410-i2c.yaml index bbc568485627..6ba7d793504c 100644 --- a/Documentation/devicetree/bindings/i2c/samsung,s3c2410-i2c.yaml +++ b/Documentation/devicetree/bindings/i2c/samsung,s3c2410-i2c.yaml @@ -22,6 +22,7 @@ properties: - samsung,exynos5-sata-phy-i2c - items: - enum: + - samsung,exynos7870-i2c - samsung,exynos7885-i2c - samsung,exynos850-i2c - const: samsung,s3c2440-i2c diff --git a/Documentation/devicetree/bindings/i2c/snps,designware-i2c.yaml b/Documentation/devicetree/bindings/i2c/snps,designware-i2c.yaml index e5d05263c45a..bc5d0fb5abfe 100644 --- a/Documentation/devicetree/bindings/i2c/snps,designware-i2c.yaml +++ b/Documentation/devicetree/bindings/i2c/snps,designware-i2c.yaml @@ -27,6 +27,11 @@ properties: oneOf: - description: Generic Synopsys DesignWare I2C controller const: snps,designware-i2c + - description: Renesas RZ/N1D I2C controller + items: + - const: renesas,r9a06g032-i2c # RZ/N1D + - const: renesas,rzn1-i2c # RZ/N1 + - const: snps,designware-i2c - description: Microsemi Ocelot SoCs I2C controller items: - const: mscc,ocelot-i2c diff --git a/Documentation/devicetree/bindings/i2c/spacemit,k1-i2c.yaml b/Documentation/devicetree/bindings/i2c/spacemit,k1-i2c.yaml new file mode 100644 index 000000000000..3d6aefb0d0f1 --- /dev/null +++ b/Documentation/devicetree/bindings/i2c/spacemit,k1-i2c.yaml @@ -0,0 +1,61 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/i2c/spacemit,k1-i2c.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: I2C controller embedded in SpacemiT's K1 SoC + +maintainers: + - Troy Mitchell <troymitchell988@gmail.com> + +properties: + compatible: + const: spacemit,k1-i2c + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + + clocks: + items: + - description: I2C Functional Clock + - description: APB Bus Clock + + clock-names: + items: + - const: func + - const: bus + + clock-frequency: + description: | + K1 support three different modes which running different frequencies + standard speed mode: up to 100000 (100Hz) + fast speed mode : up to 400000 (400Hz) + high speed mode : up to 3300000 (3.3Mhz) + default: 400000 + maximum: 3300000 + +required: + - compatible + - reg + - interrupts + - clocks + +unevaluatedProperties: false + +examples: + - | + i2c@d4010800 { + compatible = "spacemit,k1-i2c"; + reg = <0xd4010800 0x38>; + interrupt-parent = <&plic>; + interrupts = <36>; + clocks =<&ccu 32>, <&ccu 84>; + clock-names = "func", "bus"; + clock-frequency = <100000>; + }; + +... diff --git a/Documentation/devicetree/bindings/i2c/ti,omap4-i2c.yaml b/Documentation/devicetree/bindings/i2c/ti,omap4-i2c.yaml index 8c2e35fabf5b..58d32ceeacfc 100644 --- a/Documentation/devicetree/bindings/i2c/ti,omap4-i2c.yaml +++ b/Documentation/devicetree/bindings/i2c/ti,omap4-i2c.yaml @@ -47,6 +47,11 @@ properties: $ref: /schemas/types.yaml#/definitions/string deprecated: true + mux-states: + description: + mux controller node to route the I2C signals from SoC to clients. + maxItems: 1 + required: - compatible - reg @@ -87,4 +92,5 @@ examples: interrupts = <GIC_SPI 200 IRQ_TYPE_LEVEL_HIGH>; #address-cells = <1>; #size-cells = <0>; + mux-states = <&i2c_mux 1>; }; diff --git a/Documentation/devicetree/bindings/i3c/silvaco,i3c-master.yaml b/Documentation/devicetree/bindings/i3c/silvaco,i3c-master.yaml index c56ff77677f1..4fbdcdac0aee 100644 --- a/Documentation/devicetree/bindings/i3c/silvaco,i3c-master.yaml +++ b/Documentation/devicetree/bindings/i3c/silvaco,i3c-master.yaml @@ -14,7 +14,9 @@ allOf: properties: compatible: - const: silvaco,i3c-master-v1 + enum: + - nuvoton,npcm845-i3c + - silvaco,i3c-master-v1 reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/i3c/snps,dw-i3c-master.yaml b/Documentation/devicetree/bindings/i3c/snps,dw-i3c-master.yaml index 4fc13e3c0f75..5f6467375811 100644 --- a/Documentation/devicetree/bindings/i3c/snps,dw-i3c-master.yaml +++ b/Documentation/devicetree/bindings/i3c/snps,dw-i3c-master.yaml @@ -34,6 +34,9 @@ properties: interrupts: maxItems: 1 + power-domains: + maxItems: 1 + required: - compatible - reg diff --git a/Documentation/devicetree/bindings/iio/adc/adi,ad4030.yaml b/Documentation/devicetree/bindings/iio/adc/adi,ad4030.yaml new file mode 100644 index 000000000000..54e7349317b7 --- /dev/null +++ b/Documentation/devicetree/bindings/iio/adc/adi,ad4030.yaml @@ -0,0 +1,110 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +# Copyright 2024 Analog Devices Inc. +# Copyright 2024 BayLibre, SAS. +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/iio/adc/adi,ad4030.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Analog Devices AD4030 and AD4630 ADC families + +maintainers: + - Michael Hennerich <michael.hennerich@analog.com> + - Nuno Sa <nuno.sa@analog.com> + +description: | + Analog Devices AD4030 single channel and AD4630/AD4632 dual channel precision + SAR ADC families + + * https://www.analog.com/media/en/technical-documentation/data-sheets/ad4030-24-4032-24.pdf + * https://www.analog.com/media/en/technical-documentation/data-sheets/ad4630-24_ad4632-24.pdf + * https://www.analog.com/media/en/technical-documentation/data-sheets/ad4630-16-4632-16.pdf + +properties: + compatible: + enum: + - adi,ad4030-24 + - adi,ad4032-24 + - adi,ad4630-16 + - adi,ad4630-24 + - adi,ad4632-16 + - adi,ad4632-24 + + reg: + maxItems: 1 + + spi-max-frequency: + maximum: 102040816 + + spi-rx-bus-width: + enum: [1, 2, 4] + + vdd-5v-supply: true + vdd-1v8-supply: true + vio-supply: true + + ref-supply: + description: + Optional External unbuffered reference. Used when refin-supply is not + connected. + + refin-supply: + description: + Internal buffered Reference. Used when ref-supply is not connected. + + cnv-gpios: + description: + The Convert Input (CNV). It initiates the sampling conversions. + maxItems: 1 + + reset-gpios: + description: + The Reset Input (/RST). Used for asynchronous device reset. + maxItems: 1 + + interrupts: + description: + The BUSY pin is used to signal that the conversions results are available + to be transferred when in SPI Clocking Mode. This nodes should be + connected to an interrupt that is triggered when the BUSY line goes low. + maxItems: 1 + + interrupt-names: + const: busy + +required: + - compatible + - reg + - vdd-5v-supply + - vdd-1v8-supply + - vio-supply + - cnv-gpios + +oneOf: + - required: + - ref-supply + - required: + - refin-supply + +unevaluatedProperties: false + +examples: + - | + #include <dt-bindings/gpio/gpio.h> + + spi { + #address-cells = <1>; + #size-cells = <0>; + + adc@0 { + compatible = "adi,ad4030-24"; + reg = <0>; + spi-max-frequency = <80000000>; + vdd-5v-supply = <&supply_5V>; + vdd-1v8-supply = <&supply_1_8V>; + vio-supply = <&supply_1_8V>; + ref-supply = <&supply_5V>; + cnv-gpios = <&gpio0 0 GPIO_ACTIVE_HIGH>; + reset-gpios = <&gpio0 1 GPIO_ACTIVE_LOW>; + }; + }; diff --git a/Documentation/devicetree/bindings/iio/adc/adi,ad4695.yaml b/Documentation/devicetree/bindings/iio/adc/adi,ad4695.yaml index 7d2229dee444..cbde7a0505d2 100644 --- a/Documentation/devicetree/bindings/iio/adc/adi,ad4695.yaml +++ b/Documentation/devicetree/bindings/iio/adc/adi,ad4695.yaml @@ -84,6 +84,10 @@ properties: description: The Reset Input (RESET). Should be configured GPIO_ACTIVE_LOW. maxItems: 1 + pwms: + description: PWM signal connected to the CNV pin. + maxItems: 1 + interrupts: minItems: 1 items: @@ -106,6 +110,15 @@ properties: The first cell is the GPn number: 0 to 3. The second cell takes standard GPIO flags. + '#trigger-source-cells': + description: | + First cell indicates the output signal: 0 = BUSY, 1 = ALERT. + Second cell indicates which GPn pin is used: 0, 2 or 3. + + For convenience, macros for these values are available in + dt-bindings/iio/adc/adi,ad4695.h. + const: 2 + "#address-cells": const: 1 diff --git a/Documentation/devicetree/bindings/iio/adc/adi,ad4851.yaml b/Documentation/devicetree/bindings/iio/adc/adi,ad4851.yaml new file mode 100644 index 000000000000..c6676d91b4e6 --- /dev/null +++ b/Documentation/devicetree/bindings/iio/adc/adi,ad4851.yaml @@ -0,0 +1,153 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +# Copyright 2024 Analog Devices Inc. +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/iio/adc/adi,ad4851.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Analog Devices AD485X family + +maintainers: + - Sergiu Cuciurean <sergiu.cuciurean@analog.com> + - Dragos Bogdan <dragos.bogdan@analog.com> + - Antoniu Miclaus <antoniu.miclaus@analog.com> + +description: | + Analog Devices AD485X fully buffered, 8-channel simultaneous sampling, + 16/20-bit, 1 MSPS data acquisition system (DAS) with differential, wide + common-mode range inputs. + + https://www.analog.com/media/en/technical-documentation/data-sheets/ad4855.pdf + https://www.analog.com/media/en/technical-documentation/data-sheets/ad4856.pdf + https://www.analog.com/media/en/technical-documentation/data-sheets/ad4857.pdf + https://www.analog.com/media/en/technical-documentation/data-sheets/ad4858.pdf + +$ref: /schemas/spi/spi-peripheral-props.yaml# + +properties: + compatible: + enum: + - adi,ad4851 + - adi,ad4852 + - adi,ad4853 + - adi,ad4854 + - adi,ad4855 + - adi,ad4856 + - adi,ad4857 + - adi,ad4858 + - adi,ad4858i + + reg: + maxItems: 1 + + vcc-supply: true + + vee-supply: true + + vdd-supply: true + + vddh-supply: true + + vddl-supply: true + + vio-supply: true + + vrefbuf-supply: true + + vrefio-supply: true + + pwms: + description: PWM connected to the CNV pin. + maxItems: 1 + + io-backends: + maxItems: 1 + + pd-gpios: + maxItems: 1 + + spi-max-frequency: + maximum: 25000000 + + '#address-cells': + const: 1 + + '#size-cells': + const: 0 + +patternProperties: + "^channel(@[0-7])?$": + $ref: adc.yaml + type: object + description: Represents the channels which are connected to the ADC. + + properties: + reg: + description: + The channel number, as specified in the datasheet (from 0 to 7). + minimum: 0 + maximum: 7 + + diff-channels: + description: + Each channel can be configured as a bipolar differential channel. + The ADC uses the same positive and negative inputs for this. + This property must be specified as 'reg' (or the channel number) for + both positive and negative inputs (i.e. diff-channels = <reg reg>). + Since the configuration is bipolar differential, the 'bipolar' + property is required. + items: + minimum: 0 + maximum: 7 + + bipolar: true + + required: + - reg + + additionalProperties: false + +required: + - compatible + - reg + - vcc-supply + - vee-supply + - vdd-supply + - vio-supply + - pwms + +unevaluatedProperties: false + +examples: + - | + spi { + #address-cells = <1>; + #size-cells = <0>; + + adc@0{ + #address-cells = <1>; + #size-cells = <0>; + compatible = "adi,ad4858"; + reg = <0>; + spi-max-frequency = <10000000>; + vcc-supply = <&vcc>; + vdd-supply = <&vdd>; + vee-supply = <&vee>; + vddh-supply = <&vddh>; + vddl-supply = <&vddl>; + vio-supply = <&vio>; + pwms = <&pwm_gen 0 0>; + io-backends = <&iio_backend>; + + channel@0 { + reg = <0>; + diff-channels = <0 0>; + bipolar; + }; + + channel@1 { + reg = <1>; + }; + }; + }; +... diff --git a/Documentation/devicetree/bindings/iio/adc/adi,ad7191.yaml b/Documentation/devicetree/bindings/iio/adc/adi,ad7191.yaml new file mode 100644 index 000000000000..801ed319ee82 --- /dev/null +++ b/Documentation/devicetree/bindings/iio/adc/adi,ad7191.yaml @@ -0,0 +1,149 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +# Copyright 2025 Analog Devices Inc. +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/iio/adc/adi,ad7191.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Analog Devices AD7191 ADC + +maintainers: + - Alisa-Dariana Roman <alisa.roman@analog.com> + +description: | + Bindings for the Analog Devices AD7191 ADC device. Datasheet can be + found here: + https://www.analog.com/media/en/technical-documentation/data-sheets/AD7191.pdf + The device's PDOWN pin must be connected to the SPI controller's chip select + pin. + +properties: + compatible: + enum: + - adi,ad7191 + + reg: + maxItems: 1 + + spi-cpol: true + + spi-cpha: true + + clocks: + maxItems: 1 + description: + Must be present when CLKSEL pin is tied HIGH to select external clock + source (either a crystal between MCLK1 and MCLK2 pins, or a + CMOS-compatible clock driving MCLK2 pin). Must be absent when CLKSEL pin + is tied LOW to use the internal 4.92MHz clock. + + interrupts: + maxItems: 1 + + avdd-supply: + description: AVdd voltage supply + + dvdd-supply: + description: DVdd voltage supply + + vref-supply: + description: Vref voltage supply + + odr-gpios: + description: + ODR1 and ODR2 pins for output data rate selection. Should be defined if + adi,odr-value is absent. + minItems: 2 + maxItems: 2 + + adi,odr-value: + $ref: /schemas/types.yaml#/definitions/uint32 + description: | + Should be present if ODR pins are pin-strapped. Possible values: + 120 Hz (ODR1=0, ODR2=0) + 60 Hz (ODR1=0, ODR2=1) + 50 Hz (ODR1=1, ODR2=0) + 10 Hz (ODR1=1, ODR2=1) + If defined, odr-gpios must be absent. + enum: [120, 60, 50, 10] + + pga-gpios: + description: + PGA1 and PGA2 pins for gain selection. Should be defined if adi,pga-value + is absent. + minItems: 2 + maxItems: 2 + + adi,pga-value: + $ref: /schemas/types.yaml#/definitions/uint32 + description: | + Should be present if PGA pins are pin-strapped. Possible values: + Gain 1 (PGA1=0, PGA2=0) + Gain 8 (PGA1=0, PGA2=1) + Gain 64 (PGA1=1, PGA2=0) + Gain 128 (PGA1=1, PGA2=1) + If defined, pga-gpios must be absent. + enum: [1, 8, 64, 128] + + temp-gpios: + description: TEMP pin for temperature sensor enable. + maxItems: 1 + + chan-gpios: + description: CHAN pin for input channel selection. + maxItems: 1 + +required: + - compatible + - reg + - interrupts + - avdd-supply + - dvdd-supply + - vref-supply + - spi-cpol + - spi-cpha + - temp-gpios + - chan-gpios + +allOf: + - $ref: /schemas/spi/spi-peripheral-props.yaml# + - oneOf: + - required: + - adi,odr-value + - required: + - odr-gpios + - oneOf: + - required: + - adi,pga-value + - required: + - pga-gpios + +unevaluatedProperties: false + +examples: + - | + #include <dt-bindings/gpio/gpio.h> + #include <dt-bindings/interrupt-controller/irq.h> + + spi { + #address-cells = <1>; + #size-cells = <0>; + + adc@0 { + compatible = "adi,ad7191"; + reg = <0>; + spi-max-frequency = <1000000>; + spi-cpol; + spi-cpha; + clocks = <&ad7191_mclk>; + interrupts = <25 IRQ_TYPE_EDGE_FALLING>; + interrupt-parent = <&gpio>; + avdd-supply = <&avdd>; + dvdd-supply = <&dvdd>; + vref-supply = <&vref>; + adi,pga-value = <1>; + odr-gpios = <&gpio 23 GPIO_ACTIVE_HIGH>, <&gpio 24 GPIO_ACTIVE_HIGH>; + temp-gpios = <&gpio 22 GPIO_ACTIVE_HIGH>; + chan-gpios = <&gpio 27 GPIO_ACTIVE_HIGH>; + }; + }; diff --git a/Documentation/devicetree/bindings/iio/adc/adi,ad7380.yaml b/Documentation/devicetree/bindings/iio/adc/adi,ad7380.yaml index ada08005b3cd..ff4f5c21c548 100644 --- a/Documentation/devicetree/bindings/iio/adc/adi,ad7380.yaml +++ b/Documentation/devicetree/bindings/iio/adc/adi,ad7380.yaml @@ -27,6 +27,7 @@ description: | * https://www.analog.com/en/products/ad7388-4.html * https://www.analog.com/en/products/adaq4370-4.html * https://www.analog.com/en/products/adaq4380-4.html + * https://www.analog.com/en/products/adaq4381-4.html $ref: /schemas/spi/spi-peripheral-props.yaml# @@ -50,6 +51,7 @@ properties: - adi,ad7388-4 - adi,adaq4370-4 - adi,adaq4380-4 + - adi,adaq4381-4 reg: maxItems: 1 @@ -201,6 +203,7 @@ allOf: - adi,ad7380-4 - adi,adaq4370-4 - adi,adaq4380-4 + - adi,adaq4381-4 then: properties: refio-supply: false @@ -218,6 +221,7 @@ allOf: enum: - adi,adaq4370-4 - adi,adaq4380-4 + - adi,adaq4381-4 then: required: - vs-p-supply diff --git a/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml b/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml index e1f450b80db2..cf74f84d6103 100644 --- a/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml +++ b/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml @@ -17,13 +17,25 @@ description: | interface for the actual ADC, while this IP core will interface to the data-lines of the ADC and handle the streaming of data into memory via DMA. + In some cases, the AXI ADC interface is used to perform specialized + operation to a particular ADC, e.g access the physical bus through + specific registers to write ADC registers. + In this case, we use a different compatible which indicates the target + IP core's name. + The following IP is currently supported: + - AXI AD7606x: specialized version of the IP core for all the chips from + the ad7606 family. https://wiki.analog.com/resources/fpga/docs/axi_adc_ip + https://analogdevicesinc.github.io/hdl/library/axi_ad485x/index.html + http://analogdevicesinc.github.io/hdl/library/axi_ad7606x/index.html properties: compatible: enum: - adi,axi-adc-10.0.a + - adi,axi-ad7606x + - adi,axi-ad485x reg: maxItems: 1 @@ -47,17 +59,48 @@ properties: '#io-backend-cells': const: 0 + '#address-cells': + const: 1 + + '#size-cells': + const: 0 + +patternProperties: + "^adc@[0-9a-f]+$": + type: object + properties: + reg: + maxItems: 1 + additionalProperties: true + required: + - compatible + - reg + required: - compatible - dmas - reg - clocks +allOf: + - if: + properties: + compatible: + not: + contains: + const: adi,axi-ad7606x + then: + properties: + '#address-cells': false + '#size-cells': false + patternProperties: + "^adc@[0-9a-f]+$": false + additionalProperties: false examples: - | - axi-adc@44a00000 { + adc@44a00000 { compatible = "adi,axi-adc-10.0.a"; reg = <0x44a00000 0x10000>; dmas = <&rx_dma 0>; @@ -65,4 +108,31 @@ examples: clocks = <&axi_clk>; #io-backend-cells = <0>; }; + - | + #include <dt-bindings/gpio/gpio.h> + parallel_bus_controller@44a00000 { + compatible = "adi,axi-ad7606x"; + reg = <0x44a00000 0x10000>; + dmas = <&rx_dma 0>; + dma-names = "rx"; + clocks = <&ext_clk>; + #address-cells = <1>; + #size-cells = <0>; + + adc@0 { + compatible = "adi,ad7606b"; + reg = <0>; + pwms = <&axi_pwm_gen 0 0>; + pwm-names = "convst1"; + avcc-supply = <&adc_vref>; + vdrive-supply = <&vdd_supply>; + reset-gpios = <&gpio0 91 GPIO_ACTIVE_HIGH>; + standby-gpios = <&gpio0 90 GPIO_ACTIVE_LOW>; + adi,range-gpios = <&gpio0 89 GPIO_ACTIVE_HIGH>; + adi,oversampling-ratio-gpios = <&gpio0 88 GPIO_ACTIVE_HIGH + &gpio0 87 GPIO_ACTIVE_HIGH + &gpio0 86 GPIO_ACTIVE_HIGH>; + io-backends = <¶llel_bus_controller>; + }; + }; ... diff --git a/Documentation/devicetree/bindings/iio/adc/nxp,imx93-adc.yaml b/Documentation/devicetree/bindings/iio/adc/nxp,imx93-adc.yaml index dfc3f512918f..c2e5ff418920 100644 --- a/Documentation/devicetree/bindings/iio/adc/nxp,imx93-adc.yaml +++ b/Documentation/devicetree/bindings/iio/adc/nxp,imx93-adc.yaml @@ -19,7 +19,14 @@ description: properties: compatible: - const: nxp,imx93-adc + oneOf: + - enum: + - nxp,imx93-adc + - items: + - enum: + - nxp,imx94-adc + - nxp,imx95-adc + - const: nxp,imx93-adc reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/iio/adc/rockchip-saradc.yaml b/Documentation/devicetree/bindings/iio/adc/rockchip-saradc.yaml index fd93ed3991e0..41e0c56ef8e3 100644 --- a/Documentation/devicetree/bindings/iio/adc/rockchip-saradc.yaml +++ b/Documentation/devicetree/bindings/iio/adc/rockchip-saradc.yaml @@ -15,6 +15,8 @@ properties: - const: rockchip,saradc - const: rockchip,rk3066-tsadc - const: rockchip,rk3399-saradc + - const: rockchip,rk3528-saradc + - const: rockchip,rk3562-saradc - const: rockchip,rk3588-saradc - items: - const: rockchip,rk3576-saradc diff --git a/Documentation/devicetree/bindings/iio/adc/ti,ads7138.yaml b/Documentation/devicetree/bindings/iio/adc/ti,ads7138.yaml new file mode 100644 index 000000000000..a51893e207d4 --- /dev/null +++ b/Documentation/devicetree/bindings/iio/adc/ti,ads7138.yaml @@ -0,0 +1,63 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/iio/adc/ti,ads7138.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Texas Instruments ADS7128/ADS7138 analog-to-digital converter (ADC) + +maintainers: + - Tobias Sperling <tobias.sperling@softing.com> + +description: | + The ADS7128 and ADS7138 chips are 12-bit, 8 channel analog-to-digital + converters (ADC) with build-in digital window comparator (DWC), using the + I2C interface. + ADS7128 differs in the addition of further hardware features, like a + root-mean-square (RMS) and a zero-crossing-detect (ZCD) module. + + Datasheets: + https://www.ti.com/product/ADS7128 + https://www.ti.com/product/ADS7138 + +properties: + compatible: + enum: + - ti,ads7128 + - ti,ads7138 + + reg: + maxItems: 1 + + avdd-supply: + description: + The regulator used as analog supply voltage as well as reference voltage. + + interrupts: + description: + Interrupt on ALERT pin, triggers on low level. + maxItems: 1 + +required: + - compatible + - reg + - avdd-supply + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/irq.h> + i2c { + #address-cells = <1>; + #size-cells = <0>; + + adc@10 { + compatible = "ti,ads7138"; + reg = <0x10>; + avdd-supply = <®_stb_3v3>; + interrupt-parent = <&gpio2>; + interrupts = <12 IRQ_TYPE_LEVEL_LOW>; + }; + }; +... diff --git a/Documentation/devicetree/bindings/iio/dac/adi,ad5380.yaml b/Documentation/devicetree/bindings/iio/dac/adi,ad5380.yaml index 9eb9928500e2..3e323f1a5458 100644 --- a/Documentation/devicetree/bindings/iio/dac/adi,ad5380.yaml +++ b/Documentation/devicetree/bindings/iio/dac/adi,ad5380.yaml @@ -55,18 +55,18 @@ examples: #address-cells = <1>; #size-cells = <0>; dac@0 { - reg = <0>; - compatible = "adi,ad5390-5"; - vref-supply = <&dacvref>; + reg = <0>; + compatible = "adi,ad5390-5"; + vref-supply = <&dacvref>; }; }; - | i2c { - #address-cells = <1>; - #size-cells = <0>; - dac@42 { - reg = <0x42>; - compatible = "adi,ad5380-3"; - }; + #address-cells = <1>; + #size-cells = <0>; + dac@42 { + reg = <0x42>; + compatible = "adi,ad5380-3"; + }; }; ... diff --git a/Documentation/devicetree/bindings/iio/frequency/adf4371.yaml b/Documentation/devicetree/bindings/iio/frequency/adf4371.yaml index 1cb2adaf66f9..53d607441612 100644 --- a/Documentation/devicetree/bindings/iio/frequency/adf4371.yaml +++ b/Documentation/devicetree/bindings/iio/frequency/adf4371.yaml @@ -30,8 +30,9 @@ properties: clock-names: description: - Must be "clkin" - maxItems: 1 + Must be "clkin" if the input reference is single ended or "clkin-diff" + if the input reference is differential. + enum: [clkin, clkin-diff] adi,mute-till-lock-en: type: boolean diff --git a/Documentation/devicetree/bindings/iio/humidity/sciosense,ens210.yaml b/Documentation/devicetree/bindings/iio/humidity/sciosense,ens210.yaml index ed0ea938f7f8..1e25cf781cf1 100644 --- a/Documentation/devicetree/bindings/iio/humidity/sciosense,ens210.yaml +++ b/Documentation/devicetree/bindings/iio/humidity/sciosense,ens210.yaml @@ -43,13 +43,13 @@ additionalProperties: false examples: - | i2c { - #address-cells = <1>; - #size-cells = <0>; + #address-cells = <1>; + #size-cells = <0>; - temperature-sensor@43 { - compatible = "sciosense,ens210"; - reg = <0x43>; - }; + temperature-sensor@43 { + compatible = "sciosense,ens210"; + reg = <0x43>; + }; }; ... diff --git a/Documentation/devicetree/bindings/iio/imu/adi,adis16550.yaml b/Documentation/devicetree/bindings/iio/imu/adi,adis16550.yaml new file mode 100644 index 000000000000..a4c273c7a67f --- /dev/null +++ b/Documentation/devicetree/bindings/iio/imu/adi,adis16550.yaml @@ -0,0 +1,74 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/iio/imu/adi,adis16550.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Analog Devices ADIS16550 and similar IMUs + +maintainers: + - Nuno Sa <nuno.sa@analog.com> + - Ramona Gradinariu <ramona.gradinariu@analog.com> + - Antoniu Miclaus <antoniu.miclaus@analog.com> + - Robert Budai <robert.budai@analog.com> + +properties: + compatible: + enum: + - adi,adis16550 + + reg: + maxItems: 1 + + spi-cpha: true + + spi-cpol: true + + spi-max-frequency: + maximum: 15000000 + + vdd-supply: true + + interrupts: + maxItems: 1 + + reset-gpios: + description: + Active low RESET pin. + maxItems: 1 + + clocks: + description: If not provided, then the internal clock is used. + maxItems: 1 + +required: + - compatible + - reg + - interrupts + - spi-cpha + - spi-cpol + - spi-max-frequency + - vdd-supply + +allOf: + - $ref: /schemas/spi/spi-peripheral-props.yaml# + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/irq.h> + spi { + #address-cells = <1>; + #size-cells = <0>; + imu@0 { + compatible = "adi,adis16550"; + reg = <0>; + spi-max-frequency = <15000000>; + spi-cpol; + spi-cpha; + vdd-supply = <&vdd>; + interrupts = <4 IRQ_TYPE_EDGE_FALLING>; + interrupt-parent = <&gpio>; + }; + }; diff --git a/Documentation/devicetree/bindings/iio/light/brcm,apds9160.yaml b/Documentation/devicetree/bindings/iio/light/brcm,apds9160.yaml new file mode 100644 index 000000000000..bb1cc4404a55 --- /dev/null +++ b/Documentation/devicetree/bindings/iio/light/brcm,apds9160.yaml @@ -0,0 +1,78 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/iio/light/brcm,apds9160.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Broadcom Combined Proximity & Ambient light sensor + +maintainers: + - Mikael Gonella-Bolduc <m.gonella.bolduc@gmail.com> + +description: | + Datasheet: https://docs.broadcom.com/docs/APDS-9160-003-DS + +properties: + compatible: + enum: + - brcm,apds9160 + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + + vdd-supply: true + + ps-cancellation-duration: + $ref: /schemas/types.yaml#/definitions/uint32 + description: + Proximity sensor cancellation pulse duration in half clock cycles. + This parameter determines a cancellation pulse duration. + The cancellation is applied in the integration phase to cancel out + unwanted reflected light from very near objects such as tempered glass + in front of the sensor. + default: 0 + maximum: 63 + + ps-cancellation-current-picoamp: + description: + Proximity sensor crosstalk cancellation current in picoampere. + This parameter adjusts the current in steps of 2400 pA up to 276000 pA. + The provided value must be a multiple of 2400 and in one of these ranges + [60000 - 96000] + [120000 - 156000] + [180000 - 216000] + [240000 - 276000] + This parameter is used in conjunction with the cancellation duration. + minimum: 60000 + maximum: 276000 + multipleOf: 2400 + +required: + - compatible + - reg + - vdd-supply + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/irq.h> + + i2c { + #address-cells = <1>; + #size-cells = <0>; + + light-sensor@53 { + compatible = "brcm,apds9160"; + reg = <0x53>; + vdd-supply = <&vdd_reg>; + interrupts = <29 IRQ_TYPE_EDGE_FALLING>; + interrupt-parent = <&pinctrl>; + ps-cancellation-duration = <10>; + ps-cancellation-current-picoamp = <62400>; + }; + }; +... diff --git a/Documentation/devicetree/bindings/iio/light/dynaimage,al3010.yaml b/Documentation/devicetree/bindings/iio/light/dynaimage,al3010.yaml index a3a979553e32..f1048c30e73e 100644 --- a/Documentation/devicetree/bindings/iio/light/dynaimage,al3010.yaml +++ b/Documentation/devicetree/bindings/iio/light/dynaimage,al3010.yaml @@ -4,14 +4,16 @@ $id: http://devicetree.org/schemas/iio/light/dynaimage,al3010.yaml# $schema: http://devicetree.org/meta-schemas/core.yaml# -title: Dyna-Image AL3010 sensor +title: Dyna-Image AL3000a/AL3010 sensor maintainers: - David Heidelberg <david@ixit.cz> properties: compatible: - const: dynaimage,al3010 + enum: + - dynaimage,al3000a + - dynaimage,al3010 reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/iio/magnetometer/silabs,si7210.yaml b/Documentation/devicetree/bindings/iio/magnetometer/silabs,si7210.yaml new file mode 100644 index 000000000000..d4a3f7981c36 --- /dev/null +++ b/Documentation/devicetree/bindings/iio/magnetometer/silabs,si7210.yaml @@ -0,0 +1,48 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/iio/magnetometer/silabs,si7210.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Si7210 magnetic position and temperature sensor + +maintainers: + - Antoni Pokusinski <apokusinski01@gmail.com> + +description: | + Silabs Si7210 I2C Hall effect magnetic position and temperature sensor. + https://www.silabs.com/documents/public/data-sheets/si7210-datasheet.pdf + +properties: + compatible: + const: silabs,si7210 + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + + vdd-supply: + description: Regulator that provides power to the sensor + +required: + - compatible + - reg + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/irq.h> + i2c { + #address-cells = <1>; + #size-cells = <0>; + magnetometer@30 { + compatible = "silabs,si7210"; + reg = <0x30>; + interrupt-parent = <&gpio1>; + interrupts = <4 IRQ_TYPE_EDGE_FALLING>; + vdd-supply = <&vdd_3v3_reg>; + }; + }; diff --git a/Documentation/devicetree/bindings/iio/temperature/maxim,max31865.yaml b/Documentation/devicetree/bindings/iio/temperature/maxim,max31865.yaml index 7cc365e0ebc8..7c0c6ab6fc69 100644 --- a/Documentation/devicetree/bindings/iio/temperature/maxim,max31865.yaml +++ b/Documentation/devicetree/bindings/iio/temperature/maxim,max31865.yaml @@ -40,15 +40,15 @@ unevaluatedProperties: false examples: - | spi { - #address-cells = <1>; - #size-cells = <0>; - - temperature-sensor@0 { - compatible = "maxim,max31865"; - reg = <0>; - spi-max-frequency = <400000>; - spi-cpha; - maxim,3-wire; - }; + #address-cells = <1>; + #size-cells = <0>; + + temperature-sensor@0 { + compatible = "maxim,max31865"; + reg = <0>; + spi-max-frequency = <400000>; + spi-cpha; + maxim,3-wire; + }; }; ... diff --git a/Documentation/devicetree/bindings/iio/temperature/ti,tmp117.yaml b/Documentation/devicetree/bindings/iio/temperature/ti,tmp117.yaml index 58aa1542776b..fbba5e934861 100644 --- a/Documentation/devicetree/bindings/iio/temperature/ti,tmp117.yaml +++ b/Documentation/devicetree/bindings/iio/temperature/ti,tmp117.yaml @@ -44,8 +44,8 @@ examples: #size-cells = <0>; tmp117@48 { - compatible = "ti,tmp117"; - reg = <0x48>; - vcc-supply = <&pmic_reg_3v3>; + compatible = "ti,tmp117"; + reg = <0x48>; + vcc-supply = <&pmic_reg_3v3>; }; }; diff --git a/Documentation/devicetree/bindings/input/gpio-matrix-keypad.txt b/Documentation/devicetree/bindings/input/gpio-matrix-keypad.txt deleted file mode 100644 index 570dc10f0cd7..000000000000 --- a/Documentation/devicetree/bindings/input/gpio-matrix-keypad.txt +++ /dev/null @@ -1,49 +0,0 @@ -* GPIO driven matrix keypad device tree bindings - -GPIO driven matrix keypad is used to interface a SoC with a matrix keypad. -The matrix keypad supports multiple row and column lines, a key can be -placed at each intersection of a unique row and a unique column. The matrix -keypad can sense a key-press and key-release by means of GPIO lines and -report the event using GPIO interrupts to the cpu. - -Required Properties: -- compatible: Should be "gpio-matrix-keypad" -- row-gpios: List of gpios used as row lines. The gpio specifier - for this property depends on the gpio controller to - which these row lines are connected. -- col-gpios: List of gpios used as column lines. The gpio specifier - for this property depends on the gpio controller to - which these column lines are connected. -- linux,keymap: The definition can be found at - bindings/input/matrix-keymap.txt - -Optional Properties: -- linux,no-autorepeat: do no enable autorepeat feature. -- wakeup-source: use any event on keypad as wakeup event. - (Legacy property supported: "linux,wakeup") -- debounce-delay-ms: debounce interval in milliseconds -- col-scan-delay-us: delay, measured in microseconds, that is needed - before we can scan keypad after activating column gpio -- drive-inactive-cols: drive inactive columns during scan, - default is to turn inactive columns into inputs. - -Example: - matrix-keypad { - compatible = "gpio-matrix-keypad"; - debounce-delay-ms = <5>; - col-scan-delay-us = <2>; - - row-gpios = <&gpio2 25 0 - &gpio2 26 0 - &gpio2 27 0>; - - col-gpios = <&gpio2 21 0 - &gpio2 22 0>; - - linux,keymap = <0x0000008B - 0x0100009E - 0x02000069 - 0x0001006A - 0x0101001C - 0x0201006C>; - }; diff --git a/Documentation/devicetree/bindings/input/gpio-matrix-keypad.yaml b/Documentation/devicetree/bindings/input/gpio-matrix-keypad.yaml new file mode 100644 index 000000000000..ebfff9e42a36 --- /dev/null +++ b/Documentation/devicetree/bindings/input/gpio-matrix-keypad.yaml @@ -0,0 +1,103 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- + +$id: http://devicetree.org/schemas/input/gpio-matrix-keypad.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: GPIO matrix keypad + +maintainers: + - Marek Vasut <marek.vasut@gmail.com> + +description: + GPIO driven matrix keypad is used to interface a SoC with a matrix keypad. + The matrix keypad supports multiple row and column lines, a key can be + placed at each intersection of a unique row and a unique column. The matrix + keypad can sense a key-press and key-release by means of GPIO lines and + report the event using GPIO interrupts to the cpu. + +allOf: + - $ref: /schemas/input/matrix-keymap.yaml# + +properties: + compatible: + const: gpio-matrix-keypad + + row-gpios: + description: + List of GPIOs used as row lines. The gpio specifier for this property + depends on the gpio controller to which these row lines are connected. + + col-gpios: + description: + List of GPIOs used as column lines. The gpio specifier for this property + depends on the gpio controller to which these column lines are connected. + + linux,keymap: true + + linux,no-autorepeat: + type: boolean + description: Do not enable autorepeat feature. + + gpio-activelow: + type: boolean + description: + Force GPIO polarity to active low. + In the absence of this property GPIOs are treated as active high. + + debounce-delay-ms: + description: Debounce interval in milliseconds. + default: 0 + + col-scan-delay-us: + description: + Delay, measured in microseconds, that is needed + before we can scan keypad after activating column gpio. + default: 0 + + all-cols-on-delay-us: + description: + Delay, measured in microseconds, that is needed + after activating all column gpios. + default: 0 + + drive-inactive-cols: + type: boolean + description: + Drive inactive columns during scan, + default is to turn inactive columns into inputs. + + wakeup-source: true + +required: + - compatible + - row-gpios + - col-gpios + - linux,keymap + +additionalProperties: false + +examples: + - | + matrix-keypad { + compatible = "gpio-matrix-keypad"; + debounce-delay-ms = <5>; + col-scan-delay-us = <2>; + + row-gpios = <&gpio2 25 0 + &gpio2 26 0 + &gpio2 27 0>; + + col-gpios = <&gpio2 21 0 + &gpio2 22 0>; + + linux,keymap = <0x0000008B + 0x0100009E + 0x02000069 + 0x0001006A + 0x0101001C + 0x0201006C>; + + wakeup-source; + }; diff --git a/Documentation/devicetree/bindings/input/qcom,pm8921-keypad.yaml b/Documentation/devicetree/bindings/input/qcom,pm8921-keypad.yaml index 88764adcd696..e03611eef93d 100644 --- a/Documentation/devicetree/bindings/input/qcom,pm8921-keypad.yaml +++ b/Documentation/devicetree/bindings/input/qcom,pm8921-keypad.yaml @@ -62,28 +62,28 @@ unevaluatedProperties: false examples: - | - #include <dt-bindings/input/input.h> - #include <dt-bindings/interrupt-controller/irq.h> - pmic { - #address-cells = <1>; - #size-cells = <0>; + #include <dt-bindings/input/input.h> + #include <dt-bindings/interrupt-controller/irq.h> + pmic { + #address-cells = <1>; + #size-cells = <0>; - keypad@148 { - compatible = "qcom,pm8921-keypad"; - reg = <0x148>; - interrupt-parent = <&pmicintc>; - interrupts = <74 IRQ_TYPE_EDGE_RISING>, <75 IRQ_TYPE_EDGE_RISING>; - linux,keymap = < - MATRIX_KEY(0, 0, KEY_VOLUMEUP) - MATRIX_KEY(0, 1, KEY_VOLUMEDOWN) - MATRIX_KEY(0, 2, KEY_CAMERA_FOCUS) - MATRIX_KEY(0, 3, KEY_CAMERA) - >; - keypad,num-rows = <1>; - keypad,num-columns = <5>; - debounce = <15>; - scan-delay = <32>; - row-hold = <91500>; - }; - }; + keypad@148 { + compatible = "qcom,pm8921-keypad"; + reg = <0x148>; + interrupt-parent = <&pmicintc>; + interrupts = <74 IRQ_TYPE_EDGE_RISING>, <75 IRQ_TYPE_EDGE_RISING>; + linux,keymap = < + MATRIX_KEY(0, 0, KEY_VOLUMEUP) + MATRIX_KEY(0, 1, KEY_VOLUMEDOWN) + MATRIX_KEY(0, 2, KEY_CAMERA_FOCUS) + MATRIX_KEY(0, 3, KEY_CAMERA) + >; + keypad,num-rows = <1>; + keypad,num-columns = <5>; + debounce = <15>; + scan-delay = <32>; + row-hold = <91500>; + }; + }; ... diff --git a/Documentation/devicetree/bindings/input/qcom,pm8921-pwrkey.yaml b/Documentation/devicetree/bindings/input/qcom,pm8921-pwrkey.yaml index 12c74c083258..64590894857a 100644 --- a/Documentation/devicetree/bindings/input/qcom,pm8921-pwrkey.yaml +++ b/Documentation/devicetree/bindings/input/qcom,pm8921-pwrkey.yaml @@ -52,24 +52,24 @@ unevaluatedProperties: false examples: - | - #include <dt-bindings/interrupt-controller/irq.h> - ssbi { - #address-cells = <1>; - #size-cells = <0>; + #include <dt-bindings/interrupt-controller/irq.h> + ssbi { + #address-cells = <1>; + #size-cells = <0>; - pmic@0 { - reg = <0x0>; - #address-cells = <1>; - #size-cells = <0>; + pmic@0 { + reg = <0x0>; + #address-cells = <1>; + #size-cells = <0>; - pwrkey@1c { - compatible = "qcom,pm8921-pwrkey"; - reg = <0x1c>; - interrupt-parent = <&pmicint>; - interrupts = <50 IRQ_TYPE_EDGE_RISING>, <51 IRQ_TYPE_EDGE_RISING>; - debounce = <15625>; - pull-up; - }; - }; - }; + pwrkey@1c { + compatible = "qcom,pm8921-pwrkey"; + reg = <0x1c>; + interrupt-parent = <&pmicint>; + interrupts = <50 IRQ_TYPE_EDGE_RISING>, <51 IRQ_TYPE_EDGE_RISING>; + debounce = <15625>; + pull-up; + }; + }; + }; ... diff --git a/Documentation/devicetree/bindings/input/touchscreen/apple,z2-multitouch.yaml b/Documentation/devicetree/bindings/input/touchscreen/apple,z2-multitouch.yaml new file mode 100644 index 000000000000..402ca6bffd34 --- /dev/null +++ b/Documentation/devicetree/bindings/input/touchscreen/apple,z2-multitouch.yaml @@ -0,0 +1,70 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/input/touchscreen/apple,z2-multitouch.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Apple touchscreens attached using the Z2 protocol + +maintainers: + - Sasha Finkelstein <fnkl.kernel@gmail.com> + +description: A series of touschscreen controllers used in Apple products + +allOf: + - $ref: touchscreen.yaml# + - $ref: /schemas/spi/spi-peripheral-props.yaml# + +properties: + compatible: + enum: + - apple,j293-touchbar + - apple,j493-touchbar + + interrupts: + maxItems: 1 + + reset-gpios: + maxItems: 1 + + firmware-name: + maxItems: 1 + + apple,z2-cal-blob: + $ref: /schemas/types.yaml#/definitions/uint8-array + maxItems: 4096 + description: + Calibration blob supplied by the bootloader + +required: + - compatible + - interrupts + - reset-gpios + - firmware-name + - touchscreen-size-x + - touchscreen-size-y + +unevaluatedProperties: false + +examples: + - | + #include <dt-bindings/gpio/gpio.h> + #include <dt-bindings/interrupt-controller/irq.h> + + spi { + #address-cells = <1>; + #size-cells = <0>; + + touchscreen@0 { + compatible = "apple,j293-touchbar"; + reg = <0>; + spi-max-frequency = <11500000>; + reset-gpios = <&pinctrl_ap 139 GPIO_ACTIVE_LOW>; + interrupts-extended = <&pinctrl_ap 194 IRQ_TYPE_EDGE_FALLING>; + firmware-name = "apple/dfrmtfw-j293.bin"; + touchscreen-size-x = <23045>; + touchscreen-size-y = <640>; + }; + }; + +... diff --git a/Documentation/devicetree/bindings/input/touchscreen/goodix,gt9916.yaml b/Documentation/devicetree/bindings/input/touchscreen/goodix,gt9916.yaml index d90f045ac06c..c40d92b7f4af 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/goodix,gt9916.yaml +++ b/Documentation/devicetree/bindings/input/touchscreen/goodix,gt9916.yaml @@ -19,6 +19,7 @@ allOf: properties: compatible: enum: + - goodix,gt9897 - goodix,gt9916 reg: diff --git a/Documentation/devicetree/bindings/input/touchscreen/ti,ads7843.yaml b/Documentation/devicetree/bindings/input/touchscreen/ti,ads7843.yaml index 604921733d2c..8f6335d7da1c 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/ti,ads7843.yaml +++ b/Documentation/devicetree/bindings/input/touchscreen/ti,ads7843.yaml @@ -164,20 +164,20 @@ examples: #size-cells = <0>; touchscreen@0 { - compatible = "ti,tsc2046"; - reg = <0>; /* CS0 */ - interrupt-parent = <&gpio1>; - interrupts = <8 0>; /* BOOT6 / GPIO 8 */ - pendown-gpio = <&gpio1 8 0>; - spi-max-frequency = <1000000>; - vcc-supply = <®_vcc3>; - wakeup-source; - - ti,pressure-max = /bits/ 16 <255>; - ti,x-max = /bits/ 16 <8000>; - ti,x-min = /bits/ 16 <0>; - ti,x-plate-ohms = /bits/ 16 <40>; - ti,y-max = /bits/ 16 <4800>; - ti,y-min = /bits/ 16 <0>; - }; + compatible = "ti,tsc2046"; + reg = <0>; /* CS0 */ + interrupt-parent = <&gpio1>; + interrupts = <8 0>; /* BOOT6 / GPIO 8 */ + pendown-gpio = <&gpio1 8 0>; + spi-max-frequency = <1000000>; + vcc-supply = <®_vcc3>; + wakeup-source; + + ti,pressure-max = /bits/ 16 <255>; + ti,x-max = /bits/ 16 <8000>; + ti,x-min = /bits/ 16 <0>; + ti,x-plate-ohms = /bits/ 16 <40>; + ti,y-max = /bits/ 16 <4800>; + ti,y-min = /bits/ 16 <0>; + }; }; diff --git a/Documentation/devicetree/bindings/mfd/aspeed-lpc.yaml b/Documentation/devicetree/bindings/mfd/aspeed-lpc.yaml index 5dfe77aca167..d88854e60b7f 100644 --- a/Documentation/devicetree/bindings/mfd/aspeed-lpc.yaml +++ b/Documentation/devicetree/bindings/mfd/aspeed-lpc.yaml @@ -1,5 +1,5 @@ # SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) -# # Copyright (c) 2021 Aspeed Tehchnology Inc. +# # Copyright (c) 2021 Aspeed Technology Inc. %YAML 1.2 --- $id: http://devicetree.org/schemas/mfd/aspeed-lpc.yaml# diff --git a/Documentation/devicetree/bindings/phy/allwinner,sun50i-a64-usb-phy.yaml b/Documentation/devicetree/bindings/phy/allwinner,sun50i-a64-usb-phy.yaml index 21209126ed00..580c3296a18d 100644 --- a/Documentation/devicetree/bindings/phy/allwinner,sun50i-a64-usb-phy.yaml +++ b/Documentation/devicetree/bindings/phy/allwinner,sun50i-a64-usb-phy.yaml @@ -20,7 +20,9 @@ properties: - allwinner,sun20i-d1-usb-phy - allwinner,sun50i-a64-usb-phy - items: - - const: allwinner,sun50i-a100-usb-phy + - enum: + - allwinner,sun50i-a100-usb-phy + - allwinner,sun55i-a523-usb-phy - const: allwinner,sun20i-d1-usb-phy reg: diff --git a/Documentation/devicetree/bindings/phy/phy-rockchip-naneng-combphy.yaml b/Documentation/devicetree/bindings/phy/phy-rockchip-naneng-combphy.yaml index 1b3de6678c08..888e6b2aac5a 100644 --- a/Documentation/devicetree/bindings/phy/phy-rockchip-naneng-combphy.yaml +++ b/Documentation/devicetree/bindings/phy/phy-rockchip-naneng-combphy.yaml @@ -12,6 +12,7 @@ maintainers: properties: compatible: enum: + - rockchip,rk3562-naneng-combphy - rockchip,rk3568-naneng-combphy - rockchip,rk3576-naneng-combphy - rockchip,rk3588-naneng-combphy diff --git a/Documentation/devicetree/bindings/phy/qcom,ipq5332-uniphy-pcie-phy.yaml b/Documentation/devicetree/bindings/phy/qcom,ipq5332-uniphy-pcie-phy.yaml new file mode 100644 index 000000000000..e39168d55d23 --- /dev/null +++ b/Documentation/devicetree/bindings/phy/qcom,ipq5332-uniphy-pcie-phy.yaml @@ -0,0 +1,76 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/phy/qcom,ipq5332-uniphy-pcie-phy.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Qualcomm UNIPHY PCIe 28LP PHY + +maintainers: + - Nitheesh Sekar <quic_nsekar@quicinc.com> + - Varadarajan Narayanan <quic_varada@quicinc.com> + +description: + PCIe and USB combo PHY found in Qualcomm IPQ5332 SoC + +properties: + compatible: + enum: + - qcom,ipq5332-uniphy-pcie-phy + + reg: + maxItems: 1 + + clocks: + items: + - description: pcie pipe clock + - description: pcie ahb clock + + resets: + items: + - description: phy reset + - description: ahb reset + - description: cfg reset + + "#phy-cells": + const: 0 + + "#clock-cells": + const: 0 + + num-lanes: + $ref: /schemas/types.yaml#/definitions/uint32 + enum: [1, 2] + +required: + - compatible + - reg + - clocks + - resets + - "#phy-cells" + - "#clock-cells" + - num-lanes + +additionalProperties: false + +examples: + - | + #include <dt-bindings/clock/qcom,ipq5332-gcc.h> + + pcie0_phy: phy@4b0000 { + compatible = "qcom,ipq5332-uniphy-pcie-phy"; + reg = <0x004b0000 0x800>; + + clocks = <&gcc GCC_PCIE3X1_0_PIPE_CLK>, + <&gcc GCC_PCIE3X1_PHY_AHB_CLK>; + + resets = <&gcc GCC_PCIE3X1_0_PHY_BCR>, + <&gcc GCC_PCIE3X1_PHY_AHB_CLK_ARES>, + <&gcc GCC_PCIE3X1_0_PHY_PHY_BCR>; + + #clock-cells = <0>; + + #phy-cells = <0>; + + num-lanes = <1>; + }; diff --git a/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml b/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml index 89391649e0b5..2c6c9296e4c0 100644 --- a/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml +++ b/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml @@ -17,6 +17,7 @@ properties: compatible: enum: - qcom,qcs615-qmp-gen3x1-pcie-phy + - qcom,qcs8300-qmp-gen4x2-pcie-phy - qcom,sa8775p-qmp-gen4x2-pcie-phy - qcom,sa8775p-qmp-gen4x4-pcie-phy - qcom,sar2130p-qmp-gen3x2-pcie-phy @@ -45,6 +46,7 @@ properties: - qcom,x1e80100-qmp-gen4x2-pcie-phy - qcom,x1e80100-qmp-gen4x4-pcie-phy - qcom,x1e80100-qmp-gen4x8-pcie-phy + - qcom,x1p42100-qmp-gen4x4-pcie-phy reg: minItems: 1 @@ -124,6 +126,7 @@ allOf: enum: - qcom,sc8280xp-qmp-gen3x4-pcie-phy - qcom,x1e80100-qmp-gen4x4-pcie-phy + - qcom,x1p42100-qmp-gen4x4-pcie-phy then: properties: reg: @@ -180,6 +183,7 @@ allOf: - qcom,x1e80100-qmp-gen4x2-pcie-phy - qcom,x1e80100-qmp-gen4x4-pcie-phy - qcom,x1e80100-qmp-gen4x8-pcie-phy + - qcom,x1p42100-qmp-gen4x4-pcie-phy then: properties: clocks: @@ -192,6 +196,7 @@ allOf: compatible: contains: enum: + - qcom,qcs8300-qmp-gen4x2-pcie-phy - qcom,sa8775p-qmp-gen4x2-pcie-phy - qcom,sa8775p-qmp-gen4x4-pcie-phy then: @@ -217,12 +222,6 @@ allOf: minItems: 2 reset-names: minItems: 2 - else: - properties: - resets: - maxItems: 1 - reset-names: - maxItems: 1 - if: properties: diff --git a/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-ufs-phy.yaml b/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-ufs-phy.yaml index 72bed2933b03..a58370a6a5d3 100644 --- a/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-ufs-phy.yaml +++ b/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-ufs-phy.yaml @@ -44,6 +44,7 @@ properties: - qcom,sm8475-qmp-ufs-phy - qcom,sm8550-qmp-ufs-phy - qcom,sm8650-qmp-ufs-phy + - qcom,sm8750-qmp-ufs-phy reg: maxItems: 1 @@ -111,6 +112,7 @@ allOf: - qcom,sm8475-qmp-ufs-phy - qcom,sm8550-qmp-ufs-phy - qcom,sm8650-qmp-ufs-phy + - qcom,sm8750-qmp-ufs-phy then: properties: clocks: diff --git a/Documentation/devicetree/bindings/phy/rockchip,rk3588-hdptx-phy.yaml b/Documentation/devicetree/bindings/phy/rockchip,rk3588-hdptx-phy.yaml index 84fe59dbcf48..7a307f45cdec 100644 --- a/Documentation/devicetree/bindings/phy/rockchip,rk3588-hdptx-phy.yaml +++ b/Documentation/devicetree/bindings/phy/rockchip,rk3588-hdptx-phy.yaml @@ -11,8 +11,13 @@ maintainers: properties: compatible: - enum: - - rockchip,rk3588-hdptx-phy + oneOf: + - enum: + - rockchip,rk3588-hdptx-phy + - items: + - enum: + - rockchip,rk3576-hdptx-phy + - const: rockchip,rk3588-hdptx-phy reg: maxItems: 1 @@ -34,24 +39,12 @@ properties: const: 0 resets: - items: - - description: PHY reset line - - description: APB reset line - - description: INIT reset line - - description: CMN reset line - - description: LANE reset line - - description: ROPLL reset line - - description: LCPLL reset line + minItems: 4 + maxItems: 7 reset-names: - items: - - const: phy - - const: apb - - const: init - - const: cmn - - const: lane - - const: ropll - - const: lcpll + minItems: 4 + maxItems: 7 rockchip,grf: $ref: /schemas/types.yaml#/definitions/phandle @@ -67,6 +60,39 @@ required: - reset-names - rockchip,grf +allOf: + - if: + properties: + compatible: + contains: + enum: + - rockchip,rk3576-hdptx-phy + then: + properties: + resets: + minItems: 4 + maxItems: 4 + reset-names: + items: + - const: apb + - const: init + - const: cmn + - const: lane + else: + properties: + resets: + minItems: 7 + maxItems: 7 + reset-names: + items: + - const: phy + - const: apb + - const: init + - const: cmn + - const: lane + - const: ropll + - const: lcpll + additionalProperties: false examples: diff --git a/Documentation/devicetree/bindings/phy/rockchip,rk3588-mipi-dcphy.yaml b/Documentation/devicetree/bindings/phy/rockchip,rk3588-mipi-dcphy.yaml new file mode 100644 index 000000000000..c8ff5ba22a86 --- /dev/null +++ b/Documentation/devicetree/bindings/phy/rockchip,rk3588-mipi-dcphy.yaml @@ -0,0 +1,87 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/phy/rockchip,rk3588-mipi-dcphy.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Rockchip MIPI D-/C-PHY with Samsung IP block + +maintainers: + - Guochun Huang <hero.huang@rock-chips.com> + - Heiko Stuebner <heiko@sntech.de> + +properties: + compatible: + enum: + - rockchip,rk3576-mipi-dcphy + - rockchip,rk3588-mipi-dcphy + + reg: + maxItems: 1 + + "#phy-cells": + const: 1 + description: | + Argument is mode to operate in. Supported modes are: + - PHY_TYPE_DPHY + - PHY_TYPE_CPHY + See include/dt-bindings/phy/phy.h for constants. + + clocks: + maxItems: 2 + + clock-names: + items: + - const: pclk + - const: ref + + resets: + maxItems: 4 + + reset-names: + items: + - const: m_phy + - const: apb + - const: grf + - const: s_phy + + rockchip,grf: + $ref: /schemas/types.yaml#/definitions/phandle + description: + Phandle to the syscon managing the 'mipi dcphy general register files'. + +required: + - compatible + - reg + - clocks + - clock-names + - resets + - reset-names + - "#phy-cells" + +additionalProperties: false + +examples: + - | + #include <dt-bindings/clock/rockchip,rk3588-cru.h> + #include <dt-bindings/reset/rockchip,rk3588-cru.h> + + soc { + #address-cells = <2>; + #size-cells = <2>; + + phy@feda0000 { + compatible = "rockchip,rk3588-mipi-dcphy"; + reg = <0x0 0xfeda0000 0x0 0x10000>; + clocks = <&cru PCLK_MIPI_DCPHY0>, + <&cru CLK_USBDPPHY_MIPIDCPPHY_REF>; + clock-names = "pclk", "ref"; + resets = <&cru SRST_M_MIPI_DCPHY0>, + <&cru SRST_P_MIPI_DCPHY0>, + <&cru SRST_P_MIPI_DCPHY0_GRF>, + <&cru SRST_S_MIPI_DCPHY0>; + reset-names = "m_phy", "apb", "grf", "s_phy"; + rockchip,grf = <&mipidcphy0_grf>; + #phy-cells = <1>; + }; + }; diff --git a/Documentation/devicetree/bindings/phy/samsung,ufs-phy.yaml b/Documentation/devicetree/bindings/phy/samsung,ufs-phy.yaml index f402e31bf58d..d70ffeb6e824 100644 --- a/Documentation/devicetree/bindings/phy/samsung,ufs-phy.yaml +++ b/Documentation/devicetree/bindings/phy/samsung,ufs-phy.yaml @@ -18,6 +18,7 @@ properties: - google,gs101-ufs-phy - samsung,exynos7-ufs-phy - samsung,exynosautov9-ufs-phy + - samsung,exynosautov920-ufs-phy - tesla,fsd-ufs-phy reg: diff --git a/Documentation/devicetree/bindings/phy/samsung,usb3-drd-phy.yaml b/Documentation/devicetree/bindings/phy/samsung,usb3-drd-phy.yaml index 16321cdd4919..27295acbba76 100644 --- a/Documentation/devicetree/bindings/phy/samsung,usb3-drd-phy.yaml +++ b/Documentation/devicetree/bindings/phy/samsung,usb3-drd-phy.yaml @@ -83,14 +83,19 @@ properties: pll-supply: description: Power supply for the USB PLL. + dvdd-usb20-supply: description: DVDD power supply for the USB 2.0 phy. + vddh-usb20-supply: description: VDDh power supply for the USB 2.0 phy. + vdd33-usb20-supply: description: 3.3V power supply for the USB 2.0 phy. + vdda-usbdp-supply: description: VDDa power supply for the USB DP phy. + vddh-usbdp-supply: description: VDDh power supply for the USB DP phy. @@ -109,6 +114,8 @@ allOf: contains: const: google,gs101-usb31drd-phy then: + $ref: /schemas/usb/usb-switch.yaml# + properties: clocks: items: @@ -117,6 +124,7 @@ allOf: - description: Gate of control interface AXI clock - description: Gate of control interface APB clock - description: Gate of SCL APB clock + clock-names: items: - const: phy @@ -124,12 +132,17 @@ allOf: - const: ctrl_aclk - const: ctrl_pclk - const: scl_pclk + reg: minItems: 3 + reg-names: minItems: 3 + required: - reg-names + - orientation-switch + - port - pll-supply - dvdd-usb20-supply - vddh-usb20-supply @@ -149,6 +162,7 @@ allOf: clocks: minItems: 5 maxItems: 5 + clock-names: items: - const: phy @@ -156,8 +170,10 @@ allOf: - const: phy_utmi - const: phy_pipe - const: itp + reg: maxItems: 1 + reg-names: maxItems: 1 @@ -174,16 +190,19 @@ allOf: clocks: minItems: 2 maxItems: 2 + clock-names: items: - const: phy - const: ref + reg: maxItems: 1 + reg-names: maxItems: 1 -additionalProperties: false +unevaluatedProperties: false examples: - | diff --git a/Documentation/devicetree/bindings/power/wakeup-source.txt b/Documentation/devicetree/bindings/power/wakeup-source.txt index 27f1797be963..66bb016305f9 100644 --- a/Documentation/devicetree/bindings/power/wakeup-source.txt +++ b/Documentation/devicetree/bindings/power/wakeup-source.txt @@ -23,7 +23,7 @@ List of legacy properties and respective binding document 1. "gpio-key,wakeup" Documentation/devicetree/bindings/input/gpio-keys{,-polled}.txt 2. "has-tpo" Documentation/devicetree/bindings/rtc/rtc-opal.txt -3. "linux,wakeup" Documentation/devicetree/bindings/input/gpio-matrix-keypad.txt +3. "linux,wakeup" Documentation/devicetree/bindings/input/gpio-matrix-keypad.yaml Documentation/devicetree/bindings/mfd/tc3589x.txt Documentation/devicetree/bindings/input/touchscreen/ti,ads7843.yaml 4. "linux,keypad-wakeup" Documentation/devicetree/bindings/input/qcom,pm8921-keypad.yaml diff --git a/Documentation/devicetree/bindings/powerpc/fsl/dma.txt b/Documentation/devicetree/bindings/powerpc/fsl/dma.txt deleted file mode 100644 index c11ad5c6db21..000000000000 --- a/Documentation/devicetree/bindings/powerpc/fsl/dma.txt +++ /dev/null @@ -1,204 +0,0 @@ -* Freescale DMA Controllers - -** Freescale Elo DMA Controller - This is a little-endian 4-channel DMA controller, used in Freescale mpc83xx - series chips such as mpc8315, mpc8349, mpc8379 etc. - -Required properties: - -- compatible : must include "fsl,elo-dma" -- reg : DMA General Status Register, i.e. DGSR which contains - status for all the 4 DMA channels -- ranges : describes the mapping between the address space of the - DMA channels and the address space of the DMA controller -- cell-index : controller index. 0 for controller @ 0x8100 -- interrupts : interrupt specifier for DMA IRQ - -- DMA channel nodes: - - compatible : must include "fsl,elo-dma-channel" - However, see note below. - - reg : DMA channel specific registers - - cell-index : DMA channel index starts at 0. - -Optional properties: - - interrupts : interrupt specifier for DMA channel IRQ - (on 83xx this is expected to be identical to - the interrupts property of the parent node) - -Example: - dma@82a8 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "fsl,mpc8349-dma", "fsl,elo-dma"; - reg = <0x82a8 4>; - ranges = <0 0x8100 0x1a4>; - interrupt-parent = <&ipic>; - interrupts = <71 8>; - cell-index = <0>; - dma-channel@0 { - compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; - cell-index = <0>; - reg = <0 0x80>; - interrupt-parent = <&ipic>; - interrupts = <71 8>; - }; - dma-channel@80 { - compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; - cell-index = <1>; - reg = <0x80 0x80>; - interrupt-parent = <&ipic>; - interrupts = <71 8>; - }; - dma-channel@100 { - compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; - cell-index = <2>; - reg = <0x100 0x80>; - interrupt-parent = <&ipic>; - interrupts = <71 8>; - }; - dma-channel@180 { - compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; - cell-index = <3>; - reg = <0x180 0x80>; - interrupt-parent = <&ipic>; - interrupts = <71 8>; - }; - }; - -** Freescale EloPlus DMA Controller - This is a 4-channel DMA controller with extended addresses and chaining, - mainly used in Freescale mpc85xx/86xx, Pxxx and BSC series chips, such as - mpc8540, mpc8641 p4080, bsc9131 etc. - -Required properties: - -- compatible : must include "fsl,eloplus-dma" -- reg : DMA General Status Register, i.e. DGSR which contains - status for all the 4 DMA channels -- cell-index : controller index. 0 for controller @ 0x21000, - 1 for controller @ 0xc000 -- ranges : describes the mapping between the address space of the - DMA channels and the address space of the DMA controller - -- DMA channel nodes: - - compatible : must include "fsl,eloplus-dma-channel" - However, see note below. - - cell-index : DMA channel index starts at 0. - - reg : DMA channel specific registers - - interrupts : interrupt specifier for DMA channel IRQ - -Example: - dma@21300 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "fsl,mpc8540-dma", "fsl,eloplus-dma"; - reg = <0x21300 4>; - ranges = <0 0x21100 0x200>; - cell-index = <0>; - dma-channel@0 { - compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; - reg = <0 0x80>; - cell-index = <0>; - interrupt-parent = <&mpic>; - interrupts = <20 2>; - }; - dma-channel@80 { - compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; - reg = <0x80 0x80>; - cell-index = <1>; - interrupt-parent = <&mpic>; - interrupts = <21 2>; - }; - dma-channel@100 { - compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; - reg = <0x100 0x80>; - cell-index = <2>; - interrupt-parent = <&mpic>; - interrupts = <22 2>; - }; - dma-channel@180 { - compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; - reg = <0x180 0x80>; - cell-index = <3>; - interrupt-parent = <&mpic>; - interrupts = <23 2>; - }; - }; - -** Freescale Elo3 DMA Controller - DMA controller which has same function as EloPlus except that Elo3 has 8 - channels while EloPlus has only 4, it is used in Freescale Txxx and Bxxx - series chips, such as t1040, t4240, b4860. - -Required properties: - -- compatible : must include "fsl,elo3-dma" -- reg : contains two entries for DMA General Status Registers, - i.e. DGSR0 which includes status for channel 1~4, and - DGSR1 for channel 5~8 -- ranges : describes the mapping between the address space of the - DMA channels and the address space of the DMA controller - -- DMA channel nodes: - - compatible : must include "fsl,eloplus-dma-channel" - - reg : DMA channel specific registers - - interrupts : interrupt specifier for DMA channel IRQ - -Example: -dma@100300 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "fsl,elo3-dma"; - reg = <0x100300 0x4>, - <0x100600 0x4>; - ranges = <0x0 0x100100 0x500>; - dma-channel@0 { - compatible = "fsl,eloplus-dma-channel"; - reg = <0x0 0x80>; - interrupts = <28 2 0 0>; - }; - dma-channel@80 { - compatible = "fsl,eloplus-dma-channel"; - reg = <0x80 0x80>; - interrupts = <29 2 0 0>; - }; - dma-channel@100 { - compatible = "fsl,eloplus-dma-channel"; - reg = <0x100 0x80>; - interrupts = <30 2 0 0>; - }; - dma-channel@180 { - compatible = "fsl,eloplus-dma-channel"; - reg = <0x180 0x80>; - interrupts = <31 2 0 0>; - }; - dma-channel@300 { - compatible = "fsl,eloplus-dma-channel"; - reg = <0x300 0x80>; - interrupts = <76 2 0 0>; - }; - dma-channel@380 { - compatible = "fsl,eloplus-dma-channel"; - reg = <0x380 0x80>; - interrupts = <77 2 0 0>; - }; - dma-channel@400 { - compatible = "fsl,eloplus-dma-channel"; - reg = <0x400 0x80>; - interrupts = <78 2 0 0>; - }; - dma-channel@480 { - compatible = "fsl,eloplus-dma-channel"; - reg = <0x480 0x80>; - interrupts = <79 2 0 0>; - }; -}; - -Note on DMA channel compatible properties: The compatible property must say -"fsl,elo-dma-channel" or "fsl,eloplus-dma-channel" to be used by the Elo DMA -driver (fsldma). Any DMA channel used by fsldma cannot be used by another -DMA driver, such as the SSI sound drivers for the MPC8610. Therefore, any DMA -channel that should be used for another driver should not use -"fsl,elo-dma-channel" or "fsl,eloplus-dma-channel". For the SSI drivers, for -example, the compatible property should be "fsl,ssi-dma-channel". See ssi.txt -for more information. diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml index a63b994e0763..bcab59e0cc2e 100644 --- a/Documentation/devicetree/bindings/riscv/extensions.yaml +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml @@ -224,6 +224,12 @@ properties: as ratified at commit 4a69197e5617 ("Update to ratified state") of riscv-svvptc. + - const: zaamo + description: | + The standard Zaamo extension for atomic memory operations as + ratified at commit e87412e621f1 ("integrate Zaamo and Zalrsc text + (#1304)") of the unprivileged ISA specification. + - const: zabha description: | The Zabha extension for Byte and Halfword Atomic Memory Operations @@ -236,6 +242,12 @@ properties: is supported as ratified at commit 5059e0ca641c ("update to ratified") of the riscv-zacas. + - const: zalrsc + description: | + The standard Zalrsc extension for load-reserved/store-conditional as + ratified at commit e87412e621f1 ("integrate Zaamo and Zalrsc text + (#1304)") of the unprivileged ISA specification. + - const: zawrs description: | The Zawrs extension for entering a low-power state or for trapping @@ -329,6 +341,12 @@ properties: instructions, as ratified in commit 056b6ff ("Zfa is ratified") of riscv-isa-manual. + - const: zfbfmin + description: + The standard Zfbfmin extension which provides minimal support for + 16-bit half-precision brain floating-point instructions, as ratified + in commit 4dc23d62 ("Added Chapter title to BF16") of riscv-isa-manual. + - const: zfh description: The standard Zfh extension for 16-bit half-precision binary @@ -525,6 +543,18 @@ properties: in commit 6f702a2 ("Vector extensions are now ratified") of riscv-v-spec. + - const: zvfbfmin + description: + The standard Zvfbfmin extension for minimal support for vectored + 16-bit half-precision brain floating-point instructions, as ratified + in commit 4dc23d62 ("Added Chapter title to BF16") of riscv-isa-manual. + + - const: zvfbfwma + description: + The standard Zvfbfwma extension for vectored half-precision brain + floating-point widening multiply-accumulate instructions, as ratified + in commit 4dc23d62 ("Added Chapter title to BF16") of riscv-isa-manual. + - const: zvfh description: The standard Zvfh extension for vectored half-precision @@ -639,6 +669,12 @@ properties: https://github.com/T-head-Semi/thead-extension-spec/blob/95358cb2cca9489361c61d335e03d3134b14133f/xtheadvector.adoc. allOf: + - if: + contains: + const: d + then: + contains: + const: f # Zcb depends on Zca - if: contains: @@ -673,6 +709,119 @@ properties: then: contains: const: zca + # Zfbfmin depends on F + - if: + contains: + const: zfbfmin + then: + contains: + const: f + # Zvfbfmin depends on V or Zve32f + - if: + contains: + const: zvfbfmin + then: + oneOf: + - contains: + const: v + - contains: + const: zve32f + # Zvfbfwma depends on Zfbfmin and Zvfbfmin + - if: + contains: + const: zvfbfwma + then: + allOf: + - contains: + const: zfbfmin + - contains: + const: zvfbfmin + # Zacas depends on Zaamo + - if: + contains: + const: zacas + then: + contains: + const: zaamo + + - if: + contains: + const: zve32x + then: + contains: + const: zicsr + + - if: + contains: + const: zve32f + then: + allOf: + - contains: + const: f + - contains: + const: zve32x + + - if: + contains: + const: zve64x + then: + contains: + const: zve32x + + - if: + contains: + const: zve64f + then: + allOf: + - contains: + const: f + - contains: + const: zve32f + - contains: + const: zve64x + + - if: + contains: + const: zve64d + then: + allOf: + - contains: + const: d + - contains: + const: zve64f + + - if: + contains: + anyOf: + - const: zvbc + - const: zvkn + - const: zvknc + - const: zvkng + - const: zvknhb + - const: zvksc + then: + contains: + anyOf: + - const: v + - const: zve64x + + - if: + contains: + anyOf: + - const: zvbb + - const: zvkb + - const: zvkg + - const: zvkned + - const: zvknha + - const: zvksed + - const: zvksh + - const: zvks + - const: zvkt + then: + contains: + anyOf: + - const: v + - const: zve32x allOf: # Zcf extension does not exist on rv64 diff --git a/Documentation/devicetree/bindings/rtc/adi,max31335.yaml b/Documentation/devicetree/bindings/rtc/adi,max31335.yaml index 0125cf6727cc..bce7558d0d87 100644 --- a/Documentation/devicetree/bindings/rtc/adi,max31335.yaml +++ b/Documentation/devicetree/bindings/rtc/adi,max31335.yaml @@ -18,7 +18,9 @@ allOf: properties: compatible: - const: adi,max31335 + enum: + - adi,max31331 + - adi,max31335 reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/rtc/nxp,pcf2127.yaml b/Documentation/devicetree/bindings/rtc/nxp,pcf2127.yaml index 2d9fe5a75b06..11fcf0ca1ae0 100644 --- a/Documentation/devicetree/bindings/rtc/nxp,pcf2127.yaml +++ b/Documentation/devicetree/bindings/rtc/nxp,pcf2127.yaml @@ -8,6 +8,7 @@ title: NXP PCF2127 Real Time Clock allOf: - $ref: rtc.yaml# + - $ref: /schemas/spi/spi-peripheral-props.yaml# maintainers: - Alexandre Belloni <alexandre.belloni@bootlin.com> @@ -34,7 +35,7 @@ required: - compatible - reg -additionalProperties: false +unevaluatedProperties: false examples: - | diff --git a/Documentation/devicetree/bindings/rtc/qcom-pm8xxx-rtc.yaml b/Documentation/devicetree/bindings/rtc/qcom-pm8xxx-rtc.yaml index d274bb7a534b..68ef3208c886 100644 --- a/Documentation/devicetree/bindings/rtc/qcom-pm8xxx-rtc.yaml +++ b/Documentation/devicetree/bindings/rtc/qcom-pm8xxx-rtc.yaml @@ -50,6 +50,11 @@ properties: items: - const: offset + qcom,no-alarm: + type: boolean + description: + RTC alarm is not owned by the OS + wakeup-source: true required: diff --git a/Documentation/devicetree/bindings/serial/8250.yaml b/Documentation/devicetree/bindings/serial/8250.yaml index 0bde2379e864..dc0d52920575 100644 --- a/Documentation/devicetree/bindings/serial/8250.yaml +++ b/Documentation/devicetree/bindings/serial/8250.yaml @@ -77,7 +77,6 @@ properties: - altr,16550-FIFO64 - altr,16550-FIFO128 - fsl,16550-FIFO64 - - fsl,ns16550 - andestech,uart16550 - nxp,lpc1850-uart - opencores,uart16550-rtlsvn105 @@ -86,6 +85,7 @@ properties: - items: - enum: - ns16750 + - fsl,ns16550 - cavium,octeon-3860-uart - xlnx,xps-uart16550-2.00.b - ralink,rt2880-uart diff --git a/Documentation/devicetree/bindings/serial/fsl-lpuart.yaml b/Documentation/devicetree/bindings/serial/fsl-lpuart.yaml index 3f9ace89dee9..c42261b5a80a 100644 --- a/Documentation/devicetree/bindings/serial/fsl-lpuart.yaml +++ b/Documentation/devicetree/bindings/serial/fsl-lpuart.yaml @@ -30,6 +30,7 @@ properties: - items: - enum: - fsl,imx93-lpuart + - fsl,imx94-lpuart - fsl,imx95-lpuart - const: fsl,imx8ulp-lpuart - const: fsl,imx7ulp-lpuart diff --git a/Documentation/devicetree/bindings/serial/nvidia,tegra264-utc.yaml b/Documentation/devicetree/bindings/serial/nvidia,tegra264-utc.yaml new file mode 100644 index 000000000000..572cc574da64 --- /dev/null +++ b/Documentation/devicetree/bindings/serial/nvidia,tegra264-utc.yaml @@ -0,0 +1,73 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/serial/nvidia,tegra264-utc.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: NVIDIA Tegra UTC (UART Trace Controller) client + +maintainers: + - Kartik Rajput <kkartik@nvidia.com> + - Thierry Reding <thierry.reding@gmail.com> + - Jonathan Hunter <jonathanh@nvidia.com> + +description: + Represents a client interface of the Tegra UTC (UART Trace Controller). The + Tegra UTC allows multiple clients within the Tegra SoC to share a physical + UART interface. It supports up to 16 clients. Each client operates as an + independent UART endpoint with a dedicated interrupt and 128-character TX/RX + FIFOs. + + The Tegra UTC clients use 8-N-1 configuration and operates on a baudrate + configured by the bootloader at the controller level. + +allOf: + - $ref: serial.yaml# + +properties: + compatible: + const: nvidia,tegra264-utc + + reg: + items: + - description: TX region. + - description: RX region. + + reg-names: + items: + - const: tx + - const: rx + + interrupts: + maxItems: 1 + + tx-threshold: + minimum: 1 + maximum: 128 + + rx-threshold: + minimum: 1 + maximum: 128 + +required: + - compatible + - reg + - reg-names + - interrupts + - tx-threshold + - rx-threshold + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/arm-gic.h> + + tegra_utc: serial@c4e0000 { + compatible = "nvidia,tegra264-utc"; + reg = <0xc4e0000 0x8000>, <0xc4e8000 0x8000>; + reg-names = "tx", "rx"; + interrupts = <GIC_SPI 514 IRQ_TYPE_LEVEL_HIGH>; + tx-threshold = <4>; + rx-threshold = <4>; + }; diff --git a/Documentation/devicetree/bindings/serial/pl011.yaml b/Documentation/devicetree/bindings/serial/pl011.yaml index 9571041030b7..3fcf2d042372 100644 --- a/Documentation/devicetree/bindings/serial/pl011.yaml +++ b/Documentation/devicetree/bindings/serial/pl011.yaml @@ -92,6 +92,9 @@ properties: 3000ms. default: 3000 + power-domains: + maxItems: 1 + resets: maxItems: 1 diff --git a/Documentation/devicetree/bindings/serial/samsung_uart.yaml b/Documentation/devicetree/bindings/serial/samsung_uart.yaml index 070eba9f19d3..83d9986d8e98 100644 --- a/Documentation/devicetree/bindings/serial/samsung_uart.yaml +++ b/Documentation/devicetree/bindings/serial/samsung_uart.yaml @@ -42,6 +42,10 @@ properties: - samsung,exynosautov9-uart - samsung,exynosautov920-uart - const: samsung,exynos850-uart + - items: + - enum: + - samsung,exynos7870-uart + - const: samsung,exynos8895-uart reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/serial/snps-dw-apb-uart.yaml b/Documentation/devicetree/bindings/serial/snps-dw-apb-uart.yaml index 1c163cb5dff1..1aa3480d8d81 100644 --- a/Documentation/devicetree/bindings/serial/snps-dw-apb-uart.yaml +++ b/Documentation/devicetree/bindings/serial/snps-dw-apb-uart.yaml @@ -16,6 +16,20 @@ allOf: - if: properties: compatible: + items: + - enum: + - renesas,r9a06g032-uart + - renesas,r9a06g033-uart + - const: renesas,rzn1-uart + - const: snps,dw-apb-uart + then: + properties: + dmas: false + dma-names: false + + - if: + properties: + compatible: contains: const: starfive,jh7110-uart then: @@ -35,6 +49,12 @@ properties: - renesas,r9a06g032-uart - renesas,r9a06g033-uart - const: renesas,rzn1-uart + - const: snps,dw-apb-uart + - items: + - enum: + - renesas,r9a06g032-uart + - renesas,r9a06g033-uart + - const: renesas,rzn1-uart - items: - enum: - brcm,bcm11351-dw-apb-uart @@ -51,6 +71,7 @@ properties: - rockchip,rk3368-uart - rockchip,rk3399-uart - rockchip,rk3528-uart + - rockchip,rk3562-uart - rockchip,rk3568-uart - rockchip,rk3576-uart - rockchip,rk3588-uart diff --git a/Documentation/devicetree/bindings/serial/sprd-uart.yaml b/Documentation/devicetree/bindings/serial/sprd-uart.yaml index a2a5056eba04..5bf2656afcfd 100644 --- a/Documentation/devicetree/bindings/serial/sprd-uart.yaml +++ b/Documentation/devicetree/bindings/serial/sprd-uart.yaml @@ -17,13 +17,18 @@ properties: oneOf: - items: - enum: - - sprd,sc9632-uart + - sprd,ums9632-uart + - const: sprd,sc9632-uart + - items: + - enum: - sprd,sc9860-uart - sprd,sc9863a-uart - sprd,ums512-uart - sprd,ums9620-uart - const: sprd,sc9836-uart - - const: sprd,sc9836-uart + - enum: + - sprd,sc9632-uart + - sprd,sc9836-uart reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/thermal/allwinner,sun8i-a83t-ths.yaml b/Documentation/devicetree/bindings/thermal/allwinner,sun8i-a83t-ths.yaml index dad8de900495..3e61689f6dd4 100644 --- a/Documentation/devicetree/bindings/thermal/allwinner,sun8i-a83t-ths.yaml +++ b/Documentation/devicetree/bindings/thermal/allwinner,sun8i-a83t-ths.yaml @@ -142,38 +142,38 @@ unevaluatedProperties: false examples: - | thermal-sensor@1f04000 { - compatible = "allwinner,sun8i-a83t-ths"; - reg = <0x01f04000 0x100>; - interrupts = <0 31 0>; - nvmem-cells = <&ths_calibration>; - nvmem-cell-names = "calibration"; - #thermal-sensor-cells = <1>; + compatible = "allwinner,sun8i-a83t-ths"; + reg = <0x01f04000 0x100>; + interrupts = <0 31 0>; + nvmem-cells = <&ths_calibration>; + nvmem-cell-names = "calibration"; + #thermal-sensor-cells = <1>; }; - | thermal-sensor@1c25000 { - compatible = "allwinner,sun8i-h3-ths"; - reg = <0x01c25000 0x400>; - clocks = <&ccu 0>, <&ccu 1>; - clock-names = "bus", "mod"; - resets = <&ccu 2>; - interrupts = <0 31 0>; - nvmem-cells = <&ths_calibration>; - nvmem-cell-names = "calibration"; - #thermal-sensor-cells = <0>; + compatible = "allwinner,sun8i-h3-ths"; + reg = <0x01c25000 0x400>; + clocks = <&ccu 0>, <&ccu 1>; + clock-names = "bus", "mod"; + resets = <&ccu 2>; + interrupts = <0 31 0>; + nvmem-cells = <&ths_calibration>; + nvmem-cell-names = "calibration"; + #thermal-sensor-cells = <0>; }; - | thermal-sensor@5070400 { - compatible = "allwinner,sun50i-h6-ths"; - reg = <0x05070400 0x100>; - clocks = <&ccu 0>; - clock-names = "bus"; - resets = <&ccu 2>; - interrupts = <0 15 0>; - nvmem-cells = <&ths_calibration>; - nvmem-cell-names = "calibration"; - #thermal-sensor-cells = <1>; + compatible = "allwinner,sun50i-h6-ths"; + reg = <0x05070400 0x100>; + clocks = <&ccu 0>; + clock-names = "bus"; + resets = <&ccu 2>; + interrupts = <0 15 0>; + nvmem-cells = <&ths_calibration>; + nvmem-cell-names = "calibration"; + #thermal-sensor-cells = <1>; }; ... diff --git a/Documentation/devicetree/bindings/thermal/brcm,avs-tmon.yaml b/Documentation/devicetree/bindings/thermal/brcm,avs-tmon.yaml index 081486b44382..2f62551a49c1 100644 --- a/Documentation/devicetree/bindings/thermal/brcm,avs-tmon.yaml +++ b/Documentation/devicetree/bindings/thermal/brcm,avs-tmon.yaml @@ -18,6 +18,7 @@ properties: compatible: items: - enum: + - brcm,avs-tmon-bcm74110 - brcm,avs-tmon-bcm7216 - brcm,avs-tmon-bcm7445 - const: brcm,avs-tmon diff --git a/Documentation/devicetree/bindings/thermal/imx-thermal.yaml b/Documentation/devicetree/bindings/thermal/imx-thermal.yaml index 337560562337..949b154856c5 100644 --- a/Documentation/devicetree/bindings/thermal/imx-thermal.yaml +++ b/Documentation/devicetree/bindings/thermal/imx-thermal.yaml @@ -80,19 +80,19 @@ examples: #include <dt-bindings/interrupt-controller/arm-gic.h> efuse@21bc000 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "fsl,imx6sx-ocotp", "syscon"; - reg = <0x021bc000 0x4000>; - clocks = <&clks IMX6SX_CLK_OCOTP>; - - tempmon_calib: calib@38 { - reg = <0x38 4>; - }; - - tempmon_temp_grade: temp-grade@20 { - reg = <0x20 4>; - }; + #address-cells = <1>; + #size-cells = <1>; + compatible = "fsl,imx6sx-ocotp", "syscon"; + reg = <0x021bc000 0x4000>; + clocks = <&clks IMX6SX_CLK_OCOTP>; + + tempmon_calib: calib@38 { + reg = <0x38 4>; + }; + + tempmon_temp_grade: temp-grade@20 { + reg = <0x20 4>; + }; }; anatop@20c8000 { @@ -103,12 +103,12 @@ examples: <0 127 IRQ_TYPE_LEVEL_HIGH>; tempmon { - compatible = "fsl,imx6sx-tempmon"; - interrupts = <GIC_SPI 49 IRQ_TYPE_LEVEL_HIGH>; - fsl,tempmon = <&anatop>; - nvmem-cells = <&tempmon_calib>, <&tempmon_temp_grade>; - nvmem-cell-names = "calib", "temp_grade"; - clocks = <&clks IMX6SX_CLK_PLL3_USB_OTG>; - #thermal-sensor-cells = <0>; + compatible = "fsl,imx6sx-tempmon"; + interrupts = <GIC_SPI 49 IRQ_TYPE_LEVEL_HIGH>; + fsl,tempmon = <&anatop>; + nvmem-cells = <&tempmon_calib>, <&tempmon_temp_grade>; + nvmem-cell-names = "calib", "temp_grade"; + clocks = <&clks IMX6SX_CLK_PLL3_USB_OTG>; + #thermal-sensor-cells = <0>; }; }; diff --git a/Documentation/devicetree/bindings/thermal/imx8mm-thermal.yaml b/Documentation/devicetree/bindings/thermal/imx8mm-thermal.yaml index bef0e95e7416..df6c7c5d519f 100644 --- a/Documentation/devicetree/bindings/thermal/imx8mm-thermal.yaml +++ b/Documentation/devicetree/bindings/thermal/imx8mm-thermal.yaml @@ -63,10 +63,10 @@ examples: #include <dt-bindings/clock/imx8mm-clock.h> thermal-sensor@30260000 { - compatible = "fsl,imx8mm-tmu"; - reg = <0x30260000 0x10000>; - clocks = <&clk IMX8MM_CLK_TMU_ROOT>; - #thermal-sensor-cells = <0>; + compatible = "fsl,imx8mm-tmu"; + reg = <0x30260000 0x10000>; + clocks = <&clk IMX8MM_CLK_TMU_ROOT>; + #thermal-sensor-cells = <0>; }; ... diff --git a/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml b/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml index b9829bb22cc0..f9d8012c8cf5 100644 --- a/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml +++ b/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml @@ -75,6 +75,8 @@ properties: - description: v2 of TSENS with combined interrupt enum: + - qcom,ipq5332-tsens + - qcom,ipq5424-tsens - qcom,ipq8074-tsens - description: v2 of TSENS with combined interrupt @@ -212,6 +214,18 @@ properties: - const: s9_p2_backup - const: s10_p1_backup - const: s10_p2_backup + - minItems: 8 + items: + - const: mode + - const: base0 + - const: base1 + - pattern: '^tsens_sens[0-9]+_off$' + - pattern: '^tsens_sens[0-9]+_off$' + - pattern: '^tsens_sens[0-9]+_off$' + - pattern: '^tsens_sens[0-9]+_off$' + - pattern: '^tsens_sens[0-9]+_off$' + - pattern: '^tsens_sens[0-9]+_off$' + - pattern: '^tsens_sens[0-9]+_off$' "#qcom,sensors": description: @@ -271,6 +285,8 @@ allOf: compatible: contains: enum: + - qcom,ipq5332-tsens + - qcom,ipq5424-tsens - qcom,ipq8074-tsens then: properties: @@ -286,6 +302,8 @@ allOf: compatible: contains: enum: + - qcom,ipq5332-tsens + - qcom,ipq5424-tsens - qcom,ipq8074-tsens - qcom,tsens-v0_1 - qcom,tsens-v1 diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml index 0f435be1dbd8..0de0a9757ccc 100644 --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml @@ -82,9 +82,8 @@ patternProperties: $ref: /schemas/types.yaml#/definitions/string description: | The action the OS should perform after the critical temperature is reached. - By default the system will shutdown as a safe action to prevent damage - to the hardware, if the property is not set. - The shutdown action should be always the default and preferred one. + If the property is not set, it is up to the system to select the correct + action. The recommended and preferred default is shutdown. Choose 'reboot' with care, as the hardware may be in thermal stress, thus leading to infinite reboots that may cause damage to the hardware. Make sure the firmware/bootloader will act as the last resort and take diff --git a/Documentation/devicetree/bindings/trivial-devices.yaml b/Documentation/devicetree/bindings/trivial-devices.yaml index 8704b0dfc9c0..8da408107e55 100644 --- a/Documentation/devicetree/bindings/trivial-devices.yaml +++ b/Documentation/devicetree/bindings/trivial-devices.yaml @@ -193,6 +193,8 @@ properties: - maxim,max20751 # mCube 3-axis 8-bit digital accelerometer - mcube,mc3230 + # mCube 3-axis 8-bit digital accelerometer + - mcube,mc3510c # Measurement Specialities I2C temperature and humidity sensor - meas,htu21 # Measurement Specialities I2C temperature and humidity sensor diff --git a/Documentation/devicetree/bindings/usb/generic-xhci.yaml b/Documentation/devicetree/bindings/usb/generic-xhci.yaml index 6ceafa4af292..a2b94a138999 100644 --- a/Documentation/devicetree/bindings/usb/generic-xhci.yaml +++ b/Documentation/devicetree/bindings/usb/generic-xhci.yaml @@ -51,6 +51,8 @@ properties: - const: core - const: reg + dma-coherent: true + power-domains: maxItems: 1 diff --git a/Documentation/devicetree/bindings/usb/microchip,usb2514.yaml b/Documentation/devicetree/bindings/usb/microchip,usb2514.yaml index b14e6f37b298..4e3901efed3f 100644 --- a/Documentation/devicetree/bindings/usb/microchip,usb2514.yaml +++ b/Documentation/devicetree/bindings/usb/microchip,usb2514.yaml @@ -9,16 +9,19 @@ title: Microchip USB2514 Hub Controller maintainers: - Fabio Estevam <festevam@gmail.com> -allOf: - - $ref: usb-device.yaml# - properties: compatible: - enum: - - usb424,2412 - - usb424,2417 - - usb424,2514 - - usb424,2517 + oneOf: + - enum: + - usb424,2412 + - usb424,2417 + - usb424,2514 + - usb424,2517 + - items: + - enum: + - usb424,2512 + - usb424,2513 + - const: usb424,2514 reg: true @@ -28,6 +31,9 @@ properties: vdd-supply: description: 3.3V power supply. + vdda-supply: + description: 3.3V analog power supply. + clocks: description: External 24MHz clock connected to the CLKIN pin. maxItems: 1 @@ -43,6 +49,18 @@ patternProperties: $ref: /schemas/usb/usb-device.yaml additionalProperties: true +allOf: + - $ref: usb-device.yaml# + - if: + not: + properties: + compatible: + contains: + const: usb424,2514 + then: + properties: + vdda-supply: false + unevaluatedProperties: false examples: @@ -60,6 +78,7 @@ examples: clocks = <&clks IMX6QDL_CLK_CKO>; reset-gpios = <&gpio7 12 GPIO_ACTIVE_LOW>; vdd-supply = <®_3v3_hub>; + vdda-supply = <®_3v3a_hub>; #address-cells = <1>; #size-cells = <0>; diff --git a/Documentation/devicetree/bindings/usb/parade,ps8830.yaml b/Documentation/devicetree/bindings/usb/parade,ps8830.yaml new file mode 100644 index 000000000000..935d57f5d26f --- /dev/null +++ b/Documentation/devicetree/bindings/usb/parade,ps8830.yaml @@ -0,0 +1,140 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/usb/parade,ps8830.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Parade PS883x USB and DisplayPort Retimer + +maintainers: + - Abel Vesa <abel.vesa@linaro.org> + +properties: + compatible: + enum: + - parade,ps8830 + + reg: + maxItems: 1 + + clocks: + items: + - description: XO Clock + + reset-gpios: + maxItems: 1 + + vdd-supply: + description: power supply (1.07V) + + vdd33-supply: + description: power supply (3.3V) + + vdd33-cap-supply: + description: power supply (3.3V) + + vddar-supply: + description: power supply (1.07V) + + vddat-supply: + description: power supply (1.07V) + + vddio-supply: + description: power supply (1.2V or 1.8V) + + orientation-switch: true + retimer-switch: true + + ports: + $ref: /schemas/graph.yaml#/properties/ports + properties: + port@0: + $ref: /schemas/graph.yaml#/properties/port + description: Super Speed (SS) Output endpoint to the Type-C connector + + port@1: + $ref: /schemas/graph.yaml#/$defs/port-base + description: Super Speed (SS) Input endpoint from the Super-Speed PHY + unevaluatedProperties: false + + port@2: + $ref: /schemas/graph.yaml#/properties/port + description: + Sideband Use (SBU) AUX lines endpoint to the Type-C connector for the purpose of + handling altmode muxing and orientation switching. + +required: + - compatible + - reg + - clocks + - reset-gpios + - vdd-supply + - vdd33-supply + - vdd33-cap-supply + - vddat-supply + - vddio-supply + - orientation-switch + - retimer-switch + +allOf: + - $ref: usb-switch.yaml# + +additionalProperties: false + +examples: + - | + #include <dt-bindings/gpio/gpio.h> + + i2c { + #address-cells = <1>; + #size-cells = <0>; + + typec-mux@8 { + compatible = "parade,ps8830"; + reg = <0x8>; + + clocks = <&clk_rtmr_xo>; + + vdd-supply = <&vreg_rtmr_1p15>; + vdd33-supply = <&vreg_rtmr_3p3>; + vdd33-cap-supply = <&vreg_rtmr_3p3>; + vddar-supply = <&vreg_rtmr_1p15>; + vddat-supply = <&vreg_rtmr_1p15>; + vddio-supply = <&vreg_rtmr_1p8>; + + reset-gpios = <&tlmm 10 GPIO_ACTIVE_LOW>; + + retimer-switch; + orientation-switch; + + ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + + endpoint { + remote-endpoint = <&typec_con_ss>; + }; + }; + + port@1 { + reg = <1>; + + endpoint { + remote-endpoint = <&usb_phy_ss>; + }; + }; + + port@2 { + reg = <2>; + + endpoint { + remote-endpoint = <&typec_dp_aux>; + }; + }; + }; + }; + }; +... diff --git a/Documentation/devicetree/bindings/usb/qcom,dwc3.yaml b/Documentation/devicetree/bindings/usb/qcom,dwc3.yaml index a2b3cf625e5b..64137c1619a6 100644 --- a/Documentation/devicetree/bindings/usb/qcom,dwc3.yaml +++ b/Documentation/devicetree/bindings/usb/qcom,dwc3.yaml @@ -404,6 +404,7 @@ allOf: minItems: 2 maxItems: 3 interrupt-names: + minItems: 2 items: - const: pwr_event - const: qusb2_phy @@ -425,6 +426,7 @@ allOf: minItems: 3 maxItems: 4 interrupt-names: + minItems: 3 items: - const: pwr_event - const: qusb2_phy diff --git a/Documentation/devicetree/bindings/usb/richtek,rt1711h.yaml b/Documentation/devicetree/bindings/usb/richtek,rt1711h.yaml index 8da4d2ad1a91..ae611f7e57ca 100644 --- a/Documentation/devicetree/bindings/usb/richtek,rt1711h.yaml +++ b/Documentation/devicetree/bindings/usb/richtek,rt1711h.yaml @@ -30,6 +30,9 @@ properties: interrupts: maxItems: 1 + vbus-supply: + description: VBUS power supply + wakeup-source: type: boolean diff --git a/Documentation/devicetree/bindings/usb/rockchip,dwc3.yaml b/Documentation/devicetree/bindings/usb/rockchip,dwc3.yaml index a21cc098542d..fba2cb05ecba 100644 --- a/Documentation/devicetree/bindings/usb/rockchip,dwc3.yaml +++ b/Documentation/devicetree/bindings/usb/rockchip,dwc3.yaml @@ -26,6 +26,7 @@ select: contains: enum: - rockchip,rk3328-dwc3 + - rockchip,rk3562-dwc3 - rockchip,rk3568-dwc3 - rockchip,rk3576-dwc3 - rockchip,rk3588-dwc3 @@ -37,6 +38,7 @@ properties: items: - enum: - rockchip,rk3328-dwc3 + - rockchip,rk3562-dwc3 - rockchip,rk3568-dwc3 - rockchip,rk3576-dwc3 - rockchip,rk3588-dwc3 @@ -72,6 +74,7 @@ properties: - enum: - grf_clk - utmi + - pipe - const: pipe power-domains: @@ -115,6 +118,22 @@ allOf: properties: compatible: contains: + const: rockchip,rk3562-dwc3 + then: + properties: + clocks: + minItems: 4 + maxItems: 4 + clock-names: + items: + - const: ref_clk + - const: suspend_clk + - const: bus_clk + - const: pipe + - if: + properties: + compatible: + contains: enum: - rockchip,rk3568-dwc3 - rockchip,rk3576-dwc3 diff --git a/Documentation/devicetree/bindings/usb/samsung,exynos-dwc3.yaml b/Documentation/devicetree/bindings/usb/samsung,exynos-dwc3.yaml index 2b3430cebe99..256bee2a03ca 100644 --- a/Documentation/devicetree/bindings/usb/samsung,exynos-dwc3.yaml +++ b/Documentation/devicetree/bindings/usb/samsung,exynos-dwc3.yaml @@ -11,12 +11,17 @@ maintainers: properties: compatible: - enum: - - google,gs101-dwusb3 - - samsung,exynos5250-dwusb3 - - samsung,exynos5433-dwusb3 - - samsung,exynos7-dwusb3 - - samsung,exynos850-dwusb3 + oneOf: + - enum: + - google,gs101-dwusb3 + - samsung,exynos5250-dwusb3 + - samsung,exynos5433-dwusb3 + - samsung,exynos7-dwusb3 + - samsung,exynos7870-dwusb3 + - samsung,exynos850-dwusb3 + - items: + - const: samsung,exynos990-dwusb3 + - const: samsung,exynos850-dwusb3 '#address-cells': const: 1 @@ -52,7 +57,6 @@ required: - clock-names - ranges - '#size-cells' - - vdd10-supply - vdd33-supply allOf: @@ -72,6 +76,8 @@ allOf: - const: susp_clk - const: link_aclk - const: link_pclk + required: + - vdd10-supply - if: properties: @@ -86,6 +92,8 @@ allOf: clock-names: items: - const: usbdrd30 + required: + - vdd10-supply - if: properties: @@ -103,6 +111,8 @@ allOf: - const: susp_clk - const: phyclk - const: pipe_pclk + required: + - vdd10-supply - if: properties: @@ -119,6 +129,24 @@ allOf: - const: usbdrd30 - const: usbdrd30_susp_clk - const: usbdrd30_axius_clk + required: + - vdd10-supply + + - if: + properties: + compatible: + contains: + const: samsung,exynos7870-dwusb3 + then: + properties: + clocks: + minItems: 3 + maxItems: 3 + clock-names: + items: + - const: bus_early + - const: ref + - const: ctrl - if: properties: @@ -134,6 +162,8 @@ allOf: items: - const: bus_early - const: ref + required: + - vdd10-supply additionalProperties: false diff --git a/Documentation/devicetree/bindings/usb/snps,dwc3-common.yaml b/Documentation/devicetree/bindings/usb/snps,dwc3-common.yaml index c956053fd036..71249b6ba616 100644 --- a/Documentation/devicetree/bindings/usb/snps,dwc3-common.yaml +++ b/Documentation/devicetree/bindings/usb/snps,dwc3-common.yaml @@ -65,6 +65,17 @@ properties: mode. type: boolean + snps,reserved-endpoints: + description: + Reserve endpoints for other needs, e.g, for tracing control and output. + When set, the driver will avoid using them for the regular USB transfers. + $ref: /schemas/types.yaml#/definitions/uint8-array + minItems: 1 + maxItems: 30 + items: + minimum: 2 + maximum: 31 + snps,dis-start-transfer-quirk: description: When set, disable isoc START TRANSFER command failure SW work-around diff --git a/Documentation/devicetree/bindings/usb/usb-device.yaml b/Documentation/devicetree/bindings/usb/usb-device.yaml index da890ee60ce6..c67695681033 100644 --- a/Documentation/devicetree/bindings/usb/usb-device.yaml +++ b/Documentation/devicetree/bindings/usb/usb-device.yaml @@ -39,8 +39,10 @@ properties: reg: description: the number of the USB hub port or the USB host-controller - port to which this device is attached. The range is 1-255. - maxItems: 1 + port to which this device is attached. + items: + - minimum: 1 + maximum: 255 "#address-cells": description: should be 1 for hub nodes with device nodes, diff --git a/Documentation/devicetree/bindings/watchdog/allwinner,sun4i-a10-wdt.yaml b/Documentation/devicetree/bindings/watchdog/allwinner,sun4i-a10-wdt.yaml index 64c8f7393809..b35ac03d5172 100644 --- a/Documentation/devicetree/bindings/watchdog/allwinner,sun4i-a10-wdt.yaml +++ b/Documentation/devicetree/bindings/watchdog/allwinner,sun4i-a10-wdt.yaml @@ -32,6 +32,7 @@ properties: - items: - const: allwinner,sun20i-d1-wdt-reset - const: allwinner,sun20i-d1-wdt + - const: allwinner,sun55i-a523-wdt reg: maxItems: 1 @@ -60,6 +61,7 @@ if: - allwinner,sun20i-d1-wdt-reset - allwinner,sun50i-r329-wdt - allwinner,sun50i-r329-wdt-reset + - allwinner,sun55i-a523-wdt then: properties: diff --git a/Documentation/devicetree/bindings/watchdog/fsl-imx7ulp-wdt.yaml b/Documentation/devicetree/bindings/watchdog/fsl-imx7ulp-wdt.yaml index a09686b3030d..6ec391b9723a 100644 --- a/Documentation/devicetree/bindings/watchdog/fsl-imx7ulp-wdt.yaml +++ b/Documentation/devicetree/bindings/watchdog/fsl-imx7ulp-wdt.yaml @@ -22,6 +22,10 @@ properties: - const: fsl,imx8ulp-wdt - const: fsl,imx7ulp-wdt - const: fsl,imx93-wdt + - items: + - enum: + - fsl,imx94-wdt + - const: fsl,imx93-wdt reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/watchdog/renesas,wdt.yaml b/Documentation/devicetree/bindings/watchdog/renesas,wdt.yaml index 29ada89fdcdc..3e0a8747a357 100644 --- a/Documentation/devicetree/bindings/watchdog/renesas,wdt.yaml +++ b/Documentation/devicetree/bindings/watchdog/renesas,wdt.yaml @@ -75,6 +75,10 @@ properties: - renesas,r8a779h0-wdt # R-Car V4M - const: renesas,rcar-gen4-wdt # R-Car Gen4 + - items: + - const: renesas,r9a09g047-wdt # RZ/G3E + - const: renesas,r9a09g057-wdt # RZ/V2H(P) + - const: renesas,r9a09g057-wdt # RZ/V2H(P) reg: diff --git a/Documentation/driver-api/cxl/maturity-map.rst b/Documentation/driver-api/cxl/maturity-map.rst index df8e2ac2a320..a2288f9df658 100644 --- a/Documentation/driver-api/cxl/maturity-map.rst +++ b/Documentation/driver-api/cxl/maturity-map.rst @@ -130,7 +130,7 @@ Mailbox commands * [0] Switch CCI * [3] Timestamp * [1] PMEM labels -* [0] PMEM GPF / Dirty Shutdown +* [3] PMEM GPF / Dirty Shutdown * [0] Scan Media PMU diff --git a/Documentation/driver-api/phy/phy.rst b/Documentation/driver-api/phy/phy.rst index 81785c084f3e..719a2b3fd2ab 100644 --- a/Documentation/driver-api/phy/phy.rst +++ b/Documentation/driver-api/phy/phy.rst @@ -198,8 +198,7 @@ pm_runtime_get_sync of PHY provider device because of parent-child relationship. It should also be noted that phy_power_on and phy_power_off performs phy_pm_runtime_get_sync and phy_pm_runtime_put respectively. There are exported APIs like phy_pm_runtime_get, phy_pm_runtime_get_sync, -phy_pm_runtime_put, phy_pm_runtime_put_sync, phy_pm_runtime_allow and -phy_pm_runtime_forbid for performing PM operations. +phy_pm_runtime_put and phy_pm_runtime_put_sync for performing PM operations. PHY Mappings ============ diff --git a/Documentation/driver-api/pps.rst b/Documentation/driver-api/pps.rst index 71ad04c82d6c..598729f9cd27 100644 --- a/Documentation/driver-api/pps.rst +++ b/Documentation/driver-api/pps.rst @@ -206,8 +206,7 @@ To do so the class pps-gen has been added. PPS generators can be registered in the kernel by defining a struct pps_gen_source_info as follows:: - static struct pps_gen_source_info pps_gen_dummy_info = { - .name = "dummy", + static const struct pps_gen_source_info pps_gen_dummy_info = { .use_system_clock = true, .get_time = pps_gen_dummy_get_time, .enable = pps_gen_dummy_enable, @@ -286,3 +285,27 @@ delay between assert and clear edge as small as possible to reduce system latencies. But if it is too small slave won't be able to capture clear edge transition. The default of 30us should be good enough in most situations. The delay can be selected using 'delay' pps_gen_parport module parameter. + + +Intel Timed I/O PPS signal generator +------------------------------------ + +Intel Timed I/O is a high precision device, present on 2019 and newer Intel +CPUs, that can generate PPS signals. + +Timed I/O and system time are both driven by same hardware clock. The signal +is generated with a precision of ~20 nanoseconds. The generated PPS signal +is used to synchronize an external device with system clock. For example, +it can be used to share your clock with a device that receives PPS signal, +generated by Timed I/O device. There are dedicated Timed I/O pins to deliver +the PPS signal to an external device. + +Usage of Intel Timed I/O as PPS generator: + +Start generating PPS signal:: + + $echo 1 > /sys/class/pps-gen/pps-genx/enable + +Stop generating PPS signal:: + + $echo 0 > /sys/class/pps-gen/pps-genx/enable diff --git a/Documentation/driver-api/serial/driver.rst b/Documentation/driver-api/serial/driver.rst index 84b43061c11b..fa1ebfcd4472 100644 --- a/Documentation/driver-api/serial/driver.rst +++ b/Documentation/driver-api/serial/driver.rst @@ -101,6 +101,6 @@ Modem control lines via GPIO Some helpers are provided in order to set/get modem control lines via GPIO. .. kernel-doc:: drivers/tty/serial/serial_mctrl_gpio.c - :identifiers: mctrl_gpio_init mctrl_gpio_free mctrl_gpio_to_gpiod + :identifiers: mctrl_gpio_init mctrl_gpio_to_gpiod mctrl_gpio_set mctrl_gpio_get mctrl_gpio_enable_ms - mctrl_gpio_disable_ms + mctrl_gpio_disable_ms_sync mctrl_gpio_disable_ms_no_sync diff --git a/Documentation/driver-api/soundwire/bra.rst b/Documentation/driver-api/soundwire/bra.rst new file mode 100644 index 000000000000..8500253fa3e8 --- /dev/null +++ b/Documentation/driver-api/soundwire/bra.rst @@ -0,0 +1,336 @@ +========================== +Bulk Register Access (BRA) +========================== + +Conventions +----------- + +Capitalized words used in this documentation are intentional and refer +to concepts of the SoundWire 1.x specification. + +Introduction +------------ + +The SoundWire 1.x specification provides a mechanism to speed-up +command/control transfers by reclaiming parts of the audio +bandwidth. The Bulk Register Access (BRA) protocol is a standard +solution based on the Bulk Payload Transport (BPT) definitions. + +The regular control channel uses Column 0 and can only send/retrieve +one byte per frame with write/read commands. With a typical 48kHz +frame rate, only 48kB/s can be transferred. + +The optional Bulk Register Access capability can transmit up to 12 +Mbits/s and reduce transfer times by several orders of magnitude, but +has multiple design constraints: + + (1) Each frame can only support a read or a write transfer, with a + 10-byte overhead per frame (header and footer response). + + (2) The read/writes SHALL be from/to contiguous register addresses + in the same frame. A fragmented register space decreases the + efficiency of the protocol by requiring multiple BRA transfers + scheduled in different frames. + + (3) The targeted Peripheral device SHALL support the optional Data + Port 0, and likewise the Manager SHALL expose audio-like Ports + to insert BRA packets in the audio payload using the concepts of + Sample Interval, HSTART, HSTOP, etc. + + (4) The BRA transport efficiency depends on the available + bandwidth. If there are no on-going audio transfers, the entire + frame minus Column 0 can be reclaimed for BRA. The frame shape + also impacts efficiency: since Column0 cannot be used for + BTP/BRA, the frame should rely on a large number of columns and + minimize the number of rows. The bus clock should be as high as + possible. + + (5) The number of bits transferred per frame SHALL be a multiple of + 8 bits. Padding bits SHALL be inserted if necessary at the end + of the data. + + (6) The regular read/write commands can be issued in parallel with + BRA transfers. This is convenient to e.g. deal with alerts, jack + detection or change the volume during firmware download, but + accessing the same address with two independent protocols has to + be avoided to avoid undefined behavior. + + (7) Some implementations may not be capable of handling the + bandwidth of the BRA protocol, e.g. in the case of a slow I2C + bus behind the SoundWire IP. In this case, the transfers may + need to be spaced in time or flow-controlled. + + (8) Each BRA packet SHALL be marked as 'Active' when valid data is + to be transmitted. This allows for software to allocate a BRA + stream but not transmit/discard data while processing the + results or preparing the next batch of data, or allowing the + peripheral to deal with the previous transfer. In addition BRA + transfer can be started early on without data being ready. + + (9) Up to 470 bytes may be transmitted per frame. + + (10) The address is represented with 32 bits and does not rely on + the paging registers used for the regular command/control + protocol in Column 0. + + +Error checking +-------------- + +Firmware download is one of the key usages of the Bulk Register Access +protocol. To make sure the binary data integrity is not compromised by +transmission or programming errors, each BRA packet provides: + + (1) A CRC on the 7-byte header. This CRC helps the Peripheral Device + check if it is addressed and set the start address and number of + bytes. The Peripheral Device provides a response in Byte 7. + + (2) A CRC on the data block (header excluded). This CRC is + transmitted as the last-but-one byte in the packet, prior to the + footer response. + +The header response can be one of: + (a) Ack + (b) Nak + (c) Not Ready + +The footer response can be one of: + (1) Ack + (2) Nak (CRC failure) + (3) Good (operation completed) + (4) Bad (operation failed) + +Example frame +------------- + +The example below is not to scale and makes simplifying assumptions +for clarity. The different chunks in the BRA packets are not required +to start on a new SoundWire Row, and the scale of data may vary. + + :: + + +---+--------------------------------------------+ + + | | + + | BRA HEADER | + + | | + + +--------------------------------------------+ + + C | HEADER CRC | + + O +--------------------------------------------+ + + M | HEADER RESPONSE | + + M +--------------------------------------------+ + + A | | + + N | | + + D | DATA | + + | | + + | | + + | | + + +--------------------------------------------+ + + | DATA CRC | + + +--------------------------------------------+ + + | FOOTER RESPONSE | + +---+--------------------------------------------+ + + +Assuming the frame uses N columns, the configuration shown above can +be programmed by setting the DP0 registers as: + + - HSTART = 1 + - HSTOP = N - 1 + - Sampling Interval = N + - WordLength = N - 1 + +Addressing restrictions +----------------------- + +The Device Number specified in the Header follows the SoundWire +definitions, and broadcast and group addressing are permitted. For now +the Linux implementation only allows for a single BPT transfer to a +single device at a time. This might be revisited at a later point as +an optimization to send the same firmware to multiple devices, but +this would only be beneficial for single-link solutions. + +In the case of multiple Peripheral devices attached to different +Managers, the broadcast and group addressing is not supported by the +SoundWire specification. Each device must be handled with separate BRA +streams, possibly in parallel - the links are really independent. + +Unsupported features +-------------------- + +The Bulk Register Access specification provides a number of +capabilities that are not supported in known implementations, such as: + + (1) Transfers initiated by a Peripheral Device. The BRA Initiator is + always the Manager Device. + + (2) Flow-control capabilities and retransmission based on the + 'NotReady' header response require extra buffering in the + SoundWire IP and are not implemented. + +Bi-directional handling +----------------------- + +The BRA protocol can handle writes as well as reads, and in each +packet the header and footer response are provided by the Peripheral +Target device. On the Peripheral device, the BRA protocol is handled +by a single DP0 data port, and at the low-level the bus ownership can +will change for header/footer response as well as the data transmitted +during a read. + +On the host side, most implementations rely on a Port-like concept, +with two FIFOs consuming/generating data transfers in parallel +(Host->Peripheral and Peripheral->Host). The amount of data +consumed/produced by these FIFOs is not symmetrical, as a result +hardware typically inserts markers to help software and hardware +interpret raw data + +Each packet will typically have: + + (1) a 'Start of Packet' indicator. + + (2) an 'End of Packet' indicator. + + (3) a packet identifier to correlate the data requested and + transmitted, and the error status for each frame + +Hardware implementations can check errors at the frame level, and +retry a transfer in case of errors. However, as for the flow-control +case, this requires extra buffering and intelligence in the +hardware. The Linux support assumes that the entire transfer is +cancelled if a single error is detected in one of the responses. + +Abstraction required +~~~~~~~~~~~~~~~~~~~~ + +There are no standard registers or mandatory implementation at the +Manager level, so the low-level BPT/BRA details must be hidden in +Manager-specific code. For example the Cadence IP format above is not +known to the codec drivers. + +Likewise, codec drivers should not have to know the frame size. The +computation of CRC and handling of responses is handled in helpers and +Manager-specific code. + +The host BRA driver may also have restrictions on pages allocated for +DMA, or other host-DSP communication protocols. The codec driver +should not be aware of any of these restrictions, since it might be +reused in combination with different implementations of Manager IPs. + +Concurrency between BRA and regular read/write +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The existing 'nread/nwrite' API already relies on a notion of start +address and number of bytes, so it would be possible to extend this +API with a 'hint' requesting BPT/BRA be used. + +However BRA transfers could be quite long, and the use of a single +mutex for regular read/write and BRA is a show-stopper. Independent +operation of the control/command and BRA transfers is a fundamental +requirement, e.g. to change the volume level with the existing regmap +interface while downloading firmware. The integration must however +ensure that there are no concurrent access to the same address with +the command/control protocol and the BRA protocol. + +In addition, the 'sdw_msg' structure hard-codes support for 16-bit +addresses and paging registers which are irrelevant for BPT/BRA +support based on native 32-bit addresses. A separate API with +'sdw_bpt_msg' makes more sense. + +One possible strategy to speed-up all initialization tasks would be to +start a BRA transfer for firmware download, then deal with all the +"regular" read/writes in parallel with the command channel, and last +to wait for the BRA transfers to complete. This would allow for a +degree of overlap instead of a purely sequential solution. As such, +the BRA API must support async transfers and expose a separate wait +function. + + +Peripheral/bus interface +------------------------ + +The bus interface for BPT/BRA is made of two functions: + + - sdw_bpt_send_async(bpt_message) + + This function sends the data using the Manager + implementation-defined capabilities (typically DMA or IPC + protocol). + + Queueing is currently not supported, the caller + needs to wait for completion of the requested transfer. + + - sdw_bpt_wait() + + This function waits for the entire message provided by the + codec driver in the 'send_async' stage. Intermediate status for + smaller chunks will not be provided back to the codec driver, + only a return code will be provided. + +Regmap use +~~~~~~~~~~ + +Existing codec drivers rely on regmap to download firmware to +Peripherals. regmap exposes an async interface similar to the +send/wait API suggested above, so at a high-level it would seem +natural to combine BRA and regmap. The regmap layer could check if BRA +is available or not, and use a regular read-write command channel in +the latter case. + +The regmap integration will be handled in a second step. + +BRA stream model +---------------- + +For regular audio transfers, the machine driver exposes a dailink +connecting CPU DAI(s) and Codec DAI(s). + +This model is not required BRA support: + + (1) The SoundWire DAIs are mainly wrappers for SoundWire Data + Ports, with possibly some analog or audio conversion + capabilities bolted behind the Data Port. In the context of + BRA, the DP0 is the destination. DP0 registers are standard and + can be programmed blindly without knowing what Peripheral is + connected to each link. In addition, if there are multiple + Peripherals on a link and some of them do not support DP0, the + write commands to program DP0 registers will generate harmless + COMMAND_IGNORED responses that will be wired-ORed with + responses from Peripherals which support DP0. In other words, + the DP0 programming can be done with broadcast commands, and + the information on the Target device can be added only in the + BRA Header. + + (2) At the CPU level, the DAI concept is not useful for BRA; the + machine driver will not create a dailink relying on DP0. The + only concept that is needed is the notion of port. + + (3) The stream concept relies on a set of master_rt and slave_rt + concepts. All of these entities represent ports and not DAIs. + + (4) With the assumption that a single BRA stream is used per link, + that stream can connect master ports as well as all peripheral + DP0 ports. + + (5) BRA transfers only make sense in the context of one + Manager/Link, so the BRA stream handling does not rely on the + concept of multi-link aggregation allowed by regular DAI links. + +Audio DMA support +----------------- + +Some DMAs, such as HDaudio, require an audio format field to be +set. This format is in turn used to define acceptable bursts. BPT/BRA +support is not fully compatible with these definitions in that the +format and bandwidth may vary between read and write commands. + +In addition, on Intel HDaudio Intel platforms the DMAs need to be +programmed with a PCM format matching the bandwidth of the BPT/BRA +transfer. The format is based on 192kHz 32-bit samples, and the number +of channels varies to adjust the bandwidth. The notion of channel is +completely notional since the data is not typical audio +PCM. Programming such channels helps reserve enough bandwidth and adjust +FIFO sizes to avoid xruns. + +Alignment requirements are currently not enforced at the core level +but at the platform-level, e.g. for Intel the data sizes must be +multiples of 32 bytes. diff --git a/Documentation/driver-api/soundwire/bra_cadence.rst b/Documentation/driver-api/soundwire/bra_cadence.rst new file mode 100644 index 000000000000..4560752e8f47 --- /dev/null +++ b/Documentation/driver-api/soundwire/bra_cadence.rst @@ -0,0 +1,66 @@ +Cadence IP BRA support +---------------------- + +Format requirements +~~~~~~~~~~~~~~~~~~~ + +The Cadence IP relies on PDI0 for TX and PDI1 for RX. The data needs +to be formatted with the following conventions: + + (1) all Data is stored in bits 15..0 of the 32-bit PDI FIFOs. + + (2) the start of packet is BIT(31). + + (3) the end of packet is BIT(30). + + (4) A packet ID is stored in bits 19..16. This packet ID is + determined by software and is typically a rolling counter. + + (5) Padding shall be inserted as needed so that the Header CRC, + Header response, Footer CRC, Footer response are always in + Byte0. Padding is inserted by software for writes, and on reads + software shall discard the padding added by the hardware. + +Example format +~~~~~~~~~~~~~~ + +The following table represents the sequence provided to PDI0 for a +write command followed by a read command. + +:: + + +---+---+--------+---------------+---------------+ + + 1 | 0 | ID = 0 | WR HDR[1] | WR HDR[0] | + + | | | WR HDR[3] | WR HDR[2] | + + | | | WR HDR[5] | WR HDR[4] | + + | | | pad | WR HDR CRC | + + | | | WR Data[1] | WR Data[0] | + + | | | WR Data[3] | WR Data[2] | + + | | | WR Data[n-2] | WR Data[n-3] | + + | | | pad | WR Data[n-1] | + + 0 | 1 | | pad | WR Data CRC | + +---+---+--------+---------------+---------------+ + + 1 | 0 | ID = 1 | RD HDR[1] | RD HDR[0] | + + | | | RD HDR[3] | RD HDR[2] | + + | | | RD HDR[5] | RD HDR[4] | + + 0 | 1 | | pad | RD HDR CRC | + +---+---+--------+---------------+---------------+ + + +The table below represents the data received on PDI1 for the same +write command followed by a read command. + +:: + + +---+---+--------+---------------+---------------+ + + 1 | 0 | ID = 0 | pad | WR Hdr Rsp | + + 0 | 1 | | pad | WR Ftr Rsp | + +---+---+--------+---------------+---------------+ + + 1 | 0 | ID = 0 | pad | Rd Hdr Rsp | + + | | | RD Data[1] | RD Data[0] | + + | | | RD Data[3] | RD Data[2] | + + | | | RD HDR[n-2] | RD Data[n-3] | + + | | | pad | RD Data[n-1] | + + | | | pad | RD Data CRC | + + 0 | 1 | | pad | RD Ftr Rsp | + +---+---+--------+---------------+---------------+ diff --git a/Documentation/driver-api/soundwire/index.rst b/Documentation/driver-api/soundwire/index.rst index 234911a0db99..ef8d90dfbdde 100644 --- a/Documentation/driver-api/soundwire/index.rst +++ b/Documentation/driver-api/soundwire/index.rst @@ -9,6 +9,8 @@ SoundWire Documentation stream error_handling locking + bra + bra_cadence .. only:: subproject and html diff --git a/Documentation/driver-api/soundwire/stream.rst b/Documentation/driver-api/soundwire/stream.rst index 2a794484f62c..d66201299633 100644 --- a/Documentation/driver-api/soundwire/stream.rst +++ b/Documentation/driver-api/soundwire/stream.rst @@ -291,7 +291,7 @@ per stream. From ASoC DPCM framework, this stream state maybe linked to .. code-block:: c - int sdw_alloc_stream(char * stream_name); + int sdw_alloc_stream(char * stream_name, enum sdw_stream_type type); The SoundWire core provides a sdw_startup_stream() helper function, typically called during a dailink .startup() callback, which performs diff --git a/Documentation/driver-api/soundwire/summary.rst b/Documentation/driver-api/soundwire/summary.rst index 01dcb954f6d7..df78053743b5 100644 --- a/Documentation/driver-api/soundwire/summary.rst +++ b/Documentation/driver-api/soundwire/summary.rst @@ -184,14 +184,6 @@ function that provides capabilities information. Bus needs to know a set of Slave capabilities to program Slave registers and to control the Bus reconfigurations. -Future enhancements to be done -============================== - - (1) Bulk Register Access (BRA) transfers. - - - (2) Multiple data lane support. - Links ===== diff --git a/Documentation/driver-api/thermal/sysfs-api.rst b/Documentation/driver-api/thermal/sysfs-api.rst index c803b89b7248..f73de211bdce 100644 --- a/Documentation/driver-api/thermal/sysfs-api.rst +++ b/Documentation/driver-api/thermal/sysfs-api.rst @@ -413,18 +413,21 @@ This function serves as an arbitrator to set the state of a cooling device. It sets the cooling device to the deepest cooling state if possible. -5. thermal_emergency_poweroff -============================= +5. Critical Events +================== + +On an event of critical trip temperature crossing, the thermal framework +will trigger a hardware protection power-off (shutdown) or reboot, +depending on configuration. -On an event of critical trip temperature crossing the thermal framework -shuts down the system by calling hw_protection_shutdown(). The -hw_protection_shutdown() first attempts to perform an orderly shutdown -but accepts a delay after which it proceeds doing a forced power-off -or as last resort an emergency_restart. +At first, the kernel will attempt an orderly power-off or reboot, but +accepts a delay after which it proceeds to do a forced power-off or +reboot, respectively. If this fails, ``emergency_restart()`` is invoked +as last resort. The delay should be carefully profiled so as to give adequate time for -orderly poweroff. +orderly power-off or reboot. -If the delay is set to 0 emergency poweroff will not be supported. So a -carefully profiled non-zero positive value is a must for emergency -poweroff to be triggered. +If the delay is set to 0, the emergency action will not be supported. So a +carefully profiled non-zero positive value is a must for the emergency +action to be triggered. diff --git a/Documentation/driver-api/tty/tty_driver.rst b/Documentation/driver-api/tty/tty_driver.rst index cc529f863406..7138222a70f2 100644 --- a/Documentation/driver-api/tty/tty_driver.rst +++ b/Documentation/driver-api/tty/tty_driver.rst @@ -25,6 +25,8 @@ freed. For reference, both allocation and deallocation functions are explained here in detail: +.. kernel-doc:: include/linux/tty_driver.h + :identifiers: tty_alloc_driver .. kernel-doc:: drivers/tty/tty_io.c :identifiers: __tty_alloc_driver tty_driver_kref_put @@ -35,7 +37,7 @@ Here comes the documentation of flags accepted by tty_alloc_driver() (or __tty_alloc_driver()): .. kernel-doc:: include/linux/tty_driver.h - :doc: TTY Driver Flags + :identifiers: tty_driver_flag ---- diff --git a/Documentation/driver-api/tty/tty_struct.rst b/Documentation/driver-api/tty/tty_struct.rst index c72f5a4293b2..29caf1c1ca5f 100644 --- a/Documentation/driver-api/tty/tty_struct.rst +++ b/Documentation/driver-api/tty/tty_struct.rst @@ -72,7 +72,7 @@ TTY Struct Flags ================ .. kernel-doc:: include/linux/tty.h - :doc: TTY Struct Flags + :identifiers: tty_struct_flags TTY Struct Reference ==================== diff --git a/Documentation/driver-api/usb/writing_musb_glue_layer.rst b/Documentation/driver-api/usb/writing_musb_glue_layer.rst index e755c8551bba..0bb96ecdf527 100644 --- a/Documentation/driver-api/usb/writing_musb_glue_layer.rst +++ b/Documentation/driver-api/usb/writing_musb_glue_layer.rst @@ -613,7 +613,7 @@ endpoints configuration from the hardware, so we use line 12 instruction to bypass reading the configuration from silicon, and rely on a hard-coded table that describes the endpoints configuration instead:: - static struct musb_fifo_cfg jz4740_musb_fifo_cfg[] = { + static const struct musb_fifo_cfg jz4740_musb_fifo_cfg[] = { { .hw_ep_num = 1, .style = FIFO_TX, .maxpacket = 512, }, { .hw_ep_num = 1, .style = FIFO_RX, .maxpacket = 512, }, { .hw_ep_num = 2, .style = FIFO_TX, .maxpacket = 64, }, diff --git a/Documentation/features/core/mseal_sys_mappings/arch-support.txt b/Documentation/features/core/mseal_sys_mappings/arch-support.txt new file mode 100644 index 000000000000..c6cab9760d57 --- /dev/null +++ b/Documentation/features/core/mseal_sys_mappings/arch-support.txt @@ -0,0 +1,30 @@ +# +# Feature name: mseal-system-mappings +# Kconfig: ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS +# description: arch supports mseal system mappings +# + ----------------------- + | arch |status| + ----------------------- + | alpha: | TODO | + | arc: | N/A | + | arm: | N/A | + | arm64: | ok | + | csky: | N/A | + | hexagon: | N/A | + | loongarch: | TODO | + | m68k: | N/A | + | microblaze: | N/A | + | mips: | TODO | + | nios2: | N/A | + | openrisc: | N/A | + | parisc: | TODO | + | powerpc: | TODO | + | riscv: | TODO | + | s390: | ok | + | sh: | N/A | + | sparc: | TODO | + | um: | TODO | + | x86: | ok | + | xtensa: | N/A | + ----------------------- diff --git a/Documentation/filesystems/9p.rst b/Documentation/filesystems/9p.rst index 3078f3c9256a..be3504ca034a 100644 --- a/Documentation/filesystems/9p.rst +++ b/Documentation/filesystems/9p.rst @@ -40,7 +40,7 @@ For remote file server:: mount -t 9p 10.10.1.2 /mnt/9 -For Plan 9 From User Space applications (http://swtch.com/plan9):: +For Plan 9 From User Space applications (https://9fans.github.io/plan9port/):: mount -t 9p `namespace`/acme /mnt/9 -o trans=unix,uname=$USER @@ -165,8 +165,8 @@ Options do not necessarily validate cached values on the server. In other words changes on the server are not guaranteed to be reflected on the client system. Only use this mode of operation if you - have an exclusive mount and the server will modify the filesystem - underneath you. + have an exclusive mount and the server will not modify the + filesystem underneath you. debug=n specifies debug level. The debug level is a bitmask. diff --git a/Documentation/filesystems/dax.rst b/Documentation/filesystems/dax.rst index 719e90f1988e..08dd5e254cc5 100644 --- a/Documentation/filesystems/dax.rst +++ b/Documentation/filesystems/dax.rst @@ -207,7 +207,6 @@ implement direct_access. These block devices may be used for inspiration: - brd: RAM backed block device driver -- dcssblk: s390 dcss block device driver - pmem: NVDIMM persistent memory driver diff --git a/Documentation/filesystems/ext4/super.rst b/Documentation/filesystems/ext4/super.rst index a1eb4a11a1d0..1b240661bfa3 100644 --- a/Documentation/filesystems/ext4/super.rst +++ b/Documentation/filesystems/ext4/super.rst @@ -328,9 +328,13 @@ The ext4 superblock is laid out as follows in - s_checksum_type - Metadata checksum algorithm type. The only valid value is 1 (crc32c). * - 0x176 - - __le16 - - s_reserved_pad - - + - \_\_u8 + - s\_encryption\_level + - Versioning level for encryption. + * - 0x177 + - \_\_u8 + - s\_reserved\_pad + - Padding to next 32bits. * - 0x178 - __le64 - s_kbytes_written @@ -466,9 +470,13 @@ The ext4 superblock is laid out as follows in - s_last_error_time_hi - Upper 8 bits of the s_last_error_time field. * - 0x27A - - __u8 - - s_pad[2] - - Zero padding. + - \_\_u8 + - s\_first\_error\_errcode + - + * - 0x27B + - \_\_u8 + - s\_last\_error\_errcode + - * - 0x27C - __le16 - s_encoding diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index 09f0aed5a08b..2a17865dfe39 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -128,6 +128,16 @@ process running on the system, which is named after the process ID (PID). The link 'self' points to the process reading the file system. Each process subdirectory has the entries listed in Table 1-1. +A process can read its own information from /proc/PID/* with no extra +permissions. When reading /proc/PID/* information for other processes, reading +process is required to have either CAP_SYS_PTRACE capability with +PTRACE_MODE_READ access permissions, or, alternatively, CAP_PERFMON +capability. This applies to all read-only information like `maps`, `environ`, +`pagemap`, etc. The only exception is `mem` file due to its read-write nature, +which requires CAP_SYS_PTRACE capabilities with more elevated +PTRACE_MODE_ATTACH permissions; CAP_PERFMON capability does not grant access +to /proc/PID/mem for other processes. + Note that an open file descriptor to /proc/<pid> or to any of its contained files or subdirectories does not prevent <pid> being reused for some other process in the event that <pid> exits. Operations on @@ -502,9 +512,25 @@ process, its PSS will be 1500. "Pss_Dirty" is the portion of PSS which consists of dirty pages. ("Pss_Clean" is not included, but it can be calculated by subtracting "Pss_Dirty" from "Pss".) -Note that even a page which is part of a MAP_SHARED mapping, but has only -a single pte mapped, i.e. is currently used by only one process, is accounted -as private and not as shared. +Traditionally, a page is accounted as "private" if it is mapped exactly once, +and a page is accounted as "shared" when mapped multiple times, even when +mapped in the same process multiple times. Note that this accounting is +independent of MAP_SHARED. + +In some kernel configurations, the semantics of pages part of a larger +allocation (e.g., THP) can differ: a page is accounted as "private" if all +pages part of the corresponding large allocation are *certainly* mapped in the +same process, even if the page is mapped multiple times in that process. A +page is accounted as "shared" if any page page of the larger allocation +is *maybe* mapped in a different process. In some cases, a large allocation +might be treated as "maybe mapped by multiple processes" even though this +is no longer the case. + +Some kernel configurations do not track the precise number of times a page part +of a larger allocation is mapped. In this case, when calculating the PSS, the +average number of mappings per page in this larger allocation might be used +as an approximation for the number of mappings of a page. The PSS calculation +will be imprecise in this case. "Referenced" indicates the amount of memory currently marked as referenced or accessed. @@ -686,6 +712,11 @@ Where: node locality page counters (N0 == node0, N1 == node1, ...) and the kernel page size, in KB, that is backing the mapping up. +Note that some kernel configurations do not track the precise number of times +a page part of a larger allocation (e.g., THP) is mapped. In these +configurations, "mapmax" might corresponds to the average number of mappings +per page in such a larger allocation instead. + 1.2 Kernel data --------------- @@ -1060,6 +1091,8 @@ Example output. You may not have all of these fields. FilePmdMapped: 0 kB CmaTotal: 0 kB CmaFree: 0 kB + Unaccepted: 0 kB + Balloon: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 @@ -1132,9 +1165,15 @@ Dirty Writeback Memory which is actively being written back to the disk AnonPages - Non-file backed pages mapped into userspace page tables + Non-file backed pages mapped into userspace page tables. Note that + some kernel configurations might consider all pages part of a + larger allocation (e.g., THP) as "mapped", as soon as a single + page is mapped. Mapped - files which have been mmapped, such as libraries + files which have been mmapped, such as libraries. Note that some + kernel configurations might consider all pages part of a larger + allocation (e.g., THP) as "mapped", as soon as a single page is + mapped. Shmem Total memory used by shared memory (shmem) and tmpfs KReclaimable @@ -1228,6 +1267,10 @@ CmaTotal Memory reserved for the Contiguous Memory Allocator (CMA) CmaFree Free remaining memory in the CMA reserves +Unaccepted + Memory that has not been accepted by the guest +Balloon + Memory returned to Host by VM Balloon Drivers HugePages_Total, HugePages_Free, HugePages_Rsvd, HugePages_Surp, Hugepagesize, Hugetlb See Documentation/admin-guide/mm/hugetlbpage.rst. DirectMap4k, DirectMap2M, DirectMap1G diff --git a/Documentation/iio/ad4030.rst b/Documentation/iio/ad4030.rst new file mode 100644 index 000000000000..b57424b650a8 --- /dev/null +++ b/Documentation/iio/ad4030.rst @@ -0,0 +1,180 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +============= +AD4030 driver +============= + +ADC driver for Analog Devices Inc. AD4030 and similar devices. The module name +is ``ad4030``. + + +Supported devices +================= + +The following chips are supported by this driver: + +* `AD4030-24 <https://www.analog.com/AD4030-24>`_ +* `AD4032-24 <https://www.analog.com/AD4032-24>`_ +* `AD4630-16 <https://www.analog.com/AD4630-16>`_ +* `AD4630-24 <https://www.analog.com/AD4630-24>`_ +* `AD4632-16 <https://www.analog.com/AD4632-16>`_ +* `AD4632-24 <https://www.analog.com/AD4632-24>`_ + +IIO channels +============ + +Each "hardware" channel as described in the datasheet is split in 2 IIO +channels: + +- One channel for the differential data +- One channel for the common byte. + +The possible IIO channels depending on the numbers of "hardware" channel are: + ++------------------------------------+------------------------------------+ +| 1 channel ADC | 2 channels ADC | ++====================================+====================================+ +| - voltage0-voltage1 (differential) | - voltage0-voltage1 (differential) | +| - voltage2 (common-mode) | - voltage2-voltage3 (differential) | +| | - voltage4 (common-mode) | +| | - voltage5 (common-mode) | ++------------------------------------+------------------------------------+ + +Labels +------ + +For ease of use, the IIO channels provide a label. For a differential channel, +the label is ``differentialN`` where ``N`` is the "hardware" channel id. For a +common-mode channel, the label is ``common-modeN`` where ``N`` is the +"hardware" channel id. + +The possible labels are: + ++-----------------+-----------------+ +| 1 channel ADC | 2 channels ADC | ++=================+=================+ +| - differential0 | - differential0 | +| - common-mode0 | - differential1 | +| | - common-mode0 | +| | - common-mode1 | ++-----------------+-----------------+ + +Supported features +================== + +SPI wiring modes +---------------- + +The driver currently supports the following SPI wiring configurations: + +One lane mode +^^^^^^^^^^^^^ + +In this mode, each channel has its own SDO line to send the conversion results. +At the moment this mode can only be used on AD4030 which has one channel so only +one SDO line is used. + +.. code-block:: + + +-------------+ +-------------+ + | ADC | | HOST | + | | | | + | CNV |<--------| CNV | + | CS |<--------| CS | + | SDI |<--------| SDO | + | SDO0 |-------->| SDI | + | SCLK |<--------| SCLK | + +-------------+ +-------------+ + +Interleaved mode +^^^^^^^^^^^^^^^^ + +In this mode, both channels conversion results are bit interleaved one SDO line. +As such the wiring is the same as `One lane mode`_. + +SPI Clock mode +-------------- + +Only the SPI clocking mode is supported. + +Output modes +------------ + +There are more exposed IIO channels than channels as describe in the devices +datasheet. This is due to the `Differential data + common-mode`_ encoding +2 types of information in one conversion result. As such a "device" channel +provides 2 IIO channels, one for the differential data and one for the common +byte. + +Differential data +^^^^^^^^^^^^^^^^^ + +This mode is selected when: + +- Only differential channels are enabled in a buffered read +- Oversampling attribute is set to 1 + +Differential data + common-mode +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This mode is selected when: + +- Differential and common-mode channels are enabled in a buffered read +- Oversampling attribute is set to 1 + +For the 24-bits chips, this mode is also available with 16-bits differential +data but is not selectable yet. + +Averaged differential data +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This mode is selected when: + +- Only differential channels are selected enabled in a buffered read +- Oversampling attribute is greater than 1 + +Digital Gain and Offset +----------------------- + +Each differential data channel has a 16-bits unsigned configurable hardware +gain applied to it. By default it's equal to 1. Note that applying gain can +cause numerical saturation. + +Each differential data channel has a signed configurable hardware offset. +For the ADCs ending in ``-24``, the gain is encoded on 24-bits. +Likewise, the ADCs ending in ``-16`` have a gain encoded on 16-bits. Note that +applying an offset can cause numerical saturation. + +The final differential data returned by the ADC is computed by first applying +the gain, then the offset. + +The gain is controlled by the ``calibscale`` IIO attribute while the offset is +controlled by the ``calibbias`` attribute. + +Reference voltage +----------------- + +The chip supports an external reference voltage via the ``REF`` input or an +internal buffered reference voltage via the ``REFIN`` input. The driver looks +at the device tree to determine which is being used. If ``ref-supply`` is +present, then the external reference voltage is used and the internal buffer is +disabled. If ``refin-supply`` is present, then the internal buffered reference +voltage is used. + +Reset +----- + +Both hardware and software reset are supported. The driver looks first at the +device tree to see if the ``reset-gpio`` is populated. +If not present, the driver will fallback to a software reset by wiring to the +device's registers. + +Unimplemented features +---------------------- + +- ``BUSY`` indication +- Additional wiring modes +- Additional clock modes +- Differential data 16-bits + common-mode for 24-bits chips +- Overrange events +- Test patterns diff --git a/Documentation/iio/ad4695.rst b/Documentation/iio/ad4695.rst index 9ec8bf466c15..f40593bcc37d 100644 --- a/Documentation/iio/ad4695.rst +++ b/Documentation/iio/ad4695.rst @@ -47,6 +47,36 @@ In this mode, CNV and CS are tied together and there is a single SDO line. To use this mode, in the device tree, omit the ``cnv-gpios`` and ``spi-rx-bus-width`` properties. +SPI offload wiring +^^^^^^^^^^^^^^^^^^ + +When used with a SPI offload, the supported wiring configuration is: + +.. code-block:: + + +-------------+ +-------------+ + | GP0/BUSY |-------->| TRIGGER | + | CS |<--------| CS | + | | | | + | ADC | | SPI | + | | | | + | SDI |<--------| SDO | + | SDO |-------->| SDI | + | SCLK |<--------| SCLK | + | | | | + | | +-------------+ + | CNV |<-----+--| PWM | + | | +--| GPIO | + +-------------+ +-------------+ + +In this case, both the ``cnv-gpios`` and ``pwms`` properties are required. +The ``#trigger-source-cells = <2>`` property is also required to connect back +to the SPI offload. The SPI offload will have ``trigger-sources`` property +with cells to indicate the busy signal and which GPx pin is used, e.g +``<&ad4695 AD4695_TRIGGER_EVENT_BUSY AD4695_TRIGGER_PIN_GP0>``. + +.. seealso:: `SPI offload support`_ + Channel configuration --------------------- @@ -149,15 +179,62 @@ Gain/offset calibration System calibration is supported using the channel gain and offset registers via the ``calibscale`` and ``calibbias`` attributes respectively. +Oversampling +------------ + +The chip supports per-channel oversampling when SPI offload is being used, with +available oversampling ratios (OSR) of 1 (default), 4, 16, and 64. Enabling +oversampling on a channel raises the effective number of bits of sampled data to +17 (OSR == 4), 18 (16), or 19 (64), respectively. This can be set via the +``oversampling_ratio`` attribute. + +Setting the oversampling ratio for a channel also changes the sample rate for +that channel, since it requires multiple conversions per 1 sample. Specifically, +the new sampling frequency is the PWM sampling frequency divided by the +particular OSR. This is set automatically by the driver when setting the +``oversampling_ratio`` attribute. For example, if the device's current +``sampling_frequency`` is 10000 and an OSR of 4 is set on channel ``voltage0``, +the new reported sampling rate for that channel will be 2500 (ignoring PWM API +rounding), while all others will remain at 10000. Subsequently setting the +sampling frequency to a higher value on that channel will adjust the CNV trigger +period for all channels, e.g. if ``voltage0``'s sampling frequency is adjusted +from 2500 (with an OSR of 4) to 10000, the value reported by +``in_voltage0_sampling_frequency`` will be 10000, but all other channels will +now report 40000. + +For simplicity, the sampling frequency of the device should be set (considering +the highest desired OSR value to be used) first, before configuring oversampling +for specific channels. + Unimplemented features ---------------------- - Additional wiring modes - Threshold events -- Oversampling - GPIO support - CRC support +SPI offload support +=================== + +To be able to achieve the maximum sample rate, the driver can be used with the +`AXI SPI Engine`_ to provide SPI offload support. + +.. _AXI SPI Engine: http://analogdevicesinc.github.io/hdl/projects/ad469x_fmc/index.html + +.. seealso:: `SPI offload wiring`_ + +When SPI offload is being used, some attributes will be different. + +* ``trigger`` directory is removed. +* ``in_voltage0_sampling_frequency`` attributes are added for setting the sample + rate. +* ``in_voltage0_sampling_frequency_available`` attributes are added for querying + the max sample rate. +* ``timestamp`` channel is removed. +* Buffer data format may be different compared to when offload is not used, + e.g. the ``buffer0/in_voltage0_type`` attribute. + Device buffers ============== @@ -165,3 +242,28 @@ This driver supports hardware triggered buffers. This uses the "advanced sequencer" feature of the chip to trigger a burst of conversions. Also see :doc:`iio_devbuf` for more general information. + +Effective sample rate for buffered reads +---------------------------------------- + +When SPI offload is not used, the sample rate is determined by the trigger that +is manually configured in userspace. All enabled channels will be read in a +burst when the trigger is received. + +When SPI offload is used, the sample rate is configured per channel. All +channels will have the same rate, so only one ``in_voltageY_sampling_frequency`` +attribute needs to be set. Since this rate determines the delay between each +individual conversion, the effective sample rate for each sample is actually +the sum of the periods of each enabled channel in a buffered read. In other +words, it is the value of the ``in_voltageY_sampling_frequency`` attribute +divided by the number of enabled channels. So if 4 channels are enabled, with +the ``in_voltageY_sampling_frequency`` attributes set to 1 MHz, the effective +sample rate is 250 kHz. + +With oversampling enabled, the effective sample rate also depends on the OSR +assigned to each channel. For example, if one of the 4 channels mentioned in the +previous case is configured with an OSR of 4, the effective sample rate for that +channel becomes (1 MHz / 4 ) = 250 kHz. The effective sample rate for all +four channels is then 1 / ( (3 / 1 MHz) + ( 1 / 250 kHz) ) ~= 142.9 kHz. Note +that in this case "sample" refers to one read of all enabled channels (i.e. one +full cycle through the auto-sequencer). diff --git a/Documentation/iio/ad7191.rst b/Documentation/iio/ad7191.rst new file mode 100644 index 000000000000..977d4fea14b0 --- /dev/null +++ b/Documentation/iio/ad7191.rst @@ -0,0 +1,119 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +============= +AD7191 driver +============= + +Device driver for Analog Devices AD7191 ADC. + +Supported devices +================= + +* `AD7191 <https://www.analog.com/AD7191>`_ + +The AD7191 is a high precision, low noise, 24-bit Σ-Δ ADC with integrated PGA. +It features two differential input channels, an internal temperature sensor, and +configurable sampling rates. + +Devicetree +========== + +Pin Configuration +----------------- + +The driver supports both pin-strapped and GPIO-controlled configurations for ODR +(Output Data Rate) and PGA (Programmable Gain Amplifier) settings. These +configurations are mutually exclusive - you must use either pin-strapped or GPIO +control for each setting, not both. + +ODR Configuration +^^^^^^^^^^^^^^^^^ + +The ODR can be configured either through GPIO control or pin-strapping: + +- When using GPIO control, specify the "odr-gpios" property in the device tree +- For pin-strapped configuration, specify the "adi,odr-value" property in the + device tree + +Available ODR settings: + + - 120 Hz (ODR1=0, ODR2=0) + - 60 Hz (ODR1=0, ODR2=1) + - 50 Hz (ODR1=1, ODR2=0) + - 10 Hz (ODR1=1, ODR2=1) + +PGA Configuration +^^^^^^^^^^^^^^^^^ + +The PGA can be configured either through GPIO control or pin-strapping: + +- When using GPIO control, specify the "pga-gpios" property in the device tree +- For pin-strapped configuration, specify the "adi,pga-value" property in the + device tree + +Available PGA gain settings: + + - 1x (PGA1=0, PGA2=0) + - 8x (PGA1=0, PGA2=1) + - 64x (PGA1=1, PGA2=0) + - 128x (PGA1=1, PGA2=1) + +Clock Configuration +------------------- + +The AD7191 supports both internal and external clock sources: + +- When CLKSEL pin is tied LOW: Uses internal 4.92MHz clock (no clock property + needed) +- When CLKSEL pin is tied HIGH: Requires external clock source + - Can be a crystal between MCLK1 and MCLK2 pins + - Or a CMOS-compatible clock driving MCLK2 pin + - Must specify the "clocks" property in device tree when using external clock + +SPI Interface Requirements +-------------------------- + +The AD7191 has specific SPI interface requirements: + +- The DOUT/RDY output is dual-purpose and requires SPI bus locking +- DOUT/RDY must be connected to an interrupt-capable GPIO +- The SPI controller's chip select must be connected to the PDOWN pin of the ADC +- When CS (PDOWN) is high, the device powers down and resets internal circuitry +- SPI mode 3 operation (CPOL=1, CPHA=1) is required + +Power Supply Requirements +------------------------- + +The device requires the following power supplies: + +- AVdd: Analog power supply +- DVdd: Digital power supply +- Vref: Reference voltage supply (external) + +All power supplies must be specified in the device tree. + +Channel Configuration +===================== + +The device provides three channels: + +1. Temperature Sensor + - 24-bit unsigned + - Internal temperature measurement + - Temperature in millidegrees Celsius + +2. Differential Input (AIN1-AIN2) + - 24-bit unsigned + - Differential voltage measurement + - Configurable gain via PGA + +3. Differential Input (AIN3-AIN4) + - 24-bit unsigned + - Differential voltage measurement + - Configurable gain via PGA + +Buffer Support +============== + +This driver supports IIO triggered buffers. See Documentation/iio/iio_devbuf.rst +for more information about IIO triggered buffers. diff --git a/Documentation/iio/ad7380.rst b/Documentation/iio/ad7380.rst index c46127700e14..24a92a1c4371 100644 --- a/Documentation/iio/ad7380.rst +++ b/Documentation/iio/ad7380.rst @@ -29,6 +29,7 @@ The following chips are supported by this driver: * `AD7388-4 <https://www.analog.com/en/products/ad7388-4.html>`_ * `ADAQ4370-4 <https://www.analog.com/en/products/adaq4370-4.html>`_ * `ADAQ4380-4 <https://www.analog.com/en/products/adaq4380-4.html>`_ +* `ADAQ4381-4 <https://www.analog.com/en/products/adaq4381-4.html>`_ Supported features @@ -52,8 +53,8 @@ declared in the device tree as ``refin-supply``. ADAQ devices ~~~~~~~~~~~~ -adaq4370-4 and adaq4380-4 don't have an external reference, but use a 3.3V -internal reference derived from one of its supplies (``refin-supply``) +ADAQ devices don't have an external reference, but use a 3.3V internal reference +derived from one of its supplies (``refin-supply``) All other devices from ad738x family ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -92,6 +93,38 @@ must restart iiod using the following command: root:~# systemctl restart iiod +Alert +----- + +2 channels variants of the ad738x family, can use the SDOB line as an alert pin +when configured in 1 SDO line mode. 4 channels variants, can use SDOD as an +alert pin when configured in 1 or 2 SDO line(s) mode, although only 1 SDO line +mode is currently supported by the driver (see `SPI wiring modes`_). + +At the end of a conversion the active-low alert pin gets asserted if the +conversion result exceeds the alert high limit or falls below the alert low +limit. It is cleared, on a falling edge of CS. The alert pin is common to all +channels. + +User can enable alert using the regular iio events attribute: + +.. code-block:: bash + + events/thresh_either_en + +The high and low thresholds are common to all channels and can also be set using +regular iio events attributes: + +.. code-block:: bash + + events/in_thresh_falling_value + events/in_thresh_rising_value + +If debugfs is available, user can read the ALERT register to determine the +faulty channel and direction. + +In most use cases, user will hardwire the alert pin to trigger a shutdown. + Channel selection and sequencer (single-end chips only) ------------------------------------------------------- @@ -144,8 +177,25 @@ Unimplemented features - Rolling average oversampling - Power down mode - CRC indication -- Alert +SPI offload support +=================== + +To be able to achieve the maximum sample rate, the driver can be used with the +`AXI SPI Engine`_ to provide SPI offload support. + +.. _AXI SPI Engine: http://analogdevicesinc.github.io/hdl/projects/pulsar_adc/index.html + +When SPI offload is being used, some attributes will be different. + +* ``trigger`` directory is removed. +* ``in_voltage0_sampling_frequency`` attribute is added for setting the sample + rate. +* ``in_voltage0_sampling_frequency_available`` attribute is added for querying + the max sample rate. +* ``timestamp`` channel is removed. +* Buffer data format may be different compared to when offload is not used, + e.g. the ``in_voltage0_type`` attribute. Device buffers ============== diff --git a/Documentation/iio/ad7944.rst b/Documentation/iio/ad7944.rst index 0d26e56aba88..e6dbe4d7f58c 100644 --- a/Documentation/iio/ad7944.rst +++ b/Documentation/iio/ad7944.rst @@ -46,6 +46,8 @@ CS mode, 3-wire, without busy indicator To select this mode in the device tree, set the ``adi,spi-mode`` property to ``"single"`` and omit the ``cnv-gpios`` property. +This is the only wiring configuration supported when using `SPI offload support`_. + CS mode, 4-wire, without busy indicator ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -106,7 +108,6 @@ Unimplemented features ---------------------- - ``BUSY`` indication -- ``TURBO`` mode Device attributes @@ -147,6 +148,27 @@ AD7986 is a fully-differential ADC and has the following attributes: In "chain" mode, additional chips will appear as additional voltage input channels, e.g. ``in_voltage2-voltage3_raw``. +SPI offload support +=================== + +To be able to achieve the maximum sample rate, the driver can be used with the +`AXI SPI Engine`_ to provide SPI offload support. + +.. _AXI SPI Engine: http://analogdevicesinc.github.io/hdl/projects/pulsar_adc/index.html + +When SPI offload is being used, some attributes will be different. + +* ``trigger`` directory is removed. +* ``in_voltage0_sampling_frequency`` attribute is added for setting the sample + rate. +* ``in_voltage0_sampling_frequency_available`` attribute is added for querying + the max sample rate. +* ``timestamp`` channel is removed. +* Buffer data format may be different compared to when offload is not used, + e.g. the ``in_voltage0_type`` attribute. + +If the ``turbo-gpios`` property is present in the device tree, the driver will +turn on TURBO during buffered reads and turn it off otherwise. Device buffers ============== diff --git a/Documentation/iio/adis16550.rst b/Documentation/iio/adis16550.rst new file mode 100644 index 000000000000..25db7b8060c4 --- /dev/null +++ b/Documentation/iio/adis16550.rst @@ -0,0 +1,376 @@ +.. SPDX-License-Identifier: GPL-2.0 + +================ +ADIS16550 driver +================ + +This driver supports Analog Device's IMUs on SPI bus. + +1. Supported devices +==================== + +* `ADIS16550 <https://www.analog.com/ADIS16550>`_ + +The ADIS16550 is a complete inertial system that includes a triaxis gyroscope +and a triaxis accelerometer. The factory calibration characterizes each sensor for +sensitivity, bias, and alignment. As a result, each sensor has its own dynamic +compensation formulas that provide accurate sensor measurements. + +2. Device attributes +==================== + +Accelerometer, gyroscope measurements are always provided. Furthermore, the +driver offers the capability to retrieve the delta angle and the delta velocity +measurements computed by the device. + +The delta angle measurements represent a calculation of angular displacement +between each sample update, while the delta velocity measurements represent a +calculation of linear velocity change between each sample update. + +Finally, temperature data are provided which show a coarse measurement of +the temperature inside of the IMU device. This data is most useful for +monitoring relative changes in the thermal environment. + +Each IIO device, has a device folder under ``/sys/bus/iio/devices/iio:deviceX``, +where X is the IIO index of the device. Under these folders reside a set of +device files, depending on the characteristics and features of the hardware +device in questions. These files are consistently generalized and documented in +the IIO ABI documentation. + +The following tables show the adis16550 related device files, found in the +specific device folder path ``/sys/bus/iio/devices/iio:deviceX``. + ++-------------------------------------------+----------------------------------------------------------+ +| 3-Axis Accelerometer related device files | Description | ++-------------------------------------------+----------------------------------------------------------+ +| in_accel_scale | Scale for the accelerometer channels. | ++-------------------------------------------+----------------------------------------------------------+ +| in_accel_filter_low_pass_3db_frequency | Bandwidth for the accelerometer channels. | ++-------------------------------------------+----------------------------------------------------------+ +| in_accel_x_calibbias | Calibration offset for the X-axis accelerometer channel. | ++-------------------------------------------+----------------------------------------------------------+ +| in_accel_x_calibscale | Calibration scale for the X-axis accelerometer channel. | ++-------------------------------------------+----------------------------------------------------------+ +| in_accel_x_raw | Raw X-axis accelerometer channel value. | ++-------------------------------------------+----------------------------------------------------------+ +| in_accel_y_calibbias | Calibration offset for the Y-axis accelerometer channel. | ++-------------------------------------------+----------------------------------------------------------+ +| in_accel_y_calibscale | Calibration scale for the Y-axis accelerometer channel. | ++-------------------------------------------+----------------------------------------------------------+ +| in_accel_y_raw | Raw Y-axis accelerometer channel value. | ++-------------------------------------------+----------------------------------------------------------+ +| in_accel_z_calibbias | Calibration offset for the Z-axis accelerometer channel. | ++-------------------------------------------+----------------------------------------------------------+ +| in_accel_z_calibscale | Calibration scale for the Z-axis accelerometer channel. | ++-------------------------------------------+----------------------------------------------------------+ +| in_accel_z_raw | Raw Z-axis accelerometer channel value. | ++-------------------------------------------+----------------------------------------------------------+ +| in_deltavelocity_scale | Scale for delta velocity channels. | ++-------------------------------------------+----------------------------------------------------------+ +| in_deltavelocity_x_raw | Raw X-axis delta velocity channel value. | ++-------------------------------------------+----------------------------------------------------------+ +| in_deltavelocity_y_raw | Raw Y-axis delta velocity channel value. | ++-------------------------------------------+----------------------------------------------------------+ +| in_deltavelocity_z_raw | Raw Z-axis delta velocity channel value. | ++-------------------------------------------+----------------------------------------------------------+ + ++--------------------------------------------+------------------------------------------------------+ +| 3-Axis Gyroscope related device files | Description | ++--------------------------------------------+------------------------------------------------------+ +| in_anglvel_scale | Scale for the gyroscope channels. | ++--------------------------------------------+------------------------------------------------------+ +| in_anglvel_filter_low_pass_3db_frequency | Scale for the gyroscope channels. | ++--------------------------------------------+------------------------------------------------------+ +| in_anglvel_x_calibbias | Calibration offset for the X-axis gyroscope channel. | ++--------------------------------------------+------------------------------------------------------+ +| in_anglvel_x_calibscale | Calibration scale for the X-axis gyroscope channel. | ++--------------------------------------------+------------------------------------------------------+ +| in_anglvel_x_raw | Raw X-axis gyroscope channel value. | ++--------------------------------------------+------------------------------------------------------+ +| in_anglvel_y_calibbias | Calibration offset for the Y-axis gyroscope channel. | ++--------------------------------------------+------------------------------------------------------+ +| in_anglvel_y_calibscale | Calibration scale for the Y-axis gyroscope channel. | ++--------------------------------------------+------------------------------------------------------+ +| in_anglvel_y_raw | Raw Y-axis gyroscope channel value. | ++--------------------------------------------+------------------------------------------------------+ +| in_anglvel_z_calibbias | Calibration offset for the Z-axis gyroscope channel. | ++--------------------------------------------+------------------------------------------------------+ +| in_anglvel_z_calibscale | Calibration scale for the Z-axis gyroscope channel. | ++--------------------------------------------+------------------------------------------------------+ +| in_anglvel_z_raw | Raw Z-axis gyroscope channel value. | ++--------------------------------------------+------------------------------------------------------+ +| in_deltaangl_scale | Scale for delta angle channels. | ++--------------------------------------------+------------------------------------------------------+ +| in_deltaangl_x_raw | Raw X-axis delta angle channel value. | ++--------------------------------------------+------------------------------------------------------+ +| in_deltaangl_y_raw | Raw Y-axis delta angle channel value. | ++--------------------------------------------+------------------------------------------------------+ +| in_deltaangl_z_raw | Raw Z-axis delta angle channel value. | ++--------------------------------------------+------------------------------------------------------+ + ++----------------------------------+-------------------------------------------+ +| Temperature sensor related files | Description | ++----------------------------------+-------------------------------------------+ +| in_temp0_raw | Raw temperature channel value. | ++----------------------------------+-------------------------------------------+ +| in_temp0_offset | Offset for the temperature sensor channel.| ++----------------------------------+-------------------------------------------+ +| in_temp0_scale | Scale for the temperature sensor channel. | ++----------------------------------+-------------------------------------------+ + ++----------------------------+--------------------------------------------------------------------------------+ +| Miscellaneous device files | Description | ++----------------------------+--------------------------------------------------------------------------------+ +| name | Name of the IIO device. | ++----------------------------+--------------------------------------------------------------------------------+ +| sampling_frequency | Currently selected sample rate. | ++----------------------------+--------------------------------------------------------------------------------+ + +The following table shows the adis16550 related device debug files, found in the +specific device debug folder path ``/sys/kernel/debug/iio/iio:deviceX``. + ++----------------------+-------------------------------------------------------------------------+ +| Debugfs device files | Description | ++----------------------+-------------------------------------------------------------------------+ +| serial_number | The serial number of the chip in hexadecimal format. | ++----------------------+-------------------------------------------------------------------------+ +| product_id | Chip specific product id (16550). | ++----------------------+-------------------------------------------------------------------------+ +| flash_count | The number of flash writes performed on the device. | ++----------------------+-------------------------------------------------------------------------+ +| firmware_revision | String containing the firmware revision in the following format ##.##. | ++----------------------+-------------------------------------------------------------------------+ +| firmware_date | String containing the firmware date in the following format mm-dd-yyyy. | ++----------------------+-------------------------------------------------------------------------+ + +Channels processed values +------------------------- + +A channel value can be read from its _raw attribute. The value returned is the +raw value as reported by the devices. To get the processed value of the channel, +apply the following formula: + +.. code-block:: bash + + processed value = (_raw + _offset) * _scale + +Where _offset and _scale are device attributes. If no _offset attribute is +present, simply assume its value is 0. + +The adis16550 driver offers data for 5 types of channels, the table below shows +the measurement units for the processed value, which are defined by the IIO +framework: + ++--------------------------------------+---------------------------+ +| Channel type | Measurement unit | ++--------------------------------------+---------------------------+ +| Acceleration on X, Y, and Z axis | Meters per Second squared | ++--------------------------------------+---------------------------+ +| Angular velocity on X, Y and Z axis | Radians per second | ++--------------------------------------+---------------------------+ +| Delta velocity on X. Y, and Z axis | Meters per Second | ++--------------------------------------+---------------------------+ +| Delta angle on X, Y, and Z axis | Radians | ++--------------------------------------+---------------------------+ +| Temperature | Millidegrees Celsius | ++--------------------------------------+---------------------------+ + +Usage examples +-------------- + +Show device name: + +.. code-block:: bash + + root:/sys/bus/iio/devices/iio:device0> cat name + adis16550 + +Show accelerometer channels value: + +.. code-block:: bash + + root:/sys/bus/iio/devices/iio:device0> cat in_accel_x_raw + 6903851 + root:/sys/bus/iio/devices/iio:device0> cat in_accel_y_raw + 5650550 + root:/sys/bus/iio/devices/iio:device0> cat in_accel_z_raw + 104873530 + root:/sys/bus/iio/devices/iio:device0> cat in_accel_scale + 0.000000095 + +- X-axis acceleration = in_accel_x_raw * in_accel_scale = 0.655865845 m/s^2 +- Y-axis acceleration = in_accel_y_raw * in_accel_scale = 0.53680225 m/s^2 +- Z-axis acceleration = in_accel_z_raw * in_accel_scale = 9.96298535 m/s^2 + +Show gyroscope channels value: + +.. code-block:: bash + + root:/sys/bus/iio/devices/iio:device0> cat in_anglvel_x_raw + 193309 + root:/sys/bus/iio/devices/iio:device0> cat in_anglvel_y_raw + -763676 + root:/sys/bus/iio/devices/iio:device0> cat in_anglvel_z_raw + -358108 + root:/sys/bus/iio/devices/iio:device0> cat in_anglvel_scale + 0.000000003 + +- X-axis angular velocity = in_anglvel_x_raw * in_anglvel_scale = 0.000579927 rad/s +- Y-axis angular velocity = in_anglvel_y_raw * in_anglvel_scale = −0.002291028 rad/s +- Z-axis angular velocity = in_anglvel_z_raw * in_anglvel_scale = −0.001074324 rad/s + +Set calibration offset for accelerometer channels: + +.. code-block:: bash + + root:/sys/bus/iio/devices/iio:device0> cat in_accel_x_calibbias + 0 + + root:/sys/bus/iio/devices/iio:device0> echo 5000 > in_accel_x_calibbias + root:/sys/bus/iio/devices/iio:device0> cat in_accel_x_calibbias + 5000 + +Set calibration offset for gyroscope channels: + +.. code-block:: bash + + root:/sys/bus/iio/devices/iio:device0> cat in_anglvel_y_calibbias + 0 + + root:/sys/bus/iio/devices/iio:device0> echo -5000 > in_anglvel_y_calibbias + root:/sys/bus/iio/devices/iio:device0> cat in_anglvel_y_calibbias + -5000 + +Set sampling frequency: + +.. code-block:: bash + + root:/sys/bus/iio/devices/iio:device0> cat sampling_frequency + 4000.000000 + + root:/sys/bus/iio/devices/iio:device0> echo 1000 > sampling_frequency + 1000.000000 + +Set bandwidth for accelerometer channels: + +.. code-block:: bash + + root:/sys/bus/iio/devices/iio:device0> cat in_accel_filter_low_pass_3db_frequency + 0 + + root:/sys/bus/iio/devices/iio:device0> echo 100 > in_accel_filter_low_pass_3db_frequency + root:/sys/bus/iio/devices/iio:device0> cat in_accel_filter_low_pass_3db_frequency + 100 + +Show serial number: + +.. code-block:: bash + + root:/sys/kernel/debug/iio/iio:device0> cat serial_number + 0x000000b6 + +Show product id: + +.. code-block:: bash + + root:/sys/kernel/debug/iio/iio:device0> cat product_id + 16550 + +Show flash count: + +.. code-block:: bash + + root:/sys/kernel/debug/iio/iio:device0> cat flash_count + 13 + +Show firmware revision: + +.. code-block:: bash + + root:/sys/kernel/debug/iio/iio:device0> cat firmware_revision + 1.5 + +Show firmware date: + +.. code-block:: bash + + root:/sys/kernel/debug/iio/iio:device0> cat firmware_date + 28-04-2021 + +3. Device buffers +================= + +This driver supports IIO buffers. + +The device supports retrieving the raw acceleration, gyroscope, delta velocity, +delta angle and temperature measurements using buffers. + +However, when retrieving acceleration or gyroscope data using buffers, delta +readings will not be available and vice versa. This is because the device only +allows to read either acceleration and gyroscope data or delta velocity and +delta angle data at a time and switching between these two burst data selection +modes is time consuming. + +Usage examples +-------------- + +Set device trigger in current_trigger, if not already set: + +.. code-block:: bash + + root:/sys/bus/iio/devices/iio:device0> cat trigger/current_trigger + + root:/sys/bus/iio/devices/iio:device0> echo adis16550-dev0 > trigger/current_trigger + root:/sys/bus/iio/devices/iio:device0> cat trigger/current_trigger + adis16550-dev0 + +Select channels for buffer read: + +.. code-block:: bash + + root:/sys/bus/iio/devices/iio:device0> echo 1 > scan_elements/in_deltavelocity_x_en + root:/sys/bus/iio/devices/iio:device0> echo 1 > scan_elements/in_deltavelocity_y_en + root:/sys/bus/iio/devices/iio:device0> echo 1 > scan_elements/in_deltavelocity_z_en + root:/sys/bus/iio/devices/iio:device0> echo 1 > scan_elements/in_temp0_en + +Set the number of samples to be stored in the buffer: + +.. code-block:: bash + + root:/sys/bus/iio/devices/iio:device0> echo 10 > buffer/length + +Enable buffer readings: + +.. code-block:: bash + + root:/sys/bus/iio/devices/iio:device0> echo 1 > buffer/enable + +Obtain buffered data: + +.. code-block:: bash + + root:/sys/bus/iio/devices/iio:device0> hexdump -C /dev/iio\:device0 + ... + 0000cdf0 00 00 0d 2f 00 00 08 43 00 00 09 09 00 00 a4 5f |.../...C......._| + 0000ce00 00 00 0d 2f 00 00 07 de 00 00 08 db 00 00 a4 4b |.../...........K| + 0000ce10 00 00 0d 2f 00 00 07 58 00 00 08 a3 00 00 a4 55 |.../...X.......U| + 0000ce20 00 00 0d 2f 00 00 06 d6 00 00 08 5c 00 00 a4 62 |.../.......\...b| + 0000ce30 00 00 0d 2f 00 00 06 45 00 00 08 37 00 00 a4 47 |.../...E...7...G| + 0000ce40 00 00 0d 2f 00 00 05 d4 00 00 08 30 00 00 a3 fa |.../.......0....| + 0000ce50 00 00 0d 2f 00 00 05 d0 00 00 08 12 00 00 a3 d3 |.../............| + 0000ce60 00 00 0d 2f 00 00 05 dd 00 00 08 2e 00 00 a3 e9 |.../............| + 0000ce70 00 00 0d 2f 00 00 05 cc 00 00 08 51 00 00 a3 d5 |.../.......Q....| + 0000ce80 00 00 0d 2f 00 00 05 ba 00 00 08 22 00 00 a3 9a |.../......."....| + 0000ce90 00 00 0d 2f 00 00 05 9c 00 00 07 d9 00 00 a3 40 |.../...........@| + 0000cea0 00 00 0d 2f 00 00 05 68 00 00 07 94 00 00 a2 e4 |.../...h........| + 0000ceb0 00 00 0d 2f 00 00 05 25 00 00 07 8d 00 00 a2 ce |.../...%........| + ... + +See ``Documentation/iio/iio_devbuf.rst`` for more information about how buffered +data is structured. + +4. IIO Interfacing Tools +======================== + +See ``Documentation/iio/iio_tools.rst`` for the description of the available IIO +interfacing tools. diff --git a/Documentation/iio/adxl380.rst b/Documentation/iio/adxl380.rst index 376dee5fe1dd..66c8a4d4f767 100644 --- a/Documentation/iio/adxl380.rst +++ b/Documentation/iio/adxl380.rst @@ -94,7 +94,7 @@ apply the following formula: Where _offset and _scale are device attributes. If no _offset attribute is present, simply assume its value is 0. -The adis16475 driver offers data for 2 types of channels, the table below shows +The ADXL380 driver offers data for 2 types of channels, the table below shows the measurement units for the processed value, which are defined by the IIO framework: diff --git a/Documentation/iio/iio_adc.rst b/Documentation/iio/iio_adc.rst new file mode 100644 index 000000000000..f2f19a691907 --- /dev/null +++ b/Documentation/iio/iio_adc.rst @@ -0,0 +1,305 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +========================= +IIO Abstractions for ADCs +========================= + +1. Overview +=========== + +The IIO subsystem supports many Analog to Digital Converters (ADCs). Some ADCs +have features and characteristics that are supported in specific ways by IIO +device drivers. This documentation describes common ADC features and explains +how they are supported by the IIO subsystem. + +1. ADC Channel Types +==================== + +ADCs can have distinct types of inputs, each of them measuring analog voltages +in a slightly different way. An ADC digitizes the analog input voltage over a +span that is often given by the provided voltage reference, the input type, and +the input polarity. The input range allowed to an ADC channel is needed to +determine the scale factor and offset needed to obtain the measured value in +real-world units (millivolts for voltage measurement, milliamps for current +measurement, etc.). + +Elaborate designs may have nonlinear characteristics or integrated components +(such as amplifiers and reference buffers) that might also have to be considered +to derive the allowed input range for an ADC. For clarity, the sections below +assume the input range only depends on the provided voltage references, input +type, and input polarity. + +There are three general types of ADC inputs (single-ended, differential, +pseudo-differential) and two possible polarities (unipolar, bipolar). The input +type (single-ended, differential, pseudo-differential) is one channel +characteristic, and is completely independent of the polarity (unipolar, +bipolar) aspect. A comprehensive article about ADC input types (on which this +doc is heavily based on) can be found at +https://www.analog.com/en/resources/technical-articles/sar-adc-input-types.html. + +1.1 Single-ended channels +------------------------- + +Single-ended channels digitize the analog input voltage relative to ground and +can be either unipolar or bipolar. + +1.1.1 Single-ended Unipolar Channels +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:: + + ---------- VREF ------------- + ´ ` ´ ` _____________ + / \ / \ / | + / \ / \ --- < IN ADC | + \ / \ / \ | + `-´ `-´ \ VREF | + -------- GND (0V) ----------- +-----------+ + ^ + | + External VREF + +The input voltage to a **single-ended unipolar** channel is allowed to swing +from GND to VREF (where VREF is a voltage reference with electrical potential +higher than system ground). The maximum input voltage is also called VFS +(Voltage input Full-Scale), with VFS being determined by VREF. The voltage +reference may be provided from an external supply or derived from the chip power +source. + +A single-ended unipolar channel could be described in device tree like the +following example:: + + adc@0 { + ... + #address-cells = <1>; + #size-cells = <0>; + + channel@0 { + reg = <0>; + }; + }; + +One is always allowed to include ADC channel nodes in the device tree. Though, +if the device has a uniform set of inputs (e.g. all inputs are single-ended), +then declaring the channel nodes is optional. + +One caveat for devices that support mixed single-ended and differential channels +is that single-ended channel nodes also need to provide a ``single-channel`` +property when ``reg`` is an arbitrary number that doesn't match the input pin +number. + +See ``Documentation/devicetree/bindings/iio/adc/adc.yaml`` for the complete +documentation of ADC specific device tree properties. + + +1.1.2 Single-ended Bipolar Channels +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:: + + ---------- +VREF ------------ + ´ ` ´ ` _____________________ + / \ / \ / | + / \ / \ --- < IN ADC | + \ / \ / \ | + `-´ `-´ \ +VREF -VREF | + ---------- -VREF ------------ +-------------------+ + ^ ^ + | | + External +VREF ------+ External -VREF + +For a **single-ended bipolar** channel, the analog voltage input can go from +-VREF to +VREF (where -VREF is the voltage reference that has the lower +electrical potential while +VREF is the reference with the higher one). Some ADC +chips derive the lower reference from +VREF, others get it from a separate +input. Often, +VREF and -VREF are symmetric but they don't need to be so. When +-VREF is lower than system ground, these inputs are also called single-ended +true bipolar. Also, while there is a relevant difference between bipolar and +true bipolar from the electrical perspective, IIO makes no explicit distinction +between them. + +Here's an example device tree description of a single-ended bipolar channel:: + + adc@0 { + ... + #address-cells = <1>; + #size-cells = <0>; + + channel@0 { + reg = <0>; + bipolar; + }; + }; + +1.2 Differential channels +------------------------- + +A differential voltage measurement digitizes the voltage level at the positive +input (IN+) relative to the negative input (IN-) over the -VREF to +VREF span. +In other words, a differential channel measures the potential difference between +IN+ and IN-, which is often denoted by the IN+ - IN- formula. + +1.2.1 Differential Bipolar Channels +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:: + + -------- +VREF ------ +-------------------+ + ´ ` ´ ` / | + / \ / \ / --- < IN+ | + `-´ `-´ | | + -------- -VREF ------ | | + | ADC | + -------- +VREF ------ | | + ´ ` ´ ` | | + \ / \ / \ --- < IN- | + `-´ `-´ \ +VREF -VREF | + -------- -VREF ------ +-------------------+ + ^ ^ + | +---- External -VREF + External +VREF + +The analog signals to **differential bipolar** inputs are also allowed to swing +from -VREF to +VREF. The bipolar part of the name means that the resulting value +of the difference (IN+ - IN-) can be positive or negative. If -VREF is below +system GND, these are also called differential true bipolar inputs. + +Device tree example of a differential bipolar channel:: + + adc@0 { + ... + #address-cells = <1>; + #size-cells = <0>; + + channel@0 { + reg = <0>; + bipolar; + diff-channels = <0 1>; + }; + }; + +In the ADC driver, ``differential = 1`` is set into ``struct iio_chan_spec`` for +the channel. Even though, there are three general input types, ``differential`` +is only used to distinguish between differential and non-differential (either +single-ended or pseudo-differential) input types. See +``include/linux/iio/iio.h`` for more information. + +1.2.2 Differential Unipolar Channels +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +For **differential unipolar** channels, the analog voltage at the positive input +must also be higher than the voltage at the negative input. Thus, the actual +input range allowed to a differential unipolar channel is IN- to +VREF. Because +IN+ is allowed to swing with the measured analog signal and the input setup must +guarantee IN+ will not go below IN- (nor IN- will raise above IN+), most +differential unipolar channel setups have IN- fixed to a known voltage that does +not fall within the voltage range expected for the measured signal. That leads +to a setup that is equivalent to a pseudo-differential channel. Thus, +differential unipolar setups can often be supported as pseudo-differential +unipolar channels. + +1.3 Pseudo-differential Channels +-------------------------------- + +There is a third ADC input type which is called pseudo-differential or +single-ended to differential configuration. A pseudo-differential channel is +similar to a differential channel in that it also measures IN+ relative to IN-. +However, unlike bipolar differential channels, the negative input is limited to +a narrow voltage range (taken as a constant voltage) while only IN+ is allowed +to swing. A pseudo-differential channel can be made out from a differential pair +of inputs by restricting the negative input to a known voltage while allowing +only the positive input to swing. Sometimes, the input provided to IN- is called +common-mode voltage. Besides, some parts have a COM pin that allows single-ended +inputs to be referenced to a common-mode voltage, making them +pseudo-differential channels. Often, the common mode input voltage can be +described in the device tree as a voltage regulator (e.g. ``com-supply``) since +it is basically a constant voltage source. + +1.3.1 Pseudo-differential Unipolar Channels +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:: + + -------- +VREF ------ +-------------------+ + ´ ` ´ ` / | + / \ / \ / --- < IN+ | + `-´ `-´ | | + --------- IN- ------- | ADC | + | | + Common-mode voltage --> --- < IN- | + \ +VREF -VREF | + +-------------------+ + ^ ^ + | +---- External -VREF + External +VREF + +A **pseudo-differential unipolar** input has the limitations a differential +unipolar channel would have, meaning the analog voltage to the positive input +IN+ must stay within IN- to +VREF. The fixed voltage to IN- is often called +common-mode voltage and it must be within -VREF to +VREF as would be expected +from the signal to any differential channel negative input. + +The voltage measured from IN+ is relative to IN- but, unlike differential +channels, pseudo-differential setups are intended to gauge single-ended input +signals. To enable applications to calculate IN+ voltage with respect to system +ground, the IIO channel may provide an ``_offset`` sysfs attribute to be added +to ADC output when converting raw data to voltage units. In many setups, the +common-mode voltage input is at GND level and the ``_offset`` attribute is +omitted due to being always zero. + +Device tree example for pseudo-differential unipolar channel:: + + adc@0 { + ... + #address-cells = <1>; + #size-cells = <0>; + + channel@0 { + reg = <0>; + single-channel = <0>; + common-mode-channel = <1>; + }; + }; + +Do not set ``differential`` in the channel ``iio_chan_spec`` struct of +pseudo-differential channels. + +1.3.2 Pseudo-differential Bipolar Channels +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:: + + -------- +VREF ------ +-------------------+ + ´ ` ´ ` / | + / \ / \ / --- < IN+ | + `-´ `-´ | | + -------- -VREF ------ | ADC | + | | + Common-mode voltage --> --- < IN- | + \ +VREF -VREF | + +-------------------+ + ^ ^ + | +---- External -VREF + External +VREF + +A **pseudo-differential bipolar** input is not limited by the level at IN- but +it will be limited to -VREF or to GND on the lower end of the input range +depending on the particular ADC. Similar to their unipolar counter parts, +pseudo-differential bipolar channels ought to declare an ``_offset`` attribute +to enable the conversion of raw ADC data to voltage units. For the setup with +IN- connected to GND, ``_offset`` is often omitted. + +Device tree example for pseudo-differential bipolar channel:: + + adc@0 { + ... + #address-cells = <1>; + #size-cells = <0>; + + channel@0 { + reg = <0>; + bipolar; + single-channel = <0>; + common-mode-channel = <1>; + }; + }; diff --git a/Documentation/iio/index.rst b/Documentation/iio/index.rst index 5710f5b9e958..bbb2edce8272 100644 --- a/Documentation/iio/index.rst +++ b/Documentation/iio/index.rst @@ -7,6 +7,7 @@ Industrial I/O .. toctree:: :maxdepth: 1 + iio_adc iio_configfs iio_devbuf iio_dmabuf_api @@ -19,13 +20,16 @@ Industrial I/O Kernel Drivers :maxdepth: 1 ad4000 + ad4030 ad4695 + ad7191 ad7380 ad7606 ad7625 ad7944 adis16475 adis16480 + adis16550 adxl380 bno055 ep93xx_adc diff --git a/Documentation/kbuild/bash-completion.rst b/Documentation/kbuild/bash-completion.rst new file mode 100644 index 000000000000..2b52dbcd0933 --- /dev/null +++ b/Documentation/kbuild/bash-completion.rst @@ -0,0 +1,65 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +========================== +Bash completion for Kbuild +========================== + +The kernel build system is written using Makefiles, and Bash completion +for the `make` command is available through the `bash-completion`_ project. + +However, the Makefiles for the kernel build are complex. The generic completion +rules for the `make` command do not provide meaningful suggestions for the +kernel build system, except for the options of the `make` command itself. + +To enhance completion for various variables and targets, the kernel source +includes its own completion script at `scripts/bash-completion/make`. + +This script provides additional completions when working within the kernel tree. +Outside the kernel tree, it defaults to the generic completion rules for the +`make` command. + +Prerequisites +============= + +The script relies on helper functions provided by `bash-completion`_ project. +Please ensure it is installed on your system. On most distributions, you can +install the `bash-completion` package through the standard package manager. + +How to use +========== + +You can source the script directly:: + + $ source scripts/bash-completion/make + +Or, you can copy it into the search path for Bash completion scripts. +For example:: + + $ mkdir -p ~/.local/share/bash-completion/completions + $ cp scripts/bash-completion/make ~/.local/share/bash-completion/completions/ + +Details +======= + +The additional completion for Kbuild is enabled in the following cases: + + - You are in the root directory of the kernel source. + - You are in the top-level build directory created by the O= option + (checked via the `source` symlink pointing to the kernel source). + - The -C make option specifies the kernel source or build directory. + - The -f make option specifies a file in the kernel source or build directory. + +If none of the above are met, it falls back to the generic completion rules. + +The completion supports: + + - Commonly used targets, such as `all`, `menuconfig`, `dtbs`, etc. + - Make (or environment) variables, such as `ARCH`, `LLVM`, etc. + - Single-target builds (`foo/bar/baz.o`) + - Configuration files (`*_defconfig` and `*.config`) + +Some variables offer intelligent behavior. For instance, `CROSS_COMPILE=` +followed by a TAB displays installed toolchains. The list of defconfig files +shown depends on the value of the `ARCH=` variable. + +.. _bash-completion: https://github.com/scop/bash-completion/ diff --git a/Documentation/kbuild/index.rst b/Documentation/kbuild/index.rst index e82af05cd652..3731ab22bfe7 100644 --- a/Documentation/kbuild/index.rst +++ b/Documentation/kbuild/index.rst @@ -23,6 +23,8 @@ Kernel Build System llvm gendwarfksyms + bash-completion + .. only:: subproject and html Indices diff --git a/Documentation/kbuild/kconfig-language.rst b/Documentation/kbuild/kconfig-language.rst index 2619fdf56e68..a91abb8f6840 100644 --- a/Documentation/kbuild/kconfig-language.rst +++ b/Documentation/kbuild/kconfig-language.rst @@ -194,16 +194,6 @@ applicable everywhere (see syntax). ability to hook into a secondary subsystem while allowing the user to configure that subsystem out without also having to unset these drivers. - Note: If the combination of FOO=y and BAZ=m causes a link error, - you can guard the function call with IS_REACHABLE():: - - foo_init() - { - if (IS_REACHABLE(CONFIG_BAZ)) - baz_register(&foo); - ... - } - Note: If the feature provided by BAZ is highly desirable for FOO, FOO should imply not only BAZ, but also its dependency BAR:: @@ -588,7 +578,9 @@ uses the slightly counterintuitive:: depends on BAR || !BAR This means that there is either a dependency on BAR that disallows -the combination of FOO=y with BAR=m, or BAR is completely disabled. +the combination of FOO=y with BAR=m, or BAR is completely disabled. The BAR +module must provide all the stubs for !BAR case. + For a more formalized approach if there are multiple drivers that have the same dependency, a helper symbol can be used, like:: @@ -599,6 +591,21 @@ the same dependency, a helper symbol can be used, like:: config BAR_OPTIONAL def_tristate BAR || !BAR +Much less favorable way to express optional dependency is IS_REACHABLE() within +the module code, useful for example when the module BAR does not provide +!BAR stubs:: + + foo_init() + { + if (IS_REACHABLE(CONFIG_BAR)) + bar_register(&foo); + ... + } + +IS_REACHABLE() is generally discouraged, because the code will be silently +discarded, when CONFIG_BAR=m and this code is built-in. This is not what users +usually expect when enabling BAR as module. + Kconfig recursive dependency limitations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/Documentation/kbuild/makefiles.rst b/Documentation/kbuild/makefiles.rst index d36519f194dc..3b9a8bc671e2 100644 --- a/Documentation/kbuild/makefiles.rst +++ b/Documentation/kbuild/makefiles.rst @@ -318,9 +318,6 @@ ccflags-y, asflags-y and ldflags-y These three flags apply only to the kbuild makefile in which they are assigned. They are used for all the normal cc, as and ld invocations happening during a recursive build. - Note: Flags with the same behaviour were previously named: - EXTRA_CFLAGS, EXTRA_AFLAGS and EXTRA_LDFLAGS. - They are still supported but their usage is deprecated. ccflags-y specifies options for compiling with $(CC). @@ -670,6 +667,20 @@ cc-cross-prefix endif endif +$(RUSTC) support functions +-------------------------- + +rustc-min-version + rustc-min-version tests if the value of $(CONFIG_RUSTC_VERSION) is greater + than or equal to the provided value and evaluates to y if so. + + Example:: + + rustflags-$(call rustc-min-version, 108500) := -Cfoo + + In this example, rustflags-y will be assigned the value -Cfoo if + $(CONFIG_RUSTC_VERSION) is >= 1.85.0. + $(LD) support functions ----------------------- diff --git a/Documentation/kbuild/modules.rst b/Documentation/kbuild/modules.rst index a42f00d8cb90..d0703605bfa4 100644 --- a/Documentation/kbuild/modules.rst +++ b/Documentation/kbuild/modules.rst @@ -318,7 +318,7 @@ Several Subdirectories | |__ include | |__ hardwareif.h |__ include - |__ complex.h + |__ complex.h To build the module complex.ko, we then need the following kbuild file:: diff --git a/Documentation/kbuild/reproducible-builds.rst b/Documentation/kbuild/reproducible-builds.rst index f2dcc39044e6..a7762486c93f 100644 --- a/Documentation/kbuild/reproducible-builds.rst +++ b/Documentation/kbuild/reproducible-builds.rst @@ -46,21 +46,6 @@ The kernel embeds the building user and host names in `KBUILD_BUILD_USER and KBUILD_BUILD_HOST`_ variables. If you are building from a git commit, you could use its committer address. -Absolute filenames ------------------- - -When the kernel is built out-of-tree, debug information may include -absolute filenames for the source files. This must be overridden by -including the ``-fdebug-prefix-map`` option in the `KCFLAGS`_ variable. - -Depending on the compiler used, the ``__FILE__`` macro may also expand -to an absolute filename in an out-of-tree build. Kbuild automatically -uses the ``-fmacro-prefix-map`` option to prevent this, if it is -supported. - -The Reproducible Builds web site has more information about these -`prefix-map options`_. - Generated files in source packages ---------------------------------- @@ -131,7 +116,5 @@ See ``scripts/setlocalversion`` for details. .. _KBUILD_BUILD_TIMESTAMP: kbuild.html#kbuild-build-timestamp .. _KBUILD_BUILD_USER and KBUILD_BUILD_HOST: kbuild.html#kbuild-build-user-kbuild-build-host -.. _KCFLAGS: kbuild.html#kcflags -.. _prefix-map options: https://reproducible-builds.org/docs/build-path/ .. _Reproducible Builds project: https://reproducible-builds.org/ .. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/docs/source-date-epoch/ diff --git a/Documentation/mm/balance.rst b/Documentation/mm/balance.rst index abaa78561c31..c4962c89a7f5 100644 --- a/Documentation/mm/balance.rst +++ b/Documentation/mm/balance.rst @@ -81,7 +81,7 @@ Page stealing from process memory and shm is done if stealing the page would alleviate memory pressure on any zone in the page's node that has fallen below its watermark. -watemark[WMARK_MIN/WMARK_LOW/WMARK_HIGH]/low_on_memory/zone_wake_kswapd: These +watermark[WMARK_MIN/WMARK_LOW/WMARK_HIGH]/low_on_memory/zone_wake_kswapd: These are per-zone fields, used to determine when a zone needs to be balanced. When the number of pages falls below watermark[WMARK_MIN], the hysteric field low_on_memory gets set. This stays set till the number of free pages becomes diff --git a/Documentation/mm/damon/design.rst b/Documentation/mm/damon/design.rst index e28c6a1b40ae..f12d33749329 100644 --- a/Documentation/mm/damon/design.rst +++ b/Documentation/mm/damon/design.rst @@ -313,6 +313,10 @@ sufficient for the given purpose, it shouldn't be unnecessarily further lowered. It is recommended to be set proportional to ``aggregation interval``. By default, the ratio is set as ``1/20``, and it is still recommended. +Based on the manual tuning guide, DAMON provides more intuitive knob-based +intervals auto tuning mechanism. Please refer to :ref:`the design document of +the feature <damon_design_monitoring_intervals_autotuning>` for detail. + Refer to below documents for an example tuning based on the above guide. .. toctree:: @@ -321,6 +325,52 @@ Refer to below documents for an example tuning based on the above guide. monitoring_intervals_tuning_example +.. _damon_design_monitoring_intervals_autotuning: + +Monitoring Intervals Auto-tuning +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +DAMON provides automatic tuning of the ``sampling interval`` and ``aggregation +interval`` based on the :ref:`the tuning guide idea +<damon_design_monitoring_params_tuning_guide>`. The tuning mechanism allows +users to set the aimed amount of access events to observe via DAMON within +given time interval. The target can be specified by the user as a ratio of +DAMON-observed access events to the theoretical maximum amount of the events +(``access_bp``) that measured within a given number of aggregations +(``aggrs``). + +The DAMON-observed access events are calculated in byte granularity based on +DAMON :ref:`region assumption <damon_design_region_based_sampling>`. For +example, if a region of size ``X`` bytes of ``Y`` ``nr_accesses`` is found, it +means ``X * Y`` access events are observed by DAMON. Theoretical maximum +access events for the region is calculated in same way, but replacing ``Y`` +with theoretical maximum ``nr_accesses``, which can be calculated as +``aggregation interval / sampling interval``. + +The mechanism calculates the ratio of access events for ``aggrs`` aggregations, +and increases or decrease the ``sampleing interval`` and ``aggregation +interval`` in same ratio, if the observed access ratio is lower or higher than +the target, respectively. The ratio of the intervals change is decided in +proportion to the distance between current samples ratio and the target ratio. + +The user can further set the minimum and maximum ``sampling interval`` that can +be set by the tuning mechanism using two parameters (``min_sample_us`` and +``max_sample_us``). Because the tuning mechanism changes ``sampling interval`` +and ``aggregation interval`` in same ratio always, the minimum and maximum +``aggregation interval`` after each of the tuning changes can automatically set +together. + +The tuning is turned off by default, and need to be set explicitly by the user. +As a rule of thumbs and the Parreto principle, 4% access samples ratio target +is recommended. Note that Parreto principle (80/20 rule) has applied twice. +That is, assumes 4% (20% of 20%) DAMON-observed access events ratio (source) +to capture 64% (80% multipled by 80%) real access events (outcomes). + +To know how user-space can use this feature via :ref:`DAMON sysfs interface +<sysfs_interface>`, refer to :ref:`intervals_goal <sysfs_scheme>` part of +the documentation. + + .. _damon_design_damos: Operation Schemes @@ -569,11 +619,22 @@ number of filters for each scheme. Each filter specifies - whether it is to allow (include) or reject (exclude) applying the scheme's action to the memory (``allow``). -When multiple filters are installed, each filter is evaluated in the installed -order. If a part of memory is matched to one of the filter, next filters are -ignored. If the memory passes through the filters evaluation stage because it -is not matched to any of the filters, applying the scheme's action to it is -allowed, same to the behavior when no filter exists. +For efficient handling of filters, some types of filters are handled by the +core layer, while others are handled by operations set. In the latter case, +hence, support of the filter types depends on the DAMON operations set. In +case of the core layer-handled filters, the memory regions that excluded by the +filter are not counted as the scheme has tried to the region. In contrast, if +a memory regions is filtered by an operations set layer-handled filter, it is +counted as the scheme has tried. This difference affects the statistics. + +When multiple filters are installed, the group of filters that handled by the +core layer are evaluated first. After that, the group of filters that handled +by the operations layer are evaluated. Filters in each of the groups are +evaluated in the installed order. If a part of memory is matched to one of the +filter, next filters are ignored. If the part passes through the filters +evaluation stage because it is not matched to any of the filters, applying the +scheme's action to it depends on the last filter's allowance type. If the last +filter was for allowing, the part of memory will be rejected, and vice versa. For example, let's assume 1) a filter for allowing anonymous pages and 2) another filter for rejecting young pages are installed in the order. If a page @@ -585,39 +646,29 @@ second reject-filter blocks it. If the page is neither anonymous nor young, the page will pass through the filters evaluation stage since there is no matching filter, and the action will be applied to the page. -Note that the action can equally be applied to memory that either explicitly -filter-allowed or filters evaluation stage passed. It means that installing -allow-filters at the end of the list makes no practical change but only -filters-checking overhead. - -For efficient handling of filters, some types of filters are handled by the -core layer, while others are handled by operations set. In the latter case, -hence, support of the filter types depends on the DAMON operations set. In -case of the core layer-handled filters, the memory regions that excluded by the -filter are not counted as the scheme has tried to the region. In contrast, if -a memory regions is filtered by an operations set layer-handled filter, it is -counted as the scheme has tried. This difference affects the statistics. - Below ``type`` of filters are currently supported. -- anonymous page - - Applied to pages that containing data that not stored in files. - - Handled by operations set layer. Supported by only ``paddr`` set. -- memory cgroup - - Applied to pages that belonging to a given cgroup. - - Handled by operations set layer. Supported by only ``paddr`` set. -- young page - - Applied to pages that are accessed after the last access check from the - scheme. - - Handled by operations set layer. Supported by only ``paddr`` set. -- address range - - Applied to pages that belonging to a given address range. - - Handled by the core logic. -- DAMON monitoring target - - Applied to pages that belonging to a given DAMON monitoring target. - - Handled by the core logic. - -To know how user-space can set the watermarks via :ref:`DAMON sysfs interface +- Core layer handled + - addr + - Applied to pages that belonging to a given address range. + - target + - Applied to pages that belonging to a given DAMON monitoring target. +- Operations layer handled, supported by only ``paddr`` operations set. + - anon + - Applied to pages that containing data that not stored in files. + - active + - Applied to active pages. + - memcg + - Applied to pages that belonging to a given cgroup. + - young + - Applied to pages that are accessed after the last access check from the + scheme. + - hugepage_size + - Applied to pages that managed in a given size range. + - unmapped + - Applied to pages that unmapped. + +To know how user-space can set the filters via :ref:`DAMON sysfs interface <sysfs_interface>`, refer to :ref:`filters <sysfs_filters>` part of the documentation. diff --git a/Documentation/mm/damon/monitoring_intervals_tuning_example.rst b/Documentation/mm/damon/monitoring_intervals_tuning_example.rst index 334a854efb40..7207cbed591f 100644 --- a/Documentation/mm/damon/monitoring_intervals_tuning_example.rst +++ b/Documentation/mm/damon/monitoring_intervals_tuning_example.rst @@ -36,7 +36,7 @@ Then, list the DAMON-found regions of different access patterns, sorted by the "access temperature". "Access temperature" is a metric representing the access-hotness of a region. It is calculated as a weighted sum of the access frequency and the age of the region. If the access frequency is 0 %, the -temperature is multipled by minus one. That is, if a region is not accessed, +temperature is multiplied by minus one. That is, if a region is not accessed, it gets minus temperature and it gets lower as not accessed for longer time. The sorting is in temperature-ascendint order, so the region at the top of the list is the coldest, and the one at the bottom is the hottest one. :: @@ -58,11 +58,11 @@ list is the coldest, and the one at the bottom is the hottest one. :: The list shows not seemingly hot regions, and only minimum access pattern diversity. Every region has zero access frequency. The number of region is 10, which is the default ``min_nr_regions value``. Size of each region is also -nearly idential. We can suspect this is because “adaptive regions adjustment” +nearly identical. We can suspect this is because “adaptive regions adjustment” mechanism was not well working. As the guide suggested, we can get relative hotness of regions using ``age`` as the recency information. That would be better than nothing, but given the fact that the longest age is only about 6 -seconds while we waited about ten minuts, it is unclear how useful this will +seconds while we waited about ten minutes, it is unclear how useful this will be. The temperature ranges to total size of regions of each range histogram @@ -190,7 +190,7 @@ for sampling and aggregation intervals, respectively). :: The number of regions having different access patterns has significantly increased. Size of each region is also more varied. Total size of non-zero access frequency regions is also significantly increased. Maybe this is already -good enough to make some meaningful memory management efficieny changes. +good enough to make some meaningful memory management efficiency changes. 800ms/16s intervals: Another bias ================================= diff --git a/Documentation/mm/hmm.rst b/Documentation/mm/hmm.rst index f6d53c37a2ca..7d61b7a8b65b 100644 --- a/Documentation/mm/hmm.rst +++ b/Documentation/mm/hmm.rst @@ -400,7 +400,7 @@ Exclusive access memory Some devices have features such as atomic PTE bits that can be used to implement atomic access to system memory. To support atomic operations to a shared virtual memory page such a device needs access to that page which is exclusive of any -userspace access from the CPU. The ``make_device_exclusive_range()`` function +userspace access from the CPU. The ``make_device_exclusive()`` function can be used to make a memory range inaccessible from userspace. This replaces all mappings for pages in the given range with special swap diff --git a/Documentation/mm/index.rst b/Documentation/mm/index.rst index 0be1c7503a01..d3ada3e45e10 100644 --- a/Documentation/mm/index.rst +++ b/Documentation/mm/index.rst @@ -62,5 +62,4 @@ documentation, or deleted if it has served its purpose. unevictable-lru vmalloced-kernel-stacks vmemmap_dedup - z3fold zsmalloc diff --git a/Documentation/mm/physical_memory.rst b/Documentation/mm/physical_memory.rst index 71fd4a6acf42..d3ac106e6b14 100644 --- a/Documentation/mm/physical_memory.rst +++ b/Documentation/mm/physical_memory.rst @@ -338,10 +338,272 @@ Statistics Zones ===== +As we have mentioned, each zone in memory is described by a ``struct zone`` +which is an element of the ``node_zones`` array of the node it belongs to. +``struct zone`` is the core data structure of the page allocator. A zone +represents a range of physical memory and may have holes. + +The page allocator uses the GFP flags, see :ref:`mm-api-gfp-flags`, specified by +a memory allocation to determine the highest zone in a node from which the +memory allocation can allocate memory. The page allocator first allocates memory +from that zone, if the page allocator can't allocate the requested amount of +memory from the zone, it will allocate memory from the next lower zone in the +node, the process continues up to and including the lowest zone. For example, if +a node contains ``ZONE_DMA32``, ``ZONE_NORMAL`` and ``ZONE_MOVABLE`` and the +highest zone of a memory allocation is ``ZONE_MOVABLE``, the order of the zones +from which the page allocator allocates memory is ``ZONE_MOVABLE`` > +``ZONE_NORMAL`` > ``ZONE_DMA32``. + +At runtime, free pages in a zone are in the Per-CPU Pagesets (PCP) or free areas +of the zone. The Per-CPU Pagesets are a vital mechanism in the kernel's memory +management system. By handling most frequent allocations and frees locally on +each CPU, the Per-CPU Pagesets improve performance and scalability, especially +on systems with many cores. The page allocator in the kernel employs a two-step +strategy for memory allocation, starting with the Per-CPU Pagesets before +falling back to the buddy allocator. Pages are transferred between the Per-CPU +Pagesets and the global free areas (managed by the buddy allocator) in batches. +This minimizes the overhead of frequent interactions with the global buddy +allocator. + +Architecture specific code calls free_area_init() to initializes zones. + +Zone structure +-------------- +The zones structure ``struct zone`` is defined in ``include/linux/mmzone.h``. +Here we briefly describe fields of this structure: -.. admonition:: Stub +General +~~~~~~~ - This section is incomplete. Please list and describe the appropriate fields. +``_watermark`` + The watermarks for this zone. When the amount of free pages in a zone is below + the min watermark, boosting is ignored, an allocation may trigger direct + reclaim and direct compaction, it is also used to throttle direct reclaim. + When the amount of free pages in a zone is below the low watermark, kswapd is + woken up. When the amount of free pages in a zone is above the high watermark, + kswapd stops reclaiming (a zone is balanced) when the + ``NUMA_BALANCING_MEMORY_TIERING`` bit of ``sysctl_numa_balancing_mode`` is not + set. The promo watermark is used for memory tiering and NUMA balancing. When + the amount of free pages in a zone is above the promo watermark, kswapd stops + reclaiming when the ``NUMA_BALANCING_MEMORY_TIERING`` bit of + ``sysctl_numa_balancing_mode`` is set. The watermarks are set by + ``__setup_per_zone_wmarks()``. The min watermark is calculated according to + ``vm.min_free_kbytes`` sysctl. The other three watermarks are set according + to the distance between two watermarks. The distance itself is calculated + taking ``vm.watermark_scale_factor`` sysctl into account. + +``watermark_boost`` + The number of pages which are used to boost watermarks to increase reclaim + pressure to reduce the likelihood of future fallbacks and wake kswapd now + as the node may be balanced overall and kswapd will not wake naturally. + +``nr_reserved_highatomic`` + The number of pages which are reserved for high-order atomic allocations. + +``nr_free_highatomic`` + The number of free pages in reserved highatomic pageblocks + +``lowmem_reserve`` + The array of the amounts of the memory reserved in this zone for memory + allocations. For example, if the highest zone a memory allocation can + allocate memory from is ``ZONE_MOVABLE``, the amount of memory reserved in + this zone for this allocation is ``lowmem_reserve[ZONE_MOVABLE]`` when + attempting to allocate memory from this zone. This is a mechanism the page + allocator uses to prevent allocations which could use ``highmem`` from using + too much ``lowmem``. For some specialised workloads on ``highmem`` machines, + it is dangerous for the kernel to allow process memory to be allocated from + the ``lowmem`` zone. This is because that memory could then be pinned via the + ``mlock()`` system call, or by unavailability of swapspace. + ``vm.lowmem_reserve_ratio`` sysctl determines how aggressive the kernel is in + defending these lower zones. This array is recalculated by + ``setup_per_zone_lowmem_reserve()`` at runtime if ``vm.lowmem_reserve_ratio`` + sysctl changes. + +``node`` + The index of the node this zone belongs to. Available only when + ``CONFIG_NUMA`` is enabled because there is only one zone in a UMA system. + +``zone_pgdat`` + Pointer to the ``struct pglist_data`` of the node this zone belongs to. + +``per_cpu_pageset`` + Pointer to the Per-CPU Pagesets (PCP) allocated and initialized by + ``setup_zone_pageset()``. By handling most frequent allocations and frees + locally on each CPU, PCP improves performance and scalability on systems with + many cores. + +``pageset_high_min`` + Copied to the ``high_min`` of the Per-CPU Pagesets for faster access. + +``pageset_high_max`` + Copied to the ``high_max`` of the Per-CPU Pagesets for faster access. + +``pageset_batch`` + Copied to the ``batch`` of the Per-CPU Pagesets for faster access. The + ``batch``, ``high_min`` and ``high_max`` of the Per-CPU Pagesets are used to + calculate the number of elements the Per-CPU Pagesets obtain from the buddy + allocator under a single hold of the lock for efficiency. They are also used + to decide if the Per-CPU Pagesets return pages to the buddy allocator in page + free process. + +``pageblock_flags`` + The pointer to the flags for the pageblocks in the zone (see + ``include/linux/pageblock-flags.h`` for flags list). The memory is allocated + in ``setup_usemap()``. Each pageblock occupies ``NR_PAGEBLOCK_BITS`` bits. + Defined only when ``CONFIG_FLATMEM`` is enabled. The flags is stored in + ``mem_section`` when ``CONFIG_SPARSEMEM`` is enabled. + +``zone_start_pfn`` + The start pfn of the zone. It is initialized by + ``calculate_node_totalpages()``. + +``managed_pages`` + The present pages managed by the buddy system, which is calculated as: + ``managed_pages`` = ``present_pages`` - ``reserved_pages``, ``reserved_pages`` + includes pages allocated by the memblock allocator. It should be used by page + allocator and vm scanner to calculate all kinds of watermarks and thresholds. + It is accessed using ``atomic_long_xxx()`` functions. It is initialized in + ``free_area_init_core()`` and then is reinitialized when memblock allocator + frees pages into buddy system. + +``spanned_pages`` + The total pages spanned by the zone, including holes, which is calculated as: + ``spanned_pages`` = ``zone_end_pfn`` - ``zone_start_pfn``. It is initialized + by ``calculate_node_totalpages()``. + +``present_pages`` + The physical pages existing within the zone, which is calculated as: + ``present_pages`` = ``spanned_pages`` - ``absent_pages`` (pages in holes). It + may be used by memory hotplug or memory power management logic to figure out + unmanaged pages by checking (``present_pages`` - ``managed_pages``). Write + access to ``present_pages`` at runtime should be protected by + ``mem_hotplug_begin/done()``. Any reader who can't tolerant drift of + ``present_pages`` should use ``get_online_mems()`` to get a stable value. It + is initialized by ``calculate_node_totalpages()``. + +``present_early_pages`` + The present pages existing within the zone located on memory available since + early boot, excluding hotplugged memory. Defined only when + ``CONFIG_MEMORY_HOTPLUG`` is enabled and initialized by + ``calculate_node_totalpages()``. + +``cma_pages`` + The pages reserved for CMA use. These pages behave like ``ZONE_MOVABLE`` when + they are not used for CMA. Defined only when ``CONFIG_CMA`` is enabled. + +``name`` + The name of the zone. It is a pointer to the corresponding element of + the ``zone_names`` array. + +``nr_isolate_pageblock`` + Number of isolated pageblocks. It is used to solve incorrect freepage counting + problem due to racy retrieving migratetype of pageblock. Protected by + ``zone->lock``. Defined only when ``CONFIG_MEMORY_ISOLATION`` is enabled. + +``span_seqlock`` + The seqlock to protect ``zone_start_pfn`` and ``spanned_pages``. It is a + seqlock because it has to be read outside of ``zone->lock``, and it is done in + the main allocator path. However, the seqlock is written quite infrequently. + Defined only when ``CONFIG_MEMORY_HOTPLUG`` is enabled. + +``initialized`` + The flag indicating if the zone is initialized. Set by + ``init_currently_empty_zone()`` during boot. + +``free_area`` + The array of free areas, where each element corresponds to a specific order + which is a power of two. The buddy allocator uses this structure to manage + free memory efficiently. When allocating, it tries to find the smallest + sufficient block, if the smallest sufficient block is larger than the + requested size, it will be recursively split into the next smaller blocks + until the required size is reached. When a page is freed, it may be merged + with its buddy to form a larger block. It is initialized by + ``zone_init_free_lists()``. + +``unaccepted_pages`` + The list of pages to be accepted. All pages on the list are ``MAX_PAGE_ORDER``. + Defined only when ``CONFIG_UNACCEPTED_MEMORY`` is enabled. + +``flags`` + The zone flags. The least three bits are used and defined by + ``enum zone_flags``. ``ZONE_BOOSTED_WATERMARK`` (bit 0): zone recently boosted + watermarks. Cleared when kswapd is woken. ``ZONE_RECLAIM_ACTIVE`` (bit 1): + kswapd may be scanning the zone. ``ZONE_BELOW_HIGH`` (bit 2): zone is below + high watermark. + +``lock`` + The main lock that protects the internal data structures of the page allocator + specific to the zone, especially protects ``free_area``. + +``percpu_drift_mark`` + When free pages are below this point, additional steps are taken when reading + the number of free pages to avoid per-cpu counter drift allowing watermarks + to be breached. It is updated in ``refresh_zone_stat_thresholds()``. + +Compaction control +~~~~~~~~~~~~~~~~~~ + +``compact_cached_free_pfn`` + The PFN where compaction free scanner should start in the next scan. + +``compact_cached_migrate_pfn`` + The PFNs where compaction migration scanner should start in the next scan. + This array has two elements: the first one is used in ``MIGRATE_ASYNC`` mode, + and the other one is used in ``MIGRATE_SYNC`` mode. + +``compact_init_migrate_pfn`` + The initial migration PFN which is initialized to 0 at boot time, and to the + first pageblock with migratable pages in the zone after a full compaction + finishes. It is used to check if a scan is a whole zone scan or not. + +``compact_init_free_pfn`` + The initial free PFN which is initialized to 0 at boot time and to the last + pageblock with free ``MIGRATE_MOVABLE`` pages in the zone. It is used to check + if it is the start of a scan. + +``compact_considered`` + The number of compactions attempted since last failure. It is reset in + ``defer_compaction()`` when a compaction fails to result in a page allocation + success. It is increased by 1 in ``compaction_deferred()`` when a compaction + should be skipped. ``compaction_deferred()`` is called before + ``compact_zone()`` is called, ``compaction_defer_reset()`` is called when + ``compact_zone()`` returns ``COMPACT_SUCCESS``, ``defer_compaction()`` is + called when ``compact_zone()`` returns ``COMPACT_PARTIAL_SKIPPED`` or + ``COMPACT_COMPLETE``. + +``compact_defer_shift`` + The number of compactions skipped before trying again is + ``1<<compact_defer_shift``. It is increased by 1 in ``defer_compaction()``. + It is reset in ``compaction_defer_reset()`` when a direct compaction results + in a page allocation success. Its maximum value is ``COMPACT_MAX_DEFER_SHIFT``. + +``compact_order_failed`` + The minimum compaction failed order. It is set in ``compaction_defer_reset()`` + when a compaction succeeds and in ``defer_compaction()`` when a compaction + fails to result in a page allocation success. + +``compact_blockskip_flush`` + Set to true when compaction migration scanner and free scanner meet, which + means the ``PB_migrate_skip`` bits should be cleared. + +``contiguous`` + Set to true when the zone is contiguous (in other words, no hole). + +Statistics +~~~~~~~~~~ + +``vm_stat`` + VM statistics for the zone. The items tracked are defined by + ``enum zone_stat_item``. + +``vm_numa_event`` + VM NUMA event statistics for the zone. The items tracked are defined by + ``enum numa_stat_item``. + +``per_cpu_zonestats`` + Per-CPU VM statistics for the zone. It records VM statistics and VM NUMA event + statistics on a per-CPU basis. It reduces updates to the global ``vm_stat`` + and ``vm_numa_event`` fields of the zone to improve performance. .. _pages: diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst index 81417fa2ed20..e6756e78b476 100644 --- a/Documentation/mm/process_addrs.rst +++ b/Documentation/mm/process_addrs.rst @@ -716,9 +716,14 @@ calls :c:func:`!rcu_read_lock` to ensure that the VMA is looked up in an RCU critical section, then attempts to VMA lock it via :c:func:`!vma_start_read`, before releasing the RCU lock via :c:func:`!rcu_read_unlock`. -VMA read locks hold the read lock on the :c:member:`!vma->vm_lock` semaphore for -their duration and the caller of :c:func:`!lock_vma_under_rcu` must release it -via :c:func:`!vma_end_read`. +In cases when the user already holds mmap read lock, :c:func:`!vma_start_read_locked` +and :c:func:`!vma_start_read_locked_nested` can be used. These functions do not +fail due to lock contention but the caller should still check their return values +in case they fail for other reasons. + +VMA read locks increment :c:member:`!vma.vm_refcnt` reference counter for their +duration and the caller of :c:func:`!lock_vma_under_rcu` must drop it via +:c:func:`!vma_end_read`. VMA **write** locks are acquired via :c:func:`!vma_start_write` in instances where a VMA is about to be modified, unlike :c:func:`!vma_start_read` the lock is always @@ -726,9 +731,9 @@ acquired. An mmap write lock **must** be held for the duration of the VMA write lock, releasing or downgrading the mmap write lock also releases the VMA write lock so there is no :c:func:`!vma_end_write` function. -Note that a semaphore write lock is not held across a VMA lock. Rather, a -sequence number is used for serialisation, and the write semaphore is only -acquired at the point of write lock to update this. +Note that when write-locking a VMA lock, the :c:member:`!vma.vm_refcnt` is temporarily +modified so that readers can detect the presense of a writer. The reference counter is +restored once the vma sequence number used for serialisation is updated. This ensures the semantics we require - VMA write locks provide exclusive write access to the VMA. @@ -738,7 +743,7 @@ Implementation details The VMA lock mechanism is designed to be a lightweight means of avoiding the use of the heavily contended mmap lock. It is implemented using a combination of a -read/write semaphore and sequence numbers belonging to the containing +reference counter and sequence numbers belonging to the containing :c:struct:`!struct mm_struct` and the VMA. Read locks are acquired via :c:func:`!vma_start_read`, which is an optimistic @@ -779,28 +784,31 @@ release of any VMA locks on its release makes sense, as you would never want to keep VMAs locked across entirely separate write operations. It also maintains correct lock ordering. -Each time a VMA read lock is acquired, we acquire a read lock on the -:c:member:`!vma->vm_lock` read/write semaphore and hold it, while checking that -the sequence count of the VMA does not match that of the mm. +Each time a VMA read lock is acquired, we increment :c:member:`!vma.vm_refcnt` +reference counter and check that the sequence count of the VMA does not match +that of the mm. -If it does, the read lock fails. If it does not, we hold the lock, excluding -writers, but permitting other readers, who will also obtain this lock under RCU. +If it does, the read lock fails and :c:member:`!vma.vm_refcnt` is dropped. +If it does not, we keep the reference counter raised, excluding writers, but +permitting other readers, who can also obtain this lock under RCU. Importantly, maple tree operations performed in :c:func:`!lock_vma_under_rcu` are also RCU safe, so the whole read lock operation is guaranteed to function correctly. -On the write side, we acquire a write lock on the :c:member:`!vma->vm_lock` -read/write semaphore, before setting the VMA's sequence number under this lock, -also simultaneously holding the mmap write lock. +On the write side, we set a bit in :c:member:`!vma.vm_refcnt` which can't be +modified by readers and wait for all readers to drop their reference count. +Once there are no readers, the VMA's sequence number is set to match that of +the mm. During this entire operation mmap write lock is held. This way, if any read locks are in effect, :c:func:`!vma_start_write` will sleep until these are finished and mutual exclusion is achieved. -After setting the VMA's sequence number, the lock is released, avoiding -complexity with a long-term held write lock. +After setting the VMA's sequence number, the bit in :c:member:`!vma.vm_refcnt` +indicating a writer is cleared. From this point on, VMA's sequence number will +indicate VMA's write-locked state until mmap write lock is dropped or downgraded. -This clever combination of a read/write semaphore and sequence count allows for +This clever combination of a reference counter and sequence count allows for fast RCU-based per-VMA lock acquisition (especially on page fault, though utilised elsewhere) with minimal complexity around lock ordering. diff --git a/Documentation/mm/transhuge.rst b/Documentation/mm/transhuge.rst index a2cd8800d527..0e7f8e4cd2e3 100644 --- a/Documentation/mm/transhuge.rst +++ b/Documentation/mm/transhuge.rst @@ -116,14 +116,27 @@ pages: succeeds on tail pages. - map/unmap of a PMD entry for the whole THP increment/decrement - folio->_entire_mapcount, increment/decrement folio->_large_mapcount - and also increment/decrement folio->_nr_pages_mapped by ENTIRELY_MAPPED - when _entire_mapcount goes from -1 to 0 or 0 to -1. + folio->_entire_mapcount and folio->_large_mapcount. + + We also maintain the two slots for tracking MM owners (MM ID and + corresponding mapcount), and the current status ("maybe mapped shared" vs. + "mapped exclusively"). + + With CONFIG_PAGE_MAPCOUNT, we also increment/decrement + folio->_nr_pages_mapped by ENTIRELY_MAPPED when _entire_mapcount goes + from -1 to 0 or 0 to -1. - map/unmap of individual pages with PTE entry increment/decrement - page->_mapcount, increment/decrement folio->_large_mapcount and also - increment/decrement folio->_nr_pages_mapped when page->_mapcount goes - from -1 to 0 or 0 to -1 as this counts the number of pages mapped by PTE. + folio->_large_mapcount. + + We also maintain the two slots for tracking MM owners (MM ID and + corresponding mapcount), and the current status ("maybe mapped shared" vs. + "mapped exclusively"). + + With CONFIG_PAGE_MAPCOUNT, we also increment/decrement + page->_mapcount and increment/decrement folio->_nr_pages_mapped when + page->_mapcount goes from -1 to 0 or 0 to -1 as this counts the number + of pages mapped by PTE. split_huge_page internally has to distribute the refcounts in the head page to the tail pages before clearing all PG_head/tail bits from the page @@ -151,8 +164,8 @@ clear where references should go after split: it will stay on the head page. Note that split_huge_pmd() doesn't have any limitations on refcounting: pmd can be split at any point and never fails. -Partial unmap and deferred_split_folio() -======================================== +Partial unmap and deferred_split_folio() (anon THP only) +======================================================== Unmapping part of THP (with munmap() or other way) is not going to free memory immediately. Instead, we detect that a subpage of THP is not in use @@ -167,3 +180,13 @@ a THP crosses a VMA boundary. The function deferred_split_folio() is used to queue a folio for splitting. The splitting itself will happen when we get memory pressure via shrinker interface. + +With CONFIG_PAGE_MAPCOUNT, we reliably detect partial mappings based on +folio->_nr_pages_mapped. + +With CONFIG_NO_PAGE_MAPCOUNT, we detect partial mappings based on the +average per-page mapcount in a THP: if the average is < 1, an anon THP is +certainly partially mapped. As long as only a single process maps a THP, +this detection is reliable. With long-running child processes, there can +be scenarios where partial mappings can currently not be detected, and +might need asynchronous detection during memory reclaim in the future. diff --git a/Documentation/mm/z3fold.rst b/Documentation/mm/z3fold.rst deleted file mode 100644 index 25b5935d06c7..000000000000 --- a/Documentation/mm/z3fold.rst +++ /dev/null @@ -1,28 +0,0 @@ -====== -z3fold -====== - -z3fold is a special purpose allocator for storing compressed pages. -It is designed to store up to three compressed pages per physical page. -It is a zbud derivative which allows for higher compression -ratio keeping the simplicity and determinism of its predecessor. - -The main differences between z3fold and zbud are: - -* unlike zbud, z3fold allows for up to PAGE_SIZE allocations -* z3fold can hold up to 3 compressed pages in its page -* z3fold doesn't export any API itself and is thus intended to be used - via the zpool API. - -To keep the determinism and simplicity, z3fold, just like zbud, always -stores an integral number of compressed pages per page, but it can store -up to 3 pages unlike zbud which can store at most 2. Therefore the -compression ratio goes to around 2.7x while zbud's one is around 1.7x. - -Unlike zbud (but like zsmalloc for that matter) z3fold_alloc() does not -return a dereferenceable pointer. Instead, it returns an unsigned long -handle which encodes actual location of the allocated object. - -Keeping effective compression ratio close to zsmalloc's, z3fold doesn't -depend on MMU enabled and provides more predictable reclaim behavior -which makes it a better fit for small and response-critical systems. diff --git a/Documentation/mm/zsmalloc.rst b/Documentation/mm/zsmalloc.rst index 76902835e68e..d2bbecd78e14 100644 --- a/Documentation/mm/zsmalloc.rst +++ b/Documentation/mm/zsmalloc.rst @@ -27,9 +27,8 @@ Instead, it returns an opaque handle (unsigned long) which encodes actual location of the allocated object. The reason for this indirection is that zsmalloc does not keep zspages permanently mapped since that would cause issues on 32-bit systems where the VA region for kernel space mappings -is very small. So, before using the allocating memory, the object has to -be mapped using zs_map_object() to get a usable pointer and subsequently -unmapped using zs_unmap_object(). +is very small. So, using the allocated memory should be done through the +proper handle-based APIs. stat ==== diff --git a/Documentation/netlink/specs/rt_addr.yaml b/Documentation/netlink/specs/rt_addr.yaml index 5dd5469044c7..df6b23f06a22 100644 --- a/Documentation/netlink/specs/rt_addr.yaml +++ b/Documentation/netlink/specs/rt_addr.yaml @@ -78,45 +78,46 @@ definitions: attribute-sets: - name: addr-attrs + name-prefix: ifa- attributes: - - name: ifa-address + name: address type: binary display-hint: ipv4 - - name: ifa-local + name: local type: binary display-hint: ipv4 - - name: ifa-label + name: label type: string - - name: ifa-broadcast + name: broadcast type: binary display-hint: ipv4 - - name: ifa-anycast + name: anycast type: binary - - name: ifa-cacheinfo + name: cacheinfo type: binary struct: ifa-cacheinfo - - name: ifa-multicast + name: multicast type: binary - - name: ifa-flags + name: flags type: u32 enum: ifa-flags enum-as-flags: true - - name: ifa-rt-priority + name: rt-priority type: u32 - - name: ifa-target-netnsid + name: target-netnsid type: binary - - name: ifa-proto + name: proto type: u8 @@ -137,10 +138,10 @@ operations: - ifa-prefixlen - ifa-scope - ifa-index - - ifa-address - - ifa-label - - ifa-local - - ifa-cacheinfo + - address + - label + - local + - cacheinfo - name: deladdr doc: Remove address @@ -154,8 +155,8 @@ operations: - ifa-prefixlen - ifa-scope - ifa-index - - ifa-address - - ifa-local + - address + - local - name: getaddr doc: Dump address information. @@ -169,7 +170,7 @@ operations: value: 20 attributes: *ifaddr-all - - name: getmaddrs + name: getmulticast doc: Get / dump IPv4/IPv6 multicast addresses. attribute-set: addr-attrs fixed-header: ifaddrmsg @@ -182,11 +183,12 @@ operations: reply: value: 58 attributes: &mcaddr-attrs - - ifa-multicast - - ifa-cacheinfo + - multicast + - cacheinfo dump: request: value: 58 + attributes: - ifa-family reply: value: 58 diff --git a/Documentation/netlink/specs/rt_route.yaml b/Documentation/netlink/specs/rt_route.yaml index a674103e5bc4..292469c7d4b9 100644 --- a/Documentation/netlink/specs/rt_route.yaml +++ b/Documentation/netlink/specs/rt_route.yaml @@ -80,165 +80,167 @@ definitions: attribute-sets: - name: route-attrs + name-prefix: rta- attributes: - - name: rta-dst + name: dst type: binary display-hint: ipv4 - - name: rta-src + name: src type: binary display-hint: ipv4 - - name: rta-iif + name: iif type: u32 - - name: rta-oif + name: oif type: u32 - - name: rta-gateway + name: gateway type: binary display-hint: ipv4 - - name: rta-priority + name: priority type: u32 - - name: rta-prefsrc + name: prefsrc type: binary display-hint: ipv4 - - name: rta-metrics + name: metrics type: nest - nested-attributes: rta-metrics + nested-attributes: metrics - - name: rta-multipath + name: multipath type: binary - - name: rta-protoinfo # not used + name: protoinfo # not used type: binary - - name: rta-flow + name: flow type: u32 - - name: rta-cacheinfo + name: cacheinfo type: binary struct: rta-cacheinfo - - name: rta-session # not used + name: session # not used type: binary - - name: rta-mp-algo # not used + name: mp-algo # not used type: binary - - name: rta-table + name: table type: u32 - - name: rta-mark + name: mark type: u32 - - name: rta-mfc-stats + name: mfc-stats type: binary - - name: rta-via + name: via type: binary - - name: rta-newdst + name: newdst type: binary - - name: rta-pref + name: pref type: u8 - - name: rta-encap-type + name: encap-type type: u16 - - name: rta-encap + name: encap type: binary # tunnel specific nest - - name: rta-expires + name: expires type: u32 - - name: rta-pad + name: pad type: binary - - name: rta-uid + name: uid type: u32 - - name: rta-ttl-propagate + name: ttl-propagate type: u8 - - name: rta-ip-proto + name: ip-proto type: u8 - - name: rta-sport + name: sport type: u16 - - name: rta-dport + name: dport type: u16 - - name: rta-nh-id + name: nh-id type: u32 - - name: rta-flowlabel + name: flowlabel type: u32 byte-order: big-endian display-hint: hex - - name: rta-metrics + name: metrics + name-prefix: rtax- attributes: - - name: rtax-unspec + name: unspec type: unused value: 0 - - name: rtax-lock + name: lock type: u32 - - name: rtax-mtu + name: mtu type: u32 - - name: rtax-window + name: window type: u32 - - name: rtax-rtt + name: rtt type: u32 - - name: rtax-rttvar + name: rttvar type: u32 - - name: rtax-ssthresh + name: ssthresh type: u32 - - name: rtax-cwnd + name: cwnd type: u32 - - name: rtax-advmss + name: advmss type: u32 - - name: rtax-reordering + name: reordering type: u32 - - name: rtax-hoplimit + name: hoplimit type: u32 - - name: rtax-initcwnd + name: initcwnd type: u32 - - name: rtax-features + name: features type: u32 - - name: rtax-rto-min + name: rto-min type: u32 - - name: rtax-initrwnd + name: initrwnd type: u32 - - name: rtax-quickack + name: quickack type: u32 - - name: rtax-cc-algo + name: cc-algo type: string - - name: rtax-fastopen-no-cookie + name: fastopen-no-cookie type: u32 operations: @@ -254,18 +256,18 @@ operations: value: 26 attributes: - rtm-family - - rta-src + - src - rtm-src-len - - rta-dst + - dst - rtm-dst-len - - rta-iif - - rta-oif - - rta-ip-proto - - rta-sport - - rta-dport - - rta-mark - - rta-uid - - rta-flowlabel + - iif + - oif + - ip-proto + - sport + - dport + - mark + - uid + - flowlabel reply: value: 24 attributes: &all-route-attrs @@ -278,34 +280,34 @@ operations: - rtm-scope - rtm-type - rtm-flags - - rta-dst - - rta-src - - rta-iif - - rta-oif - - rta-gateway - - rta-priority - - rta-prefsrc - - rta-metrics - - rta-multipath - - rta-flow - - rta-cacheinfo - - rta-table - - rta-mark - - rta-mfc-stats - - rta-via - - rta-newdst - - rta-pref - - rta-encap-type - - rta-encap - - rta-expires - - rta-pad - - rta-uid - - rta-ttl-propagate - - rta-ip-proto - - rta-sport - - rta-dport - - rta-nh-id - - rta-flowlabel + - dst + - src + - iif + - oif + - gateway + - priority + - prefsrc + - metrics + - multipath + - flow + - cacheinfo + - table + - mark + - mfc-stats + - via + - newdst + - pref + - encap-type + - encap + - expires + - pad + - uid + - ttl-propagate + - ip-proto + - sport + - dport + - nh-id + - flowlabel dump: request: value: 26 diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst index ebb868f50ac2..eab601ab2db0 100644 --- a/Documentation/networking/netdevices.rst +++ b/Documentation/networking/netdevices.rst @@ -338,10 +338,35 @@ operations directly under the netdev instance lock. Devices drivers are encouraged to rely on the instance lock where possible. For the (mostly software) drivers that need to interact with the core stack, -there are two sets of interfaces: ``dev_xxx`` and ``netif_xxx`` (e.g., -``dev_set_mtu`` and ``netif_set_mtu``). The ``dev_xxx`` functions handle -acquiring the instance lock themselves, while the ``netif_xxx`` functions -assume that the driver has already acquired the instance lock. +there are two sets of interfaces: ``dev_xxx``/``netdev_xxx`` and ``netif_xxx`` +(e.g., ``dev_set_mtu`` and ``netif_set_mtu``). The ``dev_xxx``/``netdev_xxx`` +functions handle acquiring the instance lock themselves, while the +``netif_xxx`` functions assume that the driver has already acquired +the instance lock. + +Notifiers and netdev instance lock +================================== + +For device drivers that implement shaping or queue management APIs, +some of the notifiers (``enum netdev_cmd``) are running under the netdev +instance lock. + +For devices with locked ops, currently only the following notifiers are +running under the lock: +* ``NETDEV_REGISTER`` +* ``NETDEV_UP`` +* ``NETDEV_CHANGE`` + +The following notifiers are running without the lock: +* ``NETDEV_UNREGISTER`` + +There are no clear expectations for the remaining notifiers. Notifiers not on +the list may run with or without the instance lock, potentially even invoking +the same notifier type with and without the lock from different code paths. +The goal is to eventually ensure that all (or most, with a few documented +exceptions) notifiers run under the instance lock. Please extend this +documentation whenever you make explicit assumption about lock being held +from a notifier. NETDEV_INTERNAL symbol namespace ================================ diff --git a/Documentation/rust/arch-support.rst b/Documentation/rust/arch-support.rst index 54be7ddf3e57..6e6a515d0899 100644 --- a/Documentation/rust/arch-support.rst +++ b/Documentation/rust/arch-support.rst @@ -15,6 +15,7 @@ support corresponds to ``S`` values in the ``MAINTAINERS`` file. ============= ================ ============================================== Architecture Level of support Constraints ============= ================ ============================================== +``arm`` Maintained ARMv7 Little Endian only. ``arm64`` Maintained Little Endian only. ``loongarch`` Maintained \- ``riscv`` Maintained ``riscv64`` and LLVM/Clang only. diff --git a/Documentation/subsystem-apis.rst b/Documentation/subsystem-apis.rst index b52ad5b969d4..ff4fe8c936c8 100644 --- a/Documentation/subsystem-apis.rst +++ b/Documentation/subsystem-apis.rst @@ -71,6 +71,7 @@ Other subsystems accounting/index cpu-freq/index + edac/index fpga/index i2c/index iio/index diff --git a/Documentation/trace/coresight/coresight.rst b/Documentation/trace/coresight/coresight.rst index d4f93d6a2d63..806699871b80 100644 --- a/Documentation/trace/coresight/coresight.rst +++ b/Documentation/trace/coresight/coresight.rst @@ -462,44 +462,35 @@ queried by the perf command line tool: cs_etm// [Kernel PMU event] - linaro@linaro-nano:~$ - Regardless of the number of tracers available in a system (usually equal to the amount of processor cores), the "cs_etm" PMU will be listed only once. A Coresight PMU works the same way as any other PMU, i.e the name of the PMU is -listed along with configuration options within forward slashes '/'. Since a -Coresight system will typically have more than one sink, the name of the sink to -work with needs to be specified as an event option. -On newer kernels the available sinks are listed in sysFS under +provided along with configuration options within forward slashes '/' (see +`Config option formats`_). + +Advanced Perf framework usage +----------------------------- + +Sink selection +~~~~~~~~~~~~~~ + +An appropriate sink will be selected automatically for use with Perf, but since +there will typically be more than one sink, the name of the sink to use may be +specified as a special config option prefixed with '@'. + +The available sinks are listed in sysFS under ($SYSFS)/bus/event_source/devices/cs_etm/sinks/:: root@localhost:/sys/bus/event_source/devices/cs_etm/sinks# ls tmc_etf0 tmc_etr0 tpiu0 -On older kernels, this may need to be found from the list of coresight devices, -available under ($SYSFS)/bus/coresight/devices/:: - - root:~# ls /sys/bus/coresight/devices/ - etm0 etm1 etm2 etm3 etm4 etm5 funnel0 - funnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0 root@linaro-nano:~# perf record -e cs_etm/@tmc_etr0/u --per-thread program -As mentioned above in section "Device Naming scheme", the names of the devices could -look different from what is used in the example above. One must use the device names -as it appears under the sysFS. - -The syntax within the forward slashes '/' is important. The '@' character -tells the parser that a sink is about to be specified and that this is the sink -to use for the trace session. - More information on the above and other example on how to use Coresight with the perf tools can be found in the "HOWTO.md" file of the openCSD gitHub repository [#third]_. -Advanced perf framework usage ------------------------------ - AutoFDO analysis using the perf tools ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -508,7 +499,7 @@ perf can be used to record and analyze trace of programs. Execution can be recorded using 'perf record' with the cs_etm event, specifying the name of the sink to record to, e.g:: - perf record -e cs_etm/@tmc_etr0/u --per-thread + perf record -e cs_etm//u --per-thread The 'perf report' and 'perf script' commands can be used to analyze execution, synthesizing instruction and branch events from the instruction trace. @@ -572,7 +563,7 @@ sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tuto Bubble sorting array of 30000 elements 5910 ms - $ perf record -e cs_etm/@tmc_etr0/u --per-thread taskset -c 2 ./sort + $ perf record -e cs_etm//u --per-thread taskset -c 2 ./sort Bubble sorting array of 30000 elements 12543 ms [ perf record: Woken up 35 times to write data ] diff --git a/Documentation/trace/coresight/panic.rst b/Documentation/trace/coresight/panic.rst new file mode 100644 index 000000000000..a58aa914c241 --- /dev/null +++ b/Documentation/trace/coresight/panic.rst @@ -0,0 +1,362 @@ +=================================================== +Using Coresight for Kernel panic and Watchdog reset +=================================================== + +Introduction +------------ +This documentation is about using Linux coresight trace support to +debug kernel panic and watchdog reset scenarios. + +Coresight trace during Kernel panic +----------------------------------- +From the coresight driver point of view, addressing the kernel panic +situation has four main requirements. + +a. Support for allocation of trace buffer pages from reserved memory area. + Platform can advertise this using a new device tree property added to + relevant coresight nodes. + +b. Support for stopping coresight blocks at the time of panic + +c. Saving required metadata in the specified format + +d. Support for reading trace data captured at the time of panic + +Allocation of trace buffer pages from reserved RAM +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +A new optional device tree property "memory-region" is added to the +Coresight TMC device nodes, that would give the base address and size of trace +buffer. + +Static allocation of trace buffers would ensure that both IOMMU enabled +and disabled cases are handled. Also, platforms that support persistent +RAM will allow users to read trace data in the subsequent boot without +booting the crashdump kernel. + +Note: +For ETR sink devices, this reserved region will be used for both trace +capture and trace data retrieval. +For ETF sink devices, internal SRAM would be used for trace capture, +and they would be synced to reserved region for retrieval. + + +Disabling coresight blocks at the time of panic +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +In order to avoid the situation of losing relevant trace data after a +kernel panic, it would be desirable to stop the coresight blocks at the +time of panic. + +This can be achieved by configuring the comparator, CTI and sink +devices as below:: + + Trigger on panic + Comparator --->External out --->CTI -->External In---->ETR/ETF stop + +Saving metadata at the time of kernel panic +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Coresight metadata involves all additional data that are required for a +successful trace decode in addition to the trace data. This involves +ETR/ETF/ETB register snapshot etc. + +A new optional device property "memory-region" is added to +the ETR/ETF/ETB device nodes for this. + +Reading trace data captured at the time of panic +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Trace data captured at the time of panic, can be read from rebooted kernel +or from crashdump kernel using a special device file /dev/crash_tmc_xxx. +This device file is created only when there is a valid crashdata available. + +General flow of trace capture and decode incase of kernel panic +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +1. Enable source and sink on all the cores using the sysfs interface. + ETR sinks should have trace buffers allocated from reserved memory, + by selecting "resrv" buffer mode from sysfs. + +2. Run relevant tests. + +3. On a kernel panic, all coresight blocks are disabled, necessary + metadata is synced by kernel panic handler. + + System would eventually reboot or boot a crashdump kernel. + +4. For platforms that supports crashdump kernel, raw trace data can be + dumped using the coresight sysfs interface from the crashdump kernel + itself. Persistent RAM is not a requirement in this case. + +5. For platforms that supports persistent RAM, trace data can be dumped + using the coresight sysfs interface in the subsequent Linux boot. + Crashdump kernel is not a requirement in this case. Persistent RAM + ensures that trace data is intact across reboot. + +Coresight trace during Watchdog reset +------------------------------------- +The main difference between addressing the watchdog reset and kernel panic +case are below, + +a. Saving coresight metadata need to be taken care by the + SCP(system control processor) firmware in the specified format, + instead of kernel. + +b. Reserved memory region given by firmware for trace buffer and metadata + has to be in persistent RAM. + Note: This is a requirement for watchdog reset case but optional + in kernel panic case. + +Watchdog reset can be supported only on platforms that meet the above +two requirements. + +Sample commands for testing a Kernel panic case with ETR sink +------------------------------------------------------------- + +1. Boot Linux kernel with "crash_kexec_post_notifiers" added to the kernel + bootargs. This is mandatory if the user would like to read the tracedata + from the crashdump kernel. + +2. Enable the preloaded ETM configuration:: + + #echo 1 > /sys/kernel/config/cs-syscfg/configurations/panicstop/enable + +3. Configure CTI using sysfs interface:: + + #./cti_setup.sh + + #cat cti_setup.sh + + + cd /sys/bus/coresight/devices/ + + ap_cti_config () { + #ETM trig out[0] trigger to Channel 0 + echo 0 4 > channels/trigin_attach + } + + etf_cti_config () { + #ETF Flush in trigger from Channel 0 + echo 0 1 > channels/trigout_attach + echo 1 > channels/trig_filter_enable + } + + etr_cti_config () { + #ETR Flush in from Channel 0 + echo 0 1 > channels/trigout_attach + echo 1 > channels/trig_filter_enable + } + + ctidevs=`find . -name "cti*"` + + for i in $ctidevs + do + cd $i + + connection=`find . -name "ete*"` + if [ ! -z "$connection" ] + then + echo "AP CTI config for $i" + ap_cti_config + fi + + connection=`find . -name "tmc_etf*"` + if [ ! -z "$connection" ] + then + echo "ETF CTI config for $i" + etf_cti_config + fi + + connection=`find . -name "tmc_etr*"` + if [ ! -z "$connection" ] + then + echo "ETR CTI config for $i" + etr_cti_config + fi + + cd .. + done + +Note: CTI connections are SOC specific and hence the above script is +added just for reference. + +4. Choose reserved buffer mode for ETR buffer:: + + #echo "resrv" > /sys/bus/coresight/devices/tmc_etr0/buf_mode_preferred + +5. Enable stop on flush trigger configuration:: + + #echo 1 > /sys/bus/coresight/devices/tmc_etr0/stop_on_flush + +6. Start Coresight tracing on cores 1 and 2 using sysfs interface + +7. Run some application on core 1:: + + #taskset -c 1 dd if=/dev/urandom of=/dev/null & + +8. Invoke kernel panic on core 2:: + + #echo 1 > /proc/sys/kernel/panic + #taskset -c 2 echo c > /proc/sysrq-trigger + +9. From rebooted kernel or crashdump kernel, read crashdata:: + + #dd if=/dev/crash_tmc_etr0 of=/trace/cstrace.bin + +10. Run opencsd decoder tools/scripts to generate the instruction trace. + +Sample instruction trace dump +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Core1 dump:: + + A etm4_enable_hw: ffff800008ae1dd4 + CONTEXT EL2 etm4_enable_hw: ffff800008ae1dd4 + I etm4_enable_hw: ffff800008ae1dd4: + d503201f nop + I etm4_enable_hw: ffff800008ae1dd8: + d503201f nop + I etm4_enable_hw: ffff800008ae1ddc: + d503201f nop + I etm4_enable_hw: ffff800008ae1de0: + d503201f nop + I etm4_enable_hw: ffff800008ae1de4: + d503201f nop + I etm4_enable_hw: ffff800008ae1de8: + d503233f paciasp + I etm4_enable_hw: ffff800008ae1dec: + a9be7bfd stp x29, x30, [sp, #-32]! + I etm4_enable_hw: ffff800008ae1df0: + 910003fd mov x29, sp + I etm4_enable_hw: ffff800008ae1df4: + a90153f3 stp x19, x20, [sp, #16] + I etm4_enable_hw: ffff800008ae1df8: + 2a0003f4 mov w20, w0 + I etm4_enable_hw: ffff800008ae1dfc: + 900085b3 adrp x19, ffff800009b95000 <reserved_mem+0xc48> + I etm4_enable_hw: ffff800008ae1e00: + 910f4273 add x19, x19, #0x3d0 + I etm4_enable_hw: ffff800008ae1e04: + f8747a60 ldr x0, [x19, x20, lsl #3] + E etm4_enable_hw: ffff800008ae1e08: + b4000140 cbz x0, ffff800008ae1e30 <etm4_starting_cpu+0x50> + I 149.039572921 etm4_enable_hw: ffff800008ae1e30: + a94153f3 ldp x19, x20, [sp, #16] + I 149.039572921 etm4_enable_hw: ffff800008ae1e34: + 52800000 mov w0, #0x0 // #0 + I 149.039572921 etm4_enable_hw: ffff800008ae1e38: + a8c27bfd ldp x29, x30, [sp], #32 + + ..snip + + 149.052324811 chacha_block_generic: ffff800008642d80: + 9100a3e0 add x0, + I 149.052324811 chacha_block_generic: ffff800008642d84: + b86178a2 ldr w2, [x5, x1, lsl #2] + I 149.052324811 chacha_block_generic: ffff800008642d88: + 8b010803 add x3, x0, x1, lsl #2 + I 149.052324811 chacha_block_generic: ffff800008642d8c: + b85fc063 ldur w3, [x3, #-4] + I 149.052324811 chacha_block_generic: ffff800008642d90: + 0b030042 add w2, w2, w3 + I 149.052324811 chacha_block_generic: ffff800008642d94: + b8217882 str w2, [x4, x1, lsl #2] + I 149.052324811 chacha_block_generic: ffff800008642d98: + 91000421 add x1, x1, #0x1 + I 149.052324811 chacha_block_generic: ffff800008642d9c: + f100443f cmp x1, #0x11 + + +Core 2 dump:: + + A etm4_enable_hw: ffff800008ae1dd4 + CONTEXT EL2 etm4_enable_hw: ffff800008ae1dd4 + I etm4_enable_hw: ffff800008ae1dd4: + d503201f nop + I etm4_enable_hw: ffff800008ae1dd8: + d503201f nop + I etm4_enable_hw: ffff800008ae1ddc: + d503201f nop + I etm4_enable_hw: ffff800008ae1de0: + d503201f nop + I etm4_enable_hw: ffff800008ae1de4: + d503201f nop + I etm4_enable_hw: ffff800008ae1de8: + d503233f paciasp + I etm4_enable_hw: ffff800008ae1dec: + a9be7bfd stp x29, x30, [sp, #-32]! + I etm4_enable_hw: ffff800008ae1df0: + 910003fd mov x29, sp + I etm4_enable_hw: ffff800008ae1df4: + a90153f3 stp x19, x20, [sp, #16] + I etm4_enable_hw: ffff800008ae1df8: + 2a0003f4 mov w20, w0 + I etm4_enable_hw: ffff800008ae1dfc: + 900085b3 adrp x19, ffff800009b95000 <reserved_mem+0xc48> + I etm4_enable_hw: ffff800008ae1e00: + 910f4273 add x19, x19, #0x3d0 + I etm4_enable_hw: ffff800008ae1e04: + f8747a60 ldr x0, [x19, x20, lsl #3] + E etm4_enable_hw: ffff800008ae1e08: + b4000140 cbz x0, ffff800008ae1e30 <etm4_starting_cpu+0x50> + I 149.046243445 etm4_enable_hw: ffff800008ae1e30: + a94153f3 ldp x19, x20, [sp, #16] + I 149.046243445 etm4_enable_hw: ffff800008ae1e34: + 52800000 mov w0, #0x0 // #0 + I 149.046243445 etm4_enable_hw: ffff800008ae1e38: + a8c27bfd ldp x29, x30, [sp], #32 + I 149.046243445 etm4_enable_hw: ffff800008ae1e3c: + d50323bf autiasp + E 149.046243445 etm4_enable_hw: ffff800008ae1e40: + d65f03c0 ret + A ete_sysreg_write: ffff800008adfa18 + + ..snip + + I 149.05422547 panic: ffff800008096300: + a90363f7 stp x23, x24, [sp, #48] + I 149.05422547 panic: ffff800008096304: + 6b00003f cmp w1, w0 + I 149.05422547 panic: ffff800008096308: + 3a411804 ccmn w0, #0x1, #0x4, ne // ne = any + N 149.05422547 panic: ffff80000809630c: + 540001e0 b.eq ffff800008096348 <panic+0xe0> // b.none + I 149.05422547 panic: ffff800008096310: + f90023f9 str x25, [sp, #64] + E 149.05422547 panic: ffff800008096314: + 97fe44ef bl ffff8000080276d0 <panic_smp_self_stop> + A panic: ffff80000809634c + I 149.05422547 panic: ffff80000809634c: + 910102d5 add x21, x22, #0x40 + I 149.05422547 panic: ffff800008096350: + 52800020 mov w0, #0x1 // #1 + E 149.05422547 panic: ffff800008096354: + 94166b8b bl ffff800008631180 <bust_spinlocks> + N 149.054225518 bust_spinlocks: ffff800008631180: + 340000c0 cbz w0, ffff800008631198 <bust_spinlocks+0x18> + I 149.054225518 bust_spinlocks: ffff800008631184: + f000a321 adrp x1, ffff800009a98000 <pbufs.0+0xbb8> + I 149.054225518 bust_spinlocks: ffff800008631188: + b9405c20 ldr w0, [x1, #92] + I 149.054225518 bust_spinlocks: ffff80000863118c: + 11000400 add w0, w0, #0x1 + I 149.054225518 bust_spinlocks: ffff800008631190: + b9005c20 str w0, [x1, #92] + E 149.054225518 bust_spinlocks: ffff800008631194: + d65f03c0 ret + A panic: ffff800008096358 + +Perf based testing +------------------ + +Starting perf session +~~~~~~~~~~~~~~~~~~~~~ +ETF:: + + perf record -e cs_etm/panicstop,@tmc_etf1/ -C 1 + perf record -e cs_etm/panicstop,@tmc_etf2/ -C 2 + +ETR:: + + perf record -e cs_etm/panicstop,@tmc_etr0/ -C 1,2 + +Reading trace data after panic +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Same sysfs based method explained above can be used to retrieve and +decode the trace data after the reboot on kernel panic. diff --git a/Documentation/trace/debugging.rst b/Documentation/trace/debugging.rst index 54fb16239d70..d54bc500af80 100644 --- a/Documentation/trace/debugging.rst +++ b/Documentation/trace/debugging.rst @@ -136,6 +136,8 @@ kernel, so only the same kernel is guaranteed to work if the mapping is preserved. Switching to a different kernel version may find a different layout and mark the buffer as invalid. +NB: Both the mapped address and size must be page aligned for the architecture. + Using trace_printk() in the boot instance ----------------------------------------- By default, the content of trace_printk() goes into the top level tracing diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst index 2b74f96d09d5..c9e88bf65709 100644 --- a/Documentation/trace/ftrace.rst +++ b/Documentation/trace/ftrace.rst @@ -3077,7 +3077,7 @@ Notice that we lost the sys_nanosleep. # cat set_ftrace_filter hrtimer_run_queues hrtimer_run_pending - hrtimer_init + hrtimer_setup hrtimer_cancel hrtimer_try_to_cancel hrtimer_forward @@ -3115,7 +3115,7 @@ Again, now we want to append. # cat set_ftrace_filter hrtimer_run_queues hrtimer_run_pending - hrtimer_init + hrtimer_setup hrtimer_cancel hrtimer_try_to_cancel hrtimer_forward diff --git a/Documentation/translations/zh_CN/mm/hmm.rst b/Documentation/translations/zh_CN/mm/hmm.rst index 0669f947d0bc..22c210f4e94f 100644 --- a/Documentation/translations/zh_CN/mm/hmm.rst +++ b/Documentation/translations/zh_CN/mm/hmm.rst @@ -326,7 +326,7 @@ devm_memunmap_pages() 和 devm_release_mem_region() 当资源可以绑定到 ``s 一些设备具有诸如原子PTE位的功能,可以用来实现对系统内存的原子访问。为了支持对一 个共享的虚拟内存页的原子操作,这样的设备需要对该页的访问是排他的,而不是来自CPU -的任何用户空间访问。 ``make_device_exclusive_range()`` 函数可以用来使一 +的任何用户空间访问。 ``make_device_exclusive()`` 函数可以用来使一 个内存范围不能从用户空间访问。 这将用特殊的交换条目替换给定范围内的所有页的映射。任何试图访问交换条目的行为都会 diff --git a/Documentation/translations/zh_CN/mm/index.rst b/Documentation/translations/zh_CN/mm/index.rst index c8726bce8f74..a71116be058f 100644 --- a/Documentation/translations/zh_CN/mm/index.rst +++ b/Documentation/translations/zh_CN/mm/index.rst @@ -58,7 +58,6 @@ Linux内存管理文档 remap_file_pages split_page_table_lock vmalloced-kernel-stacks - z3fold zsmalloc TODOLIST: diff --git a/Documentation/translations/zh_CN/mm/z3fold.rst b/Documentation/translations/zh_CN/mm/z3fold.rst deleted file mode 100644 index 9569a6d88270..000000000000 --- a/Documentation/translations/zh_CN/mm/z3fold.rst +++ /dev/null @@ -1,31 +0,0 @@ -:Original: Documentation/mm/z3fold.rst - -:翻译: - - 司延腾 Yanteng Si <siyanteng@loongson.cn> - -:校译: - - -====== -z3fold -====== - -z3fold是一个专门用于存储压缩页的分配器。它被设计为每个物理页最多可以存储三个压缩页。 -它是zbud的衍生物,允许更高的压缩率,保持其前辈的简单性和确定性。 - -z3fold和zbud的主要区别是: - -* 与zbud不同的是,z3fold允许最大的PAGE_SIZE分配。 -* z3fold在其页面中最多可以容纳3个压缩页面 -* z3fold本身没有输出任何API,因此打算通过zpool的API来使用 - -为了保持确定性和简单性,z3fold,就像zbud一样,总是在每页存储一个整数的压缩页,但是 -它最多可以存储3页,不像zbud最多可以存储2页。因此压缩率达到2.7倍左右,而zbud的压缩 -率是1.7倍左右。 - -不像zbud(但也像zsmalloc),z3fold_alloc()那样不返回一个可重复引用的指针。相反,它 -返回一个无符号长句柄,它编码了被分配对象的实际位置。 - -保持有效的压缩率接近于zsmalloc,z3fold不依赖于MMU的启用,并提供更可预测的回收行 -为,这使得它更适合于小型和反应迅速的系统。 diff --git a/Documentation/usb/CREDITS b/Documentation/usb/CREDITS index 81ea3eb29e96..ce6450a6ed7c 100644 --- a/Documentation/usb/CREDITS +++ b/Documentation/usb/CREDITS @@ -161,7 +161,7 @@ THANKS file in Inaky's driver): - The people at the linux-usb mailing list, for reading so many messages :) Ok, no more kidding; for all your advises! - - All the people at the USB Implementors Forum for their + - All the people at the USB Implementers Forum for their help and assistance. - Nathan Myers <ncm@cantrip.org>, for his advice! (hope you diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst index 3d1cd7ad9d67..7a1409ecc238 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -376,9 +376,9 @@ Code Seq# Include File Comments 0xB8 all uapi/linux/mshv.h Microsoft Hyper-V /dev/mshv driver <mailto:linux-hyperv@vger.kernel.org> 0xC0 00-0F linux/usb/iowarrior.h -0xCA 00-0F uapi/misc/cxl.h +0xCA 00-0F uapi/misc/cxl.h Dead since 6.15 0xCA 10-2F uapi/misc/ocxl.h -0xCA 80-BF uapi/scsi/cxlflash_ioctl.h Dead since 6.14 +0xCA 80-BF uapi/scsi/cxlflash_ioctl.h Dead since 6.15 0xCB 00-1F CBM serial IEC bus in development: <mailto:michael.klein@puffin.lb.shuttle.de> 0xCC 00-0F drivers/misc/ibmvmc.h pseries VMC driver diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst index 70289d6815d2..b0df15865dec 100644 --- a/Documentation/userspace-api/iommufd.rst +++ b/Documentation/userspace-api/iommufd.rst @@ -63,6 +63,13 @@ Following IOMMUFD objects are exposed to userspace: space usually has mappings from guest-level I/O virtual addresses to guest- level physical addresses. +- IOMMUFD_FAULT, representing a software queue for an HWPT reporting IO page + faults using the IOMMU HW's PRI (Page Request Interface). This queue object + provides user space an FD to poll the page fault events and also to respond + to those events. A FAULT object must be created first to get a fault_id that + could be then used to allocate a fault-enabled HWPT via the IOMMU_HWPT_ALLOC + command by setting the IOMMU_HWPT_FAULT_ID_VALID bit in its flags field. + - IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance, passed to or shared with a VM. It may be some HW-accelerated virtualization features and some SW resources used by the VM. For examples: @@ -109,6 +116,14 @@ Following IOMMUFD objects are exposed to userspace: vIOMMU, which is a separate ioctl call from attaching the same device to an HWPT_PAGING that the vIOMMU holds. +- IOMMUFD_OBJ_VEVENTQ, representing a software queue for a vIOMMU to report its + events such as translation faults occurred to a nested stage-1 (excluding I/O + page faults that should go through IOMMUFD_OBJ_FAULT) and HW-specific events. + This queue object provides user space an FD to poll/read the vIOMMU events. A + vIOMMU object must be created first to get its viommu_id, which could be then + used to allocate a vEVENTQ. Each vIOMMU can support multiple types of vEVENTS, + but is confined to one vEVENTQ per vEVENTQ type. + All user-visible objects are destroyed via the IOMMU_DESTROY uAPI. The diagrams below show relationships between user-visible objects and kernel @@ -251,8 +266,10 @@ User visible objects are backed by following datastructures: - iommufd_device for IOMMUFD_OBJ_DEVICE. - iommufd_hwpt_paging for IOMMUFD_OBJ_HWPT_PAGING. - iommufd_hwpt_nested for IOMMUFD_OBJ_HWPT_NESTED. +- iommufd_fault for IOMMUFD_OBJ_FAULT. - iommufd_viommu for IOMMUFD_OBJ_VIOMMU. - iommufd_vdevice for IOMMUFD_OBJ_VDEVICE. +- iommufd_veventq for IOMMUFD_OBJ_VEVENTQ. Several terminologies when looking at these datastructures: diff --git a/Documentation/userspace-api/mseal.rst b/Documentation/userspace-api/mseal.rst index 41102f74c5e2..1dabfc29be0d 100644 --- a/Documentation/userspace-api/mseal.rst +++ b/Documentation/userspace-api/mseal.rst @@ -130,6 +130,27 @@ Use cases - Chrome browser: protect some security sensitive data structures. +- System mappings: + The system mappings are created by the kernel and includes vdso, vvar, + vvar_vclock, vectors (arm compat-mode), sigpage (arm compat-mode), uprobes. + + Those system mappings are readonly only or execute only, memory sealing can + protect them from ever changing to writable or unmmap/remapped as different + attributes. This is useful to mitigate memory corruption issues where a + corrupted pointer is passed to a memory management system. + + If supported by an architecture (CONFIG_ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS), + the CONFIG_MSEAL_SYSTEM_MAPPINGS seals all system mappings of this + architecture. + + The following architectures currently support this feature: x86-64, arm64, + and s390. + + WARNING: This feature breaks programs which rely on relocating + or unmapping system mappings. Known broken software at the time + of writing includes CHECKPOINT_RESTORE, UML, gVisor, rr. Therefore + this config can't be enabled universally. + When not to use mseal ===================== Applications can apply sealing to any virtual memory region from userspace, diff --git a/Documentation/userspace-api/perf_ring_buffer.rst b/Documentation/userspace-api/perf_ring_buffer.rst index bde9d8cbc106..dc71544532ce 100644 --- a/Documentation/userspace-api/perf_ring_buffer.rst +++ b/Documentation/userspace-api/perf_ring_buffer.rst @@ -627,7 +627,7 @@ regular ring buffer. AUX events and AUX trace data are two different things. Let's see an example:: - perf record -a -e cycles -e cs_etm/@tmc_etr0/ -- sleep 2 + perf record -a -e cycles -e cs_etm// -- sleep 2 The above command enables two events: one is the event *cycles* from PMU and another is the AUX event *cs_etm* from Arm CoreSight, both are saved @@ -766,7 +766,7 @@ only record AUX trace data at a specific time point which users are interested in. E.g. below gives an example of how to take snapshots with 1 second interval with Arm CoreSight:: - perf record -e cs_etm/@tmc_etr0/u -S -a program & + perf record -e cs_etm//u -S -a program & PERFPID=$! while true; do kill -USR2 $PERFPID diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 1f8625b7646a..47c7c3f92314 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7447,6 +7447,75 @@ Unused bitfields in the bitarrays must be set to zero. This capability connects the vcpu to an in-kernel XIVE device. +6.76 KVM_CAP_HYPERV_SYNIC +------------------------- + +:Architectures: x86 +:Target: vcpu + +This capability, if KVM_CHECK_EXTENSION indicates that it is +available, means that the kernel has an implementation of the +Hyper-V Synthetic interrupt controller(SynIC). Hyper-V SynIC is +used to support Windows Hyper-V based guest paravirt drivers(VMBus). + +In order to use SynIC, it has to be activated by setting this +capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this +will disable the use of APIC hardware virtualization even if supported +by the CPU, as it's incompatible with SynIC auto-EOI behavior. + +6.77 KVM_CAP_HYPERV_SYNIC2 +-------------------------- + +:Architectures: x86 +:Target: vcpu + +This capability enables a newer version of Hyper-V Synthetic interrupt +controller (SynIC). The only difference with KVM_CAP_HYPERV_SYNIC is that KVM +doesn't clear SynIC message and event flags pages when they are enabled by +writing to the respective MSRs. + +6.78 KVM_CAP_HYPERV_DIRECT_TLBFLUSH +----------------------------------- + +:Architectures: x86 +:Target: vcpu + +This capability indicates that KVM running on top of Hyper-V hypervisor +enables Direct TLB flush for its guests meaning that TLB flush +hypercalls are handled by Level 0 hypervisor (Hyper-V) bypassing KVM. +Due to the different ABI for hypercall parameters between Hyper-V and +KVM, enabling this capability effectively disables all hypercall +handling by KVM (as some KVM hypercall may be mistakenly treated as TLB +flush hypercalls by Hyper-V) so userspace should disable KVM identification +in CPUID and only exposes Hyper-V identification. In this case, guest +thinks it's running on Hyper-V and only use Hyper-V hypercalls. + +6.79 KVM_CAP_HYPERV_ENFORCE_CPUID +--------------------------------- + +:Architectures: x86 +:Target: vcpu + +When enabled, KVM will disable emulated Hyper-V features provided to the +guest according to the bits Hyper-V CPUID feature leaves. Otherwise, all +currently implemented Hyper-V features are provided unconditionally when +Hyper-V identification is set in the HYPERV_CPUID_INTERFACE (0x40000001) +leaf. + +6.80 KVM_CAP_ENFORCE_PV_FEATURE_CPUID +------------------------------------- + +:Architectures: x86 +:Target: vcpu + +When enabled, KVM will disable paravirtual features provided to the +guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf +(0x40000001). Otherwise, a guest may use the paravirtual features +regardless of what has actually been exposed through the CPUID leaf. + +.. _KVM_CAP_DIRTY_LOG_RING: + + .. _cap_enable_vm: 7. Capabilities that can be enabled on VMs @@ -7927,10 +7996,10 @@ by POWER10 processor. 7.24 KVM_CAP_VM_COPY_ENC_CONTEXT_FROM ------------------------------------- -Architectures: x86 SEV enabled -Type: vm -Parameters: args[0] is the fd of the source vm -Returns: 0 on success; ENOTTY on error +:Architectures: x86 SEV enabled +:Type: vm +:Parameters: args[0] is the fd of the source vm +:Returns: 0 on success; ENOTTY on error This capability enables userspace to copy encryption context from the vm indicated by the fd to the vm this is called on. @@ -7963,24 +8032,6 @@ default. See Documentation/arch/x86/sgx.rst for more details. -7.26 KVM_CAP_PPC_RPT_INVALIDATE -------------------------------- - -:Capability: KVM_CAP_PPC_RPT_INVALIDATE -:Architectures: ppc -:Type: vm - -This capability indicates that the kernel is capable of handling -H_RPT_INVALIDATE hcall. - -In order to enable the use of H_RPT_INVALIDATE in the guest, -user space might have to advertise it for the guest. For example, -IBM pSeries (sPAPR) guest starts using it if "hcall-rpt-invalidate" is -present in the "ibm,hypertas-functions" device-tree property. - -This capability is enabled for hypervisors on platforms like POWER9 -that support radix MMU. - 7.27 KVM_CAP_EXIT_ON_EMULATION_FAILURE -------------------------------------- @@ -8038,24 +8089,9 @@ indicated by the fd to the VM this is called on. This is intended to support intra-host migration of VMs between userspace VMMs, upgrading the VMM process without interrupting the guest. -7.30 KVM_CAP_PPC_AIL_MODE_3 -------------------------------- - -:Capability: KVM_CAP_PPC_AIL_MODE_3 -:Architectures: ppc -:Type: vm - -This capability indicates that the kernel supports the mode 3 setting for the -"Address Translation Mode on Interrupt" aka "Alternate Interrupt Location" -resource that is controlled with the H_SET_MODE hypercall. - -This capability allows a guest kernel to use a better-performance mode for -handling interrupts and system calls. - 7.31 KVM_CAP_DISABLE_QUIRKS2 ---------------------------- -:Capability: KVM_CAP_DISABLE_QUIRKS2 :Parameters: args[0] - set of KVM quirks to disable :Architectures: x86 :Type: vm @@ -8210,27 +8246,6 @@ This capability is aimed to mitigate the threat that malicious VMs can cause CPU stuck (due to event windows don't open up) and make the CPU unavailable to host or other VMs. -7.34 KVM_CAP_MEMORY_FAULT_INFO ------------------------------- - -:Architectures: x86 -:Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. - -The presence of this capability indicates that KVM_RUN will fill -kvm_run.memory_fault if KVM cannot resolve a guest page fault VM-Exit, e.g. if -there is a valid memslot but no backing VMA for the corresponding host virtual -address. - -The information in kvm_run.memory_fault is valid if and only if KVM_RUN returns -an error with errno=EFAULT or errno=EHWPOISON *and* kvm_run.exit_reason is set -to KVM_EXIT_MEMORY_FAULT. - -Note: Userspaces which attempt to resolve memory faults so that they can retry -KVM_RUN are encouraged to guard against repeatedly receiving the same -error/annotated fault. - -See KVM_EXIT_MEMORY_FAULT for more information. - 7.35 KVM_CAP_X86_APIC_BUS_CYCLES_NS ----------------------------------- @@ -8248,19 +8263,220 @@ by KVM_CHECK_EXTENSION. Note: Userspace is responsible for correctly configuring CPUID 0x15, a.k.a. the core crystal clock frequency, if a non-zero CPUID 0x15 is exposed to the guest. -7.36 KVM_CAP_X86_GUEST_MODE ------------------------------- +7.36 KVM_CAP_DIRTY_LOG_RING/KVM_CAP_DIRTY_LOG_RING_ACQ_REL +---------------------------------------------------------- + +:Architectures: x86, arm64 +:Type: vm +:Parameters: args[0] - size of the dirty log ring + +KVM is capable of tracking dirty memory using ring buffers that are +mmapped into userspace; there is one dirty ring per vcpu. + +The dirty ring is available to userspace as an array of +``struct kvm_dirty_gfn``. Each dirty entry is defined as:: + + struct kvm_dirty_gfn { + __u32 flags; + __u32 slot; /* as_id | slot_id */ + __u64 offset; + }; + +The following values are defined for the flags field to define the +current state of the entry:: + + #define KVM_DIRTY_GFN_F_DIRTY BIT(0) + #define KVM_DIRTY_GFN_F_RESET BIT(1) + #define KVM_DIRTY_GFN_F_MASK 0x3 + +Userspace should call KVM_ENABLE_CAP ioctl right after KVM_CREATE_VM +ioctl to enable this capability for the new guest and set the size of +the rings. Enabling the capability is only allowed before creating any +vCPU, and the size of the ring must be a power of two. The larger the +ring buffer, the less likely the ring is full and the VM is forced to +exit to userspace. The optimal size depends on the workload, but it is +recommended that it be at least 64 KiB (4096 entries). + +Just like for dirty page bitmaps, the buffer tracks writes to +all user memory regions for which the KVM_MEM_LOG_DIRTY_PAGES flag was +set in KVM_SET_USER_MEMORY_REGION. Once a memory region is registered +with the flag set, userspace can start harvesting dirty pages from the +ring buffer. + +An entry in the ring buffer can be unused (flag bits ``00``), +dirty (flag bits ``01``) or harvested (flag bits ``1X``). The +state machine for the entry is as follows:: + + dirtied harvested reset + 00 -----------> 01 -------------> 1X -------+ + ^ | + | | + +------------------------------------------+ + +To harvest the dirty pages, userspace accesses the mmapped ring buffer +to read the dirty GFNs. If the flags has the DIRTY bit set (at this stage +the RESET bit must be cleared), then it means this GFN is a dirty GFN. +The userspace should harvest this GFN and mark the flags from state +``01b`` to ``1Xb`` (bit 0 will be ignored by KVM, but bit 1 must be set +to show that this GFN is harvested and waiting for a reset), and move +on to the next GFN. The userspace should continue to do this until the +flags of a GFN have the DIRTY bit cleared, meaning that it has harvested +all the dirty GFNs that were available. + +Note that on weakly ordered architectures, userspace accesses to the +ring buffer (and more specifically the 'flags' field) must be ordered, +using load-acquire/store-release accessors when available, or any +other memory barrier that will ensure this ordering. + +It's not necessary for userspace to harvest the all dirty GFNs at once. +However it must collect the dirty GFNs in sequence, i.e., the userspace +program cannot skip one dirty GFN to collect the one next to it. + +After processing one or more entries in the ring buffer, userspace +calls the VM ioctl KVM_RESET_DIRTY_RINGS to notify the kernel about +it, so that the kernel will reprotect those collected GFNs. +Therefore, the ioctl must be called *before* reading the content of +the dirty pages. + +The dirty ring can get full. When it happens, the KVM_RUN of the +vcpu will return with exit reason KVM_EXIT_DIRTY_LOG_FULL. + +The dirty ring interface has a major difference comparing to the +KVM_GET_DIRTY_LOG interface in that, when reading the dirty ring from +userspace, it's still possible that the kernel has not yet flushed the +processor's dirty page buffers into the kernel buffer (with dirty bitmaps, the +flushing is done by the KVM_GET_DIRTY_LOG ioctl). To achieve that, one +needs to kick the vcpu out of KVM_RUN using a signal. The resulting +vmexit ensures that all dirty GFNs are flushed to the dirty rings. + +NOTE: KVM_CAP_DIRTY_LOG_RING_ACQ_REL is the only capability that +should be exposed by weakly ordered architecture, in order to indicate +the additional memory ordering requirements imposed on userspace when +reading the state of an entry and mutating it from DIRTY to HARVESTED. +Architecture with TSO-like ordering (such as x86) are allowed to +expose both KVM_CAP_DIRTY_LOG_RING and KVM_CAP_DIRTY_LOG_RING_ACQ_REL +to userspace. + +After enabling the dirty rings, the userspace needs to detect the +capability of KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP to see whether the +ring structures can be backed by per-slot bitmaps. With this capability +advertised, it means the architecture can dirty guest pages without +vcpu/ring context, so that some of the dirty information will still be +maintained in the bitmap structure. KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP +can't be enabled if the capability of KVM_CAP_DIRTY_LOG_RING_ACQ_REL +hasn't been enabled, or any memslot has been existing. + +Note that the bitmap here is only a backup of the ring structure. The +use of the ring and bitmap combination is only beneficial if there is +only a very small amount of memory that is dirtied out of vcpu/ring +context. Otherwise, the stand-alone per-slot bitmap mechanism needs to +be considered. + +To collect dirty bits in the backup bitmap, userspace can use the same +KVM_GET_DIRTY_LOG ioctl. KVM_CLEAR_DIRTY_LOG isn't needed as long as all +the generation of the dirty bits is done in a single pass. Collecting +the dirty bitmap should be the very last thing that the VMM does before +considering the state as complete. VMM needs to ensure that the dirty +state is final and avoid missing dirty pages from another ioctl ordered +after the bitmap collection. + +NOTE: Multiple examples of using the backup bitmap: (1) save vgic/its +tables through command KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_SAVE_TABLES} on +KVM device "kvm-arm-vgic-its". (2) restore vgic/its tables through +command KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_RESTORE_TABLES} on KVM device +"kvm-arm-vgic-its". VGICv3 LPI pending status is restored. (3) save +vgic3 pending table through KVM_DEV_ARM_VGIC_{GRP_CTRL, SAVE_PENDING_TABLES} +command on KVM device "kvm-arm-vgic-v3". + +7.37 KVM_CAP_PMU_CAPABILITY +--------------------------- :Architectures: x86 -:Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. +:Type: vm +:Parameters: arg[0] is bitmask of PMU virtualization capabilities. +:Returns: 0 on success, -EINVAL when arg[0] contains invalid bits -The presence of this capability indicates that KVM_RUN will update the -KVM_RUN_X86_GUEST_MODE bit in kvm_run.flags to indicate whether the -vCPU was executing nested guest code when it exited. +This capability alters PMU virtualization in KVM. -KVM exits with the register state of either the L1 or L2 guest -depending on which executed at the time of an exit. Userspace must -take care to differentiate between these cases. +Calling KVM_CHECK_EXTENSION for this capability returns a bitmask of +PMU virtualization capabilities that can be adjusted on a VM. + +The argument to KVM_ENABLE_CAP is also a bitmask and selects specific +PMU virtualization capabilities to be applied to the VM. This can +only be invoked on a VM prior to the creation of VCPUs. + +At this time, KVM_PMU_CAP_DISABLE is the only capability. Setting +this capability will disable PMU virtualization for that VM. Usermode +should adjust CPUID leaf 0xA to reflect that the PMU is disabled. + +7.38 KVM_CAP_VM_DISABLE_NX_HUGE_PAGES +------------------------------------- + +:Architectures: x86 +:Type: vm +:Parameters: arg[0] must be 0. +:Returns: 0 on success, -EPERM if the userspace process does not + have CAP_SYS_BOOT, -EINVAL if args[0] is not 0 or any vCPUs have been + created. + +This capability disables the NX huge pages mitigation for iTLB MULTIHIT. + +The capability has no effect if the nx_huge_pages module parameter is not set. + +This capability may only be set before any vCPUs are created. + +7.39 KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE +--------------------------------------- + +:Architectures: arm64 +:Type: vm +:Parameters: arg[0] is the new split chunk size. +:Returns: 0 on success, -EINVAL if any memslot was already created. + +This capability sets the chunk size used in Eager Page Splitting. + +Eager Page Splitting improves the performance of dirty-logging (used +in live migrations) when guest memory is backed by huge-pages. It +avoids splitting huge-pages (into PAGE_SIZE pages) on fault, by doing +it eagerly when enabling dirty logging (with the +KVM_MEM_LOG_DIRTY_PAGES flag for a memory region), or when using +KVM_CLEAR_DIRTY_LOG. + +The chunk size specifies how many pages to break at a time, using a +single allocation for each chunk. Bigger the chunk size, more pages +need to be allocated ahead of time. + +The chunk size needs to be a valid block size. The list of acceptable +block sizes is exposed in KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES as a +64-bit bitmap (each bit describing a block size). The default value is +0, to disable the eager page splitting. + +7.40 KVM_CAP_EXIT_HYPERCALL +--------------------------- + +:Architectures: x86 +:Type: vm + +This capability, if enabled, will cause KVM to exit to userspace +with KVM_EXIT_HYPERCALL exit reason to process some hypercalls. + +Calling KVM_CHECK_EXTENSION for this capability will return a bitmask +of hypercalls that can be configured to exit to userspace. +Right now, the only such hypercall is KVM_HC_MAP_GPA_RANGE. + +The argument to KVM_ENABLE_CAP is also a bitmask, and must be a subset +of the result of KVM_CHECK_EXTENSION. KVM will forward to userspace +the hypercalls whose corresponding bit is in the argument, and return +ENOSYS for the others. + +7.41 KVM_CAP_ARM_SYSTEM_SUSPEND +------------------------------- + +:Architectures: arm64 +:Type: vm + +When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of +type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request. 7.37 KVM_CAP_ARM_WRITABLE_IMP_ID_REGS ------------------------------------- @@ -8297,21 +8513,6 @@ H_RANDOM hypercall backed by a hardware random-number generator. If present, the kernel H_RANDOM handler can be enabled for guest use with the KVM_CAP_PPC_ENABLE_HCALL capability. -8.2 KVM_CAP_HYPERV_SYNIC ------------------------- - -:Architectures: x86 - -This capability, if KVM_CHECK_EXTENSION indicates that it is -available, means that the kernel has an implementation of the -Hyper-V Synthetic interrupt controller(SynIC). Hyper-V SynIC is -used to support Windows Hyper-V based guest paravirt drivers(VMBus). - -In order to use SynIC, it has to be activated by setting this -capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this -will disable the use of APIC hardware virtualization even if supported -by the CPU, as it's incompatible with SynIC auto-EOI behavior. - 8.3 KVM_CAP_PPC_MMU_RADIX ------------------------- @@ -8362,20 +8563,6 @@ may be incompatible with the MIPS VZ ASE. virtualization, including standard guest virtual memory segments. == ========================================================================== -8.6 KVM_CAP_MIPS_TE -------------------- - -:Architectures: mips - -This capability, if KVM_CHECK_EXTENSION on the main kvm handle indicates that -it is available, means that the trap & emulate implementation is available to -run guest code in user mode, even if KVM_CAP_MIPS_VZ indicates that hardware -assisted virtualisation is also available. KVM_VM_MIPS_TE (0) must be passed -to KVM_CREATE_VM to create a VM which utilises it. - -If KVM_CHECK_EXTENSION on a kvm VM handle indicates that this capability is -available, it means that the VM is using trap & emulate. - 8.7 KVM_CAP_MIPS_64BIT ---------------------- @@ -8457,16 +8644,6 @@ virtual SMT modes that can be set using KVM_CAP_PPC_SMT. If bit N (counting from the right) is set, then a virtual SMT mode of 2^N is available. -8.11 KVM_CAP_HYPERV_SYNIC2 --------------------------- - -:Architectures: x86 - -This capability enables a newer version of Hyper-V Synthetic interrupt -controller (SynIC). The only difference with KVM_CAP_HYPERV_SYNIC is that KVM -doesn't clear SynIC message and event flags pages when they are enabled by -writing to the respective MSRs. - 8.12 KVM_CAP_HYPERV_VP_INDEX ---------------------------- @@ -8481,7 +8658,6 @@ capability is absent, userspace can still query this msr's value. ------------------------------- :Architectures: s390 -:Parameters: none This capability indicates if the flic device will be able to get/set the AIS states for migration via the KVM_DEV_FLIC_AISM_ALL attribute and allows @@ -8555,21 +8731,6 @@ This capability indicates that KVM supports paravirtualized Hyper-V IPI send hypercalls: HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx. -8.21 KVM_CAP_HYPERV_DIRECT_TLBFLUSH ------------------------------------ - -:Architectures: x86 - -This capability indicates that KVM running on top of Hyper-V hypervisor -enables Direct TLB flush for its guests meaning that TLB flush -hypercalls are handled by Level 0 hypervisor (Hyper-V) bypassing KVM. -Due to the different ABI for hypercall parameters between Hyper-V and -KVM, enabling this capability effectively disables all hypercall -handling by KVM (as some KVM hypercall may be mistakenly treated as TLB -flush hypercalls by Hyper-V) so userspace should disable KVM identification -in CPUID and only exposes Hyper-V identification. In this case, guest -thinks it's running on Hyper-V and only use Hyper-V hypercalls. - 8.22 KVM_CAP_S390_VCPU_RESETS ----------------------------- @@ -8647,142 +8808,6 @@ In combination with KVM_CAP_X86_USER_SPACE_MSR, this allows user space to trap and emulate MSRs that are outside of the scope of KVM as well as limit the attack surface on KVM's MSR emulation code. -8.28 KVM_CAP_ENFORCE_PV_FEATURE_CPUID -------------------------------------- - -Architectures: x86 - -When enabled, KVM will disable paravirtual features provided to the -guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf -(0x40000001). Otherwise, a guest may use the paravirtual features -regardless of what has actually been exposed through the CPUID leaf. - -.. _KVM_CAP_DIRTY_LOG_RING: - -8.29 KVM_CAP_DIRTY_LOG_RING/KVM_CAP_DIRTY_LOG_RING_ACQ_REL ----------------------------------------------------------- - -:Architectures: x86, arm64 -:Parameters: args[0] - size of the dirty log ring - -KVM is capable of tracking dirty memory using ring buffers that are -mmapped into userspace; there is one dirty ring per vcpu. - -The dirty ring is available to userspace as an array of -``struct kvm_dirty_gfn``. Each dirty entry is defined as:: - - struct kvm_dirty_gfn { - __u32 flags; - __u32 slot; /* as_id | slot_id */ - __u64 offset; - }; - -The following values are defined for the flags field to define the -current state of the entry:: - - #define KVM_DIRTY_GFN_F_DIRTY BIT(0) - #define KVM_DIRTY_GFN_F_RESET BIT(1) - #define KVM_DIRTY_GFN_F_MASK 0x3 - -Userspace should call KVM_ENABLE_CAP ioctl right after KVM_CREATE_VM -ioctl to enable this capability for the new guest and set the size of -the rings. Enabling the capability is only allowed before creating any -vCPU, and the size of the ring must be a power of two. The larger the -ring buffer, the less likely the ring is full and the VM is forced to -exit to userspace. The optimal size depends on the workload, but it is -recommended that it be at least 64 KiB (4096 entries). - -Just like for dirty page bitmaps, the buffer tracks writes to -all user memory regions for which the KVM_MEM_LOG_DIRTY_PAGES flag was -set in KVM_SET_USER_MEMORY_REGION. Once a memory region is registered -with the flag set, userspace can start harvesting dirty pages from the -ring buffer. - -An entry in the ring buffer can be unused (flag bits ``00``), -dirty (flag bits ``01``) or harvested (flag bits ``1X``). The -state machine for the entry is as follows:: - - dirtied harvested reset - 00 -----------> 01 -------------> 1X -------+ - ^ | - | | - +------------------------------------------+ - -To harvest the dirty pages, userspace accesses the mmapped ring buffer -to read the dirty GFNs. If the flags has the DIRTY bit set (at this stage -the RESET bit must be cleared), then it means this GFN is a dirty GFN. -The userspace should harvest this GFN and mark the flags from state -``01b`` to ``1Xb`` (bit 0 will be ignored by KVM, but bit 1 must be set -to show that this GFN is harvested and waiting for a reset), and move -on to the next GFN. The userspace should continue to do this until the -flags of a GFN have the DIRTY bit cleared, meaning that it has harvested -all the dirty GFNs that were available. - -Note that on weakly ordered architectures, userspace accesses to the -ring buffer (and more specifically the 'flags' field) must be ordered, -using load-acquire/store-release accessors when available, or any -other memory barrier that will ensure this ordering. - -It's not necessary for userspace to harvest the all dirty GFNs at once. -However it must collect the dirty GFNs in sequence, i.e., the userspace -program cannot skip one dirty GFN to collect the one next to it. - -After processing one or more entries in the ring buffer, userspace -calls the VM ioctl KVM_RESET_DIRTY_RINGS to notify the kernel about -it, so that the kernel will reprotect those collected GFNs. -Therefore, the ioctl must be called *before* reading the content of -the dirty pages. - -The dirty ring can get full. When it happens, the KVM_RUN of the -vcpu will return with exit reason KVM_EXIT_DIRTY_LOG_FULL. - -The dirty ring interface has a major difference comparing to the -KVM_GET_DIRTY_LOG interface in that, when reading the dirty ring from -userspace, it's still possible that the kernel has not yet flushed the -processor's dirty page buffers into the kernel buffer (with dirty bitmaps, the -flushing is done by the KVM_GET_DIRTY_LOG ioctl). To achieve that, one -needs to kick the vcpu out of KVM_RUN using a signal. The resulting -vmexit ensures that all dirty GFNs are flushed to the dirty rings. - -NOTE: KVM_CAP_DIRTY_LOG_RING_ACQ_REL is the only capability that -should be exposed by weakly ordered architecture, in order to indicate -the additional memory ordering requirements imposed on userspace when -reading the state of an entry and mutating it from DIRTY to HARVESTED. -Architecture with TSO-like ordering (such as x86) are allowed to -expose both KVM_CAP_DIRTY_LOG_RING and KVM_CAP_DIRTY_LOG_RING_ACQ_REL -to userspace. - -After enabling the dirty rings, the userspace needs to detect the -capability of KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP to see whether the -ring structures can be backed by per-slot bitmaps. With this capability -advertised, it means the architecture can dirty guest pages without -vcpu/ring context, so that some of the dirty information will still be -maintained in the bitmap structure. KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP -can't be enabled if the capability of KVM_CAP_DIRTY_LOG_RING_ACQ_REL -hasn't been enabled, or any memslot has been existing. - -Note that the bitmap here is only a backup of the ring structure. The -use of the ring and bitmap combination is only beneficial if there is -only a very small amount of memory that is dirtied out of vcpu/ring -context. Otherwise, the stand-alone per-slot bitmap mechanism needs to -be considered. - -To collect dirty bits in the backup bitmap, userspace can use the same -KVM_GET_DIRTY_LOG ioctl. KVM_CLEAR_DIRTY_LOG isn't needed as long as all -the generation of the dirty bits is done in a single pass. Collecting -the dirty bitmap should be the very last thing that the VMM does before -considering the state as complete. VMM needs to ensure that the dirty -state is final and avoid missing dirty pages from another ioctl ordered -after the bitmap collection. - -NOTE: Multiple examples of using the backup bitmap: (1) save vgic/its -tables through command KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_SAVE_TABLES} on -KVM device "kvm-arm-vgic-its". (2) restore vgic/its tables through -command KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_RESTORE_TABLES} on KVM device -"kvm-arm-vgic-its". VGICv3 LPI pending status is restored. (3) save -vgic3 pending table through KVM_DEV_ARM_VGIC_{GRP_CTRL, SAVE_PENDING_TABLES} -command on KVM device "kvm-arm-vgic-v3". - 8.30 KVM_CAP_XEN_HVM -------------------- @@ -8847,10 +8872,9 @@ clearing the PVCLOCK_TSC_STABLE_BIT flag in Xen pvclock sources. This will be done when the KVM_CAP_XEN_HVM ioctl sets the KVM_XEN_HVM_CONFIG_PVCLOCK_TSC_UNSTABLE flag. -8.31 KVM_CAP_PPC_MULTITCE -------------------------- +8.31 KVM_CAP_SPAPR_MULTITCE +--------------------------- -:Capability: KVM_CAP_PPC_MULTITCE :Architectures: ppc :Type: vm @@ -8882,72 +8906,9 @@ This capability indicates that the KVM virtual PTP service is supported in the host. A VMM can check whether the service is available to the guest on migration. -8.33 KVM_CAP_HYPERV_ENFORCE_CPUID ---------------------------------- - -Architectures: x86 - -When enabled, KVM will disable emulated Hyper-V features provided to the -guest according to the bits Hyper-V CPUID feature leaves. Otherwise, all -currently implemented Hyper-V features are provided unconditionally when -Hyper-V identification is set in the HYPERV_CPUID_INTERFACE (0x40000001) -leaf. - -8.34 KVM_CAP_EXIT_HYPERCALL ---------------------------- - -:Capability: KVM_CAP_EXIT_HYPERCALL -:Architectures: x86 -:Type: vm - -This capability, if enabled, will cause KVM to exit to userspace -with KVM_EXIT_HYPERCALL exit reason to process some hypercalls. - -Calling KVM_CHECK_EXTENSION for this capability will return a bitmask -of hypercalls that can be configured to exit to userspace. -Right now, the only such hypercall is KVM_HC_MAP_GPA_RANGE. - -The argument to KVM_ENABLE_CAP is also a bitmask, and must be a subset -of the result of KVM_CHECK_EXTENSION. KVM will forward to userspace -the hypercalls whose corresponding bit is in the argument, and return -ENOSYS for the others. - -8.35 KVM_CAP_PMU_CAPABILITY ---------------------------- - -:Capability: KVM_CAP_PMU_CAPABILITY -:Architectures: x86 -:Type: vm -:Parameters: arg[0] is bitmask of PMU virtualization capabilities. -:Returns: 0 on success, -EINVAL when arg[0] contains invalid bits - -This capability alters PMU virtualization in KVM. - -Calling KVM_CHECK_EXTENSION for this capability returns a bitmask of -PMU virtualization capabilities that can be adjusted on a VM. - -The argument to KVM_ENABLE_CAP is also a bitmask and selects specific -PMU virtualization capabilities to be applied to the VM. This can -only be invoked on a VM prior to the creation of VCPUs. - -At this time, KVM_PMU_CAP_DISABLE is the only capability. Setting -this capability will disable PMU virtualization for that VM. Usermode -should adjust CPUID leaf 0xA to reflect that the PMU is disabled. - -8.36 KVM_CAP_ARM_SYSTEM_SUSPEND -------------------------------- - -:Capability: KVM_CAP_ARM_SYSTEM_SUSPEND -:Architectures: arm64 -:Type: vm - -When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of -type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request. - 8.37 KVM_CAP_S390_PROTECTED_DUMP -------------------------------- -:Capability: KVM_CAP_S390_PROTECTED_DUMP :Architectures: s390 :Type: vm @@ -8957,27 +8918,9 @@ PV guests. The `KVM_PV_DUMP` command is available for the dump related UV data. Also the vcpu ioctl `KVM_S390_PV_CPU_COMMAND` is available and supports the `KVM_PV_DUMP_CPU` subcommand. -8.38 KVM_CAP_VM_DISABLE_NX_HUGE_PAGES -------------------------------------- - -:Capability: KVM_CAP_VM_DISABLE_NX_HUGE_PAGES -:Architectures: x86 -:Type: vm -:Parameters: arg[0] must be 0. -:Returns: 0 on success, -EPERM if the userspace process does not - have CAP_SYS_BOOT, -EINVAL if args[0] is not 0 or any vCPUs have been - created. - -This capability disables the NX huge pages mitigation for iTLB MULTIHIT. - -The capability has no effect if the nx_huge_pages module parameter is not set. - -This capability may only be set before any vCPUs are created. - 8.39 KVM_CAP_S390_CPU_TOPOLOGY ------------------------------ -:Capability: KVM_CAP_S390_CPU_TOPOLOGY :Architectures: s390 :Type: vm @@ -8999,37 +8942,9 @@ structure. When getting the Modified Change Topology Report value, the attr->addr must point to a byte where the value will be stored or retrieved from. -8.40 KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE ---------------------------------------- - -:Capability: KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE -:Architectures: arm64 -:Type: vm -:Parameters: arg[0] is the new split chunk size. -:Returns: 0 on success, -EINVAL if any memslot was already created. - -This capability sets the chunk size used in Eager Page Splitting. - -Eager Page Splitting improves the performance of dirty-logging (used -in live migrations) when guest memory is backed by huge-pages. It -avoids splitting huge-pages (into PAGE_SIZE pages) on fault, by doing -it eagerly when enabling dirty logging (with the -KVM_MEM_LOG_DIRTY_PAGES flag for a memory region), or when using -KVM_CLEAR_DIRTY_LOG. - -The chunk size specifies how many pages to break at a time, using a -single allocation for each chunk. Bigger the chunk size, more pages -need to be allocated ahead of time. - -The chunk size needs to be a valid block size. The list of acceptable -block sizes is exposed in KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES as a -64-bit bitmap (each bit describing a block size). The default value is -0, to disable the eager page splitting. - 8.41 KVM_CAP_VM_TYPES --------------------- -:Capability: KVM_CAP_MEMORY_ATTRIBUTES :Architectures: x86 :Type: system ioctl @@ -9046,6 +8961,67 @@ Do not use KVM_X86_SW_PROTECTED_VM for "real" VMs, and especially not in production. The behavior and effective ABI for software-protected VMs is unstable. +8.42 KVM_CAP_PPC_RPT_INVALIDATE +------------------------------- + +:Architectures: ppc + +This capability indicates that the kernel is capable of handling +H_RPT_INVALIDATE hcall. + +In order to enable the use of H_RPT_INVALIDATE in the guest, +user space might have to advertise it for the guest. For example, +IBM pSeries (sPAPR) guest starts using it if "hcall-rpt-invalidate" is +present in the "ibm,hypertas-functions" device-tree property. + +This capability is enabled for hypervisors on platforms like POWER9 +that support radix MMU. + +8.43 KVM_CAP_PPC_AIL_MODE_3 +--------------------------- + +:Architectures: ppc + +This capability indicates that the kernel supports the mode 3 setting for the +"Address Translation Mode on Interrupt" aka "Alternate Interrupt Location" +resource that is controlled with the H_SET_MODE hypercall. + +This capability allows a guest kernel to use a better-performance mode for +handling interrupts and system calls. + +8.44 KVM_CAP_MEMORY_FAULT_INFO +------------------------------ + +:Architectures: x86 + +The presence of this capability indicates that KVM_RUN will fill +kvm_run.memory_fault if KVM cannot resolve a guest page fault VM-Exit, e.g. if +there is a valid memslot but no backing VMA for the corresponding host virtual +address. + +The information in kvm_run.memory_fault is valid if and only if KVM_RUN returns +an error with errno=EFAULT or errno=EHWPOISON *and* kvm_run.exit_reason is set +to KVM_EXIT_MEMORY_FAULT. + +Note: Userspaces which attempt to resolve memory faults so that they can retry +KVM_RUN are encouraged to guard against repeatedly receiving the same +error/annotated fault. + +See KVM_EXIT_MEMORY_FAULT for more information. + +8.45 KVM_CAP_X86_GUEST_MODE +--------------------------- + +:Architectures: x86 + +The presence of this capability indicates that KVM_RUN will update the +KVM_RUN_X86_GUEST_MODE bit in kvm_run.flags to indicate whether the +vCPU was executing nested guest code when it exited. + +KVM exits with the register state of either the L1 or L2 guest +depending on which executed at the time of an exit. Userspace must +take care to differentiate between these cases. + 9. Known KVM API problems ========================= @@ -9076,9 +9052,10 @@ the local APIC. The same is true for the ``KVM_FEATURE_PV_UNHALT`` paravirtualized feature. -CPU[EAX=1]:ECX[24] (TSC_DEADLINE) is not reported by ``KVM_GET_SUPPORTED_CPUID``. -It can be enabled if ``KVM_CAP_TSC_DEADLINE_TIMER`` is present and the kernel -has enabled in-kernel emulation of the local APIC. +On older versions of Linux, CPU[EAX=1]:ECX[24] (TSC_DEADLINE) is not reported by +``KVM_GET_SUPPORTED_CPUID``, but it can be enabled if ``KVM_CAP_TSC_DEADLINE_TIMER`` +is present and the kernel has enabled in-kernel emulation of the local APIC. +On newer versions, ``KVM_GET_SUPPORTED_CPUID`` does report the bit as available. CPU topology ~~~~~~~~~~~~ |