diff options
48 files changed, 1520 insertions, 1034 deletions
diff --git a/Documentation/gpu/nova/core/fsp.rst b/Documentation/gpu/nova/core/fsp.rst new file mode 100644 index 000000000000..52d618d22bb8 --- /dev/null +++ b/Documentation/gpu/nova/core/fsp.rst @@ -0,0 +1,142 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=================================================== +FSP (Foundation Security Processor) and Secure Boot +=================================================== +This document describes the role of the FSP in the GPU boot sequence on +Hopper and Blackwell GPUs, and how it differs from the earlier Ampere boot +flow. It also provides a brief overview of the PRC (Product Reconfiguration +Control) protocol used to query device configuration through FSP. As with +other documents in this directory, the information is subject to change and +is intended to help developers understand the corresponding kernel code. + +What is FSP? +============ +The Foundation Security Processor (FSP) is the GPU's Internal Root of Trust +(IROT). It is a dedicated security processor that boots from immutable ROM +(Boot ROM) inside the GPU and is responsible for establishing the Chain of +Trust before any other firmware is allowed to run. + +FSP runs independently of the host CPU and starts executing as soon as the +GPU is powered on. By the time the nova-core driver is loaded, FSP has +already completed its own secure boot and is ready to accept commands from +the driver. + +Simplified boot flow (Hopper/Blackwell) +======================================= +Starting with Hopper, the boot flow is significantly simplified compared to +earlier GPU generations like Ampere. + +On an **Ampere** GPU, the boot verification chain involves multiple Falcon +engines and multiple ucode stages (see falcon.rst for details):: + + Hardware BROM (SEC2) + -> HS Booter (SEC2) + -> LS GSP-RM (GSP) + +The driver must extract ucode from VBIOS, manage SEC2 and GSP, and +orchestrate the Booter to load GSP-RM. This involves FWSEC-FRTS, devinit, +and the Booter stages. + +On **Hopper/Blackwell** GPUs, FSP replaces this multi-stage process with a +single message-driven interface:: + + FSP (hardware root of trust, boots from ROM) + -> FMC (Falcon Microcontroller, verified by FSP) + -> GSP-RM (verified and loaded by FMC) + +The driver only needs to: + +1. Wait for FSP to complete its own secure boot (polling a scratch register). +2. Send a Chain of Trust (COT) message to FSP with the FMC firmware location, + cryptographic signatures, and GSP boot parameters. +3. FSP authenticates the FMC firmware and boots it, FMC in turn loads GSP-RM. + +There is no SEC2 involvement, no Booter ucode, and no FWSEC-FRTS stage. The +entire secure boot is driven by a single FSP message exchange. + +Chain of Trust (COT) protocol +============================= +The Chain of Trust establishes a cryptographically enforced boot sequence, +ensuring the GPU reaches a known, trusted state. + +The driver communicates with FSP using a message queue (Falcon MSGQ +interface). Each message consists of an MCTP (Management Component Transport +Protocol) transport header and an NVDM (NVIDIA Vendor Defined Message) header, +followed by a protocol-specific payload. + +For Chain of Trust, the payload includes: + +- The system memory address of the FMC firmware image. +- Cryptographic material: a SHA-384 hash, RSA-3K public key, and RSA-3K + signature extracted from the FMC ELF firmware. +- FRTS (Firmware Runtime Services) region information (vidmem offset and size). +- The system memory address of the GSP boot arguments structure. + +FSP verifies the signature against the provided public key and hash, and if +verification succeeds, boots the FMC. The FMC then authenticates and launches +GSP-RM. + +The message flow is:: + + nova-core FSP + | | + | 1. Poll scratch register | + | (wait for FSP boot complete) | + | | + | 2. COT message ------------> | + | (FMC addr, signatures, | + | boot params) | + | | + | |--- Verify FMC signature + | |--- Boot FMC + | |--- FMC loads GSP-RM + | | + | 3. COT response <------------ | + | (success/error) | + | | + +FSP message format +================== +All FSP messages share a common header format consisting of two 32-bit words: + +**MCTP header** (Management Component Transport Protocol): + +- Bit 31: SOM (Start of Message) +- Bit 30: EOM (End of Message) +- Bits 29:28: Packet sequence number +- Bits 23:16: Source Endpoint ID + +**NVDM header** (NVIDIA Vendor Defined Message): + +- Bits 6:0: MCTP message type (0x7e = vendor-defined PCI) +- Bits 23:8: PCI vendor ID (0x10de = NVIDIA) +- Bits 31:24: NVDM type (0x14 = COT, 0x13 = PRC, 0x15 = FSP response) + +PRC (Product Reconfiguration Control) protocol +=============================================== +PRC is an API system exposed through FSP's Management Partition that allows +querying and modifying device configuration without firmware updates. + +Configuration parameters are called "knobs". Each knob has a unique object +ID and controls a specific device behavior. Examples include vGPU mode, ECC +enable, confidential computing mode, and NVLINK configuration. + +Each knob has two values: + +- **Active**: the currently effective value for this boot cycle. +- **Persistent**: the value stored in InfoROM, applied on subsequent boots. + +The nova-core driver uses PRC to read the vGPU mode knob (object ID 0x29) +during early boot, before firmware loading, to determine whether the GPU +should operate in vGPU mode. + +The PRC message format follows the same MCTP/NVDM header structure as COT, +with NVDM type 0x13. The payload contains: + +- A sub-command (e.g., 0x0c for read). +- Flags indicating which value to read (bit 0 = persistent, bit 1 = active). +- The knob object ID. + +The response includes the common FSP response header (with error status) +followed by the knob's 16-bit state value. diff --git a/Documentation/gpu/nova/index.rst b/Documentation/gpu/nova/index.rst index e39cb3163581..1783513cbd05 100644 --- a/Documentation/gpu/nova/index.rst +++ b/Documentation/gpu/nova/index.rst @@ -30,5 +30,6 @@ vGPU manager VFIO driver and the nova-drm driver. core/todo core/vbios core/devinit + core/fsp core/fwsec core/falcon diff --git a/drivers/gpu/Makefile b/drivers/gpu/Makefile index b4e5e338efa2..e372fc02139f 100644 --- a/drivers/gpu/Makefile +++ b/drivers/gpu/Makefile @@ -7,4 +7,60 @@ obj-$(CONFIG_GPU_BUDDY) += buddy.o obj-y += host1x/ drm/ vga/ tests/ obj-$(CONFIG_IMX_IPUV3_CORE) += ipu-v3/ obj-$(CONFIG_TRACE_GPU_MEM) += trace/ -obj-$(CONFIG_NOVA_CORE) += nova-core/ + +# nova-core and nova-drm are built from this Makefile so nova-drm's dependency +# on nova-core can be expressed as a plain Make prerequisite rather than a +# recursive sub-make. This is a temporary workaround until the Rust build +# system supports cross-crate dependencies natively. + +obj-$(CONFIG_NOVA_CORE) += nova-core.o +nova-core-y := nova-core/nova_core.o nova-core/nova_core_exports.o + +obj-$(CONFIG_DRM_NOVA) += nova-drm.o +nova-drm-y := drm/nova/nova.o + +# Export Rust symbols from nova-core only if nova-drm actually references them. +nova-core-export-deps := $(if $(CONFIG_DRM_NOVA),$(obj)/drm/nova/nova.o) + +rust_needed_exports = \ + { $(if $(strip $(2)),$(NM) -u $(2);,) echo "__DEFINED_RUST_SYMBOLS__"; \ + $(NM) -p --defined-only $(1); } | \ + awk -v fmt='$(3)' ' \ + /^__DEFINED_RUST_SYMBOLS__$$/ { defs = 1; next } \ + !defs { if ($$NF ~ /^_R/) needed[$$NF] = 1; next } \ + defs && $$2 ~ /(T|R|D|B)/ && $$3 ~ /^_R/ && \ + $$3 !~ /_(init|cleanup)_module$$/ && \ + $$3 !~ /__(pfx|cfi|odr_asan)/ && \ + $$3 in needed { printf fmt, $$3 } \ + ' + +quiet_cmd_exports = EXPORTS $@ + cmd_exports = \ + $(call rust_needed_exports,$<,$(nova-core-export-deps),EXPORT_SYMBOL_RUST_GPL(%s);\n) > $@ + +$(obj)/nova-core/exports_nova_core_generated.h: $(obj)/nova-core/nova_core.o $(nova-core-export-deps) FORCE + $(call if_changed,exports) + +targets += nova-core/exports_nova_core_generated.h + +$(obj)/nova-core/nova_core_exports.o: $(obj)/nova-core/exports_nova_core_generated.h +CFLAGS_nova-core/nova_core_exports.o := -I $(objtree)/$(obj)/nova-core + +ifdef CONFIG_MODVERSIONS +# The C export shim declares Rust symbols as `extern int`, so reuse its export +# list but generate symbol CRCs from the Rust object instead of the shim's DWARF. +$(obj)/nova-core/nova_core_exports.o: private cmd_gensymtypes_c = \ + $(call getexportsymbols,\1) | \ + $(objtree)/scripts/gendwarfksyms/gendwarfksyms \ + $(if $(KBUILD_GENDWARFKSYMS_STABLE), --stable) \ + $(if $(KBUILD_SYMTYPES), --symtypes $(@:.o=.symtypes),) \ + $(obj)/nova-core/nova_core.o +endif + +# Output nova-core's crate metadata for use by nova-drm at compile time. +RUSTFLAGS_nova-core/nova_core.o += \ + --emit=metadata=$(objtree)/$(obj)/nova-core/libnova_core.rmeta + +# Allow nova-drm to import nova-core's types. +$(obj)/drm/nova/nova.o: $(obj)/nova-core/nova_core.o +RUSTFLAGS_drm/nova/nova.o := -L $(objtree)/$(obj)/nova-core --extern nova_core diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index e97faabcd783..e635fcffd379 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -186,7 +186,7 @@ obj-$(CONFIG_DRM_VMWGFX)+= vmwgfx/ obj-$(CONFIG_DRM_VGEM) += vgem/ obj-$(CONFIG_DRM_VKMS) += vkms/ obj-$(CONFIG_DRM_NOUVEAU) +=nouveau/ -obj-$(CONFIG_DRM_NOVA) += nova/ +# nova-drm is built from drivers/gpu/Makefile together with nova-core. obj-$(CONFIG_DRM_EXYNOS) +=exynos/ obj-$(CONFIG_DRM_ROCKCHIP) +=rockchip/ obj-$(CONFIG_DRM_GMA500) += gma500/ diff --git a/drivers/gpu/drm/hyperv/Kconfig b/drivers/gpu/drm/hyperv/Kconfig index 86234f6a73f2..e48e35fb7f8b 100644 --- a/drivers/gpu/drm/hyperv/Kconfig +++ b/drivers/gpu/drm/hyperv/Kconfig @@ -8,7 +8,6 @@ config DRM_HYPERV help This is a KMS driver for Hyper-V synthetic video device. Choose this option if you would like to enable drm driver for Hyper-V virtual - machine. Unselect Hyper-V framebuffer driver (CONFIG_FB_HYPERV) so - that DRM driver is used by default. + machine. If M is selected the module will be called hyperv_drm. diff --git a/drivers/gpu/drm/nova/Makefile b/drivers/gpu/drm/nova/Makefile index f8527b2b7b4a..b9fad3956358 100644 --- a/drivers/gpu/drm/nova/Makefile +++ b/drivers/gpu/drm/nova/Makefile @@ -1,4 +1,2 @@ # SPDX-License-Identifier: GPL-2.0 - -obj-$(CONFIG_DRM_NOVA) += nova-drm.o -nova-drm-y := nova.o +# nova-drm is built from drivers/gpu/Makefile. diff --git a/drivers/gpu/drm/tyr/regs.rs b/drivers/gpu/drm/tyr/regs.rs index 562023e5df2f..831357a8ef87 100644 --- a/drivers/gpu/drm/tyr/regs.rs +++ b/drivers/gpu/drm/tyr/regs.rs @@ -48,17 +48,12 @@ pub(crate) fn read_u64_no_tearing(lo_read: impl Fn() -> u32, hi_read: impl Fn() /// These registers correspond to the GPU_CONTROL register page. /// They are involved in GPU configuration and control. pub(crate) mod gpu_control { - use core::convert::TryFrom; use kernel::{ - error::{ - code::EINVAL, - Error, // - }, num::Bounded, + prelude::*, register, uapi, // }; - use pin_init::Zeroable; register! { /// GPU identification register. @@ -964,14 +959,9 @@ pub(crate) mod mmu_control { /// /// This array contains 16 instances of the MMU_AS_CONTROL register page. pub(crate) mod mmu_as_control { - use core::convert::TryFrom; - use kernel::{ - error::{ - code::EINVAL, - Error, // - }, num::Bounded, + prelude::*, register, // }; diff --git a/drivers/gpu/nova-core/.gitignore b/drivers/gpu/nova-core/.gitignore new file mode 100644 index 000000000000..7cc8318c76b1 --- /dev/null +++ b/drivers/gpu/nova-core/.gitignore @@ -0,0 +1 @@ +exports_nova_core_generated.h diff --git a/drivers/gpu/nova-core/Makefile b/drivers/gpu/nova-core/Makefile index 4ae544f808f4..4c15729704a1 100644 --- a/drivers/gpu/nova-core/Makefile +++ b/drivers/gpu/nova-core/Makefile @@ -1,4 +1,2 @@ # SPDX-License-Identifier: GPL-2.0 - -obj-$(CONFIG_NOVA_CORE) += nova-core.o -nova-core-y := nova_core.o +# nova-core is built from drivers/gpu/Makefile. diff --git a/drivers/gpu/nova-core/bitfield.rs b/drivers/gpu/nova-core/bitfield.rs deleted file mode 100644 index 660c3911402d..000000000000 --- a/drivers/gpu/nova-core/bitfield.rs +++ /dev/null @@ -1,329 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 - -//! Bitfield library for Rust structures -//! -//! Support for defining bitfields in Rust structures. Also used by the [`register!`] macro. - -/// Defines a struct with accessors to access bits within an inner unsigned integer. -/// -/// # Syntax -/// -/// ```rust -/// use nova_core::bitfield; -/// -/// #[derive(Debug, Clone, Copy, Default)] -/// enum Mode { -/// #[default] -/// Low = 0, -/// High = 1, -/// Auto = 2, -/// } -/// -/// impl TryFrom<u8> for Mode { -/// type Error = u8; -/// fn try_from(value: u8) -> Result<Self, Self::Error> { -/// match value { -/// 0 => Ok(Mode::Low), -/// 1 => Ok(Mode::High), -/// 2 => Ok(Mode::Auto), -/// _ => Err(value), -/// } -/// } -/// } -/// -/// impl From<Mode> for u8 { -/// fn from(mode: Mode) -> u8 { -/// mode as u8 -/// } -/// } -/// -/// #[derive(Debug, Clone, Copy, Default)] -/// enum State { -/// #[default] -/// Inactive = 0, -/// Active = 1, -/// } -/// -/// impl From<bool> for State { -/// fn from(value: bool) -> Self { -/// if value { State::Active } else { State::Inactive } -/// } -/// } -/// -/// impl From<State> for bool { -/// fn from(state: State) -> bool { -/// match state { -/// State::Inactive => false, -/// State::Active => true, -/// } -/// } -/// } -/// -/// bitfield! { -/// pub struct ControlReg(u32) { -/// 7:7 state as bool => State; -/// 3:0 mode as u8 ?=> Mode; -/// } -/// } -/// ``` -/// -/// This generates a struct with: -/// - Field accessors: `mode()`, `state()`, etc. -/// - Field setters: `set_mode()`, `set_state()`, etc. (supports chaining with builder pattern). -/// Note that the compiler will error out if the size of the setter's arg exceeds the -/// struct's storage size. -/// - Debug and Default implementations. -/// -/// Note: Field accessors and setters inherit the same visibility as the struct itself. -/// In the example above, both `mode()` and `set_mode()` methods will be `pub`. -/// -/// Fields are defined as follows: -/// -/// - `as <type>` simply returns the field value casted to <type>, typically `u32`, `u16`, `u8` or -/// `bool`. Note that `bool` fields must have a range of 1 bit. -/// - `as <type> => <into_type>` calls `<into_type>`'s `From::<<type>>` implementation and returns -/// the result. -/// - `as <type> ?=> <try_into_type>` calls `<try_into_type>`'s `TryFrom::<<type>>` implementation -/// and returns the result. This is useful with fields for which not all values are valid. -macro_rules! bitfield { - // Main entry point - defines the bitfield struct with fields - ($vis:vis struct $name:ident($storage:ty) $(, $comment:literal)? { $($fields:tt)* }) => { - bitfield!(@core $vis $name $storage $(, $comment)? { $($fields)* }); - }; - - // All rules below are helpers. - - // Defines the wrapper `$name` type, as well as its relevant implementations (`Debug`, - // `Default`, and conversion to the value type) and field accessor methods. - (@core $vis:vis $name:ident $storage:ty $(, $comment:literal)? { $($fields:tt)* }) => { - $( - #[doc=$comment] - )? - #[repr(transparent)] - #[derive(Clone, Copy)] - $vis struct $name($storage); - - impl ::core::convert::From<$name> for $storage { - fn from(val: $name) -> $storage { - val.0 - } - } - - bitfield!(@fields_dispatcher $vis $name $storage { $($fields)* }); - }; - - // Captures the fields and passes them to all the implementers that require field information. - // - // Used to simplify the matching rules for implementers, so they don't need to match the entire - // complex fields rule even though they only make use of part of it. - (@fields_dispatcher $vis:vis $name:ident $storage:ty { - $($hi:tt:$lo:tt $field:ident as $type:tt - $(?=> $try_into_type:ty)? - $(=> $into_type:ty)? - $(, $comment:literal)? - ; - )* - } - ) => { - bitfield!(@field_accessors $vis $name $storage { - $( - $hi:$lo $field as $type - $(?=> $try_into_type)? - $(=> $into_type)? - $(, $comment)? - ; - )* - }); - bitfield!(@debug $name { $($field;)* }); - bitfield!(@default $name { $($field;)* }); - }; - - // Defines all the field getter/setter methods for `$name`. - ( - @field_accessors $vis:vis $name:ident $storage:ty { - $($hi:tt:$lo:tt $field:ident as $type:tt - $(?=> $try_into_type:ty)? - $(=> $into_type:ty)? - $(, $comment:literal)? - ; - )* - } - ) => { - $( - bitfield!(@check_field_bounds $hi:$lo $field as $type); - )* - - #[allow(dead_code)] - impl $name { - $( - bitfield!(@field_accessor $vis $name $storage, $hi:$lo $field as $type - $(?=> $try_into_type)? - $(=> $into_type)? - $(, $comment)? - ; - ); - )* - } - }; - - // Boolean fields must have `$hi == $lo`. - (@check_field_bounds $hi:tt:$lo:tt $field:ident as bool) => { - #[allow(clippy::eq_op)] - const _: () = { - ::kernel::build_assert::build_assert!( - $hi == $lo, - concat!("boolean field `", stringify!($field), "` covers more than one bit") - ); - }; - }; - - // Non-boolean fields must have `$hi >= $lo`. - (@check_field_bounds $hi:tt:$lo:tt $field:ident as $type:tt) => { - #[allow(clippy::eq_op)] - const _: () = { - ::kernel::build_assert::build_assert!( - $hi >= $lo, - concat!("field `", stringify!($field), "`'s MSB is smaller than its LSB") - ); - }; - }; - - // Catches fields defined as `bool` and convert them into a boolean value. - ( - @field_accessor $vis:vis $name:ident $storage:ty, $hi:tt:$lo:tt $field:ident as bool - => $into_type:ty $(, $comment:literal)?; - ) => { - bitfield!( - @leaf_accessor $vis $name $storage, $hi:$lo $field - { |f| <$into_type>::from(f != 0) } - bool $into_type => $into_type $(, $comment)?; - ); - }; - - // Shortcut for fields defined as `bool` without the `=>` syntax. - ( - @field_accessor $vis:vis $name:ident $storage:ty, $hi:tt:$lo:tt $field:ident as bool - $(, $comment:literal)?; - ) => { - bitfield!( - @field_accessor $vis $name $storage, $hi:$lo $field as bool => bool $(, $comment)?; - ); - }; - - // Catches the `?=>` syntax for non-boolean fields. - ( - @field_accessor $vis:vis $name:ident $storage:ty, $hi:tt:$lo:tt $field:ident as $type:tt - ?=> $try_into_type:ty $(, $comment:literal)?; - ) => { - bitfield!(@leaf_accessor $vis $name $storage, $hi:$lo $field - { |f| <$try_into_type>::try_from(f as $type) } $type $try_into_type => - ::core::result::Result< - $try_into_type, - <$try_into_type as ::core::convert::TryFrom<$type>>::Error - > - $(, $comment)?;); - }; - - // Catches the `=>` syntax for non-boolean fields. - ( - @field_accessor $vis:vis $name:ident $storage:ty, $hi:tt:$lo:tt $field:ident as $type:tt - => $into_type:ty $(, $comment:literal)?; - ) => { - bitfield!(@leaf_accessor $vis $name $storage, $hi:$lo $field - { |f| <$into_type>::from(f as $type) } $type $into_type => $into_type $(, $comment)?;); - }; - - // Shortcut for non-boolean fields defined without the `=>` or `?=>` syntax. - ( - @field_accessor $vis:vis $name:ident $storage:ty, $hi:tt:$lo:tt $field:ident as $type:tt - $(, $comment:literal)?; - ) => { - bitfield!( - @field_accessor $vis $name $storage, $hi:$lo $field as $type => $type $(, $comment)?; - ); - }; - - // Generates the accessor methods for a single field. - ( - @leaf_accessor $vis:vis $name:ident $storage:ty, $hi:tt:$lo:tt $field:ident - { $process:expr } $prim_type:tt $to_type:ty => $res_type:ty $(, $comment:literal)?; - ) => { - ::kernel::macros::paste!( - const [<$field:upper _RANGE>]: ::core::ops::RangeInclusive<u8> = $lo..=$hi; - const [<$field:upper _MASK>]: $storage = { - // Generate mask for shifting - match ::core::mem::size_of::<$storage>() { - 1 => ::kernel::bits::genmask_u8($lo..=$hi) as $storage, - 2 => ::kernel::bits::genmask_u16($lo..=$hi) as $storage, - 4 => ::kernel::bits::genmask_u32($lo..=$hi) as $storage, - 8 => ::kernel::bits::genmask_u64($lo..=$hi) as $storage, - _ => ::kernel::build_error!("Unsupported storage type size") - } - }; - const [<$field:upper _SHIFT>]: u32 = $lo; - ); - - $( - #[doc="Returns the value of this field:"] - #[doc=$comment] - )? - #[inline(always)] - $vis fn $field(self) -> $res_type { - ::kernel::macros::paste!( - const MASK: $storage = $name::[<$field:upper _MASK>]; - const SHIFT: u32 = $name::[<$field:upper _SHIFT>]; - ); - let field = ((self.0 & MASK) >> SHIFT); - - $process(field) - } - - ::kernel::macros::paste!( - $( - #[doc="Sets the value of this field:"] - #[doc=$comment] - )? - #[inline(always)] - $vis fn [<set_ $field>](mut self, value: $to_type) -> Self { - const MASK: $storage = $name::[<$field:upper _MASK>]; - const SHIFT: u32 = $name::[<$field:upper _SHIFT>]; - let value = ($storage::from($prim_type::from(value)) << SHIFT) & MASK; - self.0 = (self.0 & !MASK) | value; - - self - } - ); - }; - - // Generates the `Debug` implementation for `$name`. - (@debug $name:ident { $($field:ident;)* }) => { - impl ::kernel::fmt::Debug for $name { - fn fmt(&self, f: &mut ::kernel::fmt::Formatter<'_>) -> ::kernel::fmt::Result { - f.debug_struct(stringify!($name)) - .field("<raw>", &::kernel::prelude::fmt!("{:#x}", &self.0)) - $( - .field(stringify!($field), &self.$field()) - )* - .finish() - } - } - }; - - // Generates the `Default` implementation for `$name`. - (@default $name:ident { $($field:ident;)* }) => { - /// Returns a value for the bitfield where all fields are set to their default value. - impl ::core::default::Default for $name { - fn default() -> Self { - let value = Self(Default::default()); - - ::kernel::macros::paste!( - $( - let value = value.[<set_ $field>](Default::default()); - )* - ); - - value - } - } - }; -} diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs index 94c7696a6493..78948cc8bff3 100644 --- a/drivers/gpu/nova-core/falcon.rs +++ b/drivers/gpu/nova-core/falcon.rs @@ -5,10 +5,7 @@ use hal::FalconHal; use kernel::{ - device::{ - self, - Device, // - }, + device, dma::{ Coherent, CoherentBox, @@ -24,7 +21,6 @@ use kernel::{ Io, }, prelude::*, - sync::aref::ARef, time::Delta, }; @@ -358,41 +354,47 @@ pub(crate) trait FalconFirmware { } /// Contains the base parameters common to all Falcon instances. -pub(crate) struct Falcon<E: FalconEngine> { +pub(crate) struct Falcon<'a, E: FalconEngine> { hal: KBox<dyn FalconHal<E>>, - dev: ARef<device::Device>, + dev: &'a device::Device<device::Bound>, + bar: Bar0<'a>, } -impl<E: FalconEngine + 'static> Falcon<E> { +impl<'a, E: FalconEngine + 'static> Falcon<'a, E> { /// Create a new falcon instance. - pub(crate) fn new(dev: &device::Device, chipset: Chipset) -> Result<Self> { + pub(crate) fn new( + dev: &'a device::Device<device::Bound>, + chipset: Chipset, + bar: Bar0<'a>, + ) -> Result<Self> { Ok(Self { hal: hal::falcon_hal(chipset)?, - dev: dev.into(), + dev, + bar, }) } /// Resets DMA-related registers. - pub(crate) fn dma_reset(&self, bar: Bar0<'_>) { - bar.update(regs::NV_PFALCON_FBIF_CTL::of::<E>(), |v| { + pub(crate) fn dma_reset(&self) { + self.bar.update(regs::NV_PFALCON_FBIF_CTL::of::<E>(), |v| { v.with_allow_phys_no_ctx(true) }); - bar.write( + self.bar.write( WithBase::of::<E>(), regs::NV_PFALCON_FALCON_DMACTL::zeroed(), ); } /// Reset the controller, select the falcon core, and wait for memory scrubbing to complete. - pub(crate) fn reset(&self, bar: Bar0<'_>) -> Result { - self.hal.reset_eng(bar)?; - self.hal.select_core(self, bar)?; - self.hal.reset_wait_mem_scrubbing(bar)?; + pub(crate) fn reset(&self) -> Result { + self.hal.reset_eng(self)?; + self.hal.select_core(self)?; + self.hal.reset_wait_mem_scrubbing(self)?; - bar.write( + self.bar.write( WithBase::of::<E>(), - regs::NV_PFALCON_FALCON_RM::from(bar.read(regs::NV_PMC_BOOT_0).into_raw()), + regs::NV_PFALCON_FALCON_RM::from(self.bar.read(regs::NV_PMC_BOOT_0).into_raw()), ); Ok(()) @@ -404,18 +406,14 @@ impl<E: FalconEngine + 'static> Falcon<E> { /// Write a slice to Falcon IMEM memory using programmed I/O (PIO). /// /// Returns `EINVAL` if `img.len()` is not a multiple of 4. - fn pio_wr_imem_slice( - &self, - bar: Bar0<'_>, - load_offsets: FalconPioImemLoadTarget<'_>, - ) -> Result { + fn pio_wr_imem_slice(&self, load_offsets: FalconPioImemLoadTarget<'_>) -> Result { // Rejecting misaligned images here allows us to avoid checking // inside the loops. if load_offsets.data.len() % 4 != 0 { return Err(EINVAL); } - bar.write( + self.bar.write( WithBase::of::<E>().at(Self::PIO_PORT), regs::NV_PFALCON_FALCON_IMEMC::zeroed() .with_secure(load_offsets.secure) @@ -426,13 +424,13 @@ impl<E: FalconEngine + 'static> Falcon<E> { for (n, block) in load_offsets.data.chunks(MEM_BLOCK_ALIGNMENT).enumerate() { let n = u16::try_from(n)?; let tag: u16 = load_offsets.start_tag.checked_add(n).ok_or(ERANGE)?; - bar.write( + self.bar.write( WithBase::of::<E>().at(Self::PIO_PORT), regs::NV_PFALCON_FALCON_IMEMT::zeroed().with_tag(tag), ); for word in block.chunks_exact(4) { let w = [word[0], word[1], word[2], word[3]]; - bar.write( + self.bar.write( WithBase::of::<E>().at(Self::PIO_PORT), regs::NV_PFALCON_FALCON_IMEMD::zeroed().with_data(u32::from_le_bytes(w)), ); @@ -445,18 +443,14 @@ impl<E: FalconEngine + 'static> Falcon<E> { /// Write a slice to Falcon DMEM memory using programmed I/O (PIO). /// /// Returns `EINVAL` if `img.len()` is not a multiple of 4. - fn pio_wr_dmem_slice( - &self, - bar: Bar0<'_>, - load_offsets: FalconPioDmemLoadTarget<'_>, - ) -> Result { + fn pio_wr_dmem_slice(&self, load_offsets: FalconPioDmemLoadTarget<'_>) -> Result { // Rejecting misaligned images here allows us to avoid checking // inside the loops. if load_offsets.data.len() % 4 != 0 { return Err(EINVAL); } - bar.write( + self.bar.write( WithBase::of::<E>().at(Self::PIO_PORT), regs::NV_PFALCON_FALCON_DMEMC::zeroed() .with_aincw(true) @@ -465,7 +459,7 @@ impl<E: FalconEngine + 'static> Falcon<E> { for word in load_offsets.data.chunks_exact(4) { let w = [word[0], word[1], word[2], word[3]]; - bar.write( + self.bar.write( WithBase::of::<E>().at(Self::PIO_PORT), regs::NV_PFALCON_FALCON_DMEMD::zeroed().with_data(u32::from_le_bytes(w)), ); @@ -477,29 +471,28 @@ impl<E: FalconEngine + 'static> Falcon<E> { /// Perform a PIO copy into `IMEM` and `DMEM` of `fw`, and prepare the falcon to run it. pub(crate) fn pio_load<F: FalconFirmware<Target = E> + FalconPioLoadable>( &self, - bar: Bar0<'_>, fw: &F, ) -> Result { - bar.update(regs::NV_PFALCON_FBIF_CTL::of::<E>(), |v| { + self.bar.update(regs::NV_PFALCON_FBIF_CTL::of::<E>(), |v| { v.with_allow_phys_no_ctx(true) }); - bar.write( + self.bar.write( WithBase::of::<E>(), regs::NV_PFALCON_FALCON_DMACTL::zeroed(), ); if let Some(imem_ns) = fw.imem_ns_load_params() { - self.pio_wr_imem_slice(bar, imem_ns)?; + self.pio_wr_imem_slice(imem_ns)?; } if let Some(imem_sec) = fw.imem_sec_load_params() { - self.pio_wr_imem_slice(bar, imem_sec)?; + self.pio_wr_imem_slice(imem_sec)?; } - self.pio_wr_dmem_slice(bar, fw.dmem_load_params())?; + self.pio_wr_dmem_slice(fw.dmem_load_params())?; - self.hal.program_brom(self, bar, &fw.brom_params()); + self.hal.program_brom(self, &fw.brom_params()); - bar.write( + self.bar.write( WithBase::of::<E>(), regs::NV_PFALCON_FALCON_BOOTVEC::zeroed().with_value(fw.boot_addr()), ); @@ -513,7 +506,6 @@ impl<E: FalconEngine + 'static> Falcon<E> { /// `sec` is set if the loaded firmware is expected to run in secure mode. fn dma_wr( &self, - bar: Bar0<'_>, dma_obj: &Coherent<[u8]>, target_mem: FalconMem, load_offsets: FalconDmaLoadTarget, @@ -571,7 +563,7 @@ impl<E: FalconEngine + 'static> Falcon<E> { // Set up the base source DMA address. - bar.write( + self.bar.write( WithBase::of::<E>(), regs::NV_PFALCON_FALCON_DMATRFBASE::zeroed().with_base( // CAST: `as u32` is used on purpose since we do want to strip the upper bits, @@ -579,7 +571,7 @@ impl<E: FalconEngine + 'static> Falcon<E> { (dma_start >> 8) as u32, ), ); - bar.write( + self.bar.write( WithBase::of::<E>(), regs::NV_PFALCON_FALCON_DMATRFBASE1::zeroed().try_with_base(dma_start >> 40)?, ); @@ -590,23 +582,23 @@ impl<E: FalconEngine + 'static> Falcon<E> { for pos in (0..num_transfers).map(|i| i * DMA_LEN) { // Perform a transfer of size `DMA_LEN`. - bar.write( + self.bar.write( WithBase::of::<E>(), regs::NV_PFALCON_FALCON_DMATRFMOFFS::zeroed() .try_with_offs(load_offsets.dst_start + pos)?, ); - bar.write( + self.bar.write( WithBase::of::<E>(), regs::NV_PFALCON_FALCON_DMATRFFBOFFS::zeroed().with_offs(src_start + pos), ); - bar.write(WithBase::of::<E>(), cmd); + self.bar.write(WithBase::of::<E>(), cmd); // Wait for the transfer to complete. // TIMEOUT: arbitrarily large value, no DMA transfer to the falcon's small memories // should ever take that long. read_poll_timeout( - || Ok(bar.read(regs::NV_PFALCON_FALCON_DMATRFCMD::of::<E>())), + || Ok(self.bar.read(regs::NV_PFALCON_FALCON_DMATRFCMD::of::<E>())), |r| r.idle(), Delta::ZERO, Delta::from_secs(2), @@ -617,12 +609,7 @@ impl<E: FalconEngine + 'static> Falcon<E> { } /// Perform a DMA load into `IMEM` and `DMEM` of `fw`, and prepare the falcon to run it. - fn dma_load<F: FalconFirmware<Target = E> + FalconDmaLoadable>( - &self, - dev: &Device<device::Bound>, - bar: Bar0<'_>, - fw: &F, - ) -> Result { + fn dma_load<F: FalconFirmware<Target = E> + FalconDmaLoadable>(&self, fw: &F) -> Result { // DMA object with firmware content as the source of the DMA engine. let dma_obj = { let fw_slice = fw.as_slice(); @@ -630,7 +617,7 @@ impl<E: FalconEngine + 'static> Falcon<E> { // DMA copies are done in chunks of `MEM_BLOCK_ALIGNMENT`, so pad the length // accordingly and fill with `0`. let mut dma_obj = CoherentBox::zeroed_slice( - dev, + self.dev, fw_slice.len().next_multiple_of(MEM_BLOCK_ALIGNMENT), GFP_KERNEL, )?; @@ -642,24 +629,20 @@ impl<E: FalconEngine + 'static> Falcon<E> { dma_obj.into() }; - self.dma_reset(bar); - bar.update(regs::NV_PFALCON_FBIF_TRANSCFG::of::<E>().at(0), |v| { - v.with_target(FalconFbifTarget::CoherentSysmem) - .with_mem_type(FalconFbifMemType::Physical) - }); + self.dma_reset(); + self.bar + .update(regs::NV_PFALCON_FBIF_TRANSCFG::of::<E>().at(0), |v| { + v.with_target(FalconFbifTarget::CoherentSysmem) + .with_mem_type(FalconFbifMemType::Physical) + }); - self.dma_wr( - bar, - &dma_obj, - FalconMem::ImemSecure, - fw.imem_sec_load_params(), - )?; - self.dma_wr(bar, &dma_obj, FalconMem::Dmem, fw.dmem_load_params())?; + self.dma_wr(&dma_obj, FalconMem::ImemSecure, fw.imem_sec_load_params())?; + self.dma_wr(&dma_obj, FalconMem::Dmem, fw.dmem_load_params())?; - self.hal.program_brom(self, bar, &fw.brom_params()); + self.hal.program_brom(self, &fw.brom_params()); // Set `BootVec` to start of non-secure code. - bar.write( + self.bar.write( WithBase::of::<E>(), regs::NV_PFALCON_FALCON_BOOTVEC::zeroed().with_value(fw.boot_addr()), ); @@ -668,10 +651,10 @@ impl<E: FalconEngine + 'static> Falcon<E> { } /// Wait until the falcon CPU is halted. - pub(crate) fn wait_till_halted(&self, bar: Bar0<'_>) -> Result<()> { + pub(crate) fn wait_till_halted(&self) -> Result<()> { // TIMEOUT: arbitrarily large value, firmwares should complete in less than 2 seconds. read_poll_timeout( - || Ok(bar.read(regs::NV_PFALCON_FALCON_CPUCTL::of::<E>())), + || Ok(self.bar.read(regs::NV_PFALCON_FALCON_CPUCTL::of::<E>())), |r| r.halted(), Delta::ZERO, Delta::from_secs(2), @@ -681,16 +664,17 @@ impl<E: FalconEngine + 'static> Falcon<E> { } /// Start the falcon CPU. - pub(crate) fn start(&self, bar: Bar0<'_>) -> Result<()> { - match bar + pub(crate) fn start(&self) -> Result<()> { + match self + .bar .read(regs::NV_PFALCON_FALCON_CPUCTL::of::<E>()) .alias_en() { - true => bar.write( + true => self.bar.write( WithBase::of::<E>(), regs::NV_PFALCON_FALCON_CPUCTL_ALIAS::zeroed().with_startcpu(true), ), - false => bar.write( + false => self.bar.write( WithBase::of::<E>(), regs::NV_PFALCON_FALCON_CPUCTL::zeroed().with_startcpu(true), ), @@ -700,16 +684,16 @@ impl<E: FalconEngine + 'static> Falcon<E> { } /// Writes values to the mailbox registers if provided. - pub(crate) fn write_mailboxes(&self, bar: Bar0<'_>, mbox0: Option<u32>, mbox1: Option<u32>) { + pub(crate) fn write_mailboxes(&self, mbox0: Option<u32>, mbox1: Option<u32>) { if let Some(mbox0) = mbox0 { - bar.write( + self.bar.write( WithBase::of::<E>(), regs::NV_PFALCON_FALCON_MAILBOX0::zeroed().with_value(mbox0), ); } if let Some(mbox1) = mbox1 { - bar.write( + self.bar.write( WithBase::of::<E>(), regs::NV_PFALCON_FALCON_MAILBOX1::zeroed().with_value(mbox1), ); @@ -717,21 +701,23 @@ impl<E: FalconEngine + 'static> Falcon<E> { } /// Reads the value from `mbox0` register. - pub(crate) fn read_mailbox0(&self, bar: Bar0<'_>) -> u32 { - bar.read(regs::NV_PFALCON_FALCON_MAILBOX0::of::<E>()) + pub(crate) fn read_mailbox0(&self) -> u32 { + self.bar + .read(regs::NV_PFALCON_FALCON_MAILBOX0::of::<E>()) .value() } /// Reads the value from `mbox1` register. - pub(crate) fn read_mailbox1(&self, bar: Bar0<'_>) -> u32 { - bar.read(regs::NV_PFALCON_FALCON_MAILBOX1::of::<E>()) + pub(crate) fn read_mailbox1(&self) -> u32 { + self.bar + .read(regs::NV_PFALCON_FALCON_MAILBOX1::of::<E>()) .value() } /// Reads values from both mailbox registers. - pub(crate) fn read_mailboxes(&self, bar: Bar0<'_>) -> (u32, u32) { - let mbox0 = self.read_mailbox0(bar); - let mbox1 = self.read_mailbox1(bar); + pub(crate) fn read_mailboxes(&self) -> (u32, u32) { + let mbox0 = self.read_mailbox0(); + let mbox1 = self.read_mailbox1(); (mbox0, mbox1) } @@ -743,54 +729,43 @@ impl<E: FalconEngine + 'static> Falcon<E> { /// /// Wait up to two seconds for the firmware to complete, and return its exit status read from /// the `MBOX0` and `MBOX1` registers. - pub(crate) fn boot( - &self, - bar: Bar0<'_>, - mbox0: Option<u32>, - mbox1: Option<u32>, - ) -> Result<(u32, u32)> { - self.write_mailboxes(bar, mbox0, mbox1); - self.start(bar)?; - self.wait_till_halted(bar)?; - Ok(self.read_mailboxes(bar)) + pub(crate) fn boot(&self, mbox0: Option<u32>, mbox1: Option<u32>) -> Result<(u32, u32)> { + self.write_mailboxes(mbox0, mbox1); + self.start()?; + self.wait_till_halted()?; + Ok(self.read_mailboxes()) } /// Returns the fused version of the signature to use in order to run a HS firmware on this /// falcon instance. `engine_id_mask` and `ucode_id` are obtained from the firmware header. pub(crate) fn signature_reg_fuse_version( &self, - bar: Bar0<'_>, engine_id_mask: u16, ucode_id: u8, ) -> Result<u32> { self.hal - .signature_reg_fuse_version(self, bar, engine_id_mask, ucode_id) + .signature_reg_fuse_version(self, engine_id_mask, ucode_id) } /// Check if the RISC-V core is active. /// /// Returns `true` if the RISC-V core is active, `false` otherwise. - pub(crate) fn is_riscv_active(&self, bar: Bar0<'_>) -> bool { - self.hal.is_riscv_active(bar) + pub(crate) fn is_riscv_active(&self) -> bool { + self.hal.is_riscv_active(self) } /// Load a firmware image into Falcon memory, using the preferred method for the current /// chipset. - pub(crate) fn load<F: FalconFirmware<Target = E> + FalconDmaLoadable>( - &self, - dev: &Device<device::Bound>, - bar: Bar0<'_>, - fw: &F, - ) -> Result { + pub(crate) fn load<F: FalconFirmware<Target = E> + FalconDmaLoadable>(&self, fw: &F) -> Result { match self.hal.load_method() { - LoadMethod::Dma => self.dma_load(dev, bar, fw), - LoadMethod::Pio => self.pio_load(bar, &fw.try_as_pio_loadable()?), + LoadMethod::Dma => self.dma_load(fw), + LoadMethod::Pio => self.pio_load(&fw.try_as_pio_loadable()?), } } /// Write the application version to the OS register. - pub(crate) fn write_os_version(&self, bar: Bar0<'_>, app_version: u32) { - bar.write( + pub(crate) fn write_os_version(&self, app_version: u32) { + self.bar.write( WithBase::of::<E>(), regs::NV_PFALCON_FALCON_OS::zeroed().with_value(app_version), ); diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs index 52cdb84ef0e8..53b1079843ae 100644 --- a/drivers/gpu/nova-core/falcon/fsp.rs +++ b/drivers/gpu/nova-core/falcon/fsp.rs @@ -21,7 +21,6 @@ use kernel::{ }; use crate::{ - driver::Bar0, falcon::{ Falcon, FalconEngine, @@ -48,18 +47,18 @@ impl RegisterBase<PFalcon2Base> for Fsp { impl FalconEngine for Fsp {} -impl Falcon<Fsp> { +impl<'a> Falcon<'a, Fsp> { /// Writes `data` to FSP external memory at offset `0`. /// /// `data` is interpreted as little-endian 32-bit words. Returns `EINVAL` /// if the `data` length is not 4-byte aligned. - fn write_emem(&mut self, bar: Bar0<'_>, data: &[u8]) -> Result { + fn write_emem(&mut self, data: &[u8]) -> Result { if data.len() % 4 != 0 { return Err(EINVAL); } // Begin a write burst at offset `0`, auto-incrementing on each write. - bar.write( + self.bar.write( WithBase::of::<Fsp>(), regs::NV_PFALCON_FALCON_EMEMC::zeroed().with_aincw(true), ); @@ -68,7 +67,7 @@ impl Falcon<Fsp> { let value = u32::from_le_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]); // Write the next 32-bit `value`; hardware advances the offset. - bar.write( + self.bar.write( WithBase::of::<Fsp>(), regs::NV_PFALCON_FALCON_EMEMD::zeroed().with_data(value), ); @@ -81,20 +80,23 @@ impl Falcon<Fsp> { /// /// `data` is stored as little-endian 32-bit words. Returns `EINVAL` if /// the `data` length is not 4-byte aligned. - fn read_emem(&mut self, bar: Bar0<'_>, data: &mut [u8]) -> Result { + fn read_emem(&mut self, data: &mut [u8]) -> Result { if data.len() % 4 != 0 { return Err(EINVAL); } // Begin a read burst at offset `0`, auto-incrementing on each read. - bar.write( + self.bar.write( WithBase::of::<Fsp>(), regs::NV_PFALCON_FALCON_EMEMC::zeroed().with_aincr(true), ); for chunk in data.chunks_exact_mut(4) { // Read the next 32-bit word; hardware advances the offset. - let value = bar.read(regs::NV_PFALCON_FALCON_EMEMD::of::<Fsp>()).data(); + let value = self + .bar + .read(regs::NV_PFALCON_FALCON_EMEMD::of::<Fsp>()) + .data(); chunk.copy_from_slice(&value.to_le_bytes()); } @@ -107,9 +109,9 @@ impl Falcon<Fsp> { /// /// The FSP message queue is not circular. Pointers are reset to 0 after each /// message exchange, so `tail >= head` is always true when data is present. - fn poll_msgq(&self, bar: Bar0<'_>) -> u32 { - let head = bar.read(regs::NV_PFSP_MSGQ_HEAD::at(0)).val(); - let tail = bar.read(regs::NV_PFSP_MSGQ_TAIL::at(0)).val(); + fn poll_msgq(&self) -> u32 { + let head = self.bar.read(regs::NV_PFSP_MSGQ_HEAD::at(0)).val(); + let tail = self.bar.read(regs::NV_PFSP_MSGQ_TAIL::at(0)).val(); if head == tail { return 0; @@ -122,20 +124,20 @@ impl Falcon<Fsp> { /// Writes `packet` to FSP EMEM and updates the queue pointers to notify FSP. /// /// Returns `EINVAL` if `packet` is empty or its length is not 4-byte aligned. - pub(crate) fn send_msg(&mut self, bar: Bar0<'_>, packet: &[u8]) -> Result { + pub(crate) fn send_msg(&mut self, packet: &[u8]) -> Result { if packet.is_empty() { return Err(EINVAL); } - self.write_emem(bar, packet)?; + self.write_emem(packet)?; // Update queue pointers. TAIL points at the last DWORD written. let tail_offset = u32::try_from(packet.len() - 4).map_err(|_| EINVAL)?; - bar.write( + self.bar.write( Array::at(0), regs::NV_PFSP_QUEUE_TAIL::zeroed().with_address(tail_offset), ); - bar.write( + self.bar.write( Array::at(0), regs::NV_PFSP_QUEUE_HEAD::zeroed().with_address(0), ); @@ -148,9 +150,9 @@ impl Falcon<Fsp> { /// /// Returns `ETIMEDOUT` if no message was available until timeout, or a regular error code if a /// memory allocation error occurred. - pub(crate) fn recv_msg(&mut self, bar: Bar0<'_>) -> Result<KVec<u8>> { + pub(crate) fn recv_msg(&mut self) -> Result<KVec<u8>> { let msg_size = read_poll_timeout( - || Ok(self.poll_msgq(bar)), + || Ok(self.poll_msgq()), |&size| size > 0, Delta::from_millis(10), Delta::from_millis(FSP_MSG_TIMEOUT_MS), @@ -160,11 +162,13 @@ impl Falcon<Fsp> { let mut buffer = KVec::<u8>::new(); buffer.resize(msg_size, 0, GFP_KERNEL)?; - self.read_emem(bar, &mut buffer)?; + self.read_emem(&mut buffer)?; // Reset message queue pointers after reading. - bar.write(Array::at(0), regs::NV_PFSP_MSGQ_TAIL::zeroed().with_val(0)); - bar.write(Array::at(0), regs::NV_PFSP_MSGQ_HEAD::zeroed().with_val(0)); + self.bar + .write(Array::at(0), regs::NV_PFSP_MSGQ_TAIL::zeroed().with_val(0)); + self.bar + .write(Array::at(0), regs::NV_PFSP_MSGQ_HEAD::zeroed().with_val(0)); Ok(buffer) } diff --git a/drivers/gpu/nova-core/falcon/gsp.rs b/drivers/gpu/nova-core/falcon/gsp.rs index d1f6f7fcffff..ae32f401aeb0 100644 --- a/drivers/gpu/nova-core/falcon/gsp.rs +++ b/drivers/gpu/nova-core/falcon/gsp.rs @@ -14,7 +14,6 @@ use kernel::{ }; use crate::{ - driver::Bar0, falcon::{ Falcon, FalconEngine, @@ -24,10 +23,6 @@ use crate::{ regs, }; -/// Pattern returned by GSP register reads while the PRIV target mask still blocks CPU access. -const GSP_TARGET_MASK_LOCKED_PATTERN: u32 = 0xbadf_4100; -const GSP_TARGET_MASK_LOCKED_MASK: u32 = 0xffff_ff00; - /// Type specifying the `Gsp` falcon engine. Cannot be instantiated. pub(crate) struct Gsp(()); @@ -41,20 +36,20 @@ impl RegisterBase<PFalcon2Base> for Gsp { impl FalconEngine for Gsp {} -impl Falcon<Gsp> { +impl<'a> Falcon<'a, Gsp> { /// Clears the SWGEN0 bit in the Falcon's IRQ status clear register to /// allow GSP to signal CPU for processing new messages in message queue. - pub(crate) fn clear_swgen0_intr(&self, bar: Bar0<'_>) { - bar.write( + pub(crate) fn clear_swgen0_intr(&self) { + self.bar.write( WithBase::of::<Gsp>(), regs::NV_PFALCON_FALCON_IRQSCLR::zeroed().with_swgen0(true), ); } /// Checks if GSP reload/resume has completed during the boot process. - pub(crate) fn check_reload_completed(&self, bar: Bar0<'_>, timeout: Delta) -> Result<bool> { + pub(crate) fn check_reload_completed(&self, timeout: Delta) -> Result<bool> { read_poll_timeout( - || Ok(bar.read(regs::NV_PGC6_BSI_SECURE_SCRATCH_14)), + || Ok(self.bar.read(regs::NV_PGC6_BSI_SECURE_SCRATCH_14)), |val| val.boot_stage_3_handoff(), Delta::ZERO, timeout, @@ -63,17 +58,24 @@ impl Falcon<Gsp> { } /// Returns whether the RISC-V branch privilege lockdown bit is set. - pub(crate) fn riscv_branch_privilege_lockdown(&self, bar: Bar0<'_>) -> bool { - bar.read(regs::NV_PFALCON_FALCON_HWCFG2::of::<Gsp>()) + pub(crate) fn riscv_branch_privilege_lockdown(&self) -> bool { + self.bar + .read(regs::NV_PFALCON_FALCON_HWCFG2::of::<Gsp>()) .riscv_br_priv_lockdown() } /// Returns whether GSP registers can be read by the CPU. - pub(crate) fn priv_target_mask_released(&self, bar: Bar0<'_>) -> bool { - let hwcfg2 = bar + pub(crate) fn priv_target_mask_released(&self) -> bool { + /// Pattern returned by GSP register reads while the PRIV target mask still blocks CPU + /// access. The low byte varies; the upper 24 bits are fixed. + const LOCKED_PATTERN: u32 = 0xbadf_4100; + const LOCKED_MASK: u32 = 0xffff_ff00; + + let hwcfg2 = self + .bar .read(regs::NV_PFALCON_FALCON_HWCFG2::of::<Gsp>()) .into_raw(); - hwcfg2 != 0 && (hwcfg2 & GSP_TARGET_MASK_LOCKED_MASK) != GSP_TARGET_MASK_LOCKED_PATTERN + hwcfg2 != 0 && (hwcfg2 & LOCKED_MASK) != LOCKED_PATTERN } } diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs index 89b56823906b..ee4a017f3a4c 100644 --- a/drivers/gpu/nova-core/falcon/hal.rs +++ b/drivers/gpu/nova-core/falcon/hal.rs @@ -3,7 +3,6 @@ use kernel::prelude::*; use crate::{ - driver::Bar0, falcon::{ Falcon, FalconBromParams, @@ -34,7 +33,7 @@ pub(crate) enum LoadMethod { /// registers. pub(crate) trait FalconHal<E: FalconEngine>: Send + Sync { /// Activates the Falcon core if the engine is a risvc/falcon dual engine. - fn select_core(&self, _falcon: &Falcon<E>, _bar: Bar0<'_>) -> Result { + fn select_core(&self, _falcon: &Falcon<'_, E>) -> Result { Ok(()) } @@ -42,24 +41,23 @@ pub(crate) trait FalconHal<E: FalconEngine>: Send + Sync { /// falcon instance. `engine_id_mask` and `ucode_id` are obtained from the firmware header. fn signature_reg_fuse_version( &self, - falcon: &Falcon<E>, - bar: Bar0<'_>, + falcon: &Falcon<'_, E>, engine_id_mask: u16, ucode_id: u8, ) -> Result<u32>; /// Program the boot ROM registers prior to starting a secure firmware. - fn program_brom(&self, falcon: &Falcon<E>, bar: Bar0<'_>, params: &FalconBromParams); + fn program_brom(&self, falcon: &Falcon<'_, E>, params: &FalconBromParams); /// Check if the RISC-V core is active. /// Returns `true` if the RISC-V core is active, `false` otherwise. - fn is_riscv_active(&self, bar: Bar0<'_>) -> bool; + fn is_riscv_active(&self, falcon: &Falcon<'_, E>) -> bool; /// Wait for memory scrubbing to complete. - fn reset_wait_mem_scrubbing(&self, bar: Bar0<'_>) -> Result; + fn reset_wait_mem_scrubbing(&self, falcon: &Falcon<'_, E>) -> Result; /// Reset the falcon engine. - fn reset_eng(&self, bar: Bar0<'_>) -> Result; + fn reset_eng(&self, falcon: &Falcon<'_, E>) -> Result; /// Returns the method used to load data into the falcon's memory. /// diff --git a/drivers/gpu/nova-core/falcon/hal/ga102.rs b/drivers/gpu/nova-core/falcon/hal/ga102.rs index cf6ce47e6b25..fe821ded5fa1 100644 --- a/drivers/gpu/nova-core/falcon/hal/ga102.rs +++ b/drivers/gpu/nova-core/falcon/hal/ga102.rs @@ -115,33 +115,34 @@ impl<E: FalconEngine> Ga102<E> { } impl<E: FalconEngine> FalconHal<E> for Ga102<E> { - fn select_core(&self, _falcon: &Falcon<E>, bar: Bar0<'_>) -> Result { - select_core_ga102::<E>(bar) + fn select_core(&self, falcon: &Falcon<'_, E>) -> Result { + select_core_ga102::<E>(falcon.bar) } fn signature_reg_fuse_version( &self, - falcon: &Falcon<E>, - bar: Bar0<'_>, + falcon: &Falcon<'_, E>, engine_id_mask: u16, ucode_id: u8, ) -> Result<u32> { - signature_reg_fuse_version_ga102(&falcon.dev, bar, engine_id_mask, ucode_id) + signature_reg_fuse_version_ga102(falcon.dev, falcon.bar, engine_id_mask, ucode_id) } - fn program_brom(&self, _falcon: &Falcon<E>, bar: Bar0<'_>, params: &FalconBromParams) { - program_brom_ga102::<E>(bar, params); + fn program_brom(&self, falcon: &Falcon<'_, E>, params: &FalconBromParams) { + program_brom_ga102::<E>(falcon.bar, params); } - fn is_riscv_active(&self, bar: Bar0<'_>) -> bool { - bar.read(regs::NV_PRISCV_RISCV_CPUCTL::of::<E>()) + fn is_riscv_active(&self, falcon: &Falcon<'_, E>) -> bool { + falcon + .bar + .read(regs::NV_PRISCV_RISCV_CPUCTL::of::<E>()) .active_stat() } - fn reset_wait_mem_scrubbing(&self, bar: Bar0<'_>) -> Result { + fn reset_wait_mem_scrubbing(&self, falcon: &Falcon<'_, E>) -> Result { // TIMEOUT: memory scrubbing should complete in less than 20ms. read_poll_timeout( - || Ok(bar.read(regs::NV_PFALCON_FALCON_HWCFG2::of::<E>())), + || Ok(falcon.bar.read(regs::NV_PFALCON_FALCON_HWCFG2::of::<E>())), |r| r.mem_scrubbing_done(), Delta::ZERO, Delta::from_millis(20), @@ -149,7 +150,9 @@ impl<E: FalconEngine> FalconHal<E> for Ga102<E> { .map(|_| ()) } - fn reset_eng(&self, bar: Bar0<'_>) -> Result { + fn reset_eng(&self, falcon: &Falcon<'_, E>) -> Result { + let bar = falcon.bar; + let _ = bar.read(regs::NV_PFALCON_FALCON_HWCFG2::of::<E>()); // According to OpenRM's `kflcnPreResetWait_GA102` documentation, HW sometimes does not set @@ -162,7 +165,7 @@ impl<E: FalconEngine> FalconHal<E> for Ga102<E> { ); regs::NV_PFALCON_FALCON_ENGINE::reset_engine::<E>(bar); - self.reset_wait_mem_scrubbing(bar)?; + self.reset_wait_mem_scrubbing(falcon)?; Ok(()) } diff --git a/drivers/gpu/nova-core/falcon/hal/tu102.rs b/drivers/gpu/nova-core/falcon/hal/tu102.rs index 3aaee3869312..34bf9f3f44c7 100644 --- a/drivers/gpu/nova-core/falcon/hal/tu102.rs +++ b/drivers/gpu/nova-core/falcon/hal/tu102.rs @@ -13,7 +13,6 @@ use kernel::{ }; use crate::{ - driver::Bar0, falcon::{ hal::LoadMethod, Falcon, @@ -34,31 +33,32 @@ impl<E: FalconEngine> Tu102<E> { } impl<E: FalconEngine> FalconHal<E> for Tu102<E> { - fn select_core(&self, _falcon: &Falcon<E>, _bar: Bar0<'_>) -> Result { + fn select_core(&self, _falcon: &Falcon<'_, E>) -> Result { Ok(()) } fn signature_reg_fuse_version( &self, - _falcon: &Falcon<E>, - _bar: Bar0<'_>, + _falcon: &Falcon<'_, E>, _engine_id_mask: u16, _ucode_id: u8, ) -> Result<u32> { Ok(0) } - fn program_brom(&self, _falcon: &Falcon<E>, _bar: Bar0<'_>, _params: &FalconBromParams) {} + fn program_brom(&self, _falcon: &Falcon<'_, E>, _params: &FalconBromParams) {} - fn is_riscv_active(&self, bar: Bar0<'_>) -> bool { - bar.read(regs::NV_PRISCV_RISCV_CORE_SWITCH_RISCV_STATUS::of::<E>()) + fn is_riscv_active(&self, falcon: &Falcon<'_, E>) -> bool { + falcon + .bar + .read(regs::NV_PRISCV_RISCV_CORE_SWITCH_RISCV_STATUS::of::<E>()) .active_stat() } - fn reset_wait_mem_scrubbing(&self, bar: Bar0<'_>) -> Result { + fn reset_wait_mem_scrubbing(&self, falcon: &Falcon<'_, E>) -> Result { // TIMEOUT: memory scrubbing should complete in less than 10ms. read_poll_timeout( - || Ok(bar.read(regs::NV_PFALCON_FALCON_DMACTL::of::<E>())), + || Ok(falcon.bar.read(regs::NV_PFALCON_FALCON_DMACTL::of::<E>())), |r| r.mem_scrubbing_done(), Delta::ZERO, Delta::from_millis(10), @@ -66,9 +66,9 @@ impl<E: FalconEngine> FalconHal<E> for Tu102<E> { .map(|_| ()) } - fn reset_eng(&self, bar: Bar0<'_>) -> Result { - regs::NV_PFALCON_FALCON_ENGINE::reset_engine::<E>(bar); - self.reset_wait_mem_scrubbing(bar)?; + fn reset_eng(&self, falcon: &Falcon<'_, E>) -> Result { + regs::NV_PFALCON_FALCON_ENGINE::reset_engine::<E>(falcon.bar); + self.reset_wait_mem_scrubbing(falcon)?; Ok(()) } diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs index 725e428154cf..273cff752fae 100644 --- a/drivers/gpu/nova-core/fb.rs +++ b/drivers/gpu/nova-core/fb.rs @@ -23,11 +23,11 @@ use crate::{ firmware::gsp::GspFirmware, gpu::Chipset, gsp, - num::FromSafeCast, - regs, // + num::FromSafeCast, // }; mod hal; +mod regs; /// Type holding the sysmem flush memory page, a page of memory to be written into the /// `NV_PFB_NISO_FLUSH_SYSMEM_ADDR*` registers and used to maintain memory coherency. diff --git a/drivers/gpu/nova-core/fb/hal/gb202.rs b/drivers/gpu/nova-core/fb/hal/gb202.rs index 038d1278c634..b78e0970f66d 100644 --- a/drivers/gpu/nova-core/fb/hal/gb202.rs +++ b/drivers/gpu/nova-core/fb/hal/gb202.rs @@ -4,13 +4,7 @@ //! Blackwell GB20x framebuffer HAL. use kernel::{ - io::{ - register::{ - RegisterBase, - WithBase, // - }, - Io, // - }, + io::Io, num::Bounded, prelude::*, sizes::SizeConstants, // @@ -24,35 +18,29 @@ use crate::{ struct Gb202; -impl RegisterBase<regs::Fbhub0Base> for Gb202 { - const BASE: usize = 0x008a_0000; -} - fn read_sysmem_flush_page_gb202(bar: Bar0<'_>) -> u64 { let lo = u64::from( - bar.read(regs::NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_LO::of::<Gb202>()) + bar.read(regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO) .adr(), ); let hi = u64::from( - bar.read(regs::NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_HI::of::<Gb202>()) + bar.read(regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI) .adr(), ); - lo | (hi << 32) + (hi << 32) | lo } /// Write the sysmem flush page address through the GB20x FBHUB0 registers. fn write_sysmem_flush_page_gb202(bar: Bar0<'_>, addr: Bounded<u64, 52>) { // Write HI first. The hardware will trigger the flush on the LO write. - bar.write( - regs::NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_HI::of::<Gb202>(), - regs::NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_HI::zeroed() + bar.write_reg( + regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::zeroed() .with_adr(addr.shr::<32, 20>().cast::<u32>()), ); - bar.write( - regs::NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_LO::of::<Gb202>(), + bar.write_reg( // CAST: lower 32 bits. Hardware ignores bits 7:0. - regs::NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_LO::zeroed().with_adr(*addr as u32), + regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::zeroed().with_adr(*addr as u32), ); } diff --git a/drivers/gpu/nova-core/fb/hal/gh100.rs b/drivers/gpu/nova-core/fb/hal/gh100.rs index 5450c7254dad..d39fe99537ed 100644 --- a/drivers/gpu/nova-core/fb/hal/gh100.rs +++ b/drivers/gpu/nova-core/fb/hal/gh100.rs @@ -2,24 +2,49 @@ // SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. use kernel::{ + io::Io, + num::Bounded, prelude::*, sizes::SizeConstants, // }; use crate::{ driver::Bar0, - fb::hal::FbHal, // + fb::hal::FbHal, + regs, // }; struct Gh100; +fn read_sysmem_flush_page_gh100(bar: Bar0<'_>) -> u64 { + let lo = u64::from(bar.read(regs::NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_LO).adr()); + let hi = u64::from(bar.read(regs::NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_HI).adr()); + + (hi << 32) | lo +} + +/// Write the sysmem flush page address through the Hopper FBHUB registers. +fn write_sysmem_flush_page_gh100(bar: Bar0<'_>, addr: Bounded<u64, 52>) { + // Write HI first. The hardware will trigger the flush on the LO write. + bar.write_reg( + regs::NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_HI::zeroed() + .with_adr(addr.shr::<32, 20>().cast::<u32>()), + ); + bar.write_reg( + // CAST: lower 32 bits. Hardware ignores bits 7:0. + regs::NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_LO::zeroed().with_adr(*addr as u32), + ); +} + impl FbHal for Gh100 { fn read_sysmem_flush_page(&self, bar: Bar0<'_>) -> u64 { - super::ga100::read_sysmem_flush_page_ga100(bar) + read_sysmem_flush_page_gh100(bar) } fn write_sysmem_flush_page(&self, bar: Bar0<'_>, addr: u64) -> Result { - super::ga100::write_sysmem_flush_page_ga100(bar, addr); + let addr = Bounded::<u64, 52>::try_new(addr).ok_or(EINVAL)?; + + write_sysmem_flush_page_gh100(bar, addr); Ok(()) } diff --git a/drivers/gpu/nova-core/fb/regs.rs b/drivers/gpu/nova-core/fb/regs.rs new file mode 100644 index 000000000000..b2ec02f584be --- /dev/null +++ b/drivers/gpu/nova-core/fb/regs.rs @@ -0,0 +1,25 @@ +// SPDX-License-Identifier: GPL-2.0 + +use kernel::io::register; + +// PDISP + +register! { + pub(super) NV_PDISP_VGA_WORKSPACE_BASE(u32) @ 0x00625f04 { + /// VGA workspace base address divided by 0x10000. + 31:8 addr; + /// Set if the `addr` field is valid. + 3:3 status_valid => bool; + } +} + +impl NV_PDISP_VGA_WORKSPACE_BASE { + /// Returns the base address of the VGA workspace, or `None` if none exists. + pub(super) fn vga_workspace_addr(self) -> Option<u64> { + if self.status_valid() { + Some(u64::from(self.addr()) << 16) + } else { + None + } + } +} diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs index 1e89390209f5..a94820a3b335 100644 --- a/drivers/gpu/nova-core/firmware.rs +++ b/drivers/gpu/nova-core/firmware.rs @@ -88,7 +88,7 @@ pub(crate) struct FalconUCodeDescV2 { /// Structure used to describe some firmwares, notably FWSEC-FRTS. #[repr(C)] -#[derive(Debug, Clone)] +#[derive(Debug, Clone, FromBytes)] pub(crate) struct FalconUCodeDescV3 { /// Header defined by `NV_BIT_FALCON_UCODE_DESC_HEADER_VDESC*` in OpenRM. hdr: u32, @@ -119,10 +119,6 @@ pub(crate) struct FalconUCodeDescV3 { _reserved: u16, } -// SAFETY: all bit patterns are valid for this type, and it doesn't use -// interior mutability. -unsafe impl FromBytes for FalconUCodeDescV3 {} - /// Enum wrapping the different versions of Falcon microcode descriptors. /// /// This allows handling both V2 and V3 descriptor formats through a @@ -424,19 +420,20 @@ impl<const N: usize> ModInfoBuilder<N> { let name = chipset.name(); let this = self - .make_entry_file(name, "booter_load") - .make_entry_file(name, "booter_unload") .make_entry_file(name, "bootloader") .make_entry_file(name, "gsp"); - let this = if chipset.needs_fwsec_bootloader() { - this.make_entry_file(name, "gen_bootloader") + // FSP-based chipsets (Hopper, Blackwell and later) boot the GSP via the FMC image loaded by + // FSP. Older chipsets use the SEC2 booter instead. + let this = if chipset.uses_fsp() { + this.make_entry_file(name, "fmc") } else { - this + this.make_entry_file(name, "booter_load") + .make_entry_file(name, "booter_unload") }; - if chipset.uses_fsp() { - this.make_entry_file(name, "fmc") + if chipset.needs_fwsec_bootloader() { + this.make_entry_file(name, "gen_bootloader") } else { this } @@ -464,11 +461,9 @@ impl<const N: usize> ModInfoBuilder<N> { /// that scheme before nova-core becomes stable, which means this module will eventually be /// removed. mod elf { - use core::mem::size_of; - use kernel::{ bindings, - str::CStr, + prelude::*, transmute::FromBytes, // }; diff --git a/drivers/gpu/nova-core/firmware/booter.rs b/drivers/gpu/nova-core/firmware/booter.rs index d9313ac361af..acb7f4d8a532 100644 --- a/drivers/gpu/nova-core/firmware/booter.rs +++ b/drivers/gpu/nova-core/firmware/booter.rs @@ -15,7 +15,6 @@ use kernel::{ }; use crate::{ - driver::Bar0, falcon::{ sec2::Sec2, Falcon, @@ -293,8 +292,7 @@ impl BooterFirmware { kind: BooterKind, chipset: Chipset, ver: &str, - falcon: &Falcon<<Self as FalconFirmware>::Target>, - bar: Bar0<'_>, + falcon: &Falcon<'_, <Self as FalconFirmware>::Target>, ) -> Result<Self> { let fw_name = match kind { BooterKind::Loader => "booter_load", @@ -339,11 +337,8 @@ impl BooterFirmware { } else { // Obtain the version from the fuse register, and extract the corresponding // signature. - let reg_fuse_version = falcon.signature_reg_fuse_version( - bar, - brom_params.engine_id_mask, - brom_params.ucode_id, - )?; + let reg_fuse_version = falcon + .signature_reg_fuse_version(brom_params.engine_id_mask, brom_params.ucode_id)?; // `0` means the last signature should be used. const FUSE_VERSION_USE_LAST_SIG: u32 = 0; @@ -405,18 +400,14 @@ impl BooterFirmware { pub(crate) fn run<T>( &self, dev: &device::Device<device::Bound>, - bar: Bar0<'_>, - sec2_falcon: &Falcon<Sec2>, + sec2_falcon: &Falcon<'_, Sec2>, wpr_meta: &Coherent<T>, ) -> Result { - sec2_falcon.reset(bar)?; - sec2_falcon.load(dev, bar, self)?; + sec2_falcon.reset()?; + sec2_falcon.load(self)?; let wpr_handle = wpr_meta.dma_handle(); - let (mbox0, mbox1) = sec2_falcon.boot( - bar, - Some(wpr_handle as u32), - Some((wpr_handle >> 32) as u32), - )?; + let (mbox0, mbox1) = + sec2_falcon.boot(Some(wpr_handle as u32), Some((wpr_handle >> 32) as u32))?; dev_dbg!(dev, "SEC2 MBOX0: {:#x}, MBOX1: {:#x}\n", mbox0, mbox1); if mbox0 != 0 { diff --git a/drivers/gpu/nova-core/firmware/fwsec.rs b/drivers/gpu/nova-core/firmware/fwsec.rs index 199ae2adb664..95e0dd77746b 100644 --- a/drivers/gpu/nova-core/firmware/fwsec.rs +++ b/drivers/gpu/nova-core/firmware/fwsec.rs @@ -27,7 +27,6 @@ use kernel::{ }; use crate::{ - driver::Bar0, falcon::{ gsp::Gsp, Falcon, @@ -320,8 +319,7 @@ impl FwsecFirmware { /// command. pub(crate) fn new( dev: &Device<device::Bound>, - falcon: &Falcon<Gsp>, - bar: Bar0<'_>, + falcon: &Falcon<'_, Gsp>, bios: &Vbios, cmd: FwsecCommand, ) -> Result<Self> { @@ -337,7 +335,7 @@ impl FwsecFirmware { .ok_or(EINVAL)?; let desc_sig_versions = u32::from(desc.signature_versions()); let reg_fuse_version = - falcon.signature_reg_fuse_version(bar, desc.engine_id_mask(), desc.ucode_id())?; + falcon.signature_reg_fuse_version(desc.engine_id_mask(), desc.ucode_id())?; dev_dbg!( dev, "desc_sig_versions: {:#x}, reg_fuse_version: {}\n", @@ -390,21 +388,16 @@ impl FwsecFirmware { /// This must only be called on chipsets that do not need the FWSEC bootloader (i.e., where /// [`Chipset::needs_fwsec_bootloader()`](crate::gpu::Chipset::needs_fwsec_bootloader) returns /// `false`). On chipsets that do, use [`bootloader::FwsecFirmwareWithBl`] instead. - pub(crate) fn run( - &self, - dev: &Device<device::Bound>, - falcon: &Falcon<Gsp>, - bar: Bar0<'_>, - ) -> Result<()> { + pub(crate) fn run(&self, dev: &Device<device::Bound>, falcon: &Falcon<'_, Gsp>) -> Result<()> { // Reset falcon, load the firmware, and run it. falcon - .reset(bar) + .reset() .inspect_err(|e| dev_err!(dev, "Failed to reset GSP falcon: {:?}\n", e))?; falcon - .load(dev, bar, self) + .load(self) .inspect_err(|e| dev_err!(dev, "Failed to load FWSEC firmware: {:?}\n", e))?; let (mbox0, _) = falcon - .boot(bar, Some(0), None) + .boot(Some(0), None) .inspect_err(|e| dev_err!(dev, "Failed to boot FWSEC firmware: {:?}\n", e))?; if mbox0 != 0 { dev_err!(dev, "FWSEC firmware returned error {}\n", mbox0); diff --git a/drivers/gpu/nova-core/firmware/fwsec/bootloader.rs b/drivers/gpu/nova-core/firmware/fwsec/bootloader.rs index 039920dc340b..d9fafd2eea5b 100644 --- a/drivers/gpu/nova-core/firmware/fwsec/bootloader.rs +++ b/drivers/gpu/nova-core/firmware/fwsec/bootloader.rs @@ -7,16 +7,12 @@ //! be loaded using PIO. use kernel::{ - alloc::KVec, device::{ self, Device, // }, dma::Coherent, - io::{ - register::WithBase, // - Io, - }, + io::{register::WithBase, Io}, prelude::*, ptr::{ Alignable, @@ -51,7 +47,7 @@ use crate::{ FIRMWARE_VERSION, // }, gpu::Chipset, - num::FromSafeCast, + num::FromSafeCast, // regs, }; @@ -279,15 +275,15 @@ impl FwsecFirmwareWithBl { pub(crate) fn run( &self, dev: &Device<device::Bound>, - falcon: &Falcon<Gsp>, + falcon: &Falcon<'_, Gsp>, bar: Bar0<'_>, ) -> Result<()> { // Reset falcon, load the firmware, and run it. falcon - .reset(bar) + .reset() .inspect_err(|e| dev_err!(dev, "Failed to reset GSP falcon: {:?}\n", e))?; falcon - .pio_load(bar, self) + .pio_load(self) .inspect_err(|e| dev_err!(dev, "Failed to load FWSEC firmware: {:?}\n", e))?; // Configure DMA index for the bootloader to fetch the FWSEC firmware from system memory. @@ -302,7 +298,7 @@ impl FwsecFirmwareWithBl { ); let (mbox0, _) = falcon - .boot(bar, Some(0), None) + .boot(Some(0), None) .inspect_err(|e| dev_err!(dev, "Failed to boot FWSEC firmware: {:?}\n", e))?; if mbox0 != 0 { dev_err!(dev, "FWSEC firmware returned error {}\n", mbox0); diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs index 8fc243c66e35..f0c595175c9c 100644 --- a/drivers/gpu/nova-core/fsp.rs +++ b/drivers/gpu/nova-core/fsp.rs @@ -11,6 +11,7 @@ use kernel::{ device, dma::Coherent, io::poll::read_poll_timeout, + num::TryIntoBounded, prelude::*, ptr::{ Alignable, @@ -31,9 +32,12 @@ use crate::{ Falcon, // }, fb::FbLayout, - firmware::fsp::{ - FmcSignatures, - FspFirmware, // + firmware::{ + fsp::{ + FmcSignatures, + FspFirmware, // + }, + FIRMWARE_VERSION, // }, gpu::Chipset, gsp::GspFmcBootParams, @@ -57,12 +61,35 @@ struct NvdmPayloadCommandResponse { error_code: u32, } -/// Complete FSP response structure with MCTP and NVDM headers. +/// Common MCTP and NVDM headers shared by all FSP messages. #[repr(C, packed)] #[derive(Clone, Copy)] -struct FspResponse { +struct FspMessageHeader { mctp_header: MctpHeader, nvdm_header: NvdmHeader, +} + +// SAFETY: FspMessageHeader is a packed C struct with only integral fields. +unsafe impl AsBytes for FspMessageHeader {} + +// SAFETY: FspMessageHeader is a packed C struct with only integral fields. +unsafe impl FromBytes for FspMessageHeader {} + +impl FspMessageHeader { + /// Construct a standard FSP message header for the given NVDM type. + fn new(nvdm_type: NvdmType) -> Self { + Self { + mctp_header: MctpHeader::single_packet(), + nvdm_header: NvdmHeader::new(nvdm_type), + } + } +} + +/// Complete FSP response structure with MCTP and NVDM headers. +#[repr(C, packed)] +#[derive(Clone, Copy)] +struct FspResponse { + header: FspMessageHeader, response: NvdmPayloadCommandResponse, } @@ -94,23 +121,23 @@ struct NvdmPayloadCot { gsp_boot_args_sysmem_offset: u64, } -/// Complete FSP message structure with MCTP and NVDM headers. +/// Complete FSP COT (Chain of Trust) message structure. #[repr(C)] #[derive(Clone, Copy)] -struct FspMessage { - mctp_header: MctpHeader, - nvdm_header: NvdmHeader, +struct FspCotMessage { + header: FspMessageHeader, cot: NvdmPayloadCot, } -impl FspMessage { - /// Returns an in-place initializer for [`FspMessage`]. +impl FspCotMessage { + /// Returns an in-place initializer for [`FspCotMessage`]. fn new<'a>( fb_layout: &FbLayout, fsp_fw: &'a FspFirmware, args: &'a FmcBootArgs, ) -> Result<impl Init<Self> + 'a> { - // frts_offset is relative to FB end: FRTS_location = FB_END - frts_offset + // frts_vidmem_offset is measured from the end of FB, so FRTS sits at + // (end of FB) - frts_vidmem_offset. let frts_vidmem_offset = if !args.resume { let frts_reserved_size = fb_layout.heap.len() + u64::from(fb_layout.pmu_reserved_size); @@ -131,8 +158,7 @@ impl FspMessage { let size = num::usize_into_u16::<{ core::mem::size_of::<NvdmPayloadCot>() }>(); Ok(init!(Self { - mctp_header: MctpHeader::single_packet(), - nvdm_header: NvdmHeader::new(NvdmType::Cot), + header: FspMessageHeader::new(NvdmType::Cot), // The payload is packed, so we cannot use `init!`. Initialize it member-by-member using // `chain`. cot <- pin_init::init_zeroed(), @@ -143,8 +169,8 @@ impl FspMessage { msg.cot.gsp_fmc_sysmem_offset = fsp_fw.fmc_image.dma_handle(); msg.cot.frts_vidmem_offset = frts_vidmem_offset; msg.cot.frts_vidmem_size = frts_size; - // frts_sysmem_* intentionally left at zero for now, but will be needed for e.g. - // systems without VRAM. + // frts_sysmem_* are left at zero because this path places FRTS in vidmem. The sysmem + // fields point to an FRTS buffer in sysmem instead, for systems without VRAM. msg.cot.gsp_boot_args_sysmem_offset = args.fmc_boot_params.dma_handle(); msg.cot.sigs = *fsp_fw.fmc_sigs; @@ -153,11 +179,11 @@ impl FspMessage { } } -// SAFETY: `FspMessage` is `#[repr(C)]` with no padding, so all of its +// SAFETY: `FspCotMessage` is `#[repr(C)]` with no padding, so all of its // bytes are initialized. -unsafe impl AsBytes for FspMessage {} +unsafe impl AsBytes for FspCotMessage {} -impl MessageToFsp for FspMessage { +impl MessageToFsp for FspCotMessage { const NVDM_TYPE: NvdmType = NvdmType::Cot; } @@ -199,28 +225,28 @@ impl FmcBootArgs { /// An `Fsp` is produced by [`Fsp::wait_secure_boot`], which only returns once FSP secure boot /// has completed. It owns the FSP falcon and the FMC firmware, which are used for the subsequent /// Chain of Trust boot. -pub(crate) struct Fsp { - falcon: Falcon<FspEngine>, +pub(crate) struct Fsp<'a> { + falcon: Falcon<'a, FspEngine>, fsp_fw: FspFirmware, } -impl Fsp { +impl<'a> Fsp<'a> { /// Waits for FSP secure boot completion, then returns the [`Fsp`] interface. /// /// Polls the thermal scratch register until FSP signals boot completion or the timeout /// elapses. Returning an [`Fsp`] only on success guarantees, at the API level, that the /// interface is not used before secure boot has completed. pub(crate) fn wait_secure_boot( - dev: &device::Device<device::Bound>, - bar: Bar0<'_>, + dev: &'a device::Device<device::Bound>, + bar: Bar0<'a>, chipset: Chipset, - fsp_fw: FspFirmware, - ) -> Result<Fsp> { + ) -> Result<Fsp<'a>> { /// FSP secure boot completion timeout in milliseconds. const FSP_SECURE_BOOT_TIMEOUT_MS: i64 = 5000; let hal = hal::fsp_hal(chipset).ok_or(ENOTSUPP)?; - let falcon = Falcon::<FspEngine>::new(dev, chipset)?; + let falcon = Falcon::<FspEngine>::new(dev, chipset, bar)?; + let fsp_fw = FspFirmware::new(dev, chipset, FIRMWARE_VERSION)?; read_poll_timeout( || Ok(hal.fsp_boot_status(bar)), @@ -236,13 +262,14 @@ impl Fsp { } /// Sends a message to FSP and waits for the response. - fn send_sync_fsp<M>(&mut self, dev: &device::Device, bar: Bar0<'_>, msg: &M) -> Result + /// Returns the full response buffer on success. + fn send_sync_fsp<M>(&mut self, dev: &device::Device, msg: &M) -> Result<KVec<u8>> where M: MessageToFsp, { - self.falcon.send_msg(bar, msg.as_bytes())?; + self.falcon.send_msg(msg.as_bytes())?; - let response_buf = self.falcon.recv_msg(bar).inspect_err(|e| { + let response_buf = self.falcon.recv_msg().inspect_err(|e| { dev_err!(dev, "FSP response error: {:?}\n", e); })?; @@ -251,8 +278,8 @@ impl Fsp { EIO })?; - let mctp_header = response.mctp_header; - let nvdm_header = response.nvdm_header; + let mctp_header = response.header.mctp_header; + let nvdm_header = response.header.nvdm_header; let command_nvdm_type = response.response.command_nvdm_type; let error_code = response.response.error_code; @@ -274,7 +301,7 @@ impl Fsp { return Err(EIO); } - if command_nvdm_type != u8::from(M::NVDM_TYPE).into() { + if command_nvdm_type.try_into_bounded() != Some(M::NVDM_TYPE.into()) { dev_err!( dev, "Expected NVDM type {:?} in reply, got {:#x}\n", @@ -294,7 +321,7 @@ impl Fsp { return Err(EIO); } - Ok(()) + Ok(response_buf) } /// Boots GSP FMC via FSP Chain of Trust. @@ -304,15 +331,17 @@ impl Fsp { pub(crate) fn boot_fmc( &mut self, dev: &device::Device<device::Bound>, - bar: Bar0<'_>, fb_layout: &FbLayout, args: &FmcBootArgs, ) -> Result { dev_dbg!(dev, "Starting FSP boot sequence for {}\n", args.chipset); - let msg = KBox::init(FspMessage::new(fb_layout, &self.fsp_fw, args)?, GFP_KERNEL)?; + let msg = KBox::init( + FspCotMessage::new(fb_layout, &self.fsp_fw, args)?, + GFP_KERNEL, + )?; - self.send_sync_fsp(dev, bar, &*msg)?; + let _response_buf = self.send_sync_fsp(dev, &*msg)?; dev_dbg!(dev, "FSP Chain of Trust completed successfully\n"); Ok(()) diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index b3c91731db45..43c3f4f8df71 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -9,7 +9,8 @@ use kernel::{ io::Io, num::Bounded, pci, - prelude::*, // + prelude::*, + sizes::SizeConstants, // }; use crate::{ @@ -23,7 +24,9 @@ use crate::{ fb::SysmemFlush, gsp::{ self, - Gsp, // + commands::GetGspStaticInfoReply, + Gsp, + GspBootContext, // }, regs, }; @@ -262,35 +265,64 @@ impl fmt::Display for Spec { } } -/// Structure holding the resources required to operate the GPU. +/// Self-contained resources to operate and drop the GSP. #[pin_data(PinnedDrop)] -pub(crate) struct Gpu<'gpu> { +struct GspResources<'gpu> { /// Device owning the GPU. device: &'gpu device::Device<device::Bound>, - spec: Spec, /// MMIO mapping of PCI BAR 0. bar: Bar0<'gpu>, - /// System memory page required for flushing all pending GPU-side memory writes done through - /// PCIE into system memory, via sysmembar (A GPU-initiated HW memory-barrier operation). - sysmem_flush: SysmemFlush<'gpu>, /// GSP falcon instance, used for GSP boot up and cleanup. - gsp_falcon: Falcon<GspFalcon>, + gsp_falcon: Falcon<'gpu, GspFalcon>, /// SEC2 falcon instance, used for GSP boot up and cleanup. - sec2_falcon: Falcon<Sec2Falcon>, - /// GSP runtime data. Temporarily an empty placeholder. + sec2_falcon: Falcon<'gpu, Sec2Falcon>, + /// GSP runtime data. #[pin] gsp: Gsp, /// GSP unload firmware bundle, if any. unload_bundle: Option<gsp::UnloadBundle>, } +/// Structure holding the resources required to operate the GPU. +#[pin_data] +pub(crate) struct Gpu<'gpu> { + spec: Spec, + /// Static GPU information as provided by the GSP. + gsp_static_info: GetGspStaticInfoReply, + /// GSP and its resources. + #[pin] + gsp_resources: GspResources<'gpu>, + /// System memory page required for flushing all pending GPU-side memory writes done through + /// PCIE into system memory, via sysmembar (A GPU-initiated HW memory-barrier operation). + /// + /// Must be kept declared *after* `gsp_resources`, as the latter's `PinnedDrop` implementation + /// requires the sysmem flush page to be in place. + sysmem_flush: SysmemFlush<'gpu>, +} + +#[pinned_drop] +impl PinnedDrop for GspResources<'_> { + fn drop(self: Pin<&mut Self>) { + let this = self.project(); + let device = *this.device; + let bar = *this.bar; + let bundle = this.unload_bundle.take(); + + let _ = this + .gsp + .as_ref() + .get_ref() + .unload(device, bar, &*this.gsp_falcon, &*this.sec2_falcon, bundle) + .inspect_err(|e| dev_err!(device, "failed to unload GSP: {:?}\n", e)); + } +} + impl<'gpu> Gpu<'gpu> { pub(crate) fn new( pdev: &'gpu pci::Device<device::Core<'_>>, bar: Bar0<'gpu>, ) -> impl PinInit<Self, Error> + 'gpu { try_pin_init!(Self { - device: pdev.as_ref(), spec: Spec::new(pdev.as_ref(), bar).inspect(|spec| { dev_info!(pdev,"NVIDIA ({})\n", spec); })?, @@ -308,40 +340,62 @@ impl<'gpu> Gpu<'gpu> { .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?; }, + // Initialize this early because `gsp_resources` depends on it. sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, spec.chipset)?, - gsp_falcon: Falcon::new( - pdev.as_ref(), - spec.chipset, - ) - .inspect(|falcon| falcon.clear_swgen0_intr(bar))?, - - sec2_falcon: Falcon::new(pdev.as_ref(), spec.chipset)?, + gsp_resources <- try_pin_init!(GspResources { + device: pdev.as_ref(), + + bar, + + gsp_falcon: Falcon::new( + pdev.as_ref(), + spec.chipset, + bar + ) + .inspect(|falcon| falcon.clear_swgen0_intr())?, + + sec2_falcon: Falcon::new(pdev.as_ref(), spec.chipset, bar)?, + + gsp <- Gsp::new(pdev), + + // This member must be initialized last, so the `UnloadBundle` can never be dropped + // from outside of the constructed `GspResources`, ensuring that the unload sequence + // is properly run in case of failure. + unload_bundle: gsp.boot(GspBootContext { + pdev, + bar, + chipset: spec.chipset, + gsp_falcon, + sec2_falcon, + })?, + }), + + gsp_static_info: { + // Obtain and display basic GPU information. + let info = gsp_resources.gsp.get_static_info(bar)?; + match info.gpu_name() { + Ok(name) => dev_info!(pdev, "GPU name: {}\n", name), + Err(e) => dev_warn!(pdev, "GPU name unavailable: {:?}\n", e), + } - gsp <- Gsp::new(pdev), + if !info.usable_fb_regions.is_empty() { + dev_dbg!(pdev, "Usable FB regions:\n"); + for region in &info.usable_fb_regions { + dev_dbg!(pdev, " - {:#x?}\n", region); + } + + dev_dbg!( + pdev, + "Total usable VRAM: {} MiB\n", + info.usable_fb_regions.iter().fold(0u64, |res, region| res + .saturating_add(region.end - region.start)) + / u64::SZ_1M + ); + } - // This member must be initialized last, so the `UnloadBundle` can never be dropped from - // outside of the constructed `Gpu`, ensuring that the unload sequence is properly run - // in case of failure. - unload_bundle: gsp.boot(pdev, bar, spec.chipset, gsp_falcon, sec2_falcon)?, - bar, + info + } }) } } - -#[pinned_drop] -impl PinnedDrop for Gpu<'_> { - fn drop(self: Pin<&mut Self>) { - let this = self.project(); - let device = *this.device; - let bar = *this.bar; - let bundle = this.unload_bundle.take(); - - let _ = this - .gsp - .as_ref() - .get_ref() - .unload(device, bar, &*this.gsp_falcon, &*this.sec2_falcon, bundle) - .inspect_err(|e| dev_err!(device, "failed to unload GSP: {:?}\n", e)); - } -} diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs index 69175ca3315c..b4ac4156056e 100644 --- a/drivers/gpu/nova-core/gsp.rs +++ b/drivers/gpu/nova-core/gsp.rs @@ -22,6 +22,7 @@ use kernel::{ pub(crate) mod cmdq; pub(crate) mod commands; mod fw; +mod regs; mod sequencer; pub(crate) use fw::{ @@ -31,10 +32,19 @@ pub(crate) use fw::{ }; use crate::{ - gsp::cmdq::Cmdq, - gsp::fw::{ - GspArgumentsPadded, - LibosMemoryRegionInitArgument, // + driver::Bar0, + falcon::{ + gsp::Gsp as GspFalcon, + sec2::Sec2 as Sec2Falcon, + Falcon, // + }, + gpu::Chipset, + gsp::{ + cmdq::Cmdq, + fw::{ + GspArgumentsPadded, + LibosMemoryRegionInitArgument, // + }, }, num, }; @@ -42,6 +52,21 @@ use crate::{ pub(crate) const GSP_PAGE_SHIFT: usize = 12; pub(crate) const GSP_PAGE_SIZE: usize = 1 << GSP_PAGE_SHIFT; +/// Common context for the GSP boot process. +pub(crate) struct GspBootContext<'a> { + pub(crate) pdev: &'a pci::Device<device::Bound>, + pub(crate) bar: Bar0<'a>, + pub(crate) chipset: Chipset, + pub(crate) gsp_falcon: &'a Falcon<'a, GspFalcon>, + pub(crate) sec2_falcon: &'a Falcon<'a, Sec2Falcon>, +} + +impl<'a> GspBootContext<'a> { + pub(crate) fn dev(&self) -> &'a device::Device<device::Bound> { + self.pdev.as_ref() + } +} + /// Number of GSP pages to use in a RM log buffer. const RM_LOG_BUFFER_NUM_PAGES: usize = 0x10; const LOG_BUFFER_SIZE: usize = RM_LOG_BUFFER_NUM_PAGES * GSP_PAGE_SIZE; @@ -185,6 +210,11 @@ impl Gsp { })) }) } + + /// Query the GSP for the static GPU information. + pub(crate) fn get_static_info(&self, bar: Bar0<'_>) -> Result<commands::GetGspStaticInfoReply> { + self.cmdq.send_command(bar, commands::GetGspStaticInfo) + } } /// Opaque bundle required to unload the GSP. Created by [`Gsp::boot`], consumed by [`Gsp::unload`]. diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs index 8afb62d689cb..ab0491b57944 100644 --- a/drivers/gpu/nova-core/gsp/boot.rs +++ b/drivers/gpu/nova-core/gsp/boot.rs @@ -6,7 +6,6 @@ use kernel::{ device, dma::Coherent, io::poll::read_poll_timeout, - pci, prelude::*, time::Delta, types::ScopeGuard, // @@ -24,7 +23,6 @@ use crate::{ gsp::GspFirmware, FIRMWARE_VERSION, // }, - gpu::Chipset, gsp::{ cmdq::Cmdq, commands, @@ -39,8 +37,8 @@ pub(super) struct BootUnloadArgs<'a> { gsp: &'a super::Gsp, dev: &'a device::Device<device::Bound>, bar: Bar0<'a>, - gsp_falcon: &'a Falcon<Gsp>, - sec2_falcon: &'a Falcon<Sec2>, + gsp_falcon: &'a Falcon<'a, Gsp>, + sec2_falcon: &'a Falcon<'a, Sec2>, unload_bundle: Option<super::UnloadBundle>, } @@ -58,8 +56,8 @@ impl<'a> BootUnloadGuard<'a> { gsp: &'a super::Gsp, dev: &'a device::Device<device::Bound>, bar: Bar0<'a>, - gsp_falcon: &'a Falcon<Gsp>, - sec2_falcon: &'a Falcon<Sec2>, + gsp_falcon: &'a Falcon<'a, Gsp>, + sec2_falcon: &'a Falcon<'a, Sec2>, unload_bundle: Option<super::UnloadBundle>, ) -> Self { Self { @@ -103,12 +101,12 @@ impl super::Gsp { /// [`Self::unload`]) returned. pub(crate) fn boot( self: Pin<&mut Self>, - pdev: &pci::Device<device::Bound>, - bar: Bar0<'_>, - chipset: Chipset, - gsp_falcon: &Falcon<Gsp>, - sec2_falcon: &Falcon<Sec2>, + ctx: super::GspBootContext<'_>, ) -> Result<Option<super::UnloadBundle>> { + let pdev = ctx.pdev; + let bar = ctx.bar; + let chipset = ctx.chipset; + let gsp_falcon = ctx.gsp_falcon; let dev = pdev.as_ref(); let hal = super::hal::gsp_hal(chipset); @@ -120,46 +118,30 @@ impl super::Gsp { let wpr_meta = Coherent::init(dev, GFP_KERNEL, GspFwWprMeta::new(&gsp_fw, &fb_layout))?; // Perform the chipset-specific boot sequence, and retrieve the unload bundle. - let unload_guard = hal.boot( - &self, - dev, - bar, - chipset, - &fb_layout, - &wpr_meta, - gsp_falcon, - sec2_falcon, - )?; + let unload_guard = hal.boot(&self, &ctx, &fb_layout, &wpr_meta)?; - gsp_falcon.write_os_version(bar, gsp_fw.bootloader.app_version); + gsp_falcon.write_os_version(gsp_fw.bootloader.app_version); // Poll for RISC-V to become active before continuing. read_poll_timeout( - || Ok(gsp_falcon.is_riscv_active(bar)), + || Ok(gsp_falcon.is_riscv_active()), |val: &bool| *val, Delta::from_millis(10), Delta::from_secs(5), )?; - dev_dbg!(pdev, "RISC-V active? {}\n", gsp_falcon.is_riscv_active(bar),); + dev_dbg!(pdev, "RISC-V active? {}\n", gsp_falcon.is_riscv_active(),); self.cmdq .send_command_no_wait(bar, commands::SetSystemInfo::new(pdev, chipset))?; self.cmdq .send_command_no_wait(bar, commands::SetRegistry::new())?; - hal.post_boot(&self, dev, bar, &gsp_fw, gsp_falcon, sec2_falcon)?; + hal.post_boot(&self, &ctx, &gsp_fw)?; // Wait until GSP is fully initialized. commands::wait_gsp_init_done(&self.cmdq)?; - // Obtain and display basic GPU information. - let info = self.cmdq.send_command(bar, commands::GetGspStaticInfo)?; - match info.gpu_name() { - Ok(name) => dev_info!(pdev, "GPU name: {}\n", name), - Err(e) => dev_warn!(pdev, "GPU name unavailable: {:?}\n", e), - } - Ok(unload_guard.dismiss()) } @@ -167,7 +149,7 @@ impl super::Gsp { fn shutdown_gsp( cmdq: &Cmdq, bar: Bar0<'_>, - gsp_falcon: &Falcon<Gsp>, + gsp_falcon: &Falcon<'_, Gsp>, mode: commands::PowerStateLevel, ) -> Result { // Command to shut the GSP down. @@ -176,7 +158,7 @@ impl super::Gsp { // Wait until GSP signals it is suspended. const LIBOS_INTERRUPT_PROCESSOR_SUSPENDED: u32 = bits::bit_u32(31); read_poll_timeout( - || Ok(gsp_falcon.read_mailbox0(bar)), + || Ok(gsp_falcon.read_mailbox0()), |&mb0| mb0 & LIBOS_INTERRUPT_PROCESSOR_SUSPENDED != 0, Delta::from_millis(10), Delta::from_secs(5), @@ -191,8 +173,8 @@ impl super::Gsp { &self, dev: &device::Device<device::Bound>, bar: Bar0<'_>, - gsp_falcon: &Falcon<Gsp>, - sec2_falcon: &Falcon<Sec2>, + gsp_falcon: &Falcon<'_, Gsp>, + sec2_falcon: &Falcon<'_, Sec2>, unload_bundle: Option<super::UnloadBundle>, ) -> Result { // Shut down the GSP. Keep going even in case of error. diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs index 070de0731e95..0671ee8a9960 100644 --- a/drivers/gpu/nova-core/gsp/cmdq.rs +++ b/drivers/gpu/nova-core/gsp/cmdq.rs @@ -51,10 +51,11 @@ use crate::{ GSP_PAGE_SIZE, // }, num, - regs, sbuffer::SBufferIter, // }; +use super::regs; + /// Marker type representing the absence of a reply for a command. Commands using this as their /// reply type are sent using [`Cmdq::send_command_no_wait`]. pub(crate) struct NoReply; diff --git a/drivers/gpu/nova-core/gsp/commands.rs b/drivers/gpu/nova-core/gsp/commands.rs index f84de9f4f045..86a3747cd31c 100644 --- a/drivers/gpu/nova-core/gsp/commands.rs +++ b/drivers/gpu/nova-core/gsp/commands.rs @@ -5,6 +5,7 @@ use core::{ array, convert::Infallible, ffi::FromBytesUntilNulError, + ops::Range, str::Utf8Error, // }; @@ -191,22 +192,30 @@ impl CommandToGsp for GetGspStaticInfo { } } -/// The reply from the GSP to the [`GetGspInfo`] command. +/// The reply from the GSP to the [`GetGspStaticInfo`] command. pub(crate) struct GetGspStaticInfoReply { gpu_name: [u8; 64], + /// Usable FB (VRAM) regions for driver memory allocation. + pub(crate) usable_fb_regions: KVec<Range<u64>>, } impl MessageFromGsp for GetGspStaticInfoReply { const FUNCTION: MsgFunction = MsgFunction::GetGspStaticInfo; type Message = fw::commands::GspStaticConfigInfo; - type InitError = Infallible; + type InitError = Error; fn read( msg: &Self::Message, _sbuffer: &mut SBufferIter<array::IntoIter<&[u8], 2>>, ) -> Result<Self, Self::InitError> { + let mut usable_fb_regions = KVec::new(); + for region in msg.usable_fb_regions() { + usable_fb_regions.push(region, GFP_KERNEL)?; + } + Ok(GetGspStaticInfoReply { gpu_name: msg.gpu_name_str(), + usable_fb_regions, }) } } diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs index 4db0cfa4dc4d..2590931262af 100644 --- a/drivers/gpu/nova-core/gsp/fw.rs +++ b/drivers/gpu/nova-core/gsp/fw.rs @@ -10,6 +10,7 @@ use r570_144 as bindings; use core::ops::Range; use kernel::{ + bitfield, dma::Coherent, prelude::*, ptr::{ @@ -219,7 +220,6 @@ impl GspFwWprMeta { gsp_firmware: &'a GspFirmware, fb_layout: &'a FbLayout, ) -> impl Init<Self> + 'a { - #[allow(non_snake_case)] let init_inner = init!(bindings::GspFwWprMeta { // CAST: we want to store the bits of `GSP_FW_WPR_META_MAGIC` unmodified. magic: bindings::GSP_FW_WPR_META_MAGIC as u64, @@ -674,7 +674,6 @@ impl LibosMemoryRegionInitArgument { u64::from_ne_bytes(bytes) } - #[allow(non_snake_case)] let init_inner = init!(bindings::LibosMemoryRegionInitArgument { id8: id8(name), pa: obj.dma_handle(), @@ -742,8 +741,8 @@ unsafe impl AsBytes for MsgqRxHeader {} bitfield! { struct MsgHeaderVersion(u32) { - 31:24 major as u8; - 23:16 minor as u8; + 31:24 major; + 23:16 minor; } } @@ -752,9 +751,9 @@ impl MsgHeaderVersion { const MINOR_TOT: u8 = 0; fn new() -> Self { - Self::default() - .set_major(Self::MAJOR_TOT) - .set_minor(Self::MINOR_TOT) + Self::zeroed() + .with_major(Self::MAJOR_TOT) + .with_minor(Self::MINOR_TOT) } } @@ -793,7 +792,6 @@ impl GspMsgElement { /// * `sequence` - Sequence number of the message. /// * `cmd_size` - Size of the command (not including the message element), in bytes. /// * `function` - Function of the message. - #[allow(non_snake_case)] pub(crate) fn init( sequence: u32, cmd_size: usize, @@ -876,7 +874,6 @@ pub(crate) struct GspArgumentsCached { impl GspArgumentsCached { /// Creates the arguments for starting the GSP up using `cmdq` as its command queue. pub(crate) fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ { - #[allow(non_snake_case)] let init_inner = init!(bindings::GSP_ARGUMENTS_CACHED { messageQueueInitArguments <- MessageQueueInitArguments::new(cmdq), bDmemStack: 1, @@ -923,7 +920,6 @@ type MessageQueueInitArguments = bindings::MESSAGE_QUEUE_INIT_ARGUMENTS; impl MessageQueueInitArguments { /// Creates a new init arguments structure for `cmdq`. - #[allow(non_snake_case)] fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ { init!(MessageQueueInitArguments { sharedMemPhysAddr: cmdq.dma_handle, @@ -947,7 +943,6 @@ type GspAcrBootGspRmParams = bindings::GSP_ACR_BOOT_GSP_RM_PARAMS; impl GspAcrBootGspRmParams { fn new(target: GspDmaTarget, wpr_meta_addr: u64) -> impl Init<Self> { - #[allow(non_snake_case)] let params = init!(Self { target: target as u32, gspRmDescSize: num::usize_into_u32::<{ size_of::<GspFwWprMeta>() }>(), @@ -966,7 +961,6 @@ type GspRmParams = bindings::GSP_RM_PARAMS; impl GspRmParams { fn new(target: GspDmaTarget, libos_addr: u64) -> impl Init<Self> { - #[allow(non_snake_case)] let params = init!(Self { target: target as u32, bootArgsOffset: libos_addr, @@ -986,7 +980,6 @@ unsafe impl FromBytes for GspFmcBootParams {} impl GspFmcBootParams { pub(crate) fn new(wpr_meta_addr: u64, libos_addr: u64) -> impl Init<Self> { - #[allow(non_snake_case)] let init = init!(Self { // Blackwell FSP obtains WPR info from other sources, so // wprCarveoutOffset and wprCarveoutSize are left zero. diff --git a/drivers/gpu/nova-core/gsp/fw/commands.rs b/drivers/gpu/nova-core/gsp/fw/commands.rs index 7bcc41fc7fa0..6dc31d1bf5ae 100644 --- a/drivers/gpu/nova-core/gsp/fw/commands.rs +++ b/drivers/gpu/nova-core/gsp/fw/commands.rs @@ -1,6 +1,8 @@ // SPDX-License-Identifier: GPL-2.0 // SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +use core::ops::Range; + use kernel::{ device, pci, @@ -13,7 +15,8 @@ use kernel::{ use crate::{ gpu::Chipset, - gsp::GSP_PAGE_SIZE, // + gsp::GSP_PAGE_SIZE, + num::IntoSafeCast, // }; use super::bindings; @@ -27,7 +30,6 @@ static_assert!(size_of::<GspSetSystemInfo>() < GSP_PAGE_SIZE); impl GspSetSystemInfo { /// Returns an in-place initializer for the `GspSetSystemInfo` command. - #[allow(non_snake_case)] pub(crate) fn init<'a>( dev: &'a pci::Device<device::Bound>, chipset: Chipset, @@ -99,7 +101,6 @@ pub(crate) struct PackedRegistryTable { } impl PackedRegistryTable { - #[allow(non_snake_case)] pub(crate) fn init(num_entries: u32, size: u32) -> impl Init<Self> { type InnerPackedRegistryTable = bindings::PACKED_REGISTRY_TABLE; let init_inner = init!(InnerPackedRegistryTable { @@ -129,6 +130,41 @@ impl GspStaticConfigInfo { pub(crate) fn gpu_name_str(&self) -> [u8; 64] { self.0.gpuNameString } + + /// Returns an iterator over valid FB regions from GSP firmware data. + fn fb_regions( + &self, + ) -> impl Iterator<Item = &bindings::NV2080_CTRL_CMD_FB_GET_FB_REGION_FB_REGION_INFO> { + let fb_info = &self.0.fbRegionInfoParams; + fb_info + .fbRegion + .iter() + .take(fb_info.numFBRegions.into_safe_cast()) + .filter(|reg| reg.limit >= reg.base) + } + + /// Iterates over usable FB regions from GSP firmware data. + /// + /// Each yielded region is a [`Range<u64>`] suitable for driver memory allocation. + /// Usable regions are those that satisfy all the following properties: + /// - Are not reserved for firmware internal use. + /// - Are not protected (hardware-enforced access restrictions). + /// - Support compression (can use GPU memory compression for bandwidth). + /// - Support ISO (isochronous memory for display requiring guaranteed bandwidth). + pub(crate) fn usable_fb_regions(&self) -> impl Iterator<Item = Range<u64>> + '_ { + self.fb_regions().filter_map(|reg| { + // Filter: not reserved, not protected, supports compression and ISO. + if reg.reserved == 0 + && reg.bProtected == 0 + && reg.supportCompressed != 0 + && reg.supportISO != 0 + { + reg.limit.checked_add(1).map(|end| reg.base..end) + } else { + None + } + }) + } } // SAFETY: Padding is explicit and will not contain uninitialized data. diff --git a/drivers/gpu/nova-core/gsp/hal.rs b/drivers/gpu/nova-core/gsp/hal.rs index 04f004856c60..d3e47ef206de 100644 --- a/drivers/gpu/nova-core/gsp/hal.rs +++ b/drivers/gpu/nova-core/gsp/hal.rs @@ -4,11 +4,10 @@ mod gh100; mod tu102; -use kernel::prelude::*; - use kernel::{ device, - dma::Coherent, // + dma::Coherent, + prelude::*, // }; use crate::{ @@ -27,6 +26,7 @@ use crate::{ gsp::{ boot::BootUnloadGuard, Gsp, + GspBootContext, GspFwWprMeta, // }, }; @@ -42,8 +42,8 @@ pub(super) trait UnloadBundle: Send { &self, dev: &device::Device<device::Bound>, bar: Bar0<'_>, - gsp_falcon: &Falcon<GspEngine>, - sec2_falcon: &Falcon<Sec2>, + gsp_falcon: &Falcon<'_, GspEngine>, + sec2_falcon: &Falcon<'_, Sec2>, ) -> Result; } @@ -53,32 +53,19 @@ pub(super) trait GspHal: Send { /// /// Upon success, returns a guard that runs the GSP unload sequence if GSP boot does not /// complete. - #[allow(clippy::too_many_arguments)] fn boot<'a>( &self, gsp: &'a Gsp, - dev: &'a device::Device<device::Bound>, - bar: Bar0<'a>, - chipset: Chipset, + ctx: &GspBootContext<'a>, fb_layout: &FbLayout, wpr_meta: &Coherent<GspFwWprMeta>, - gsp_falcon: &'a Falcon<GspEngine>, - sec2_falcon: &'a Falcon<Sec2>, ) -> Result<BootUnloadGuard<'a>>; /// Performs HAL-specific post-GSP boot tasks. /// /// This method is called by the GSP boot code after the GSP is confirmed to be running, and /// after the initialization commands have been pushed onto its queue. - fn post_boot( - &self, - _gsp: &Gsp, - _dev: &device::Device<device::Bound>, - _bar: Bar0<'_>, - _gsp_fw: &GspFirmware, - _gsp_falcon: &Falcon<GspEngine>, - _sec2_falcon: &Falcon<Sec2>, - ) -> Result { + fn post_boot(&self, _gsp: &Gsp, _ctx: &GspBootContext<'_>, _gsp_fw: &GspFirmware) -> Result { Ok(()) } } diff --git a/drivers/gpu/nova-core/gsp/hal/gh100.rs b/drivers/gpu/nova-core/gsp/hal/gh100.rs index 98f5ce197d13..1d06405a32f6 100644 --- a/drivers/gpu/nova-core/gsp/hal/gh100.rs +++ b/drivers/gpu/nova-core/gsp/hal/gh100.rs @@ -18,15 +18,10 @@ use crate::{ Falcon, // }, fb::FbLayout, - firmware::{ - fsp::FspFirmware, - FIRMWARE_VERSION, // - }, fsp::{ FmcBootArgs, Fsp, // }, - gpu::Chipset, gsp::{ boot::BootUnloadGuard, hal::{ @@ -34,6 +29,7 @@ use crate::{ UnloadBundle, // }, Gsp, + GspBootContext, GspFwWprMeta, // }, }; @@ -46,10 +42,10 @@ struct GspMbox { impl GspMbox { /// Reads both mailboxes from the GSP falcon. - fn read(gsp_falcon: &Falcon<GspEngine>, bar: Bar0<'_>) -> Self { + fn read(gsp_falcon: &Falcon<'_, GspEngine>) -> Self { Self { - mbox0: gsp_falcon.read_mailbox0(bar), - mbox1: gsp_falcon.read_mailbox1(bar), + mbox0: gsp_falcon.read_mailbox0(), + mbox1: gsp_falcon.read_mailbox1(), } } @@ -64,8 +60,7 @@ impl GspMbox { /// either condition should stop the poll loop. fn lockdown_released_or_error( &self, - gsp_falcon: &Falcon<GspEngine>, - bar: Bar0<'_>, + gsp_falcon: &Falcon<'_, GspEngine>, fmc_boot_params_addr: u64, ) -> bool { // GSP-FMC normally clears the boot parameters address from the mailboxes early during @@ -75,15 +70,14 @@ impl GspMbox { return self.combined_addr() != fmc_boot_params_addr; } - !gsp_falcon.riscv_branch_privilege_lockdown(bar) + !gsp_falcon.riscv_branch_privilege_lockdown() } } /// Waits for GSP lockdown to be released after FSP Chain of Trust. fn wait_for_gsp_lockdown_release( dev: &device::Device<device::Bound>, - bar: Bar0<'_>, - gsp_falcon: &Falcon<GspEngine>, + gsp_falcon: &Falcon<'_, GspEngine>, fmc_boot_params_addr: u64, ) -> Result { dev_dbg!(dev, "Waiting for GSP lockdown release\n"); @@ -92,14 +86,14 @@ fn wait_for_gsp_lockdown_release( || { // While the PRIV target mask is still locked to FSP, GSP register and mailbox reads // are not meaningful. Wait until HWCFG2 says the CPU can read them. - Ok(match gsp_falcon.priv_target_mask_released(bar) { + Ok(match gsp_falcon.priv_target_mask_released() { false => None, - true => Some(GspMbox::read(gsp_falcon, bar)), + true => Some(GspMbox::read(gsp_falcon)), }) }, |mbox| match mbox { None => false, - Some(mbox) => mbox.lockdown_released_or_error(gsp_falcon, bar, fmc_boot_params_addr), + Some(mbox) => mbox.lockdown_released_or_error(gsp_falcon, fmc_boot_params_addr), }, Delta::from_millis(10), Delta::from_secs(30), @@ -126,13 +120,13 @@ impl UnloadBundle for FspUnloadBundle { fn run( &self, dev: &device::Device<device::Bound>, - bar: Bar0<'_>, - gsp_falcon: &Falcon<GspEngine>, - _sec2_falcon: &Falcon<Sec2>, + _bar: Bar0<'_>, + gsp_falcon: &Falcon<'_, GspEngine>, + _sec2_falcon: &Falcon<'_, Sec2>, ) -> Result { // GSP falcon does most of the work of resetting, so just wait for it to finish. read_poll_timeout( - || Ok(gsp_falcon.is_riscv_active(bar)), + || Ok(gsp_falcon.is_riscv_active()), |&active| !active, Delta::from_millis(10), Delta::from_secs(5), @@ -152,15 +146,15 @@ impl GspHal for Gh100 { fn boot<'a>( &self, gsp: &'a Gsp, - dev: &'a device::Device<device::Bound>, - bar: Bar0<'a>, - chipset: Chipset, + ctx: &GspBootContext<'a>, fb_layout: &FbLayout, wpr_meta: &Coherent<GspFwWprMeta>, - gsp_falcon: &'a Falcon<GspEngine>, - sec2_falcon: &'a Falcon<Sec2>, ) -> Result<BootUnloadGuard<'a>> { - let fsp_fw = FspFirmware::new(dev, chipset, FIRMWARE_VERSION)?; + let dev = ctx.dev(); + let bar = ctx.bar; + let chipset = ctx.chipset; + let gsp_falcon = ctx.gsp_falcon; + let sec2_falcon = ctx.sec2_falcon; let unload_bundle = crate::gsp::UnloadBundle( KBox::new(FspUnloadBundle, GFP_KERNEL)? as KBox<dyn UnloadBundle> @@ -170,7 +164,7 @@ impl GspHal for Gh100 { let unload_guard = BootUnloadGuard::new(gsp, dev, bar, gsp_falcon, sec2_falcon, Some(unload_bundle)); - let mut fsp = Fsp::wait_secure_boot(dev, bar, chipset, fsp_fw)?; + let mut fsp = Fsp::wait_secure_boot(dev, bar, chipset)?; let args = FmcBootArgs::new( dev, @@ -180,9 +174,9 @@ impl GspHal for Gh100 { false, )?; - fsp.boot_fmc(dev, bar, fb_layout, &args)?; + fsp.boot_fmc(dev, fb_layout, &args)?; - wait_for_gsp_lockdown_release(dev, bar, gsp_falcon, args.boot_params_dma_handle())?; + wait_for_gsp_lockdown_release(dev, gsp_falcon, args.boot_params_dma_handle())?; Ok(unload_guard) } diff --git a/drivers/gpu/nova-core/gsp/hal/tu102.rs b/drivers/gpu/nova-core/gsp/hal/tu102.rs index 2f6301af7113..ff71b45b5432 100644 --- a/drivers/gpu/nova-core/gsp/hal/tu102.rs +++ b/drivers/gpu/nova-core/gsp/hal/tu102.rs @@ -42,6 +42,7 @@ use crate::{ GspSequencerParams, // }, Gsp, + GspBootContext, GspFwWprMeta, // }, regs, @@ -61,12 +62,11 @@ impl FwsecUnloadFirmware { /// Loads the FWSEC SB firmware, as well as its bootloader if `chipset` requires it. fn new( dev: &device::Device<device::Bound>, - bar: Bar0<'_>, chipset: Chipset, bios: &Vbios, - gsp_falcon: &Falcon<GspEngine>, + gsp_falcon: &Falcon<'_, GspEngine>, ) -> Result<Self> { - let fwsec_sb = FwsecFirmware::new(dev, gsp_falcon, bar, bios, FwsecCommand::Sb)?; + let fwsec_sb = FwsecFirmware::new(dev, gsp_falcon, bios, FwsecCommand::Sb)?; Ok(if chipset.needs_fwsec_bootloader() { Self::WithBl(FwsecFirmwareWithBl::new(fwsec_sb, dev, chipset)?) @@ -80,10 +80,10 @@ impl FwsecUnloadFirmware { &self, dev: &device::Device<device::Bound>, bar: Bar0<'_>, - gsp_falcon: &Falcon<GspEngine>, + gsp_falcon: &Falcon<'_, GspEngine>, ) -> Result { match self { - Self::WithoutBl(fw) => fw.run(dev, gsp_falcon, bar), + Self::WithoutBl(fw) => fw.run(dev, gsp_falcon), Self::WithBl(fw) => fw.run(dev, gsp_falcon, bar), } } @@ -100,22 +100,20 @@ impl Sec2UnloadBundle { /// Load and prepare the resources required to properly reset the GSP after it has been stopped. fn build( dev: &device::Device<device::Bound>, - bar: Bar0<'_>, chipset: Chipset, bios: &Vbios, - gsp_falcon: &Falcon<GspEngine>, - sec2_falcon: &Falcon<Sec2>, + gsp_falcon: &Falcon<'_, GspEngine>, + sec2_falcon: &Falcon<'_, Sec2>, ) -> Result<KBox<dyn UnloadBundle>> { KBox::new( Self { - fwsec_sb: FwsecUnloadFirmware::new(dev, bar, chipset, bios, gsp_falcon)?, + fwsec_sb: FwsecUnloadFirmware::new(dev, chipset, bios, gsp_falcon)?, booter_unloader: BooterFirmware::new( dev, BooterKind::Unloader, chipset, FIRMWARE_VERSION, sec2_falcon, - bar, )?, }, GFP_KERNEL, @@ -130,22 +128,29 @@ impl UnloadBundle for Sec2UnloadBundle { &self, dev: &device::Device<device::Bound>, bar: Bar0<'_>, - gsp_falcon: &Falcon<GspEngine>, - sec2_falcon: &Falcon<Sec2>, + gsp_falcon: &Falcon<'_, GspEngine>, + sec2_falcon: &Falcon<'_, Sec2>, ) -> Result { // Run FWSEC-SB to reset the GSP falcon to its pre-libos state. - self.fwsec_sb.run(dev, bar, gsp_falcon)?; + // Log errors but keep going if it fails. + let fwsec_sb_res = self + .fwsec_sb + .run(dev, bar, gsp_falcon) + .inspect_err(|e| dev_err!(dev, "FWSEC-SB failed to run: {:?}\n", e)); // Remove WPR2 region if set. let wpr2_hi = bar.read(regs::NV_PFB_PRI_MMU_WPR2_ADDR_HI); - if wpr2_hi.is_wpr2_set() { - sec2_falcon.reset(bar)?; - sec2_falcon.load(dev, bar, &self.booter_unloader)?; + let booter_unloader_res = (|| { + if !wpr2_hi.is_wpr2_set() { + return Ok(()); + } + + sec2_falcon.reset()?; + sec2_falcon.load(&self.booter_unloader)?; // Sentinel value to confirm that Booter Unloader has run. const MAILBOX_SENTINEL: u32 = 0xff; - let (mbox0, _) = - sec2_falcon.boot(bar, Some(MAILBOX_SENTINEL), Some(MAILBOX_SENTINEL))?; + let (mbox0, _) = sec2_falcon.boot(Some(MAILBOX_SENTINEL), Some(MAILBOX_SENTINEL))?; if mbox0 != 0 { dev_err!(dev, "Booter Unloader returned error 0x{:x}\n", mbox0); return Err(EINVAL); @@ -160,9 +165,12 @@ impl UnloadBundle for Sec2UnloadBundle { ); return Err(EBUSY); } - } - Ok(()) + Ok(()) + })() + .inspect_err(|e| dev_err!(dev, "Booter Unloader failed to run: {:?}\n", e)); + + fwsec_sb_res.and(booter_unloader_res) } } @@ -171,7 +179,7 @@ impl UnloadBundle for Sec2UnloadBundle { fn run_fwsec_frts( dev: &device::Device<device::Bound>, chipset: Chipset, - falcon: &Falcon<GspEngine>, + falcon: &Falcon<'_, GspEngine>, bar: Bar0<'_>, bios: &Vbios, fb_layout: &FbLayout, @@ -190,7 +198,6 @@ fn run_fwsec_frts( let fwsec_frts = FwsecFirmware::new( dev, falcon, - bar, bios, FwsecCommand::Frts { frts_addr: fb_layout.frts.start, @@ -204,7 +211,7 @@ fn run_fwsec_frts( fwsec_frts_bl.run(dev, falcon, bar)?; } else { // Load and run FWSEC-FRTS directly. - fwsec_frts.run(dev, falcon, bar)?; + fwsec_frts.run(dev, falcon)?; } // SCRATCH_E contains the error code for FWSEC-FRTS. @@ -258,32 +265,33 @@ impl GspHal for Tu102 { fn boot<'a>( &self, gsp: &'a Gsp, - dev: &'a device::Device<device::Bound>, - bar: Bar0<'a>, - chipset: Chipset, + ctx: &GspBootContext<'a>, fb_layout: &FbLayout, wpr_meta: &Coherent<GspFwWprMeta>, - gsp_falcon: &'a Falcon<GspEngine>, - sec2_falcon: &'a Falcon<Sec2>, ) -> Result<BootUnloadGuard<'a>> { + let dev = ctx.dev(); + let bar = ctx.bar; + let chipset = ctx.chipset; + let gsp_falcon = ctx.gsp_falcon; + let sec2_falcon = ctx.sec2_falcon; + let bios = Vbios::new(dev, bar)?; // Try and prepare the unload bundle. // // If the unload bundle creation fails, the GPU will need to be reset before the driver can // be probed again. - let unload_bundle = - Sec2UnloadBundle::build(dev, bar, chipset, &bios, gsp_falcon, sec2_falcon) - .inspect_err(|e| { - dev_warn!(dev, "Failed to prepare unload firmware: {:?}\n", e); - dev_warn!(dev, "The GSP won't be able to unload properly on unbind.\n"); - dev_warn!( - dev, - "The GPU will need to be reset before the driver can bind again.\n" - ); - }) - .ok() - .map(crate::gsp::UnloadBundle); + let unload_bundle = Sec2UnloadBundle::build(dev, chipset, &bios, gsp_falcon, sec2_falcon) + .inspect_err(|e| { + dev_warn!(dev, "Failed to prepare unload firmware: {:?}\n", e); + dev_warn!(dev, "The GSP won't be able to unload properly on unbind.\n"); + dev_warn!( + dev, + "The GPU will need to be reset before the driver can bind again.\n" + ); + }) + .ok() + .map(crate::gsp::UnloadBundle); // Wrap the unload bundle into a drop guard so it is automatically run upon failure. let unload_guard = @@ -294,13 +302,10 @@ impl GspHal for Tu102 { run_fwsec_frts(dev, chipset, gsp_falcon, bar, &bios, fb_layout)?; } - gsp_falcon.reset(bar)?; + gsp_falcon.reset()?; let libos_handle = gsp.libos.dma_handle(); - let (mbox0, mbox1) = gsp_falcon.boot( - bar, - Some(libos_handle as u32), - Some((libos_handle >> 32) as u32), - )?; + let (mbox0, mbox1) = + gsp_falcon.boot(Some(libos_handle as u32), Some((libos_handle >> 32) as u32))?; dev_dbg!(dev, "GSP MBOX0: {:#x}, MBOX1: {:#x}\n", mbox0, mbox1); dev_dbg!( @@ -314,30 +319,21 @@ impl GspHal for Tu102 { chipset, FIRMWARE_VERSION, sec2_falcon, - bar, )? - .run(dev, bar, sec2_falcon, wpr_meta)?; + .run(dev, sec2_falcon, wpr_meta)?; Ok(unload_guard) } - fn post_boot( - &self, - gsp: &Gsp, - dev: &device::Device<device::Bound>, - bar: Bar0<'_>, - gsp_fw: &GspFirmware, - gsp_falcon: &Falcon<GspEngine>, - sec2_falcon: &Falcon<Sec2>, - ) -> Result { + fn post_boot(&self, gsp: &Gsp, ctx: &GspBootContext<'_>, gsp_fw: &GspFirmware) -> Result { // Create and run the GSP sequencer. let seq_params = GspSequencerParams { bootloader_app_version: gsp_fw.bootloader.app_version, libos_dma_handle: gsp.libos.dma_handle(), - gsp_falcon, - sec2_falcon, - dev, - bar, + gsp_falcon: ctx.gsp_falcon, + sec2_falcon: ctx.sec2_falcon, + dev: ctx.dev(), + bar: ctx.bar, }; GspSequencer::run(&gsp.cmdq, seq_params)?; diff --git a/drivers/gpu/nova-core/gsp/regs.rs b/drivers/gpu/nova-core/gsp/regs.rs new file mode 100644 index 000000000000..a76dea3c3ab0 --- /dev/null +++ b/drivers/gpu/nova-core/gsp/regs.rs @@ -0,0 +1,11 @@ +// SPDX-License-Identifier: GPL-2.0 + +use kernel::io::register; + +// PGSP + +register! { + pub(super) NV_PGSP_QUEUE_HEAD(u32) @ 0x00110c00 { + 31:0 address; + } +} diff --git a/drivers/gpu/nova-core/gsp/sequencer.rs b/drivers/gpu/nova-core/gsp/sequencer.rs index e0850d21adca..13983d42b12b 100644 --- a/drivers/gpu/nova-core/gsp/sequencer.rs +++ b/drivers/gpu/nova-core/gsp/sequencer.rs @@ -133,9 +133,9 @@ pub(crate) struct GspSequencer<'a> { /// `Bar0` for register access. bar: Bar0<'a>, /// SEC2 falcon for core operations. - sec2_falcon: &'a Falcon<Sec2>, + sec2_falcon: &'a Falcon<'a, Sec2>, /// GSP falcon for core operations. - gsp_falcon: &'a Falcon<Gsp>, + gsp_falcon: &'a Falcon<'a, Gsp>, /// LibOS DMA handle address. libos_dma_handle: u64, /// Bootloader application version. @@ -213,16 +213,16 @@ impl GspSeqCmd { GspSeqCmd::DelayUs(cmd) => cmd.run(seq), GspSeqCmd::RegStore(cmd) => cmd.run(seq), GspSeqCmd::CoreReset => { - seq.gsp_falcon.reset(seq.bar)?; - seq.gsp_falcon.dma_reset(seq.bar); + seq.gsp_falcon.reset()?; + seq.gsp_falcon.dma_reset(); Ok(()) } GspSeqCmd::CoreStart => { - seq.gsp_falcon.start(seq.bar)?; + seq.gsp_falcon.start()?; Ok(()) } GspSeqCmd::CoreWaitForHalt => { - seq.gsp_falcon.wait_till_halted(seq.bar)?; + seq.gsp_falcon.wait_till_halted()?; Ok(()) } GspSeqCmd::CoreResume => { @@ -231,35 +231,32 @@ impl GspSeqCmd { // sequencer will start both. // Reset the GSP to prepare it for resuming. - seq.gsp_falcon.reset(seq.bar)?; + seq.gsp_falcon.reset()?; // Write the libOS DMA handle to GSP mailboxes. seq.gsp_falcon.write_mailboxes( - seq.bar, Some(seq.libos_dma_handle as u32), Some((seq.libos_dma_handle >> 32) as u32), ); // Start the SEC2 falcon which will trigger GSP-RM to resume on the GSP. - seq.sec2_falcon.start(seq.bar)?; + seq.sec2_falcon.start()?; // Poll until GSP-RM reload/resume has completed (up to 2 seconds). - seq.gsp_falcon - .check_reload_completed(seq.bar, Delta::from_secs(2))?; + seq.gsp_falcon.check_reload_completed(Delta::from_secs(2))?; // Verify SEC2 completed successfully by checking its mailbox for errors. - let mbox0 = seq.sec2_falcon.read_mailbox0(seq.bar); + let mbox0 = seq.sec2_falcon.read_mailbox0(); if mbox0 != 0 { dev_err!(seq.dev, "Sequencer: sec2 errors: {:?}\n", mbox0); return Err(EIO); } // Configure GSP with the bootloader version. - seq.gsp_falcon - .write_os_version(seq.bar, seq.bootloader_app_version); + seq.gsp_falcon.write_os_version(seq.bootloader_app_version); // Verify the GSP's RISC-V core is active indicating successful GSP boot. - if !seq.gsp_falcon.is_riscv_active(seq.bar) { + if !seq.gsp_falcon.is_riscv_active() { dev_err!(seq.dev, "Sequencer: RISC-V core is not active\n"); return Err(EIO); } @@ -345,9 +342,9 @@ pub(crate) struct GspSequencerParams<'a> { /// LibOS DMA handle address. pub(crate) libos_dma_handle: u64, /// GSP falcon for core operations. - pub(crate) gsp_falcon: &'a Falcon<Gsp>, + pub(crate) gsp_falcon: &'a Falcon<'a, Gsp>, /// SEC2 falcon for core operations. - pub(crate) sec2_falcon: &'a Falcon<Sec2>, + pub(crate) sec2_falcon: &'a Falcon<'a, Sec2>, /// Device for logging. pub(crate) dev: &'a device::Device, /// BAR0 for register access. diff --git a/drivers/gpu/nova-core/mctp.rs b/drivers/gpu/nova-core/mctp.rs index 482786e07bc7..acc2abbd4b0c 100644 --- a/drivers/gpu/nova-core/mctp.rs +++ b/drivers/gpu/nova-core/mctp.rs @@ -7,55 +7,51 @@ //! Data Model) messages between the kernel driver and GPU firmware processors //! such as FSP and GSP. -use kernel::pci::Vendor; +use kernel::{ + bitfield, + pci::Vendor, + prelude::*, // +}; -/// NVDM message type identifiers carried over MCTP. -#[derive(Debug, Clone, Copy, Default, PartialEq, Eq)] -#[repr(u8)] -pub(crate) enum NvdmType { - #[default] - /// Chain of Trust boot message. - Cot = 0x14, - /// FSP command response. - FspResponse = 0x15, -} - -impl TryFrom<u8> for NvdmType { - type Error = u8; - - fn try_from(value: u8) -> Result<Self, Self::Error> { - match value { - x if x == u8::from(Self::Cot) => Ok(Self::Cot), - x if x == u8::from(Self::FspResponse) => Ok(Self::FspResponse), - _ => Err(value), - } - } -} +use crate::{ + bounded_enum, + num, // +}; -impl From<NvdmType> for u8 { - fn from(value: NvdmType) -> Self { - value as u8 +bounded_enum! { + /// NVDM message type identifiers carried over MCTP. + #[derive(Debug, Clone, Copy, PartialEq, Eq)] + pub(crate) enum NvdmType with TryFrom<Bounded<u32, 8>> { + /// Chain of Trust boot message. + Cot = 0x14, + /// FSP command response. + FspResponse = 0x15, } } bitfield! { - pub(crate) struct MctpHeader(u32), "MCTP transport header for NVIDIA firmware messages." { - 31:31 som as bool, "Start-of-message bit."; - 30:30 eom as bool, "End-of-message bit."; - 29:28 seq as u8, "Packet sequence number."; - 23:16 seid as u8, "Source endpoint ID."; + /// MCTP transport header for NVIDIA firmware messages. + pub(crate) struct MctpHeader(u32) { + /// Start-of-message bit. + 31:31 som; + /// End-of-message bit. + 30:30 eom; + /// Packet sequence number. + 29:28 seq; + /// Source endpoint ID. + 23:16 seid; } } impl MctpHeader { /// Builds a single-packet MCTP header (`SOM=1`, `EOM=1`, `SEQ=0`, `SEID=0`). pub(crate) fn single_packet() -> Self { - Self::default().set_som(true).set_eom(true) + Self::zeroed().with_som(true).with_eom(true) } /// Returns whether this is a complete single-packet message (`SOM=1` and `EOM=1`). pub(crate) fn is_single_packet(self) -> bool { - self.som() && self.eom() + self.som().into_bool() && self.eom().into_bool() } } @@ -63,26 +59,30 @@ impl MctpHeader { const MSG_TYPE_VENDOR_PCI: u8 = 0x7e; bitfield! { - pub(crate) struct NvdmHeader(u32), "NVIDIA Vendor-Defined Message header over MCTP." { - 31:24 nvdm_type as u8 ?=> NvdmType, "NVDM message type."; - 23:8 vendor_id as u16, "PCI vendor ID."; - 6:0 msg_type as u8, "MCTP vendor-defined message type."; + /// NVIDIA Vendor-Defined Message header over MCTP. + pub(crate) struct NvdmHeader(u32) { + /// NVDM message type. + 31:24 nvdm_type ?=> NvdmType; + /// PCI vendor ID. + 23:8 vendor_id; + /// MCTP vendor-defined message type. + 6:0 msg_type; } } impl NvdmHeader { /// Builds an NVDM header for the given message type. pub(crate) fn new(nvdm_type: NvdmType) -> Self { - Self::default() - .set_msg_type(MSG_TYPE_VENDOR_PCI) - .set_vendor_id(Vendor::NVIDIA.as_raw()) - .set_nvdm_type(nvdm_type) + Self::zeroed() + .with_const_msg_type::<{ num::u8_as_u32(MSG_TYPE_VENDOR_PCI) }>() + .with_vendor_id(Vendor::NVIDIA.as_raw()) + .with_nvdm_type(nvdm_type) } /// Validates this header against the expected NVIDIA NVDM format and type. pub(crate) fn validate(self, expected_type: NvdmType) -> bool { - self.msg_type() == MSG_TYPE_VENDOR_PCI - && self.vendor_id() == Vendor::NVIDIA.as_raw() + u8::from(self.msg_type()) == MSG_TYPE_VENDOR_PCI + && u16::from(self.vendor_id()) == Vendor::NVIDIA.as_raw() && matches!(self.nvdm_type(), Ok(nvdm_type) if nvdm_type == expected_type) } } diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index 9f0199f7b38c..a61406ba5c0b 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -10,9 +10,6 @@ use kernel::{ InPlaceModule, // }; -#[macro_use] -mod bitfield; - mod driver; mod falcon; mod fb; @@ -54,7 +51,7 @@ struct NovaCoreModule { impl InPlaceModule for NovaCoreModule { fn init(module: &'static kernel::ThisModule) -> impl PinInit<Self, Error> { - let dir = debugfs::Dir::new(kernel::c_str!("nova-core")); + let dir = debugfs::Dir::new(c"nova-core"); // SAFETY: We are the only driver code running during init, so there // cannot be any concurrent access to `DEBUGFS_ROOT`. diff --git a/drivers/gpu/nova-core/nova_core_exports.c b/drivers/gpu/nova-core/nova_core_exports.c new file mode 100644 index 000000000000..6e80ca9792ee --- /dev/null +++ b/drivers/gpu/nova-core/nova_core_exports.c @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-2.0 +// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + +/* + * Exports Rust symbols from the `nova_core` crate for use by dependent modules. + * + * This is a workaround until the build system supports Rust cross-module + * dependencies natively. + */ + +#include <linux/export.h> + +#define EXPORT_SYMBOL_RUST_GPL(sym) extern int sym; EXPORT_SYMBOL_GPL(sym) + +#include "exports_nova_core_generated.h" diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index 0f49c1ab83ad..397124f245ee 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -126,7 +126,7 @@ register! { } /// High bits of the physical system memory address used by the GPU to perform sysmembar - /// operations (see [`crate::fb::SysmemFlush`]). + /// operations. pub(crate) NV_PFB_NISO_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x00100c40 { 23:0 adr_63_40; } @@ -153,11 +153,6 @@ register! { /// The base is provided by the GB10x framebuffer HAL. pub(crate) struct Hshub0Base(()); -/// Base of the GB20x FBHUB0 register window (`NV_FBHUB0_PRI_BASE` in Open RM). -/// -/// The base is provided by the GB20x framebuffer HAL. -pub(crate) struct Fbhub0Base(()); - register! { // GB10x sysmem flush registers, relative to the HSHUB0 base. GB10x routes sysmembar // through a primary and an EG (egress) pair that must both be programmed to the same @@ -178,16 +173,37 @@ register! { pub(crate) NV_PFB_HSHUB_EG_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ Hshub0Base + 0x000006c4 { 19:0 adr; } +} - // GB20x sysmem flush registers, relative to the FBHUB0 base. Unlike the older - // NV_PFB_NISO_FLUSH_SYSMEM_ADDR registers which encode the address with an 8-bit - // right-shift, these take the raw address split into lower and upper halves. Hardware - // ignores bits 7:0 of the LO register. - pub(crate) NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ Fbhub0Base + 0x00001d58 { +register! { + // GB20x FBHUB0 sysmem flush registers. Unlike the older + // NV_PFB_NISO_FLUSH_SYSMEM_ADDR registers, which encode the address with an + // 8-bit right-shift, these take the raw address split into lower and upper + // halves. Hardware ignores bits 7:0 of the LO register. + pub(crate) NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x008a1d58 { 31:0 adr => u32; } - pub(crate) NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ Fbhub0Base + 0x00001d5c { + pub(crate) NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x008a1d5c { + 19:0 adr; + } +} + +register! { + /// Low bits of the physical system memory address used by the GPU to perform + /// sysmembar operations on Hopper. + /// + /// Like the GB20x FBHUB0 registers, and unlike the Ampere + /// `NV_PFB_NISO_FLUSH_SYSMEM_ADDR` registers (which encode the address with an + /// 8-bit right-shift), these take the raw address split into lower and upper + /// halves. Hardware ignores bits 7:0 of the LO register. + pub(crate) NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x00100a34 { + 31:0 adr => u32; + } + + /// High bits of the physical system memory address used by the GPU to perform + /// sysmembar operations on Hopper. + pub(crate) NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x00100a38 { 19:0 adr; } } @@ -227,14 +243,6 @@ impl NV_PFB_PRI_MMU_WPR2_ADDR_HI { } } -// PGSP - -register! { - pub(crate) NV_PGSP_QUEUE_HEAD(u32) @ 0x00110c00 { - 31:0 address; - } -} - // PGC6 register space. // // `GC6` is a GPU low-power state where VRAM is in self-refresh and the GPU is powered down (except @@ -294,28 +302,6 @@ impl NV_USABLE_FB_SIZE_IN_MB { } } -// PDISP - -register! { - pub(crate) NV_PDISP_VGA_WORKSPACE_BASE(u32) @ 0x00625f04 { - /// VGA workspace base address divided by 0x10000. - 31:8 addr; - /// Set if the `addr` field is valid. - 3:3 status_valid => bool; - } -} - -impl NV_PDISP_VGA_WORKSPACE_BASE { - /// Returns the base address of the VGA workspace, or `None` if none exists. - pub(crate) fn vga_workspace_addr(self) -> Option<u64> { - if self.status_valid() { - Some(u64::from(self.addr()) << 16) - } else { - None - } - } -} - // FUSE pub(crate) const NV_FUSE_OPT_FPF_SIZE: usize = 16; diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs index c6e6bfcd6a1f..c03650ee5226 100644 --- a/drivers/gpu/nova-core/vbios.rs +++ b/drivers/gpu/nova-core/vbios.rs @@ -13,11 +13,8 @@ use kernel::{ register, sizes::SZ_4K, sync::aref::ARef, - transmute::FromBytes, }; -use zerocopy::FromBytes as _; - use crate::{ driver::Bar0, firmware::{ @@ -359,7 +356,7 @@ impl Vbios { } /// PCI Data Structure as defined in PCI Firmware Specification -#[derive(Debug, Clone)] +#[derive(Debug, Clone, FromBytes)] #[repr(C)] struct PcirStruct { /// PCI Data Structure signature ("PCIR" or "NPDS") @@ -388,15 +385,12 @@ struct PcirStruct { max_runtime_image_len: u16, } -// SAFETY: all bit patterns are valid for `PcirStruct`. -unsafe impl FromBytes for PcirStruct {} - impl PcirStruct { /// The bit in `last_image` that indicates the last image. const LAST_IMAGE_BIT_MASK: u8 = 0x80; fn new(dev: &device::Device, data: &[u8]) -> Result<Self> { - let (pcir, _) = PcirStruct::from_bytes_copy_prefix(data).ok_or(EINVAL)?; + let (pcir, _) = PcirStruct::read_from_prefix(data).map_err(|_| EINVAL)?; // Signature should be "PCIR" (0x52494350) or "NPDS" (0x5344504e). if &pcir.signature != b"PCIR" && &pcir.signature != b"NPDS" { @@ -432,7 +426,7 @@ impl PcirStruct { /// This is the head of the BIT table, that is used to locate the Falcon data. The BIT table (with /// its header) is in the [`PciAtBiosImage`] and the falcon data it is pointing to is in the /// [`FwSecBiosImage`]. -#[derive(Debug, Clone, Copy)] +#[derive(Debug, Clone, Copy, FromBytes)] #[repr(C)] struct BitHeader { /// 0h: BIT Header Identifier (BMP=0x7FFF/BIT=0xB8FF) @@ -451,12 +445,9 @@ struct BitHeader { checksum: u8, } -// SAFETY: all bit patterns are valid for `BitHeader`. -unsafe impl FromBytes for BitHeader {} - impl BitHeader { fn new(data: &[u8]) -> Result<Self> { - let (header, _) = BitHeader::from_bytes_copy_prefix(data).ok_or(EINVAL)?; + let (header, _) = BitHeader::read_from_prefix(data).map_err(|_| EINVAL)?; // Check header ID and signature if header.id != 0xB8FF || &header.signature != b"BIT\0" { @@ -468,7 +459,7 @@ impl BitHeader { } /// BIT Token Entry: Records in the BIT table followed by the BIT header. -#[derive(Debug, Clone, Copy)] +#[derive(Debug, Clone, Copy, FromBytes)] #[repr(C)] struct BitToken { /// 00h: Token identifier @@ -481,9 +472,6 @@ struct BitToken { data_offset: u16, } -// SAFETY: all bit patterns are valid for `BitToken`. -unsafe impl FromBytes for BitToken {} - impl BitToken { /// BIT token ID for Falcon data. const ID_FALCON_DATA: u8 = 0x70; @@ -508,7 +496,7 @@ impl BitToken { .and_then(|data| data.get(..entry_size)) .ok_or(EINVAL)?; - let (token, _) = BitToken::from_bytes_copy_prefix(entry).ok_or(EINVAL)?; + let (token, _) = BitToken::read_from_prefix(entry).map_err(|_| EINVAL)?; // Check if this token has the requested ID if token.id == token_id { @@ -525,7 +513,7 @@ impl BitToken { /// /// This header is at the beginning of every image in the set of images in the ROM. It contains a /// pointer to the PCI Data Structure which describes the image. -#[derive(Debug, Clone, Copy)] +#[derive(Debug, Clone, Copy, FromBytes)] #[repr(C)] struct PciRomHeader { /// 00h: Signature (0xAA55) @@ -536,13 +524,10 @@ struct PciRomHeader { pci_data_struct_offset: u16, } -// SAFETY: all bit patterns are valid for `PciRomHeader`. -unsafe impl FromBytes for PciRomHeader {} - impl PciRomHeader { fn new(dev: &device::Device, data: &[u8]) -> Result<Self> { - let (rom_header, _) = PciRomHeader::from_bytes_copy_prefix(data) - .ok_or(EINVAL) + let (rom_header, _) = PciRomHeader::read_from_prefix(data) + .map_err(|_| EINVAL) .inspect_err(|_| dev_err!(dev, "Not enough data for ROM header\n"))?; // Check for valid ROM signatures. @@ -564,7 +549,7 @@ impl PciRomHeader { /// PCI Data Structure. It contains some fields that are redundant with the PCI Data Structure, but /// are needed for traversing the BIOS images. It is expected to be present in all BIOS images /// except for NBSI images. -#[derive(Debug, Clone)] +#[derive(Debug, Clone, FromBytes)] #[repr(C)] struct NpdeStruct { /// 00h: Signature ("NPDE") @@ -579,15 +564,12 @@ struct NpdeStruct { last_image: u8, } -// SAFETY: all bit patterns are valid for `NpdeStruct`. -unsafe impl FromBytes for NpdeStruct {} - impl NpdeStruct { /// The bit in `last_image` that indicates the last image. const LAST_IMAGE_BIT_MASK: u8 = 0x80; fn new(dev: &device::Device, data: &[u8]) -> Option<Self> { - let (npde, _) = NpdeStruct::from_bytes_copy_prefix(data)?; + let (npde, _) = NpdeStruct::read_from_prefix(data).ok()?; // Signature should be "NPDE" (0x4544504E). if &npde.signature != b"NPDE" { @@ -784,7 +766,7 @@ impl PciAtBiosImage { let data = &self.base.data; let (ptr, _) = data .get(offset..) - .and_then(u32::from_bytes_copy_prefix) + .and_then(|p| u32::read_from_prefix(p).ok()) .ok_or(EINVAL)?; usize::from_safe_cast(ptr) @@ -814,6 +796,7 @@ impl TryFrom<BiosImage> for PciAtBiosImage { /// The [`PmuLookupTableEntry`] structure is a single entry in the [`PmuLookupTable`]. /// /// See the [`PmuLookupTable`] description for more information. +#[derive(FromBytes)] #[repr(C, packed)] struct PmuLookupTableEntry { application_id: u8, @@ -821,9 +804,6 @@ struct PmuLookupTableEntry { data: u32, } -// SAFETY: all bit patterns are valid for `PmuLookupTableEntry`. -unsafe impl FromBytes for PmuLookupTableEntry {} - impl PmuLookupTableEntry { /// PMU lookup table application ID for firmware security license ucode. #[expect(dead_code)] @@ -836,6 +816,7 @@ impl PmuLookupTableEntry { } #[repr(C)] +#[derive(FromBytes)] struct PmuLookupTableHeader { version: u8, header_len: u8, @@ -843,9 +824,6 @@ struct PmuLookupTableHeader { entry_count: u8, } -// SAFETY: all bit patterns are valid for `PmuLookupTableHeader`. -unsafe impl FromBytes for PmuLookupTableHeader {} - /// The [`PmuLookupTableEntry`] structure is used to find the [`PmuLookupTableEntry`] for a given /// application ID. /// @@ -857,7 +835,7 @@ struct PmuLookupTable { impl PmuLookupTable { fn new(dev: &device::Device, data: &[u8]) -> Result<Self> { - let (header, _) = PmuLookupTableHeader::from_bytes_copy_prefix(data).ok_or(EINVAL)?; + let (header, _) = PmuLookupTableHeader::read_from_prefix(data).map_err(|_| EINVAL)?; let header_len = usize::from(header.header_len); let entry_len = usize::from(header.entry_len); @@ -872,8 +850,8 @@ impl PmuLookupTable { let mut entries = KVVec::with_capacity(entry_count, GFP_KERNEL)?; for i in 0..entry_count { - let (entry, _) = PmuLookupTableEntry::from_bytes_copy_prefix(&data[i * entry_len..]) - .ok_or(EINVAL)?; + let (entry, _) = PmuLookupTableEntry::read_from_prefix(&data[i * entry_len..]) + .map_err(|_| EINVAL)?; entries.push(entry, GFP_KERNEL)?; } @@ -929,15 +907,11 @@ impl FwSecBiosImage { let ver = data.get(1).copied().ok_or(EINVAL)?; match ver { 2 => { - let v2 = FalconUCodeDescV2::read_from_prefix(data) - .map_err(|_| EINVAL)? - .0; + let (v2, _) = FalconUCodeDescV2::read_from_prefix(data).map_err(|_| EINVAL)?; Ok(FalconUCodeDesc::V2(v2)) } 3 => { - let v3 = FalconUCodeDescV3::from_bytes_copy_prefix(data) - .ok_or(EINVAL)? - .0; + let (v3, _) = FalconUCodeDescV3::read_from_prefix(data).map_err(|_| EINVAL)?; Ok(FalconUCodeDesc::V3(v3)) } _ => { diff --git a/rust/kernel/drm/device.rs b/rust/kernel/drm/device.rs index 477cf771fb10..7ad124327a83 100644 --- a/rust/kernel/drm/device.rs +++ b/rust/kernel/drm/device.rs @@ -102,7 +102,7 @@ macro_rules! drm_legacy_fields { /// what stage of the process the [`Device`] is currently in. This means for instance that a /// `&Device<T, Uninit>` may actually be registered with userspace, it just wasn't known to be /// registered at the time the reference was taken. -pub trait DeviceContext: Sealed + Send + Sync {} +pub trait DeviceContext: Sealed + Send + Sync + 'static {} /// The [`DeviceContext`] of a [`Device`] that was registered with userspace at some point. /// diff --git a/rust/kernel/drm/gem/mod.rs b/rust/kernel/drm/gem/mod.rs index c8b66d816871..48fa6e96dfe7 100644 --- a/rust/kernel/drm/gem/mod.rs +++ b/rust/kernel/drm/gem/mod.rs @@ -85,7 +85,7 @@ pub type DriverAllocImpl<T, Ctx = Registered> = <<T as DriverObject>::Driver as drm::Driver>::Object<Ctx>; /// GEM object functions, which must be implemented by drivers. -pub trait DriverObject: Sync + Send + Sized { +pub trait DriverObject: Sync + Send + Sized + 'static { /// Parent `Driver` for this object. type Driver: drm::Driver; diff --git a/rust/kernel/drm/gem/shmem.rs b/rust/kernel/drm/gem/shmem.rs index 34af402899a0..3ee19ef6264e 100644 --- a/rust/kernel/drm/gem/shmem.rs +++ b/rust/kernel/drm/gem/shmem.rs @@ -11,6 +11,11 @@ use crate::{ container_of, + device::{ + self, + Bound, // + }, + devres::*, drm::{ driver, gem, @@ -19,20 +24,46 @@ use crate::{ DeviceContext, Registered, // }, - error::to_result, + error::{ + from_err_ptr, + to_result, // + }, + io::{ + Io, + IoCapable, + IoKnownSize, // + }, prelude::*, - sync::aref::ARef, - types::Opaque, // + scatterlist, + sync::{ + aref::ARef, + new_mutex, + Mutex, + SetOnce, // + }, + types::{ + NotThreadSafe, + Opaque, // + }, }; use core::{ + ffi::c_void, marker::PhantomData, + mem::{ + ManuallyDrop, + MaybeUninit, // + }, ops::{ Deref, DerefMut, // }, - ptr::NonNull, // + ptr::{ + self, + NonNull, // + }, }; use gem::{ + BaseObject, BaseObjectPrivate, DriverObject, IntoGEMObject, // @@ -42,7 +73,6 @@ use gem::{ /// /// This is used with [`Object::new()`] to control various properties that can only be set when /// initially creating a shmem-backed GEM object. -#[derive(Default)] pub struct ObjectConfig<'a, T: DriverObject, C: DeviceContext = Registered> { /// Whether to set the write-combine map flag. pub map_wc: bool, @@ -53,6 +83,16 @@ pub struct ObjectConfig<'a, T: DriverObject, C: DeviceContext = Registered> { pub parent_resv_obj: Option<&'a Object<T, C>>, } +impl<'a, T: DriverObject, C: DeviceContext> Default for ObjectConfig<'a, T, C> { + #[inline(always)] + fn default() -> Self { + Self { + map_wc: false, + parent_resv_obj: None, + } + } +} + /// A shmem-backed GEM object. /// /// # Invariants @@ -67,6 +107,11 @@ pub struct Object<T: DriverObject, C: DeviceContext = Registered> { obj: Opaque<bindings::drm_gem_shmem_object>, /// Parent object that owns this object's DMA reservation object. parent_resv_obj: Option<ARef<Object<T, C>>>, + /// Devres object for unmapping any SGTable on driver-unbind. + sgt_res: ManuallyDrop<SetOnce<Devres<SGTableMap<T, C>>>>, + #[pin] + /// Lock for protecting initialization of `sgt_res`. + sgt_lock: Mutex<()>, #[pin] inner: T, _ctx: PhantomData<C>, @@ -125,6 +170,8 @@ impl<T: DriverObject, C: DeviceContext> Object<T, C> { try_pin_init!(Self { obj <- Opaque::init_zeroed(), parent_resv_obj: config.parent_resv_obj.map(|p| p.into()), + sgt_res: ManuallyDrop::new(SetOnce::new()), + sgt_lock <- new_mutex!(()), inner <- T::new(dev, size, args), _ctx: PhantomData::<C>, }), @@ -169,22 +216,143 @@ impl<T: DriverObject, C: DeviceContext> Object<T, C> { // - DRM always passes a valid gem object here // - We used drm_gem_shmem_create() in our create_gem_object callback, so we know that // `obj` is contained within a drm_gem_shmem_object - let this = unsafe { container_of!(obj, bindings::drm_gem_shmem_object, base) }; - - // SAFETY: - // - We're in free_callback - so this function is safe to call. - // - We won't be using the gem resources on `this` after this call. - unsafe { bindings::drm_gem_shmem_release(this) }; + let base = unsafe { container_of!(obj, bindings::drm_gem_shmem_object, base) }; // SAFETY: // - We verified above that `obj` is valid, which makes `this` valid // - This function is set in AllocOps, so we know that `this` is contained within a // `Object<T, C>` - let this = unsafe { container_of!(Opaque::cast_from(this), Self, obj) }.cast_mut(); + let this = unsafe { container_of!(Opaque::cast_from(base), Self, obj) }.cast_mut(); + + // We need to drop `sgt_res` first, since doing so requires that the GEM object is still + // alive. + // SAFETY: + // - We verified above that `this` is valid. + // - We are in free_callback, guaranteeing we have exclusive access to `this` and that + // `sgt_res` will not be used after dropping it here. + unsafe { ManuallyDrop::drop(&mut (*this).sgt_res) }; + + // SAFETY: + // - We're in free_callback - so this function is safe to call. + // - We won't be using the gem resources on `this` after this call. + unsafe { bindings::drm_gem_shmem_release(base) }; // SAFETY: We're recovering the Kbox<> we created in gem_create_object() let _ = unsafe { KBox::from_raw(this) }; } + + /// Attempt to create a vmap from the gem object, and confirm the size of said vmap. + fn make_vmap<'a, R, const SIZE: usize>(&'a self) -> Result<VMap<T, R, C, SIZE>> + where + R: Deref<Target = Self> + From<&'a Self>, + { + // INVARIANT: We check here that the gem object is at least as large as `SIZE`. + if self.size() < SIZE { + return Err(ENOSPC); + } + + let mut map: MaybeUninit<bindings::iosys_map> = MaybeUninit::uninit(); + let guard = DmaResvGuard::new(self); + + // SAFETY: `drm_gem_shmem_vmap()` can be called with the DMA reservation lock held. + to_result(unsafe { + bindings::drm_gem_shmem_vmap_locked(self.as_raw_shmem(), map.as_mut_ptr()) + })?; + + // Drop the guard explicitly here, since we may need to call `raw_vunmap()` (which + // re-acquires the lock). + drop(guard); + + // SAFETY: The call to `drm_gem_shmem_vmap_locked()` succeeded above, so we are guaranteed + // that map is properly initialized. + let map = unsafe { map.assume_init() }; + + // XXX: We don't currently support iomem allocations + if map.is_iomem { + // SAFETY: The vmap operation above succeeded, guaranteeing that `map` points to a valid + // memory mapping. + unsafe { self.raw_vunmap(map) }; + + Err(ENOTSUPP) + } else { + Ok(VMap { + // INVARIANT: `addr` remains valid for as long as `owner` does, which extends to the + // lifetime of `VMap` itself. + // SAFETY: We checked that this is not an iomem allocation, making it safe to read + // vaddr. + addr: unsafe { map.__bindgen_anon_1.vaddr }, + owner: self.into(), + }) + } + } + + /// Unmap a vmap from the gem object. + /// + /// # Safety + /// + /// - The caller promises that `map` is a valid vmap on this gem object. + /// - The caller promises that the memory pointed to by map will no longer be accesed through + /// this instance. + unsafe fn raw_vunmap(&self, mut map: bindings::iosys_map) { + let _guard = DmaResvGuard::new(self); + + // SAFETY: + // - This function is safe to call with the DMA reservation lock held. + // - The caller promises that `map` is a valid vmap on this gem object. + unsafe { bindings::drm_gem_shmem_vunmap_locked(self.as_raw_shmem(), &mut map) }; + } + + /// Creates and returns a virtual kernel memory mapping for this object. + #[inline] + pub fn vmap<const SIZE: usize>(&self) -> Result<VMapRef<'_, T, C, SIZE>> { + self.make_vmap() + } + + /// Creates and returns an owned reference to a virtual kernel memory mapping for this object. + #[inline] + pub fn owned_vmap<const SIZE: usize>(&self) -> Result<VMapOwned<T, C, SIZE>> { + self.make_vmap() + } + + /// Creates (if necessary) and returns an immutable reference to a scatter-gather table of DMA + /// pages for this object. + /// + /// This will pin the object in memory. It is expected that `dev` should be a pointer to the + /// same [`device::Device`] which `self` belongs to, otherwise this function will return + /// `Err(EINVAL)`. + pub fn sg_table<'a>( + &'a self, + dev: &'a device::Device<Bound>, + ) -> Result<&'a scatterlist::SGTable> { + if dev.as_raw() != self.dev().as_ref().as_raw() { + return Err(EINVAL); + } + + let sgt_res = 'out: { + // Fast path: sgt_res is already initialized + if let Some(sgt_res) = self.sgt_res.as_ref() { + break 'out sgt_res; + } + + // Slow path: Grab the lock and see if we need to initialize sgt_res. + let _guard = self.sgt_lock.lock(); + + // If someone initialized it while we were waiting, we can exit early. + if let Some(sgt_res) = self.sgt_res.as_ref() { + break 'out sgt_res; + } + + // If not, finish initializing and return. `populate()` cannot return false, as + // `sgt_res` must be unpopulated, and we must hold `sgt_lock` to reach this point. + self.sgt_res + .populate(Devres::new(dev, SGTableMap::new(self))?); + + // SAFETY: We just populated sgt_res above. + unsafe { self.sgt_res.as_ref().unwrap_unchecked() } + }; + + Ok(sgt_res.access(dev)?) + } } impl<T: DriverObject, C: DeviceContext> Deref for Object<T, C> { @@ -235,3 +403,369 @@ impl<T: DriverObject, C: DeviceContext> driver::AllocImpl for Object<T, C> { dumb_map_offset: None, }; } + +/// Private helper-type for holding the `dma_resv` object for a GEM shmem object. +/// +/// When this is dropped, the `dma_resv` lock is dropped as well. +/// +// TODO: This should be replace with a WwMutex equivalent once we have such bindings in the kernel. +struct DmaResvGuard<'a, T: DriverObject, C: DeviceContext = Registered>( + &'a Object<T, C>, + NotThreadSafe, +); + +impl<'a, T: DriverObject, C: DeviceContext> DmaResvGuard<'a, T, C> { + #[inline] + fn new(obj: &'a Object<T, C>) -> Self { + // SAFETY: This lock is initialized throughout the lifetime of `object`. + unsafe { bindings::dma_resv_lock(obj.raw_dma_resv(), ptr::null_mut()) }; + + Self(obj, NotThreadSafe) + } +} + +impl<'a, T: DriverObject, C: DeviceContext> Drop for DmaResvGuard<'a, T, C> { + #[inline] + fn drop(&mut self) { + // SAFETY: We are releasing the lock grabbed during the creation of this object. + unsafe { bindings::dma_resv_unlock(self.0.raw_dma_resv()) }; + } +} + +/// A reference to a virtual mapping for an shmem-based GEM object in kernel address space. +/// +/// # Invariants +/// +/// - The size of `owner` is >= SIZE. +/// - The memory pointed to by `addr` remains valid at least until this object is dropped. +pub struct VMap<D, R, C = Registered, const SIZE: usize = 0> +where + D: DriverObject, + C: DeviceContext, + R: Deref<Target = Object<D, C>>, +{ + addr: *mut c_void, + owner: R, +} + +/// An alias type for a reference to a shmem-based GEM object's VMap. +pub type VMapRef<'a, D, C, const SIZE: usize = 0> = VMap<D, &'a Object<D, C>, C, SIZE>; + +/// An alias type for an owned reference to a shmem-based GEM object's VMap. +pub type VMapOwned<D, C, const SIZE: usize = 0> = VMap<D, ARef<Object<D, C>>, C, SIZE>; + +impl<D, R, C, const SIZE: usize> VMap<D, R, C, SIZE> +where + D: DriverObject, + C: DeviceContext, + R: Deref<Target = Object<D, C>>, +{ + /// Borrows a reference to the object that owns this virtual mapping. + #[inline] + pub fn owner(&self) -> &Object<D, C> { + &self.owner + } +} + +impl<D, R, C, const SIZE: usize> Drop for VMap<D, R, C, SIZE> +where + D: DriverObject, + C: DeviceContext, + R: Deref<Target = Object<D, C>>, +{ + #[inline] + fn drop(&mut self) { + // SAFETY: + // - Our existence is proof that this map was previously created using self.owner. + // - Since we are in Drop, we are guaranteed that no one will access the memory + // through this mapping after calling this. + unsafe { + self.owner.raw_vunmap(bindings::iosys_map { + is_iomem: false, + __bindgen_anon_1: bindings::iosys_map__bindgen_ty_1 { vaddr: self.addr }, + }) + }; + } +} + +// SAFETY: `addr` points to a valid memory address for as long as `owner` exists, meaning that so +// long as `owner` is `Send` so is `VMap`. +unsafe impl<D, R, C, const SIZE: usize> Send for VMap<D, R, C, SIZE> +where + D: DriverObject, + C: DeviceContext, + R: Deref<Target = Object<D, C>> + Send, +{ +} + +// SAFETY: `addr` points to a valid memory address for as long as `owner` exists, meaning that so +// long as `owner` is `Sync` so is `VMap`. +unsafe impl<D, R, C, const SIZE: usize> Sync for VMap<D, R, C, SIZE> +where + D: DriverObject, + C: DeviceContext, + R: Deref<Target = Object<D, C>> + Sync, +{ +} + +impl<D, R, C, const SIZE: usize> Io for VMap<D, R, C, SIZE> +where + D: DriverObject, + C: DeviceContext, + R: Deref<Target = Object<D, C>>, +{ + #[inline] + fn addr(&self) -> usize { + self.addr as usize + } + + #[inline] + fn maxsize(&self) -> usize { + self.owner.size() + } +} + +impl<D, R, C, const SIZE: usize> IoKnownSize for VMap<D, R, C, SIZE> +where + D: DriverObject, + C: DeviceContext, + R: Deref<Target = Object<D, C>>, +{ + const MIN_SIZE: usize = SIZE; +} + +macro_rules! impl_vmap_io_capable { + ($ty:ty) => { + impl<D, R, C, const SIZE: usize> IoCapable<$ty> for VMap<D, R, C, SIZE> + where + D: DriverObject, + C: DeviceContext, + R: Deref<Target = Object<D, C>>, + { + #[inline] + unsafe fn io_read(&self, address: usize) -> $ty { + let ptr = address as *mut $ty; + + // SAFETY: The safety contract of `io_read` guarantees that address is a valid + // address within the bounds of `Self` of at least the size of $ty, and is properly + // aligned. + unsafe { ptr::read_volatile(ptr) } + } + + #[inline] + unsafe fn io_write(&self, value: $ty, address: usize) { + let ptr = address as *mut $ty; + + // SAFETY: The safety contract of `io_write` guarantees that address is a valid + // address within the bounds of `Self` of at least the size of $ty, and is properly + // aligned. + unsafe { ptr::write_volatile(ptr, value) } + } + } + }; +} + +impl_vmap_io_capable!(u8); +impl_vmap_io_capable!(u16); +impl_vmap_io_capable!(u32); +#[cfg(CONFIG_64BIT)] +impl_vmap_io_capable!(u64); + +/// A reference to a GEM object that is known to have a mapped [`SGTable`]. +/// +/// This is used by the Rust bindings with [`Devres`] in order to ensure that mappings for SGTables +/// on GEM shmem objects are revoked on driver-unbind. +/// +/// # Invariants +/// +/// - `self.obj` always points to a valid GEM object. +/// - This object is proof that `self.obj.owner.sgt_res` has an initialized and valid pointer to an +/// [`SGTable`]. +/// +/// [`SGTable`]: scatterlist::SGTable +pub struct SGTableMap<T: DriverObject, C: DeviceContext> { + obj: NonNull<Object<T, C>>, +} + +impl<T: DriverObject, C: DeviceContext> Deref for SGTableMap<T, C> { + type Target = scatterlist::SGTable; + + fn deref(&self) -> &Self::Target { + // SAFETY: + // - The NonNull is guaranteed to be valid via our type invariants. + // - The sgt field is guaranteed to be initialized and valid via our type invariants. + unsafe { scatterlist::SGTable::from_raw((*self.obj.as_ref().as_raw_shmem()).sgt) } + } +} + +impl<T: DriverObject, C: DeviceContext> Drop for SGTableMap<T, C> { + fn drop(&mut self) { + // SAFETY: `obj` is always valid via our type invariants + let obj = unsafe { self.obj.as_ref() }; + let _lock = DmaResvGuard::new(obj); + + // SAFETY: We acquired the lock needed for calling this function above + unsafe { bindings::__drm_gem_shmem_free_sgt_locked(obj.as_raw_shmem()) }; + } +} + +impl<T: DriverObject, C: DeviceContext> SGTableMap<T, C> { + fn new(obj: &Object<T, C>) -> impl Init<Self, Error> { + // INVARIANT: + // - We call drm_gem_shmem_get_pages_sgt below and check whether or not it succeeds, + // fulfilling the invariant of SGTableMap that the object's `sgt` field is initialized. + // SAFETY: + // - `obj` is fully initialized, making this function safe to call. + from_err_ptr(unsafe { bindings::drm_gem_shmem_get_pages_sgt(obj.as_raw_shmem()) })?; + + Ok(Self { obj: obj.into() }) + } +} + +// SAFETY: The NonNull in SGTableMap is guaranteed valid by our type invariants, and the GEM object +// it points to is guaranteed to be thread-safe. +unsafe impl<T: DriverObject, C: DeviceContext> Send for SGTableMap<T, C> {} +// SAFETY: The NonNull in SGTableMap is guaranteed valid by our type invariants, and the GEM object +// it points to is guaranteed to be thread-safe. +unsafe impl<T: DriverObject, C: DeviceContext> Sync for SGTableMap<T, C> {} + +#[kunit_tests(rust_drm_gem_shmem)] +mod tests { + use super::*; + use crate::{ + drm::{ + self, + UnregisteredDevice, // + }, + faux, + page::PAGE_SIZE, // + }; + + // The bare minimum needed to create a fake drm driver for kunit + + #[pin_data] + struct KunitData {} + struct KunitDriver; + struct KunitFile; + #[pin_data] + struct KunitObject {} + + const INFO: drm::DriverInfo = drm::DriverInfo { + major: 0, + minor: 0, + patchlevel: 0, + name: c"kunit", + desc: c"Kunit", + }; + + impl drm::file::DriverFile for KunitFile { + type Driver = KunitDriver; + + fn open(_dev: &drm::Device<KunitDriver>) -> Result<Pin<KBox<Self>>> { + Ok(KBox::new(Self, GFP_KERNEL)?.into()) + } + } + + impl gem::DriverObject for KunitObject { + type Driver = KunitDriver; + type Args = (); + + fn new<C: DeviceContext>( + _dev: &drm::Device<KunitDriver, C>, + _size: usize, + _args: Self::Args, + ) -> impl PinInit<Self, Error> { + try_pin_init!(KunitObject {}) + } + } + + #[vtable] + impl drm::Driver for KunitDriver { + type Data = KunitData; + type File = KunitFile; + type Object<Ctx: DeviceContext> = Object<KunitObject, Ctx>; + + const INFO: drm::DriverInfo = INFO; + const IOCTLS: &'static [drm::ioctl::DrmIoctlDescriptor] = &[]; + } + + fn create_drm_dev() -> Result<(faux::Registration, UnregisteredDevice<KunitDriver>)> { + // Create a faux DRM device so we can test gem object creation. + let data = try_pin_init!(KunitData {}); + let dev = faux::Registration::new(c"Kunit", None)?; + let drm = UnregisteredDevice::new(dev.as_ref(), data)?; + + Ok((dev, drm)) + } + + #[test] + fn compile_time_vmap_sizes() -> Result { + let (_dev, drm) = create_drm_dev()?; + + let obj = Object::<KunitObject, _>::new(&drm, PAGE_SIZE, ObjectConfig::default(), ())?; + + // Try creating a normal vmap + obj.vmap::<PAGE_SIZE>()?; + + // Try creating a vmap that's smaller then the size we specified + let vmap = obj.vmap::<{ PAGE_SIZE - 100 }>()?; + + // Verify the owner matches + assert!(ptr::eq(vmap.owner(), obj.deref())); + + // Verify the max size matches the actual object size + assert_eq!(vmap.maxsize(), PAGE_SIZE); + + // Make sure creating a vmap that's too large fails + assert!(obj.vmap::<{ PAGE_SIZE + 200 }>().is_err()); + + Ok(()) + } + + #[test] + fn vmap_io() -> Result { + let (_dev, drm) = create_drm_dev()?; + + let obj = Object::<KunitObject, _>::new(&drm, PAGE_SIZE, ObjectConfig::default(), ())?; + + let vmap = obj.vmap::<PAGE_SIZE>()?; + + vmap.write8(0xDE, 0x0); + assert_eq!(vmap.read8(0x0), 0xDE); + vmap.write32(0xFEDCBA98, 0x20); + + assert_eq!(vmap.read32(0x20), 0xFEDCBA98); + + // Ensure the ordering in memory is correct + let expected = 0xFEDCBA98_u32.to_ne_bytes().into_iter(); + for (offset, expected) in (0x20..=0x23).zip(expected) { + assert_eq!(vmap.read8(offset), expected); + } + + Ok(()) + } + + // TODO: I would love to actually test the success paths of sg_table(), but that would require + // also implementing dummy dma_ops so that trying to create a mapping doesn't explode. So, leave + // that for someone else. + + // Ensures that passing the wrong device to sg_table() fails as we expect, and also ensure it + // skips initializing `sgt_res` since we could otherwise create `sgt_res` with the wrong device + // bound to it. + #[test] + fn fail_sg_table_on_wrong_dev() -> Result { + let (_dev, drm) = create_drm_dev()?; + let wrong_dev = faux::Registration::new(c"EvilKunit", None)?; + + let obj = Object::<KunitObject, _>::new(&drm, PAGE_SIZE, ObjectConfig::default(), ())?; + + assert_eq!(obj.sg_table(wrong_dev.as_ref()).err().unwrap(), EINVAL); + + // If sgt_res was not initialized mistakenly with the wrong device, this should still fail. + assert_eq!(obj.sg_table(wrong_dev.as_ref()).err().unwrap(), EINVAL); + + // TODO: Someday, we should test that creating an sg_table here still succeeds. + + Ok(()) + } +} diff --git a/rust/kernel/drm/gpuvm/mod.rs b/rust/kernel/drm/gpuvm/mod.rs index ae58f6f667c1..20a08b3defeb 100644 --- a/rust/kernel/drm/gpuvm/mod.rs +++ b/rust/kernel/drm/gpuvm/mod.rs @@ -116,9 +116,9 @@ impl<T: DriverGpuVm> GpuVm<T> { /// Creates a GPUVM instance. #[expect(clippy::new_ret_no_self)] - pub fn new<E>( + pub fn new<E, Ctx: drm::DeviceContext>( name: &'static CStr, - dev: &drm::Device<T::Driver>, + dev: &drm::Device<T::Driver, Ctx>, r_obj: &T::Object, range: Range<u64>, reserve_range: Range<u64>, @@ -252,10 +252,10 @@ impl<T: DriverGpuVm> GpuVm<T> { /// The manager for a GPUVM. pub trait DriverGpuVm: Sized + Send { /// Parent `Driver` for this object. - type Driver: drm::Driver<Object = Self::Object>; + type Driver: drm::Driver; /// The kind of GEM object stored in this GPUVM. - type Object: IntoGEMObject; + type Object: drm::driver::AllocImpl<Driver = Self::Driver>; /// Data stored with each [`struct drm_gpuva`](struct@GpuVa). type VaData; @@ -264,7 +264,9 @@ pub trait DriverGpuVm: Sized + Send { type VmBoData; /// The private data passed to callbacks. - type SmContext<'ctx>; + type SmContext<'ctx> + where + Self: 'ctx; /// Indicates that a new mapping should be created. fn sm_step_map<'op, 'ctx>( diff --git a/rust/kernel/drm/gpuvm/sm_ops.rs b/rust/kernel/drm/gpuvm/sm_ops.rs index 69a8e5ab2821..742c151b2540 100644 --- a/rust/kernel/drm/gpuvm/sm_ops.rs +++ b/rust/kernel/drm/gpuvm/sm_ops.rs @@ -3,7 +3,7 @@ use super::*; /// The actual data that gets threaded through the callbacks. -struct SmData<'a, 'ctx, T: DriverGpuVm> { +struct SmData<'a, 'ctx, T: DriverGpuVm + 'ctx> { gpuvm: &'a mut UniqueRefGpuVm<T>, user_context: &'a mut T::SmContext<'ctx>, } @@ -20,7 +20,7 @@ struct SmMapData<'a, 'ctx, T: DriverGpuVm> { } /// The argument for [`UniqueRefGpuVm::sm_map`]. -pub struct OpMapRequest<'a, 'ctx, T: DriverGpuVm> { +pub struct OpMapRequest<'a, 'ctx, T: DriverGpuVm + 'ctx> { /// Address in GPU virtual address space. pub addr: u64, /// Length of mapping to create. diff --git a/rust/kernel/faux.rs b/rust/kernel/faux.rs index 43b4974f48cd..36c92ae2943c 100644 --- a/rust/kernel/faux.rs +++ b/rust/kernel/faux.rs @@ -25,7 +25,8 @@ use core::ptr::{ /// /// # Invariants /// -/// `self.0` always holds a valid pointer to an initialized and registered [`struct faux_device`]. +/// - `self.0` always holds a valid pointer to an initialized and registered [`struct faux_device`]. +/// - This object is proof that the object described by this `Registration` is bound to a device. /// /// [`struct faux_device`]: srctree/include/linux/device/faux.h pub struct Registration(NonNull<bindings::faux_device>); @@ -59,10 +60,17 @@ impl Registration { } } -impl AsRef<device::Device> for Registration { - fn as_ref(&self) -> &device::Device { - // SAFETY: The underlying `device` in `faux_device` is guaranteed by the C API to be - // a valid initialized `device`. +impl AsRef<device::Device<device::Bound>> for Registration { + fn as_ref(&self) -> &device::Device<device::Bound> { + // SAFETY: + // - The underlying `device` in `faux_device` is guaranteed by the C API to be a valid + // initialized `device`. + // - `faux_match()` always returns 1, and probe runs synchronously + // (PROBE_FORCE_SYNCHRONOUS). + // - `suppress_bind_attrs = true` on faux_driver prevents userspace-triggered unbind via + // sysfs. + // - `mem::forget(Registration)` is not a problem; if the `Registration` is leaked, the faux + // device stays bound forever. unsafe { device::Device::from_raw(addr_of_mut!((*self.as_raw()).dev)) } } } |
