summaryrefslogtreecommitdiff
path: root/Documentation/driver-api
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/driver-api')
-rw-r--r--Documentation/driver-api/dmaengine/client.rst87
-rw-r--r--Documentation/driver-api/dmaengine/provider.rst48
-rw-r--r--Documentation/driver-api/driver-model/devres.rst1
-rw-r--r--Documentation/driver-api/thermal/cpu-idle-cooling.rst189
-rw-r--r--Documentation/driver-api/thermal/exynos_thermal.rst8
5 files changed, 328 insertions, 5 deletions
diff --git a/Documentation/driver-api/dmaengine/client.rst b/Documentation/driver-api/dmaengine/client.rst
index 45953f171500..a9a7a3c84c63 100644
--- a/Documentation/driver-api/dmaengine/client.rst
+++ b/Documentation/driver-api/dmaengine/client.rst
@@ -151,6 +151,93 @@ The details of these operations are:
Note that callbacks will always be invoked from the DMA
engines tasklet, never from interrupt context.
+ Optional: per descriptor metadata
+ ---------------------------------
+ DMAengine provides two ways for metadata support.
+
+ DESC_METADATA_CLIENT
+
+ The metadata buffer is allocated/provided by the client driver and it is
+ attached to the descriptor.
+
+ .. code-block:: c
+
+ int dmaengine_desc_attach_metadata(struct dma_async_tx_descriptor *desc,
+ void *data, size_t len);
+
+ DESC_METADATA_ENGINE
+
+ The metadata buffer is allocated/managed by the DMA driver. The client
+ driver can ask for the pointer, maximum size and the currently used size of
+ the metadata and can directly update or read it.
+
+ Becasue the DMA driver manages the memory area containing the metadata,
+ clients must make sure that they do not try to access or get the pointer
+ after their transfer completion callback has run for the descriptor.
+ If no completion callback has been defined for the transfer, then the
+ metadata must not be accessed after issue_pending.
+ In other words: if the aim is to read back metadata after the transfer is
+ completed, then the client must use completion callback.
+
+ .. code-block:: c
+
+ void *dmaengine_desc_get_metadata_ptr(struct dma_async_tx_descriptor *desc,
+ size_t *payload_len, size_t *max_len);
+
+ int dmaengine_desc_set_metadata_len(struct dma_async_tx_descriptor *desc,
+ size_t payload_len);
+
+ Client drivers can query if a given mode is supported with:
+
+ .. code-block:: c
+
+ bool dmaengine_is_metadata_mode_supported(struct dma_chan *chan,
+ enum dma_desc_metadata_mode mode);
+
+ Depending on the used mode client drivers must follow different flow.
+
+ DESC_METADATA_CLIENT
+
+ - DMA_MEM_TO_DEV / DEV_MEM_TO_MEM:
+ 1. prepare the descriptor (dmaengine_prep_*)
+ construct the metadata in the client's buffer
+ 2. use dmaengine_desc_attach_metadata() to attach the buffer to the
+ descriptor
+ 3. submit the transfer
+ - DMA_DEV_TO_MEM:
+ 1. prepare the descriptor (dmaengine_prep_*)
+ 2. use dmaengine_desc_attach_metadata() to attach the buffer to the
+ descriptor
+ 3. submit the transfer
+ 4. when the transfer is completed, the metadata should be available in the
+ attached buffer
+
+ DESC_METADATA_ENGINE
+
+ - DMA_MEM_TO_DEV / DEV_MEM_TO_MEM:
+ 1. prepare the descriptor (dmaengine_prep_*)
+ 2. use dmaengine_desc_get_metadata_ptr() to get the pointer to the
+ engine's metadata area
+ 3. update the metadata at the pointer
+ 4. use dmaengine_desc_set_metadata_len() to tell the DMA engine the
+ amount of data the client has placed into the metadata buffer
+ 5. submit the transfer
+ - DMA_DEV_TO_MEM:
+ 1. prepare the descriptor (dmaengine_prep_*)
+ 2. submit the transfer
+ 3. on transfer completion, use dmaengine_desc_get_metadata_ptr() to get
+ the pointer to the engine's metadata area
+ 4. read out the metadata from the pointer
+
+ .. note::
+
+ When DESC_METADATA_ENGINE mode is used the metadata area for the descriptor
+ is no longer valid after the transfer has been completed (valid up to the
+ point when the completion callback returns if used).
+
+ Mixed use of DESC_METADATA_CLIENT / DESC_METADATA_ENGINE is not allowed,
+ client drivers must use either of the modes per descriptor.
+
4. Submit the transaction
Once the descriptor has been prepared and the callback information
diff --git a/Documentation/driver-api/dmaengine/provider.rst b/Documentation/driver-api/dmaengine/provider.rst
index dfc4486b5743..790a15089f1f 100644
--- a/Documentation/driver-api/dmaengine/provider.rst
+++ b/Documentation/driver-api/dmaengine/provider.rst
@@ -247,6 +247,54 @@ after each transfer. In case of a ring buffer, they may loop
(DMA_CYCLIC). Addresses pointing to a device's register (e.g. a FIFO)
are typically fixed.
+Per descriptor metadata support
+-------------------------------
+Some data movement architecture (DMA controller and peripherals) uses metadata
+associated with a transaction. The DMA controller role is to transfer the
+payload and the metadata alongside.
+The metadata itself is not used by the DMA engine itself, but it contains
+parameters, keys, vectors, etc for peripheral or from the peripheral.
+
+The DMAengine framework provides a generic ways to facilitate the metadata for
+descriptors. Depending on the architecture the DMA driver can implement either
+or both of the methods and it is up to the client driver to choose which one
+to use.
+
+- DESC_METADATA_CLIENT
+
+ The metadata buffer is allocated/provided by the client driver and it is
+ attached (via the dmaengine_desc_attach_metadata() helper to the descriptor.
+
+ From the DMA driver the following is expected for this mode:
+ - DMA_MEM_TO_DEV / DEV_MEM_TO_MEM
+ The data from the provided metadata buffer should be prepared for the DMA
+ controller to be sent alongside of the payload data. Either by copying to a
+ hardware descriptor, or highly coupled packet.
+ - DMA_DEV_TO_MEM
+ On transfer completion the DMA driver must copy the metadata to the client
+ provided metadata buffer before notifying the client about the completion.
+ After the transfer completion, DMA drivers must not touch the metadata
+ buffer provided by the client.
+
+- DESC_METADATA_ENGINE
+
+ The metadata buffer is allocated/managed by the DMA driver. The client driver
+ can ask for the pointer, maximum size and the currently used size of the
+ metadata and can directly update or read it. dmaengine_desc_get_metadata_ptr()
+ and dmaengine_desc_set_metadata_len() is provided as helper functions.
+
+ From the DMA driver the following is expected for this mode:
+ - get_metadata_ptr
+ Should return a pointer for the metadata buffer, the maximum size of the
+ metadata buffer and the currently used / valid (if any) bytes in the buffer.
+ - set_metadata_len
+ It is called by the clients after it have placed the metadata to the buffer
+ to let the DMA driver know the number of valid bytes provided.
+
+ Note: since the client will ask for the metadata pointer in the completion
+ callback (in DMA_DEV_TO_MEM case) the DMA driver must ensure that the
+ descriptor is not freed up prior the callback is called.
+
Device operations
-----------------
diff --git a/Documentation/driver-api/driver-model/devres.rst b/Documentation/driver-api/driver-model/devres.rst
index f37d9ded8ddc..46c13780994c 100644
--- a/Documentation/driver-api/driver-model/devres.rst
+++ b/Documentation/driver-api/driver-model/devres.rst
@@ -315,7 +315,6 @@ IOMAP
devm_ioport_map()
devm_ioport_unmap()
devm_ioremap()
- devm_ioremap_nocache()
devm_ioremap_uc()
devm_ioremap_wc()
devm_ioremap_resource() : checks resource, requests memory region, ioremaps
diff --git a/Documentation/driver-api/thermal/cpu-idle-cooling.rst b/Documentation/driver-api/thermal/cpu-idle-cooling.rst
new file mode 100644
index 000000000000..e4f0859486c7
--- /dev/null
+++ b/Documentation/driver-api/thermal/cpu-idle-cooling.rst
@@ -0,0 +1,189 @@
+
+Situation:
+----------
+
+Under certain circumstances a SoC can reach a critical temperature
+limit and is unable to stabilize the temperature around a temperature
+control. When the SoC has to stabilize the temperature, the kernel can
+act on a cooling device to mitigate the dissipated power. When the
+critical temperature is reached, a decision must be taken to reduce
+the temperature, that, in turn impacts performance.
+
+Another situation is when the silicon temperature continues to
+increase even after the dynamic leakage is reduced to its minimum by
+clock gating the component. This runaway phenomenon can continue due
+to the static leakage. The only solution is to power down the
+component, thus dropping the dynamic and static leakage that will
+allow the component to cool down.
+
+Last but not least, the system can ask for a specific power budget but
+because of the OPP density, we can only choose an OPP with a power
+budget lower than the requested one and under-utilize the CPU, thus
+losing performance. In other words, one OPP under-utilizes the CPU
+with a power less than the requested power budget and the next OPP
+exceeds the power budget. An intermediate OPP could have been used if
+it were present.
+
+Solutions:
+----------
+
+If we can remove the static and the dynamic leakage for a specific
+duration in a controlled period, the SoC temperature will
+decrease. Acting on the idle state duration or the idle cycle
+injection period, we can mitigate the temperature by modulating the
+power budget.
+
+The Operating Performance Point (OPP) density has a great influence on
+the control precision of cpufreq, however different vendors have a
+plethora of OPP density, and some have large power gap between OPPs,
+that will result in loss of performance during thermal control and
+loss of power in other scenarios.
+
+At a specific OPP, we can assume that injecting idle cycle on all CPUs
+belong to the same cluster, with a duration greater than the cluster
+idle state target residency, we lead to dropping the static and the
+dynamic leakage for this period (modulo the energy needed to enter
+this state). So the sustainable power with idle cycles has a linear
+relation with the OPP’s sustainable power and can be computed with a
+coefficient similar to:
+
+ Power(IdleCycle) = Coef x Power(OPP)
+
+Idle Injection:
+---------------
+
+The base concept of the idle injection is to force the CPU to go to an
+idle state for a specified time each control cycle, it provides
+another way to control CPU power and heat in addition to
+cpufreq. Ideally, if all CPUs belonging to the same cluster, inject
+their idle cycles synchronously, the cluster can reach its power down
+state with a minimum power consumption and reduce the static leakage
+to almost zero. However, these idle cycles injection will add extra
+latencies as the CPUs will have to wakeup from a deep sleep state.
+
+We use a fixed duration of idle injection that gives an acceptable
+performance penalty and a fixed latency. Mitigation can be increased
+or decreased by modulating the duty cycle of the idle injection.
+
+ ^
+ |
+ |
+ |------- -------
+ |_______|_______________________|_______|___________
+
+ <------>
+ idle <---------------------->
+ running
+
+ <----------------------------->
+ duty cycle 25%
+
+
+The implementation of the cooling device bases the number of states on
+the duty cycle percentage. When no mitigation is happening the cooling
+device state is zero, meaning the duty cycle is 0%.
+
+When the mitigation begins, depending on the governor's policy, a
+starting state is selected. With a fixed idle duration and the duty
+cycle (aka the cooling device state), the running duration can be
+computed.
+
+The governor will change the cooling device state thus the duty cycle
+and this variation will modulate the cooling effect.
+
+ ^
+ |
+ |
+ |------- -------
+ |_______|_______________|_______|___________
+
+ <------>
+ idle <-------------->
+ running
+
+ <----------------------------->
+ duty cycle 33%
+
+
+ ^
+ |
+ |
+ |------- -------
+ |_______|_______|_______|___________
+
+ <------>
+ idle <------>
+ running
+
+ <------------->
+ duty cycle 50%
+
+The idle injection duration value must comply with the constraints:
+
+- It is less than or equal to the latency we tolerate when the
+ mitigation begins. It is platform dependent and will depend on the
+ user experience, reactivity vs performance trade off we want. This
+ value should be specified.
+
+- It is greater than the idle state’s target residency we want to go
+ for thermal mitigation, otherwise we end up consuming more energy.
+
+Power considerations
+--------------------
+
+When we reach the thermal trip point, we have to sustain a specified
+power for a specific temperature but at this time we consume:
+
+ Power = Capacitance x Voltage^2 x Frequency x Utilisation
+
+... which is more than the sustainable power (or there is something
+wrong in the system setup). The ‘Capacitance’ and ‘Utilisation’ are a
+fixed value, ‘Voltage’ and the ‘Frequency’ are fixed artificially
+because we don’t want to change the OPP. We can group the
+‘Capacitance’ and the ‘Utilisation’ into a single term which is the
+‘Dynamic Power Coefficient (Cdyn)’ Simplifying the above, we have:
+
+ Pdyn = Cdyn x Voltage^2 x Frequency
+
+The power allocator governor will ask us somehow to reduce our power
+in order to target the sustainable power defined in the device
+tree. So with the idle injection mechanism, we want an average power
+(Ptarget) resulting in an amount of time running at full power on a
+specific OPP and idle another amount of time. That could be put in a
+equation:
+
+ P(opp)target = ((Trunning x (P(opp)running) + (Tidle x P(opp)idle)) /
+ (Trunning + Tidle)
+ ...
+
+ Tidle = Trunning x ((P(opp)running / P(opp)target) - 1)
+
+At this point if we know the running period for the CPU, that gives us
+the idle injection we need. Alternatively if we have the idle
+injection duration, we can compute the running duration with:
+
+ Trunning = Tidle / ((P(opp)running / P(opp)target) - 1)
+
+Practically, if the running power is less than the targeted power, we
+end up with a negative time value, so obviously the equation usage is
+bound to a power reduction, hence a higher OPP is needed to have the
+running power greater than the targeted power.
+
+However, in this demonstration we ignore three aspects:
+
+ * The static leakage is not defined here, we can introduce it in the
+ equation but assuming it will be zero most of the time as it is
+ difficult to get the values from the SoC vendors
+
+ * The idle state wake up latency (or entry + exit latency) is not
+ taken into account, it must be added in the equation in order to
+ rigorously compute the idle injection
+
+ * The injected idle duration must be greater than the idle state
+ target residency, otherwise we end up consuming more energy and
+ potentially invert the mitigation effect
+
+So the final equation is:
+
+ Trunning = (Tidle - Twakeup ) x
+ (((P(opp)dyn + P(opp)static ) - P(opp)target) / P(opp)target )
diff --git a/Documentation/driver-api/thermal/exynos_thermal.rst b/Documentation/driver-api/thermal/exynos_thermal.rst
index 5bd556566c70..764df4ab584d 100644
--- a/Documentation/driver-api/thermal/exynos_thermal.rst
+++ b/Documentation/driver-api/thermal/exynos_thermal.rst
@@ -4,7 +4,7 @@ Kernel driver exynos_tmu
Supported chips:
-* ARM SAMSUNG EXYNOS4, EXYNOS5 series of SoC
+* ARM Samsung Exynos4, Exynos5 series of SoC
Datasheet: Not publicly available
@@ -14,7 +14,7 @@ Authors: Amit Daniel <amit.daniel@samsung.com>
TMU controller Description:
---------------------------
-This driver allows to read temperature inside SAMSUNG EXYNOS4/5 series of SoC.
+This driver allows to read temperature inside Samsung Exynos4/5 series of SoC.
The chip only exposes the measured 8-bit temperature code value
through a register.
@@ -43,7 +43,7 @@ The three equations are:
Trimming info for 85 degree Celsius (stored at TRIMINFO register)
Temperature code measured at 85 degree Celsius which is unchanged
-TMU(Thermal Management Unit) in EXYNOS4/5 generates interrupt
+TMU(Thermal Management Unit) in Exynos4/5 generates interrupt
when temperature exceeds pre-defined levels.
The maximum number of configurable threshold is five.
The threshold levels are defined as follows::
@@ -67,7 +67,7 @@ TMU driver description:
The exynos thermal driver is structured as::
Kernel Core thermal framework
- (thermal_core.c, step_wise.c, cpu_cooling.c)
+ (thermal_core.c, step_wise.c, cpufreq_cooling.c)
^
|
|