summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2012-07-30 09:53:50 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2012-07-30 09:53:50 -0700
commit8da8533dfb0929c5ea5d9fdf60ea6d3ffa02127d (patch)
tree1f9fe13e150dae31cf48ac3a88d5003040c2ec98 /Documentation
parentf50f118c4974f7c2208a54f96452165ffb880471 (diff)
parentc2078e4c9120e7b38b1a02cd9fc6dd4f792110bf (diff)
downloadlwn-8da8533dfb0929c5ea5d9fdf60ea6d3ffa02127d.tar.gz
lwn-8da8533dfb0929c5ea5d9fdf60ea6d3ffa02127d.zip
Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac
Pull EDAC patches from Mauro Carvalho Chehab: - the second part of the EDAC rework: - Add the sysfs nodes that exports the real memory layout, instead of the fake one (needed to properly represent Intel memory controllers since 2002) - convert EDAC MC to use "struct device" instead of creating the sysfs nodes via the kobj API - adds a tracepoint to represent memory errors - some cleanup patches - some fixes at i5000, i5400 and EDAC core - a new EDAC driver for Caldera. * git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: (33 commits) edac i5000, i5400: fix pointer math in i5000_get_mc_regs() edac: allow specifying the error count with fake_inject edac: add support for Calxeda highbank L2 cache ecc edac: add support for Calxeda highbank memory controller edac: create top-level debugfs directory sb_edac: properly handle error count i7core_edac: properly handle error count edac: edac_mc_handle_error(): add an error_count parameter edac: remove arch-specific parameter for the error handler amd64_edac: Don't pass driver name as an error parameter edac_mc: check for allocation failure in edac_mc_alloc() edac: Increase version to 3.0.0 edac_mc: Cleanup per-dimm_info debug messages edac: Convert debugfX to edac_dbg(X, edac: Use more normal debugging macro style edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs Edac: Add ABI Documentation for the new device nodes edac: move documentation ABI to ABI/testing/sysfs-devices-edac i7core_edac: change the mem allocation scheme to make Documentation/kobject.txt happy edac: change the mem allocation scheme to make Documentation/kobject.txt happy ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-devices-edac140
-rw-r--r--Documentation/devicetree/bindings/arm/calxeda/l2ecc.txt15
-rw-r--r--Documentation/devicetree/bindings/arm/calxeda/mem-ctrlr.txt14
-rw-r--r--Documentation/edac.txt112
4 files changed, 177 insertions, 104 deletions
diff --git a/Documentation/ABI/testing/sysfs-devices-edac b/Documentation/ABI/testing/sysfs-devices-edac
new file mode 100644
index 000000000000..30ee78aaed75
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-devices-edac
@@ -0,0 +1,140 @@
+What: /sys/devices/system/edac/mc/mc*/reset_counters
+Date: January 2006
+Contact: linux-edac@vger.kernel.org
+Description: This write-only control file will zero all the statistical
+ counters for UE and CE errors on the given memory controller.
+ Zeroing the counters will also reset the timer indicating how
+ long since the last counter were reset. This is useful for
+ computing errors/time. Since the counters are always reset
+ at driver initialization time, no module/kernel parameter
+ is available.
+
+What: /sys/devices/system/edac/mc/mc*/seconds_since_reset
+Date: January 2006
+Contact: linux-edac@vger.kernel.org
+Description: This attribute file displays how many seconds have elapsed
+ since the last counter reset. This can be used with the error
+ counters to measure error rates.
+
+What: /sys/devices/system/edac/mc/mc*/mc_name
+Date: January 2006
+Contact: linux-edac@vger.kernel.org
+Description: This attribute file displays the type of memory controller
+ that is being utilized.
+
+What: /sys/devices/system/edac/mc/mc*/size_mb
+Date: January 2006
+Contact: linux-edac@vger.kernel.org
+Description: This attribute file displays, in count of megabytes, of memory
+ that this memory controller manages.
+
+What: /sys/devices/system/edac/mc/mc*/ue_count
+Date: January 2006
+Contact: linux-edac@vger.kernel.org
+Description: This attribute file displays the total count of uncorrectable
+ errors that have occurred on this memory controller. If
+ panic_on_ue is set, this counter will not have a chance to
+ increment, since EDAC will panic the system
+
+What: /sys/devices/system/edac/mc/mc*/ue_noinfo_count
+Date: January 2006
+Contact: linux-edac@vger.kernel.org
+Description: This attribute file displays the number of UEs that have
+ occurred on this memory controller with no information as to
+ which DIMM slot is having errors.
+
+What: /sys/devices/system/edac/mc/mc*/ce_count
+Date: January 2006
+Contact: linux-edac@vger.kernel.org
+Description: This attribute file displays the total count of correctable
+ errors that have occurred on this memory controller. This
+ count is very important to examine. CEs provide early
+ indications that a DIMM is beginning to fail. This count
+ field should be monitored for non-zero values and report
+ such information to the system administrator.
+
+What: /sys/devices/system/edac/mc/mc*/ce_noinfo_count
+Date: January 2006
+Contact: linux-edac@vger.kernel.org
+Description: This attribute file displays the number of CEs that
+ have occurred on this memory controller wherewith no
+ information as to which DIMM slot is having errors. Memory is
+ handicapped, but operational, yet no information is available
+ to indicate which slot the failing memory is in. This count
+ field should be also be monitored for non-zero values.
+
+What: /sys/devices/system/edac/mc/mc*/sdram_scrub_rate
+Date: February 2007
+Contact: linux-edac@vger.kernel.org
+Description: Read/Write attribute file that controls memory scrubbing.
+ The scrubbing rate used by the memory controller is set by
+ writing a minimum bandwidth in bytes/sec to the attribute file.
+ The rate will be translated to an internal value that gives at
+ least the specified rate.
+ Reading the file will return the actual scrubbing rate employed.
+ If configuration fails or memory scrubbing is not implemented,
+ the value of the attribute file will be -1.
+
+What: /sys/devices/system/edac/mc/mc*/max_location
+Date: April 2012
+Contact: Mauro Carvalho Chehab <mchehab@redhat.com>
+ linux-edac@vger.kernel.org
+Description: This attribute file displays the information about the last
+ available memory slot in this memory controller. It is used by
+ userspace tools in order to display the memory filling layout.
+
+What: /sys/devices/system/edac/mc/mc*/(dimm|rank)*/size
+Date: April 2012
+Contact: Mauro Carvalho Chehab <mchehab@redhat.com>
+ linux-edac@vger.kernel.org
+Description: This attribute file will display the size of dimm or rank.
+ For dimm*/size, this is the size, in MB of the DIMM memory
+ stick. For rank*/size, this is the size, in MB for one rank
+ of the DIMM memory stick. On single rank memories (1R), this
+ is also the total size of the dimm. On dual rank (2R) memories,
+ this is half the size of the total DIMM memories.
+
+What: /sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_dev_type
+Date: April 2012
+Contact: Mauro Carvalho Chehab <mchehab@redhat.com>
+ linux-edac@vger.kernel.org
+Description: This attribute file will display what type of DRAM device is
+ being utilized on this DIMM (x1, x2, x4, x8, ...).
+
+What: /sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_edac_mode
+Date: April 2012
+Contact: Mauro Carvalho Chehab <mchehab@redhat.com>
+ linux-edac@vger.kernel.org
+Description: This attribute file will display what type of Error detection
+ and correction is being utilized. For example: S4ECD4ED would
+ mean a Chipkill with x4 DRAM.
+
+What: /sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_label
+Date: April 2012
+Contact: Mauro Carvalho Chehab <mchehab@redhat.com>
+ linux-edac@vger.kernel.org
+Description: This control file allows this DIMM to have a label assigned
+ to it. With this label in the module, when errors occur
+ the output can provide the DIMM label in the system log.
+ This becomes vital for panic events to isolate the
+ cause of the UE event.
+ DIMM Labels must be assigned after booting, with information
+ that correctly identifies the physical slot with its
+ silk screen label. This information is currently very
+ motherboard specific and determination of this information
+ must occur in userland at this time.
+
+What: /sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_location
+Date: April 2012
+Contact: Mauro Carvalho Chehab <mchehab@redhat.com>
+ linux-edac@vger.kernel.org
+Description: This attribute file will display the location (csrow/channel,
+ branch/channel/slot or channel/slot) of the dimm or rank.
+
+What: /sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_mem_type
+Date: April 2012
+Contact: Mauro Carvalho Chehab <mchehab@redhat.com>
+ linux-edac@vger.kernel.org
+Description: This attribute file will display what type of memory is
+ currently on this csrow. Normally, either buffered or
+ unbuffered memory (for example, Unbuffered-DDR3).
diff --git a/Documentation/devicetree/bindings/arm/calxeda/l2ecc.txt b/Documentation/devicetree/bindings/arm/calxeda/l2ecc.txt
new file mode 100644
index 000000000000..94e642a33db0
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/calxeda/l2ecc.txt
@@ -0,0 +1,15 @@
+Calxeda Highbank L2 cache ECC
+
+Properties:
+- compatible : Should be "calxeda,hb-sregs-l2-ecc"
+- reg : Address and size for ECC error interrupt clear registers.
+- interrupts : Should be single bit error interrupt, then double bit error
+ interrupt.
+
+Example:
+
+ sregs@fff3c200 {
+ compatible = "calxeda,hb-sregs-l2-ecc";
+ reg = <0xfff3c200 0x100>;
+ interrupts = <0 71 4 0 72 4>;
+ };
diff --git a/Documentation/devicetree/bindings/arm/calxeda/mem-ctrlr.txt b/Documentation/devicetree/bindings/arm/calxeda/mem-ctrlr.txt
new file mode 100644
index 000000000000..f770ac0893d4
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/calxeda/mem-ctrlr.txt
@@ -0,0 +1,14 @@
+Calxeda DDR memory controller
+
+Properties:
+- compatible : Should be "calxeda,hb-ddr-ctrl"
+- reg : Address and size for DDR controller registers.
+- interrupts : Interrupt for DDR controller.
+
+Example:
+
+ memory-controller@fff00000 {
+ compatible = "calxeda,hb-ddr-ctrl";
+ reg = <0xfff00000 0x1000>;
+ interrupts = <0 91 4>;
+ };
diff --git a/Documentation/edac.txt b/Documentation/edac.txt
index 03df2b020332..56c7e936430f 100644
--- a/Documentation/edac.txt
+++ b/Documentation/edac.txt
@@ -232,116 +232,20 @@ EDAC control and attribute files.
In 'mcX' directories are EDAC control and attribute files for
-this 'X' instance of the memory controllers:
-
-
-Counter reset control file:
-
- 'reset_counters'
-
- This write-only control file will zero all the statistical counters
- for UE and CE errors. Zeroing the counters will also reset the timer
- indicating how long since the last counter zero. This is useful
- for computing errors/time. Since the counters are always reset at
- driver initialization time, no module/kernel parameter is available.
-
- RUN TIME: echo "anything" >/sys/devices/system/edac/mc/mc0/counter_reset
-
- This resets the counters on memory controller 0
-
-
-Seconds since last counter reset control file:
-
- 'seconds_since_reset'
-
- This attribute file displays how many seconds have elapsed since the
- last counter reset. This can be used with the error counters to
- measure error rates.
-
-
-
-Memory Controller name attribute file:
-
- 'mc_name'
-
- This attribute file displays the type of memory controller
- that is being utilized.
-
-
-Total memory managed by this memory controller attribute file:
-
- 'size_mb'
-
- This attribute file displays, in count of megabytes, of memory
- that this instance of memory controller manages.
-
-
-Total Uncorrectable Errors count attribute file:
-
- 'ue_count'
-
- This attribute file displays the total count of uncorrectable
- errors that have occurred on this memory controller. If panic_on_ue
- is set this counter will not have a chance to increment,
- since EDAC will panic the system.
-
-
-Total UE count that had no information attribute fileY:
-
- 'ue_noinfo_count'
-
- This attribute file displays the number of UEs that have occurred
- with no information as to which DIMM slot is having errors.
-
-
-Total Correctable Errors count attribute file:
-
- 'ce_count'
-
- This attribute file displays the total count of correctable
- errors that have occurred on this memory controller. This
- count is very important to examine. CEs provide early
- indications that a DIMM is beginning to fail. This count
- field should be monitored for non-zero values and report
- such information to the system administrator.
-
-
-Total Correctable Errors count attribute file:
-
- 'ce_noinfo_count'
-
- This attribute file displays the number of CEs that
- have occurred wherewith no information as to which DIMM slot
- is having errors. Memory is handicapped, but operational,
- yet no information is available to indicate which slot
- the failing memory is in. This count field should be also
- be monitored for non-zero values.
-
-Device Symlink:
-
- 'device'
-
- Symlink to the memory controller device.
-
-Sdram memory scrubbing rate:
-
- 'sdram_scrub_rate'
-
- Read/Write attribute file that controls memory scrubbing. The scrubbing
- rate is set by writing a minimum bandwidth in bytes/sec to the attribute
- file. The rate will be translated to an internal value that gives at
- least the specified rate.
-
- Reading the file will return the actual scrubbing rate employed.
-
- If configuration fails or memory scrubbing is not implemented, accessing
- that attribute will fail.
+this 'X' instance of the memory controllers.
+For a description of the sysfs API, please see:
+ Documentation/ABI/testing/sysfs/devices-edac
============================================================================
'csrowX' DIRECTORIES
+When CONFIG_EDAC_LEGACY_SYSFS is enabled, the sysfs will contain the
+csrowX directories. As this API doesn't work properly for Rambus, FB-DIMMs
+and modern Intel Memory Controllers, this is being deprecated in favor
+of dimmX directories.
+
In the 'csrowX' directories are EDAC control and attribute files for
this 'X' instance of csrow: