<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/arch/x86/include/asm/mce.h, branch docs-5.10-3</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-5.10-3</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-5.10-3'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2020-10-07T09:19:11+00:00</updated>
<entry>
<title>x86/mce: Add _ASM_EXTABLE_CPY for copy user access</title>
<updated>2020-10-07T09:19:11+00:00</updated>
<author>
<name>Youquan Song</name>
<email>youquan.song@intel.com</email>
</author>
<published>2020-10-06T21:09:07+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=278b917f8cb9b02923c15249f9d1a5769d2c1976'/>
<id>urn:sha1:278b917f8cb9b02923c15249f9d1a5769d2c1976</id>
<content type='text'>
_ASM_EXTABLE_UA is a general exception entry to record the exception fixup
for all exception spots between kernel and user space access.

To enable recovery from machine checks while coping data from user
addresses it is necessary to be able to distinguish the places that are
looping copying data from those that copy a single byte/word/etc.

Add a new macro _ASM_EXTABLE_CPY and use it in place of _ASM_EXTABLE_UA
in the copy functions.

Record the exception reason number to regs-&gt;ax at
ex_handler_uaccess which is used to check MCE triggered.

The new fixup routine ex_handler_copy() is almost an exact copy of
ex_handler_uaccess() The difference is that it sets regs-&gt;ax to the trap
number. Following patches use this to avoid trying to copy remaining
bytes from the tail of the copy and possibly hitting the poison again.

New mce.kflags bit MCE_IN_KERNEL_COPYIN will be used by mce_severity()
calculation to indicate that a machine check is recoverable because the
kernel was copying from user space.

Signed-off-by: Youquan Song &lt;youquan.song@intel.com&gt;
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20201006210910.21062-4-tony.luck@intel.com
</content>
</entry>
<entry>
<title>x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()</title>
<updated>2020-10-06T09:18:04+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2020-10-06T03:40:16+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=ec6347bb43395cb92126788a1a5b25302543f815'/>
<id>urn:sha1:ec6347bb43395cb92126788a1a5b25302543f815</id>
<content type='text'>
In reaction to a proposal to introduce a memcpy_mcsafe_fast()
implementation Linus points out that memcpy_mcsafe() is poorly named
relative to communicating the scope of the interface. Specifically what
addresses are valid to pass as source, destination, and what faults /
exceptions are handled.

Of particular concern is that even though x86 might be able to handle
the semantics of copy_mc_to_user() with its common copy_user_generic()
implementation other archs likely need / want an explicit path for this
case:

  On Fri, May 1, 2020 at 11:28 AM Linus Torvalds &lt;torvalds@linux-foundation.org&gt; wrote:
  &gt;
  &gt; On Thu, Apr 30, 2020 at 6:21 PM Dan Williams &lt;dan.j.williams@intel.com&gt; wrote:
  &gt; &gt;
  &gt; &gt; However now I see that copy_user_generic() works for the wrong reason.
  &gt; &gt; It works because the exception on the source address due to poison
  &gt; &gt; looks no different than a write fault on the user address to the
  &gt; &gt; caller, it's still just a short copy. So it makes copy_to_user() work
  &gt; &gt; for the wrong reason relative to the name.
  &gt;
  &gt; Right.
  &gt;
  &gt; And it won't work that way on other architectures. On x86, we have a
  &gt; generic function that can take faults on either side, and we use it
  &gt; for both cases (and for the "in_user" case too), but that's an
  &gt; artifact of the architecture oddity.
  &gt;
  &gt; In fact, it's probably wrong even on x86 - because it can hide bugs -
  &gt; but writing those things is painful enough that everybody prefers
  &gt; having just one function.

Replace a single top-level memcpy_mcsafe() with either
copy_mc_to_user(), or copy_mc_to_kernel().

Introduce an x86 copy_mc_fragile() name as the rename for the
low-level x86 implementation formerly named memcpy_mcsafe(). It is used
as the slow / careful backend that is supplanted by a fast
copy_mc_generic() in a follow-on patch.

One side-effect of this reorganization is that separating copy_mc_64.S
to its own file means that perf no longer needs to track dependencies
for its memcpy_64.S benchmarks.

 [ bp: Massage a bit. ]

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Acked-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com
Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
</content>
</entry>
<entry>
<title>x86/mce: Increase maximum number of banks to 64</title>
<updated>2020-09-04T15:17:27+00:00</updated>
<author>
<name>Akshay Gupta</name>
<email>Akshay.Gupta@amd.com</email>
</author>
<published>2020-08-28T19:24:12+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=a0bc32b3cacf194dc479b342f006203fd1e1941a'/>
<id>urn:sha1:a0bc32b3cacf194dc479b342f006203fd1e1941a</id>
<content type='text'>
...because future AMD systems will support up to 64 MCA banks per CPU.

MAX_NR_BANKS is used to allocate a number of data structures, and it is
used as a ceiling for values read from MCG_CAP[Count]. Therefore, this
change will have no functional effect on existing systems with 32 or
fewer MCA banks per CPU.

However, this will increase the size of the following structures:

Global bitmaps:
- core.c / mce_banks_ce_disabled
- core.c / all_banks
- core.c / valid_banks
- core.c / toclear
- Total: 32 new bits * 4 bitmaps = 16 new bytes

Per-CPU bitmaps:
- core.c / mce_poll_banks
- intel.c / mce_banks_owned
- Total: 32 new bits * 2 bitmaps = 8 new bytes

The bitmaps are arrays of longs. So this change will only affect 32-bit
execution, since there will be one additional long used. There will be
no additional memory use on 64-bit execution, because the size of long
is 64 bits.

Global structs:
- amd.c / struct smca_bank smca_banks[]: 16 bytes per bank
- core.c / struct mce_bank_dev mce_bank_devs[]: 56 bytes per bank
- Total: 32 new banks * (16 + 56) bytes = 2304 new bytes

Per-CPU structs:
- core.c / struct mce_bank mce_banks_array[]: 16 bytes per bank
- Total: 32 new banks * 16 bytes = 512 new bytes

32-bit
Total global size increase: 2320 bytes
Total per-CPU size increase: 520 bytes

64-bit
Total global size increase: 2304 bytes
Total per-CPU size increase: 512 bytes

This additional memory should still fit within the existing .data
section of the kernel binary. However, in the case where it doesn't
fit, an additional page (4kB) of memory will be added to the binary to
accommodate the extra data which will be the maximum size increase of
vmlinux.

Signed-off-by: Akshay Gupta &lt;Akshay.Gupta@amd.com&gt;
[ Adjust commit message and code comment. ]
Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20200828192412.320052-1-Yazen.Ghannam@amd.com
</content>
</entry>
<entry>
<title>x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap</title>
<updated>2020-08-20T08:34:38+00:00</updated>
<author>
<name>Yazen Ghannam</name>
<email>yazen.ghannam@amd.com</email>
</author>
<published>2020-07-20T14:53:53+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=368d1887200d68075c064a62a9aa191168cf1eed'/>
<id>urn:sha1:368d1887200d68075c064a62a9aa191168cf1eed</id>
<content type='text'>
The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type
was intended to be used by the kernel to filter out invalid error codes
on a system. However, this is unnecessary after a few product releases
because the hardware will only report valid error codes. Thus, there's
no need for it with future systems.

Remove the xec_bitmap field and all references to it.

Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20200720145353.43924-1-Yazen.Ghannam@amd.com
</content>
</entry>
<entry>
<title>Merge branch 'x86/entry' into ras/core</title>
<updated>2020-06-11T13:17:57+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2020-06-11T13:17:57+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=f77d26a9fc525286bcef3d4f98b52e17482cf49c'/>
<id>urn:sha1:f77d26a9fc525286bcef3d4f98b52e17482cf49c</id>
<content type='text'>
to fixup conflicts in arch/x86/kernel/cpu/mce/core.c so MCE specific follow
up patches can be applied without creating a horrible merge conflict
afterwards.
</content>
</entry>
<entry>
<title>x86/entry: Convert Machine Check to IDTENTRY_IST</title>
<updated>2020-06-11T13:14:57+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2020-02-25T22:33:23+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=8cd501c1facc159dff6db63775151c9200a3ea1e'/>
<id>urn:sha1:8cd501c1facc159dff6db63775151c9200a3ea1e</id>
<content type='text'>
Convert #MC to IDTENTRY_MCE:
  - Implement the C entry points with DEFINE_IDTENTRY_MCE
  - Emit the ASM stub with DECLARE_IDTENTRY_MCE
  - Remove the ASM idtentry in 64bit
  - Remove the open coded ASM entry code in 32bit
  - Fixup the XEN/PV code
  - Remove the old prototypes
  - Remove the error code from *machine_check_vector() as
    it is always 0 and not used by any of the functions
    it can point to. Fixup all the functions as well.

No functional change.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Alexandre Chartre &lt;alexandre.chartre@oracle.com&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Acked-by: Andy Lutomirski &lt;luto@kernel.org&gt;
Link: https://lkml.kernel.org/r/20200505135314.334980426@linutronix.de




</content>
</entry>
<entry>
<title>x86/mce: Fixup exception only for the correct MCEs</title>
<updated>2020-04-14T14:01:49+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2020-04-07T11:49:58+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=1df73b2131e3b33d518609769636b41ce00212de'/>
<id>urn:sha1:1df73b2131e3b33d518609769636b41ce00212de</id>
<content type='text'>
The severity grading code returns IN_KERNEL_RECOV error context for
errors which have happened in kernel space but from which the kernel can
recover. Whether the recovery can happen is determined by the exception
table entry having as handler ex_handler_fault() and which has been
declared at build time using _ASM_EXTABLE_FAULT().

IN_KERNEL_RECOV is used in mce_severity_intel() to lookup the
corresponding error severity in the severities table.

However, the mapping back from error severity to whether the error is
IN_KERNEL_RECOV is ambiguous and in the very paranoid case - which
might not be possible right now - but be better safe than sorry later,
an exception fixup could be attempted for another MCE whose address
is in the exception table and has the proper severity. Which would be
unfortunate, to say the least.

Therefore, mark such MCEs explicitly as MCE_IN_KERNEL_RECOV so that the
recovery attempt is done only for them.

Document the whole handling, while at it, as it is not trivial.

Reported-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Tested-by: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20200407163414.18058-10-bp@alien8.de
</content>
</entry>
<entry>
<title>x86/mce: Add a struct mce.kflags field</title>
<updated>2020-04-14T13:58:43+00:00</updated>
<author>
<name>Tony Luck</name>
<email>tony.luck@intel.com</email>
</author>
<published>2020-02-14T22:27:16+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=1de08dccd383482a3e88845d3554094d338f5ff9'/>
<id>urn:sha1:1de08dccd383482a3e88845d3554094d338f5ff9</id>
<content type='text'>
There can be many different subsystems register on the mce handler
chain. Add a new bitmask field and define values so that handlers can
indicate whether they took any action to log or otherwise handle an
error.

The default handler at the end of the chain can use this information to
decide whether to print to the console log.

Boris suggested a generic name and leaving plenty of spare bits for
possible future use.

 [ bp: Move flag bits to the internal mce.h header and use BIT_ULL(). ]

Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Tested-by: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20200214222720.13168-4-tony.luck@intel.com
</content>
</entry>
<entry>
<title>x86/mce: Rename "first" function as "early"</title>
<updated>2020-04-14T13:55:01+00:00</updated>
<author>
<name>Tony Luck</name>
<email>tony.luck@intel.com</email>
</author>
<published>2020-02-14T22:27:14+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=c9c6d216ed28be6e2c91e3651af169eca284813a'/>
<id>urn:sha1:c9c6d216ed28be6e2c91e3651af169eca284813a</id>
<content type='text'>
It isn't going to be first on the notifier chain when the CEC is moved
to be a normal user of the notifier chain.

Fix the enum for the MCE_PRIO symbols to list them in reverse order so
that the compiler can give them numbers from low to high priority. Add
an entry for MCE_PRIO_CEC as the highest priority.

 [ bp: Use passive voice, add comments. ]

Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Tested-by: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20200214222720.13168-2-tony.luck@intel.com
</content>
</entry>
<entry>
<title>x86/mce/amd, edac: Remove report_gart_errors</title>
<updated>2020-04-14T13:53:46+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2020-04-07T07:55:10+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=3e0fdec858d82c829774f271e88b5ceb17051551'/>
<id>urn:sha1:3e0fdec858d82c829774f271e88b5ceb17051551</id>
<content type='text'>
... because no one should be interested in spurious MCEs anyway. Make
the filtering unconditional and move it to amd_filter_mce().

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Tested-by: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20200407163414.18058-2-bp@alien8.de
</content>
</entry>
</feed>
