summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-07-04crypto: drbg - fix memory corruption for AES192Stephan Mueller
For the CTR DRBG, the drbg_state->scratchpad temp buffer (i.e. the memory location immediately before the drbg_state->tfm variable is the buffer that the BCC function operates on. BCC operates blockwise. Making the temp buffer drbg_statelen(drbg) in size is sufficient when the DRBG state length is a multiple of the block size. For AES192 this is not the case and the length for temp is insufficient (yes, that also means for such ciphers, the final output of all BCC rounds are truncated before used to update the state of the DRBG!!). The patch enlarges the temp buffer from drbg_statelen to drbg_statelen + drbg_blocklen to have sufficient space. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-03crypto: ux500 - make interrupt mode plausibleArnd Bergmann
The interrupt handler in the ux500 crypto driver has an obviously incorrect way to access the data buffer, which for a while has caused this build warning: ../ux500/cryp/cryp_core.c: In function 'cryp_interrupt_handler': ../ux500/cryp/cryp_core.c:234:5: warning: passing argument 1 of '__fswab32' makes integer from pointer without a cast [enabled by default] writel_relaxed(ctx->indata, ^ In file included from ../include/linux/swab.h:4:0, from ../include/uapi/linux/byteorder/big_endian.h:12, from ../include/linux/byteorder/big_endian.h:4, from ../arch/arm/include/uapi/asm/byteorder.h:19, from ../include/asm-generic/bitops/le.h:5, from ../arch/arm/include/asm/bitops.h:340, from ../include/linux/bitops.h:33, from ../include/linux/kernel.h:10, from ../include/linux/clk.h:16, from ../drivers/crypto/ux500/cryp/cryp_core.c:12: ../include/uapi/linux/swab.h:57:119: note: expected '__u32' but argument is of type 'const u8 *' static inline __attribute_const__ __u32 __fswab32(__u32 val) There are at least two, possibly three problems here: a) when writing into the FIFO, we copy the pointer rather than the actual data we want to give to the hardware b) the data pointer is an array of 8-bit values, while the FIFO is 32-bit wide, so both the read and write access fail to do a proper type conversion c) This seems incorrect for big-endian kernels, on which we need to byte-swap any register access, but not normally FIFO accesses, at least the DMA case doesn't do it either. This converts the bogus loop to use the same readsl/writesl pair that we use for the two other modes (DMA and polling). This is more efficient and consistent, and probably correct for endianess. The bug has existed since the driver was first merged, and was probably never detected because nobody tried to use interrupt mode. It might make sense to backport this fix to stable kernels, depending on how the crypto maintainers feel about that. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: linux-crypto@vger.kernel.org Cc: Fabio Baltieri <fabio.baltieri@linaro.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-03crypto: tcrypt - print cra driver name in tcrypt tests outputLuca Clementi
Print the driver name that is being tested. The driver name can be inferred parsing /proc/crypto but having it in the output is clearer Signed-off-by: Luca Clementi <luca.clementi@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-03ARM: DT: qcom: Add Qualcomm crypto driver binding documentStanimir Varbanov
Here is Qualcomm crypto driver device tree binding documentation to used as a reference example. Signed-off-by: Stanimir Varbanov <svarbanov@mm-sol.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-03crypto: qce - Build Qualcomm crypto driverStanimir Varbanov
Modify crypto Kconfig and Makefile in order to build the qce driver and adds qce Makefile as well. Signed-off-by: Stanimir Varbanov <svarbanov@mm-sol.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-03crypto: qce - Qualcomm crypto engine driverStanimir Varbanov
The driver is separated by functional parts. The core part implements a platform driver probe and remove callbaks. The probe enables clocks, checks crypto version, initialize and request dma channels, create done tasklet and init crypto queue and finally register the algorithms into crypto core subsystem. - DMA and SG helper functions implement dmaengine and sg-list helper functions used by other parts of the crypto driver. - ablkcipher algorithms implementation of AES, DES and 3DES crypto API callbacks, the crypto register alg function, the async request handler and its dma done callback function. - SHA and HMAC transforms implementation and registration of ahash crypto type. It includes sha1, sha256, hmac(sha1) and hmac(sha256). - infrastructure to setup the crypto hw contains functions used to setup/prepare hardware registers for all algorithms supported by the crypto block. It also exports few helper functions needed by algorithms: - to check hardware status - to start crypto hardware - to translate data stream to big endian form Adds register addresses and bit/masks used by the driver as well. Signed-off-by: Stanimir Varbanov <svarbanov@mm-sol.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-03crypto: fips - only panic on bad/missing crypto mod signaturesJarod Wilson
Per further discussion with NIST, the requirements for FIPS state that we only need to panic the system on failed kernel module signature checks for crypto subsystem modules. This moves the fips-mode-only module signature check out of the generic module loading code, into the crypto subsystem, at points where we can catch both algorithm module loads and mode module loads. At the same time, make CONFIG_CRYPTO_FIPS dependent on CONFIG_MODULE_SIG, as this is entirely necessary for FIPS mode. v2: remove extraneous blank line, perform checks in static inline function, drop no longer necessary fips.h include. CC: "David S. Miller" <davem@davemloft.net> CC: Rusty Russell <rusty@rustcorp.com.au> CC: Stephan Mueller <stephan.mueller@atsec.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-26crypto: qat - Fix error path crash when no firmware is presentTadeusz Struk
Firmware loader crashes when no firmware file is present. Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-26crypto: qat - Fixed new checkpatch warningsTadeusz Struk
After updates to checkpatch new warnings pops up this patch fixes them. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Acked-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-26crypto: qat - Updated Firmware Info MetadataTadeusz Struk
Updated Firmware Info Metadata Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-26crypto: qat - Fix random config build warningsTadeusz Struk
Fix random config build warnings: Implicit-function-declaration ‘__raw_writel’ Cast to pointer from integer of different size [-Wint-to-pointer-cast] Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-26crypto: drbg - simplify ordering of linked list in drbg_ctr_dfStephan Mueller
As reported by a static code analyzer, the code for the ordering of the linked list can be simplified. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-25crypto: lzo - use kvfree() helperEric Dumazet
kvfree() helper is now available, use it instead of open code it. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-25crypto: des3_ede-x86_64 - fix parse warningJussi Kivilinna
Patch fixes following sparse warning: CHECK arch/x86/crypto/des3_ede_glue.c arch/x86/crypto/des3_ede_glue.c:308:52: warning: restricted __be64 degrades to integer arch/x86/crypto/des3_ede_glue.c:309:52: warning: restricted __be64 degrades to integer arch/x86/crypto/des3_ede_glue.c:310:52: warning: restricted __be64 degrades to integer arch/x86/crypto/des3_ede_glue.c:326:44: warning: restricted __be64 degrades to integer Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-25crypto: caam - Correct the dma mapping for sg tableRuchika Gupta
At few places in caamhash and caamalg, after allocating a dmable buffer for sg table , the buffer was being modified. As per definition of DMA_FROM_DEVICE ,afer allocation the memory should be treated as read-only by the driver. This patch shifts the allocation of dmable buffer for sg table after it is populated by the driver, making it read-only as per the DMA API's requirement. Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-25crypto: caam - Add definition of rd/wr_reg64 for little endian platformRuchika Gupta
CAAM IP has certain 64 bit registers . 32 bit architectures cannot force atomic-64 operations. This patch adds definition of these atomic-64 operations for little endian platforms. The definitions which existed previously were for big endian platforms. Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-25crypto: caam - Configuration for platforms with virtualization enabled in CAAMRuchika Gupta
For platforms with virtualization enabled 1. The job ring registers can be written to only is the job ring has been started i.e STARTR bit in JRSTART register is 1 2. For DECO's under direct software control, with virtualization enabled PL, BMT, ICID and SDID values need to be provided. These are provided by selecting a Job ring in start mode whose parameters would be used for the DECO access programming. Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-25crypto: caam - Correct definition of registers in memory mapRuchika Gupta
Some registers like SECVID, CHAVID, CHA Revision Number, CTPR were defined as 64 bit resgisters. The IP provides a DWT bit(Double word Transpose) to transpose the two words when a double word register is accessed. However setting this bit would also affect the operation of job descriptors as well as other registers which are truly double word in nature. So, for the IP to work correctly on big-endian as well as little-endian SoC's, change is required to access all 32 bit registers as 32 bit quantities. Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-23crypto: qat - Fix build problem with O=Herbert Xu
qat adds -I to the ccflags. Unfortunately it uses CURDIR which breaks when make is invoked with O=. This patch replaces CURDIR with $(src) which should work with/without O=. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-21crypto: testmgr - add 4 more test vectors for GHASHArd Biesheuvel
This adds 4 test vectors for GHASH (of which one for chunked mode), making a total of 5. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: aes - AES CTR x86_64 "by8" AVX optimizationchandramouli narayanan
This patch introduces "by8" AES CTR mode AVX optimization inspired by Intel Optimized IPSEC Cryptograhpic library. For additional information, please see: http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=22972 The functions aes_ctr_enc_128_avx_by8(), aes_ctr_enc_192_avx_by8() and aes_ctr_enc_256_avx_by8() are adapted from Intel Optimized IPSEC Cryptographic library. When both AES and AVX features are enabled in a platform, the glue code in AESNI module overrieds the existing "by4" CTR mode en/decryption with the "by8" AES CTR mode en/decryption. On a Haswell desktop, with turbo disabled and all cpus running at maximum frequency, the "by8" CTR mode optimization shows better performance results across data & key sizes as measured by tcrypt. The average performance improvement of the "by8" version over the "by4" version is as follows: For 128 bit key and data sizes >= 256 bytes, there is a 10-16% improvement. For 192 bit key and data sizes >= 256 bytes, there is a 20-22% improvement. For 256 bit key and data sizes >= 256 bytes, there is a 20-25% improvement. A typical run of tcrypt with AES CTR mode encryption of the "by4" and "by8" optimization shows the following results: tcrypt with "by4" AES CTR mode encryption optimization on a Haswell Desktop: --------------------------------------------------------------------------- testing speed of __ctr-aes-aesni encryption test 0 (128 bit key, 16 byte blocks): 1 operation in 343 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 336 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 491 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1130 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 7309 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 346 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 361 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 543 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1321 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 9649 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 369 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 366 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 595 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1531 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 10522 cycles (8192 bytes) testing speed of __ctr-aes-aesni decryption test 0 (128 bit key, 16 byte blocks): 1 operation in 336 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 350 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 487 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1129 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 7287 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 350 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 359 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 635 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1324 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 9595 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 364 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 377 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 604 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1527 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 10549 cycles (8192 bytes) tcrypt with "by8" AES CTR mode encryption optimization on a Haswell Desktop: --------------------------------------------------------------------------- testing speed of __ctr-aes-aesni encryption test 0 (128 bit key, 16 byte blocks): 1 operation in 340 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 330 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 450 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1043 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 6597 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 339 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 352 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 539 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1153 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 8458 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 353 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 360 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 512 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1277 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 8745 cycles (8192 bytes) testing speed of __ctr-aes-aesni decryption test 0 (128 bit key, 16 byte blocks): 1 operation in 348 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 335 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 451 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1030 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 6611 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 354 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 346 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 488 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1154 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 8390 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 357 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 362 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 515 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1284 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 8681 cycles (8192 bytes) crypto: Incorporate feed back to AES CTR mode optimization patch Specifically, the following: a) alignment around main loop in aes_ctrby8_avx_x86_64.S b) .rodata around data constants used in the assembely code. c) the use of CONFIG_AVX in the glue code. d) fix up white space. e) informational message for "by8" AES CTR mode optimization f) "by8" AES CTR mode optimization can be simply enabled if the platform supports both AES and AVX features. The optimization works superbly on Sandybridge as well. Testing on Haswell shows no performance change since the last. Testing on Sandybridge shows that the "by8" AES CTR mode optimization greatly improves performance. tcrypt log with "by4" AES CTR mode optimization on Sandybridge -------------------------------------------------------------- testing speed of __ctr-aes-aesni encryption test 0 (128 bit key, 16 byte blocks): 1 operation in 383 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 408 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 707 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1864 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 12813 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 395 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 432 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 780 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 2132 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 15765 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 416 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 438 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 842 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 2383 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 16945 cycles (8192 bytes) testing speed of __ctr-aes-aesni decryption test 0 (128 bit key, 16 byte blocks): 1 operation in 389 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 409 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 704 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1865 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 12783 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 409 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 434 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 792 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 2151 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 15804 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 421 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 444 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 840 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 2394 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 16928 cycles (8192 bytes) tcrypt log with "by8" AES CTR mode optimization on Sandybridge -------------------------------------------------------------- testing speed of __ctr-aes-aesni encryption test 0 (128 bit key, 16 byte blocks): 1 operation in 383 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 401 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 522 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1136 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 7046 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 394 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 418 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 559 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1263 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 9072 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 408 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 428 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 595 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1385 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 9224 cycles (8192 bytes) testing speed of __ctr-aes-aesni decryption test 0 (128 bit key, 16 byte blocks): 1 operation in 390 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 402 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 530 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1135 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 7079 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 414 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 417 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 572 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1312 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 9073 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 415 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 454 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 598 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1407 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 9288 cycles (8192 bytes) crypto: Fix redundant checks a) Fix the redundant check for cpu_has_aes b) Fix the key length check when invoking the CTR mode "by8" encryptor/decryptor. crypto: fix typo in AES ctr mode transform Signed-off-by: Chandramouli Narayanan <mouli@linux.intel.com> Reviewed-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: des_3des - add x86-64 assembly implementationJussi Kivilinna
Patch adds x86_64 assembly implementation of Triple DES EDE cipher algorithm. Two assembly implementations are provided. First is regular 'one-block at time' encrypt/decrypt function. Second is 'three-blocks at time' function that gains performance increase on out-of-order CPUs. tcrypt test results: Intel Core i5-4570: des3_ede-asm vs des3_ede-generic: size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec 16B 1.21x 1.22x 1.27x 1.36x 1.25x 1.25x 64B 1.98x 1.96x 1.23x 2.04x 2.01x 2.00x 256B 2.34x 2.37x 1.21x 2.40x 2.38x 2.39x 1024B 2.50x 2.47x 1.22x 2.51x 2.52x 2.51x 8192B 2.51x 2.53x 1.21x 2.56x 2.54x 2.55x Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: tcrypt - add ctr(des3_ede) sync speed testJussi Kivilinna
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: caam - remove duplicate FIFOST_CONT_MASK defineDan Carpenter
The FIFOST_CONT_MASK define is cut and pasted twice so we can delete the second instance. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Kim Phillips <kim.phillips@freescale.com> Acked-by: Marek Vasut <marex@denx.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: crc32c-pclmul - Shrink K_table to 32-bit wordsGeorge Spelvin
There's no need for the K_table to be made of 64-bit words. For some reason, the original authors didn't fully reduce the values modulo the CRC32C polynomial, and so had some 33-bit values in there. They can all be reduced to 32 bits. Doing that cuts the table size in half. Since the code depends on both pclmulq and crc32, SSE 4.1 is obviously present, so we can use pmovzxdq to fetch it in the correct format. This adds (measured on Ivy Bridge) 1 cycle per main loop iteration (CRC of up to 3K bytes), less than 0.2%. The hope is that the reduced D-cache footprint will make up the loss in other code. Two other related fixes: * K_table is read-only, so belongs in .rodata, and * There's no need for more than 8-byte alignment Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: George Spelvin <linux@horizon.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: qat - Update to makefilesTadeusz Struk
Update to makefiles etc. Don't update the firmware/Makefile yet since there is no FW binary in the crypto repo yet. This will be added later. v3 - removed change to ./firmware/Makefile Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: qat - Intel(R) QAT DH895xcc acceleratorTadeusz Struk
This patch adds DH895xCC hardware specific code. It hooks to the common infrastructure and provides acceleration for crypto algorithms. Acked-by: John Griffin <john.griffin@intel.com> Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: qat - Intel(R) QAT accelengine part of fw loaderTadeusz Struk
This patch adds acceleration engine handler part the firmware loader. Acked-by: Bo Cui <bo.cui@intel.com> Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com> Signed-off-by: Karen Xiang <karen.xiang@intel.com> Signed-off-by: Pingchaox Yang <pingchaox.yang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: qat - Intel(R) QAT ucode part of fw loaderTadeusz Struk
This patch adds microcode part of the firmware loader. v4 - splits FW loader part into two smaller patches. Acked-by: Bo Cui <bo.cui@intel.com> Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com> Signed-off-by: Karen Xiang <karen.xiang@intel.com> Signed-off-by: Pingchaox Yang <pingchaox.yang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: qat - Intel(R) QAT crypto interfaceTadeusz Struk
This patch adds qat crypto interface. Acked-by: John Griffin <john.griffin@intel.com> Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: qat - Intel(R) QAT FW interfaceTadeusz Struk
This patch adds FW interface structure definitions. Acked-by: John Griffin <john.griffin@intel.com> Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: qat - Intel(R) QAT transport codeTadeusz Struk
This patch adds a code that implements communication channel between the driver and the firmware. Acked-by: John Griffin <john.griffin@intel.com> Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: qat - Intel(R) QAT driver frameworkTadeusz Struk
This patch adds a common infractructure that will be used by all Intel(R) QuickAssist Technology (QAT) devices. v2 - added ./drivers/crypto/qat/Kconfig and ./drivers/crypto/qat/Makefile v4 - splits common part into more, smaller patches Acked-by: John Griffin <john.griffin@intel.com> Reviewed-by: Bruce W. Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: ccp - Add platform device support for arm64Tom Lendacky
Add support for the CCP on arm64 as a platform device. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: ccp - CCP device bindings documentationTom Lendacky
This patch provides the documentation of the device bindings for the AMD Cryptographic Coprocessor driver. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: ccp - Modify PCI support in prep for arm64 supportTom Lendacky
Modify the PCI device support in prep for supporting the CCP as a platform device for arm64. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: drbg - Add DRBG test code to testmgrStephan Mueller
The DRBG test code implements the CAVS test approach. As discussed for the test vectors, all DRBG types are covered with testing. However, not every backend cipher is covered with testing. To prevent the testmgr from logging missing testing, the NULL test is registered for all backend ciphers not covered with specific test cases. All currently implemented DRBG types and backend ciphers are defined in SP800-90A. Therefore, the fips_allowed flag is set for all. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: drbg - DRBG testmgr test vectorsStephan Mueller
All types of the DRBG (CTR, HMAC, Hash) are covered with test vectors. In addition, all permutations of use cases of the DRBG are covered: * with and without predition resistance * with and without additional information string * with and without personalization string As the DRBG implementation is agnositc of the specific backend cipher, only test vectors for one specific backend cipher is used. For example: the Hash DRBG uses the same code paths irrespectively of using SHA-256 or SHA-512. Thus, the test vectors for SHA-256 cover the testing of all DRBG code paths of SHA-512. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: drbg - compile the DRBG codeStephan Mueller
Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: drbg - DRBG kernel configuration optionsStephan Mueller
The different DRBG types of CTR, Hash, HMAC can be enabled or disabled at compile time. At least one DRBG type shall be selected. The default is the HMAC DRBG as its code base is smallest. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: drbg - header file for DRBGStephan Mueller
The header file includes the definition of: * DRBG data structures with - struct drbg_state as main structure - struct drbg_core referencing the backend ciphers - struct drbg_state_ops callbach handlers for specific code supporting the Hash, HMAC, CTR DRBG implementations - struct drbg_conc defining a linked list for input data - struct drbg_test_data holding the test "entropy" data for CAVS testing and testmgr.c - struct drbg_gen allowing test data, additional information string and personalization string data to be funneled through the kernel crypto API -- the DRBG requires additional parameters when invoking the reset and random number generation requests than intended by the kernel crypto API * wrapper function to the kernel crypto API functions using struct drbg_gen to pass through all data needed for DRBG * wrapper functions to kernel crypto API functions usable for testing code to inject test_data into the DRBG as needed by CAVS testing and testmgr.c. * DRBG flags required for the operation of the DRBG and for selecting the particular DRBG type and backend cipher * getter functions for data from struct drbg_core Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: drbg - SP800-90A Deterministic Random Bit GeneratorStephan Mueller
This is a clean-room implementation of the DRBG defined in SP800-90A. All three viable DRBGs defined in the standard are implemented: * HMAC: This is the leanest DRBG and compiled per default * Hash: The more complex DRBG can be enabled at compile time * CTR: The most complex DRBG can also be enabled at compile time The DRBG implementation offers the following: * All three DRBG types are implemented with a derivation function. * All DRBG types are available with and without prediction resistance. * All SHA types of SHA-1, SHA-256, SHA-384, SHA-512 are available for the HMAC and Hash DRBGs. * All AES types of AES-128, AES-192 and AES-256 are available for the CTR DRBG. * A self test is implemented with drbg_healthcheck(). * The FIPS 140-2 continuous self test is implemented. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: drivers - Add 2 missing __exit_pJean Delvare
References to __exit functions must be wrapped with __exit_p. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Marcelo Henrique Cerri <mhcerri@linux.vnet.ibm.com> Cc: Fionnuala Gunter <fin@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: caam - Introduce the use of the managed version of kzallocHimangi Saraogi
This patch moves data allocated using kzalloc to managed data allocated using devm_kzalloc and cleans now unnecessary kfrees in probe and remove functions. Also, linux/device.h is added to make sure the devm_*() routine declarations are unambiguously available. Earlier, in the probe function ctrlpriv was leaked on the failure of ctrl = of_iomap(nprop, 0); as well as on the failure of ctrlpriv->jrpdev = kzalloc(...); . These two bugs have been fixed by the patch. The following Coccinelle semantic patch was used for making the change: identifier p, probefn, removefn; @@ struct platform_driver p = { .probe = probefn, .remove = removefn, }; @prb@ identifier platform.probefn, pdev; expression e, e1, e2; @@ probefn(struct platform_device *pdev, ...) { <+... - e = kzalloc(e1, e2) + e = devm_kzalloc(&pdev->dev, e1, e2) ... ?-kfree(e); ...+> } @rem depends on prb@ identifier platform.removefn; expression e; @@ removefn(...) { <... - kfree(e); ...> } Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Reviewed-by: Marek Vasut <marex@denx.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: lzo - try kmalloc() before vmalloc()Eric Dumazet
zswap allocates one LZO context per online cpu. Using vmalloc() for small (16KB) memory areas has drawback of slowing down /proc/vmallocinfo and /proc/meminfo reads, TLB pressure and poor NUMA locality, as default NUMA policy at boot time is to interleave pages : edumazet:~# grep lzo /proc/vmallocinfo | head -4 0xffffc90006062000-0xffffc90006067000 20480 lzo_init+0x1b/0x30 pages=4 vmalloc N0=2 N1=2 0xffffc90006067000-0xffffc9000606c000 20480 lzo_init+0x1b/0x30 pages=4 vmalloc N0=2 N1=2 0xffffc9000606c000-0xffffc90006071000 20480 lzo_init+0x1b/0x30 pages=4 vmalloc N0=2 N1=2 0xffffc90006071000-0xffffc90006076000 20480 lzo_init+0x1b/0x30 pages=4 vmalloc N0=2 N1=2 This patch tries a regular kmalloc() and fallback to vmalloc in case memory is too fragmented. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: skcipher - Don't use __crypto_dequeue_request()Marek Vasut
Use skcipher_givcrypt_cast(crypto_dequeue_request(queue)) instead, which does the same thing in much cleaner way. The skcipher_givcrypt_cast() actually uses container_of() instead of messing around with offsetof() too. Signed-off-by: Marek Vasut <marex@denx.de> Reported-by: Arnd Bergmann <arnd@arndb.de> Cc: Pantelis Antoniou <panto@antoniou-consulting.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20crypto: api - Move crypto_yield() to algapi.hMarek Vasut
It makes no sense for crypto_yield() to be defined in scatterwalk.h , move it into algapi.h as it's an internal function to crypto API. Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-15Linux 3.16-rc1v3.16-rc1Linus Torvalds
2014-06-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix checksumming regressions, from Tom Herbert. 2) Undo unintentional permissions changes for SCTP rto_alpha and rto_beta sysfs knobs, from Denial Borkmann. 3) VXLAN, like other IP tunnels, should advertize it's encapsulation size using dev->needed_headroom instead of dev->hard_header_len. From Cong Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net: sctp: fix permissions for rto_alpha and rto_beta knobs vxlan: Checksum fixes net: add skb_pop_rcv_encapsulation udp: call __skb_checksum_complete when doing full checksum net: Fix save software checksum complete net: Fix GSO constants to match NETIF flags udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup vxlan: use dev->needed_headroom instead of dev->hard_header_len MAINTAINERS: update cxgb4 maintainer
2014-06-15Merge tag 'clk-for-linus-3.16-part2' of ↵Linus Torvalds
git://git.linaro.org/people/mike.turquette/linux Pull more clock framework updates from Mike Turquette: "This contains the second half the of the clk changes for 3.16. They are simply fixes and code refactoring for the OMAP clock drivers. The sunxi clock driver changes include splitting out the one mega-driver into several smaller pieces and adding support for the A31 SoC clocks" * tag 'clk-for-linus-3.16-part2' of git://git.linaro.org/people/mike.turquette/linux: (25 commits) clk: sunxi: document PRCM clock compatible strings clk: sunxi: add PRCM (Power/Reset/Clock Management) clks support clk: sun6i: Protect SDRAM gating bit clk: sun6i: Protect CPU clock clk: sunxi: Rework clock protection code clk: sunxi: Move the GMAC clock to a file of its own clk: sunxi: Move the 24M oscillator to a file of its own clk: sunxi: Remove calls to clk_put clk: sunxi: document new A31 USB clock compatible clk: sunxi: Implement A31 USB clock ARM: dts: OMAP5/DRA7: use omap5-mpu-dpll-clock capable of dealing with higher frequencies CLK: TI: dpll: support OMAP5 MPU DPLL that need special handling for higher frequencies ARM: OMAP5+: dpll: support Duty Cycle Correction(DCC) CLK: TI: clk-54xx: Set the rate for dpll_abe_m2x2_ck CLK: TI: Driver for DRA7 ATL (Audio Tracking Logic) dt:/bindings: DRA7 ATL (Audio Tracking Logic) clock bindings ARM: dts: dra7xx-clocks: Correct name for atl clkin3 clock CLK: TI: gate: add composite interface clock to OMAP2 only build ARM: OMAP2: clock: add DT boot support for cpufreq_ck CLK: TI: OMAP2: add clock init support ...