summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-12-30net: lantiq_etop: make alignment match open parenthesisAleksander Jan Bajkowski
checkpatch.pl complains as the following: Alignment should match open parenthesis Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: lantiq_etop: remove multiple assignmentsAleksander Jan Bajkowski
Documentation/process/coding-style.rst says (in line 88) "Don't put multiple assignments on a single line either." This patch fixes the coding style issue reported by checkpatch.pl. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: lantiq_etop: avoid precedence issuesAleksander Jan Bajkowski
Add () around macro argument to avoid precedence issues Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: lantiq_etop: replace strlcpy with strscpyAleksander Jan Bajkowski
strlcpy is marked as deprecated in Documentation/process/deprecated.rst, and there is no functional difference when the caller expects truncation (when not checking the return value). strscpy is relatively better as it also avoids scanning the whole source string. This silences the related checkpatch warnings from: commit 5dbdb2d87c29 ("checkpatch: prefer strscpy to strlcpy") Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30ice: Add flow director support for channel modeKiran Patil
Add support to enable flow-director filter when multiple TCs are configured. Flow director filter can be configured using ethtool (--config-ntuple option). When multiple TCs are configured, each TC is mapped to an unique HW VSI. So VSI corresponding to queue used in filter is identified and flow director context is updated with correct VSI while configuring ntuple filter in HW. Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-David S. Miller
queue Tony Nguyen says: ==================== 1GbE Intel Wired LAN Driver Updates 2021-12-29 Ruud Bos says: The igb driver provides support for PEROUT and EXTTS pin functions that allow adapter external use of timing signals. At Hottinger Bruel & Kjaer we are using the PEROUT function to feed a PTP corrected 1pps signal into an FPGA as cross system synchronized time source. Support for the PEROUT and EXTTS SDP functions is currently limited to i210/i211 based adapters. This patch series enables these functions also for 82580/i354/i350 based ones. Because the time registers of these adapters do not have the nice split in second rollovers as the i210 has, the implementation is slightly more complex compared to the i210 implementation. The PEROUT function has been successfully tested on an i350 based ethernet adapter. Using the following user space code excerpt, the driver outputs a PTP corrected 1pps signal on the SDP0 pin of an i350: struct ptp_pin_desc desc; memset(&desc, 0, sizeof(desc)); desc.index = 0; desc.func = PTP_PF_PEROUT; desc.chan = 0; if (ioctl(fd, PTP_PIN_SETFUNC, &desc) == 0) { struct timespec ts; if (clock_gettime(clkid, &ts) == 0) { struct ptp_perout_request rq; memset(&rq, 0, sizeof(rq)); rq.index = 0; rq.start.sec = ts.tv_sec + 1; rq.start.nsec = 500000000; rq.period.sec = 1; rq.period.nsec = 0; if (ioctl(fd, PTP_PEROUT_REQUEST, &rq) == 0) { /* 1pps signal is now available on SDP0 */ } } } The added EXTTS function has not been tested. However, looking at the data sheets, the layout of the registers involved match the i210 exactly except for the time registers mentioned before. Hence the almost identical implementation. --- Note: I made changes to fix RCT and checkpatch messages regarding unnecessary parenthesis. ==================== Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30Merge branch 'prestera-router-driver'David S. Miller
Yevhen Orlov says: ==================== prestera: add basic router driver support Add initial router support for Marvell Prestera driver. Subscribe on inetaddr notifications. TRAP packets, that has to be routed (if packet has router's destination MAC address). Add features: - Support ip address adding on port. e.g.: "ip address add PORT 1.1.1.1/24" Limitations: - Only regular port supported. Vlan will be added soon. - It is routing through CPU. Offloading will be added in next patches. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Changes for v2: * Remove useless assignment ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: marvell: prestera: Implement initial inetaddr notifiersYevhen Orlov
Add inetaddr notifiers to support add/del IPv4 address on switchdev port. We create TRAP on first address, added on port and delete TRAP, when last address removed. Currently, driver supports only regular port to became routed. Other port type support will be added later Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: marvell: prestera: Register inetaddr stub notifiersYevhen Orlov
Initial implementation of notification handlers. For now this is just stub. So that we can move forward and add prestera_router_hw's objects manipulations. We support several addresses on interface. We just have nothing to do for second address, because rif is already enabled on this interface, after first one. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: marvell: prestera: add hardware router objects accountingYevhen Orlov
Add prestera_router_hw.c. This file contains functions, which track HW objects relations and links. This include implicity creation of objects, that needed by requested one and implicity removing of objects, which reference counter is became zero. We need this layer, because kernel callbacks not always mapped to creation of single HW object. So let it be two different layers - one for subscribing and parsing kernel structures, and another (prestera_router_hw.c) for HW objects relations tracking. There is two types of objects on router_hw layer: - Explicit objects (rif_entry) : created by higher layer. - Implicit objects (vr) : created on demand by explicit objects. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: marvell: prestera: Add prestera router infraYevhen Orlov
Add prestera_router.c, which contains code to subscribe/unsubscribe on kernel notifiers for router. This handle kernel notifications, parse structures to make key to manipulate prestera_router_hw's objects. Also prestera_router is container for router's objects database. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: marvell: prestera: Add router interface ABIYevhen Orlov
Add functions to enable routing on port, which is not in vlan. Also we can enable routing on vlan. prestera_hw_rif_create() take index of allocated virtual router. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: marvell: prestera: add virtual router ABIYevhen Orlov
Add functions and structures to allocate virtual router. prestera_hw_vr_create() return index of allocated VR so that we can move forward and also add another objects (e.g. router interface), which has link to VR. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-29Merge branch 'lighten uapi/bpf.h rebuilds'Alexei Starovoitov
Jakub Kicinski says: ==================== Last change in the bpf headers - disentangling BPF uapi from netdevice.h. Both linux/bpf.h and uapi/bpf.h changes should now rebuild ~1k objects down from the original ~18k. There's probably more that can be done but it's diminishing returns. Split into two patches for ease of review. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-12-29bpf: Invert the dependency between bpf-netns.h and netns/bpf.hJakub Kicinski
netns/bpf.h gets included by netdevice.h (thru net_namespace.h) which in turn gets included in a lot of places. We should keep netns/bpf.h as light-weight as possible. bpf-netns.h seems to contain more implementation details than deserves to be included in a netns header. It needs to pull in uapi/bpf.h to get various enum types. Move enum netns_bpf_attach_type to netns/bpf.h and invert the dependency. This makes netns/bpf.h fit the mold of a struct definition header more clearly, and drops the number of objects rebuilt when uapi/bpf.h is touched from 7.7k to 1.1k. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211230012742.770642-3-kuba@kernel.org
2021-12-29net: Add includes masked by netdevice.h including uapi/bpf.hJakub Kicinski
Add missing includes unmasked by the subsequent change. Mostly network drivers missing an include for XDP_PACKET_HEADROOM. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211230012742.770642-2-kuba@kernel.org
2021-12-29Merge tag 'mlx5-fixes-2021-12-28' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2021-12-28 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2021-12-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Fix wrong features assignment in case of error net/mlx5e: TC, Fix memory leak with rules with internal port ==================== Link: https://lore.kernel.org/r/20211229065352.30178-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29Merge branch 'Sleepable local storage'Alexei Starovoitov
KP Singh says: ==================== Local storage is currently unusable in sleepable helpers. One of the important use cases of local_storage is to attach security (or performance) contextual information to kernel objects in LSM / tracing programs to be used later in the life-cyle of the object. Sometimes this context can only be gathered from sleepable programs (because it needs accesing __user pointers or helpers like bpf_ima_inode_hash). Allowing local storage to be used from sleepable programs allows such context to be managed with the benefits of local_storage. # v2 -> v3 * Fixed some RCU issues pointed by Martin * Added Martin's ack # v1 -> v2 * Generalize RCU checks (will send a separate patch for updating non local storage code where this can be used). * Add missing RCU lock checks from v1 ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-12-29bpf/selftests: Update local storage selftest for sleepable programsKP Singh
Remove the spin lock logic and update the selftests to use sleepable programs to use a mix of sleepable and non-sleepable programs. It's more useful to test the sleepable programs since the tests don't really need spinlocks. Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20211224152916.1550677-3-kpsingh@kernel.org
2021-12-29bpf: Allow bpf_local_storage to be used by sleepable programsKP Singh
Other maps like hashmaps are already available to sleepable programs. Sleepable BPF programs run under trace RCU. Allow task, sk and inode storage to be used from sleepable programs. This allows sleepable and non-sleepable programs to provide shareable annotations on kernel objects. Sleepable programs run in trace RCU where as non-sleepable programs run in a normal RCU critical section i.e. __bpf_prog_enter{_sleepable} and __bpf_prog_exit{_sleepable}) (rcu_read_lock or rcu_read_lock_trace). In order to make the local storage maps accessible to both sleepable and non-sleepable programs, one needs to call both call_rcu_tasks_trace and call_rcu to wait for both trace and classical RCU grace periods to expire before freeing memory. Paul's work on call_rcu_tasks_trace allows us to have per CPU queueing for call_rcu_tasks_trace. This behaviour can be achieved by setting rcupdate.rcu_task_enqueue_lim=<num_cpus> boot parameter. In light of these new performance changes and to keep the local storage code simple, avoid adding a new flag for sleepable maps / local storage to select the RCU synchronization (trace / classical). Also, update the dereferencing of the pointers to use rcu_derference_check (with either the trace or normal RCU locks held) with a common bpf_rcu_lock_held helper method. Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20211224152916.1550677-2-kpsingh@kernel.org
2021-12-29net/ncsi: check for error return from call to nla_put_u32Jiasheng Jiang
As we can see from the comment of the nla_put() that it could return -EMSGSIZE if the tailroom of the skb is insufficient. Therefore, it should be better to check the return value of the nla_put_u32 and return the error code if error accurs. Also, there are many other functions have the same problem, and if this patch is correct, I will commit a new version to fix all. Fixes: 955dc68cb9b2 ("net/ncsi: Add generic netlink family") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Link: https://lore.kernel.org/r/20211229032118.1706294-1-jiasheng@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29sun4i-emac.c: add dma supportConley Lee
Thanks for your review. Here is the new version for this patch. This patch adds support for the emac rx dma present on sun4i. The emac is able to move packets from rx fifo to RAM by using dma. Change since v4. - rename sbk field to skb - rename alloc_emac_dma_req to emac_alloc_dma_req - using kzalloc(..., GPF_ATOMIC) in interrupt context to avoid sleeping - retry by using emac_inblk_32bit when emac_dma_inblk_32bit fails - fix some code style issues Change since v5. - fix some code style issue Signed-off-by: Conley Lee <conleylee@foxmail.com> Link: https://lore.kernel.org/r/tencent_DE05ADA53D5B084D4605BE6CB11E49EF7408@qq.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29net: bridge: mcast: fix br_multicast_ctx_vlan_global_disabled helperNikolay Aleksandrov
We need to first check if the context is a vlan one, then we need to check the global bridge multicast vlan snooping flag, and finally the vlan's multicast flag, otherwise we will unnecessarily enable vlan mcast processing (e.g. querier timers). Fixes: 7b54aaaf53cb ("net: bridge: multicast: add vlan state initialization and control") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20211228153142.536969-1-nikolay@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29net: fix use-after-free in tw_timer_handlerMuchun Song
A real world panic issue was found as follow in Linux 5.4. BUG: unable to handle page fault for address: ffffde49a863de28 PGD 7e6fe62067 P4D 7e6fe62067 PUD 7e6fe63067 PMD f51e064067 PTE 0 RIP: 0010:tw_timer_handler+0x20/0x40 Call Trace: <IRQ> call_timer_fn+0x2b/0x120 run_timer_softirq+0x1ef/0x450 __do_softirq+0x10d/0x2b8 irq_exit+0xc7/0xd0 smp_apic_timer_interrupt+0x68/0x120 apic_timer_interrupt+0xf/0x20 This issue was also reported since 2017 in the thread [1], unfortunately, the issue was still can be reproduced after fixing DCCP. The ipv4_mib_exit_net is called before tcp_sk_exit_batch when a net namespace is destroyed since tcp_sk_ops is registered befrore ipv4_mib_ops, which means tcp_sk_ops is in the front of ipv4_mib_ops in the list of pernet_list. There will be a use-after-free on net->mib.net_statistics in tw_timer_handler after ipv4_mib_exit_net if there are some inflight time-wait timers. This bug is not introduced by commit f2bf415cfed7 ("mib: add net to NET_ADD_STATS_BH") since the net_statistics is a global variable instead of dynamic allocation and freeing. Actually, commit 61a7e26028b9 ("mib: put net statistics on struct net") introduces the bug since it put net statistics on struct net and free it when net namespace is destroyed. Moving init_ipv4_mibs() to the front of tcp_init() to fix this bug and replace pr_crit() with panic() since continuing is meaningless when init_ipv4_mibs() fails. [1] https://groups.google.com/g/syzkaller/c/p1tn-_Kc6l4/m/smuL_FMAAgAJ?pli=1 Fixes: 61a7e26028b9 ("mib: put net statistics on struct net") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Cc: Cong Wang <cong.wang@bytedance.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211228104145.9426-1-songmuchun@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29selftests: net: Fix a typo in udpgro_fwd.shJianguo Wu
$rvs -> $rcv Fixes: a062260a9d5f ("selftests: net: add UDP GRO forwarding self-tests") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Link: https://lore.kernel.org/r/d247d7c8-a03a-0abf-3c71-4006a051d133@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29selftests/net: udpgso_bench_tx: fix dst ip argumentwujianguo
udpgso_bench_tx call setup_sockaddr() for dest address before parsing all arguments, if we specify "-p ${dst_port}" after "-D ${dst_ip}", then ${dst_port} will be ignored, and using default cfg_port 8000. This will cause test case "multiple GRO socks" failed in udpgro.sh. Setup sockaddr after parsing all arguments. Fixes: 3a687bef148d ("selftests: udp gso benchmark") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/ff620d9f-5b52-06ab-5286-44b945453002@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29Merge tag 'for-net-next-2021-12-29' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - Add support for Foxconn MT7922A - Add support for Realtek RTL8852AE - Rework HCI event handling to use skb_pull_data * tag 'for-net-next-2021-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (62 commits) Bluetooth: MGMT: Fix spelling mistake "simultanous" -> "simultaneous" Bluetooth: vhci: Set HCI_QUIRK_VALID_LE_STATES Bluetooth: MGMT: Fix LE simultaneous roles UUID if not supported Bluetooth: hci_sync: Add check simultaneous roles support Bluetooth: hci_sync: Wait for proper events when connecting LE Bluetooth: hci_sync: Add support for waiting specific LE subevents Bluetooth: hci_sync: Add hci_le_create_conn_sync Bluetooth: hci_event: Use skb_pull_data when processing inquiry results Bluetooth: hci_sync: Push sync command cancellation to workqueue Bluetooth: hci_qca: Stop IBS timer during BT OFF Bluetooth: btusb: Add support for Foxconn MT7922A Bluetooth: btintel: Add missing quirks and msft ext for legacy bootloader Bluetooth: btusb: Add two more Bluetooth parts for WCN6855 Bluetooth: L2CAP: Fix using wrong mode Bluetooth: hci_sync: Fix not always pausing advertising when necessary Bluetooth: mgmt: Make use of mgmt_send_event_skb in MGMT_EV_DEVICE_CONNECTED Bluetooth: mgmt: Make use of mgmt_send_event_skb in MGMT_EV_DEVICE_FOUND Bluetooth: mgmt: Introduce mgmt_alloc_skb and mgmt_send_event_skb Bluetooth: btusb: Return error code when getting patch status failed Bluetooth: btusb: Handle download_firmware failure cases ... ==================== Link: https://lore.kernel.org/r/20211229211258.2290966-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29Merge branch 'net-bridge-mcast-add-and-enforce-query-interval-minimum'Jakub Kicinski
Nikolay Aleksandrov says: ==================== net: bridge: mcast: add and enforce query interval minimum This set adds and enforces 1 second minimum value for bridge multicast query and startup query intervals in order to avoid rearming the timers too often which could lock and crash the host. I doubt anyone is using such low values or anything lower than 1 second, so it seems like a good minimum. In order to be compatible if the value is lower then it is overwritten and a log message is emitted, since we can't return an error at this point. Eric, I looked for the syzbot reports in its dashboard but couldn't find them so I've added you as the reporter. I've prepared a global bridge igmp rate limiting patch but wasn't sure if it's ok for -net. It adds a static limit of 32k packets per second, I plan to send it for net-next with added drop counters for each bridge so it can be easily debugged. Original report can be seen at: https://lore.kernel.org/netdev/e8b9ce41-57b9-b6e2-a46a-ff9c791cf0ba@gmail.com/ ==================== Link: https://lore.kernel.org/r/20211227172116.320768-1-nikolay@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29net: bridge: mcast: add and enforce startup query interval minimumNikolay Aleksandrov
As reported[1] if startup query interval is set too low in combination with large number of startup queries and we have multiple bridges or even a single bridge with multiple querier vlans configured we can crash the machine. Add a 1 second minimum which must be enforced by overwriting the value if set lower (i.e. without returning an error) to avoid breaking user-space. If that happens a log message is emitted to let the admin know that the startup interval has been set to the minimum. It doesn't make sense to make the startup interval lower than the normal query interval so use the same value of 1 second. The issue has been present since these intervals could be user-controlled. [1] https://lore.kernel.org/netdev/e8b9ce41-57b9-b6e2-a46a-ff9c791cf0ba@gmail.com/ Fixes: d902eee43f19 ("bridge: Add multicast count/interval sysfs entries") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29net: bridge: mcast: add and enforce query interval minimumNikolay Aleksandrov
As reported[1] if query interval is set too low and we have multiple bridges or even a single bridge with multiple querier vlans configured we can crash the machine. Add a 1 second minimum which must be enforced by overwriting the value if set lower (i.e. without returning an error) to avoid breaking user-space. If that happens a log message is emitted to let the administrator know that the interval has been set to the minimum. The issue has been present since these intervals could be user-controlled. [1] https://lore.kernel.org/netdev/e8b9ce41-57b9-b6e2-a46a-ff9c791cf0ba@gmail.com/ Fixes: d902eee43f19 ("bridge: Add multicast count/interval sysfs entries") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29ipv6: raw: check passed optlen before readingTamir Duberstein
Add a check that the user-provided option is at least as long as the number of bytes we intend to read. Before this patch we would blindly read sizeof(int) bytes even in cases where the user passed optlen<sizeof(int), which would potentially read garbage or fault. Discovered by new tests in https://github.com/google/gvisor/pull/6957 . The original get_user call predates history in the git repo. Signed-off-by: Tamir Duberstein <tamird@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20211229200947.2862255-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29Merge branch 'net-define-new-hwtstamp-flag-and-return-it-to-userspace'Jakub Kicinski
Hangbin Liu says: ==================== net: define new hwtstamp flag and return it to userspace This patchset defined the new hwtstamp flag HWTSTAMP_FLAG_BONDED_PHC_INDEX to make userspace program build pass with old kernel header by settting ifdef. Let's also return the flag when do SIOC[G/S]HWTSTAMP to let userspace know that it's necessary for a given netdev. ==================== Link: https://lore.kernel.org/r/20211229080938.231324-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29Bonding: return HWTSTAMP_FLAG_BONDED_PHC_INDEX to notify user spaceHangbin Liu
If the userspace program is distributed in binary form (distro package), there is no way to know on which kernel versions it will run. Let's only check if the flag was set when do SIOCSHWTSTAMP. And return hwtstamp_config with flag HWTSTAMP_FLAG_BONDED_PHC_INDEX to notify userspace whether the new feature is supported or not. Suggested-by: Jakub Kicinski <kuba@kernel.org> Fixes: 085d61000845 ("Bonding: force user to add HWTSTAMP_FLAG_BONDED_PHC_INDEX when get/set HWTSTAMP") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29net_tstamp: define new flag HWTSTAMP_FLAG_BONDED_PHC_INDEXHangbin Liu
As we defined the new hwtstamp_config flag HWTSTAMP_FLAG_BONDED_PHC_INDEX as enum, it's not easy for userspace program to check if the flag is supported when build. Let's define the new flag so user space could build it on old kernel with ifdef check. Fixes: 9c9211a3fc7a ("net_tstamp: add new flag HWTSTAMP_FLAG_BONDED_PHC_INDEX") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29Merge tag 's390-5.16-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fix from Heiko Carstens: - fix s390 mcount regex typo in recordmcount.pl * tag 's390-5.16-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: recordmcount.pl: fix typo in s390 mcount regex
2021-12-29igb: support EXTTS on 82580/i354/i350Ruud Bos
Support for the PTP pin function on 82580/i354/i350 based adapters. Because the time registers of these adapters do not have the nice split in second rollovers as the i210 has, the implementation is slightly more complex compared to the i210 implementation. Signed-off-by: Ruud Bos <kernel.hbk@gmail.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-29igb: support PEROUT on 82580/i354/i350Ruud Bos
Support for the PEROUT PTP pin function on 82580/i354/i350 based adapters. Because the time registers of these adapters do not have the nice split in second rollovers as the i210 has, the implementation is slightly more complex compared to the i210 implementation. Signed-off-by: Ruud Bos <kernel.hbk@gmail.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-29igb: move PEROUT and EXTTS isr logic to separate functionsRuud Bos
Remove code duplication in the tsync interrupt handler function by moving this logic to separate functions. This keeps the interrupt handler readable and allows the new functions to be extended for adapter types other than i210. Signed-off-by: Ruud Bos <kernel.hbk@gmail.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-29igb: move SDP config initialization to separate functionRuud Bos
Allow reuse of SDP config struct initialization by moving it to a separate function. Signed-off-by: Ruud Bos <kernel.hbk@gmail.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-29xsk: Initialise xskb free_list_nodeCiara Loftus
This commit initialises the xskb's free_list_node when the xskb is allocated. This prevents a potential false negative returned from a call to list_empty for that node, such as the one introduced in commit 199d983bc015 ("xsk: Fix crash on double free in buffer pool") In my environment this issue caused packets to not be received by the xdpsock application if the traffic was running prior to application launch. This happened when the first batch of packets failed the xskmap lookup and XDP_PASS was returned from the bpf program. This action is handled in the i40e zc driver (and others) by allocating an skbuff, freeing the xdp_buff and adding the associated xskb to the xsk_buff_pool's free_list if it hadn't been added already. Without this fix, the xskb is not added to the free_list because the check to determine if it was added already returns an invalid positive result. Later, this caused allocation errors in the driver and the failure to receive packets. Fixes: 199d983bc015 ("xsk: Fix crash on double free in buffer pool") Fixes: 2b43470add8c ("xsk: Introduce AF_XDP buffer allocation API") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20211220155250.2746-1-ciara.loftus@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29bpf: Add missing map_get_next_key method to bloom filter map.Haimin Zhang
Without it, kernel crashes in map_get_next_key(). Fixes: 9330986c0300 ("bpf: Add bloom filter map implementation") Reported-by: TCS Robot <tcs_robot@tencent.com> Signed-off-by: Haimin Zhang <tcs_kernel@tencent.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Joanne Koong <joannekoong@fb.com> Link: https://lore.kernel.org/bpf/1640776802-22421-1-git-send-email-tcs.kernel@gmail.com
2021-12-29net: Don't include filter.h from net/sock.hJakub Kicinski
sock.h is pretty heavily used (5k objects rebuilt on x86 after it's touched). We can drop the include of filter.h from it and add a forward declaration of struct sk_filter instead. This decreases the number of rebuilt objects when bpf.h is touched from ~5k to ~1k. There's a lot of missing includes this was masking. Primarily in networking tho, this time. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/bpf/20211229004913.513372-1-kuba@kernel.org
2021-12-29of: net: support NVMEM cells with MAC in text formatRafał Miłecki
Some NVMEM devices have text based cells. In such cases MAC is stored in a XX:XX:XX:XX:XX:XX format. Use mac_pton() to parse such data and support those NVMEM cells. This is required to support e.g. a very popular U-Boot and its environment variables. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-28net/mlx5e: Fix wrong features assignment in case of errorGal Pressman
In case of an error in mlx5e_set_features(), 'netdev->features' must be updated with the correct state of the device to indicate which features were updated successfully. To do that we maintain a copy of 'netdev->features' and update it after successful feature changes, so we can assign it to back to 'netdev->features' if needed. However, since not all netdev features are handled by the driver (e.g. GRO/TSO/etc), some features may not be updated correctly in case of an error updating another feature. For example, while requesting to disable TSO (feature which is not handled by the driver) and enable HW-GRO, if an error occurs during HW-GRO enable, 'oper_features' will be assigned with 'netdev->features' and HW-GRO turned off. TSO will remain enabled in such case, which is a bug. To solve that, instead of using 'netdev->features' as the baseline of 'oper_features' and changing it on set feature success, use 'features' instead and update it in case of errors. Fixes: 75b81ce719b7 ("net/mlx5e: Don't override netdev features field unless in error flow") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-28net/mlx5e: TC, Fix memory leak with rules with internal portRoi Dayan
Fix a memory leak with decap rule with internal port as destination device. The driver allocates a modify hdr action but doesn't set the flow attr modify hdr action which results in skipping releasing the modify hdr action when releasing the flow. backtrace: [<000000005f8c651c>] krealloc+0x83/0xd0 [<000000009f59b143>] alloc_mod_hdr_actions+0x156/0x310 [mlx5_core] [<000000002257f342>] mlx5e_tc_match_to_reg_set_and_get_id+0x12a/0x360 [mlx5_core] [<00000000b44ea75a>] mlx5e_tc_add_fdb_flow+0x962/0x1470 [mlx5_core] [<0000000003e384a0>] __mlx5e_add_fdb_flow+0x54c/0xb90 [mlx5_core] [<00000000ed8b22b6>] mlx5e_configure_flower+0xe45/0x4af0 [mlx5_core] [<00000000024f4ab5>] mlx5e_rep_indr_offload.isra.0+0xfe/0x1b0 [mlx5_core] [<000000006c3bb494>] mlx5e_rep_indr_setup_tc_cb+0x90/0x130 [mlx5_core] [<00000000d3dac2ea>] tc_setup_cb_add+0x1d2/0x420 Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device") Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-28libbpf: Improve LINUX_VERSION_CODE detectionAndrii Nakryiko
Ubuntu reports incorrect kernel version through uname(), which on older kernels leads to kprobe BPF programs failing to load due to the version check mismatch. Accommodate Ubuntu's quirks with LINUX_VERSION_CODE by using Ubuntu-specific /proc/version_code to fetch major/minor/patch versions to form LINUX_VERSION_CODE. While at it, consolide libbpf's kernel version detection code between libbpf.c and libbpf_probes.c. [0] Closes: https://github.com/libbpf/libbpf/issues/421 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211222231003.2334940-1-andrii@kernel.org
2021-12-28libbpf: Use 100-character limit to make bpf_tracing.h easier to readAndrii Nakryiko
Improve bpf_tracing.h's macro definition readability by keeping them single-line and better aligned. This makes it easier to follow all those variadic patterns. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211222213924.1869758-2-andrii@kernel.org
2021-12-28libbpf: Normalize PT_REGS_xxx() macro definitionsAndrii Nakryiko
Refactor PT_REGS macros definitions in bpf_tracing.h to avoid excessive duplication. We currently have classic PT_REGS_xxx() and CO-RE-enabled PT_REGS_xxx_CORE(). We are about to add also _SYSCALL variants, which would require excessive copying of all the per-architecture definitions. Instead, separate architecture-specific field/register names from the final macro that utilize them. That way for upcoming _SYSCALL variants we'll be able to just define x86_64 exception and otherwise have one common set of _SYSCALL macro definitions common for all architectures. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/20211222213924.1869758-1-andrii@kernel.org
2021-12-28Merge branch '1GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-12-28 This series contains updates to igc driver only. Vinicius disables support for crosstimestamp on i225-V as lockups are being observed. James McLaughlin fixes Tx timestamping support on non-MSI-X platforms. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igc: Fix TX timestamp support for non-MSI-X platforms igc: Do not enable crosstimestamping for i225-V models ==================== Link: https://lore.kernel.org/r/20211228182421.340354-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-28Merge branch '10GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 10GbE Intel Wired LAN Driver Updates 2021-12-28 Alexander Lobakin says: napi_build_skb() I introduced earlier this year ([0]) aims to decrease MM pressure and the overhead from in-place kmem_cache_alloc() on each Rx entry processing by decaching skbuff_heads from NAPI per-cpu cache filled prior to that by napi_consume_skb() (so it is sort of a direct shortcut for free -> mm -> alloc cycle). Currently, no in-tree drivers use it. Switch all Intel Ethernet drivers to it to get slight-to-medium perf boosts depending on the frame size. ice driver, 50 Gbps link, pktgen + XDP_PASS (local in) sample: frame_size/nthreads 64/42 128/20 256/8 512/4 1024/2 1532/1 net-next (Kpps) 46062 34654 18248 9830 5343 2714 series 47438 34708 18330 9875 5435 2777 increase 2.9% 0.15% 0.45% 0.46% 1.72% 2.32% Additionally, e1000's been switched to napi_consume_skb() as it's safe and works fine there, and there's no point in napi_build_skb() without paired NAPI cache feeding point. [0] https://lore.kernel.org/all/20210213141021.87840-1-alobakin@pm.me * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ixgbevf: switch to napi_build_skb() ixgbe: switch to napi_build_skb() igc: switch to napi_build_skb() igb: switch to napi_build_skb() ice: switch to napi_build_skb() iavf: switch to napi_build_skb() i40e: switch to napi_build_skb() e1000: switch to napi_build_skb() e1000: switch to napi_consume_skb() ==================== Link: https://lore.kernel.org/r/20211228175815.281449-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>