Age | Commit message (Collapse) | Author |
|
|
|
Taken from http://bugzilla.kernel.org/show_bug.cgi?id=8747
Problem Description:
It is related to the possibility to obtain MSG_ERRQUEUE messages from the udp
and raw sockets, both connected and unconnected.
There is a little typo in net/ipv6/icmp.c code, which prevents such messages
to be delivered to the errqueue of the correspond raw socket, when the socket
is CONNECTED. The typo is due to swap of local/remote addresses.
Consider __raw_v6_lookup() function from net/ipv6/raw.c. When a raw socket is
looked up usual way, it is something like:
sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr, IP6CB(skb)->iif);
where "daddr" is a destination address of the incoming packet (IOW our local
address), "saddr" is a source address of the incoming packet (the remote end).
But when the raw socket is looked up for some icmp error report, in
net/ipv6/icmp.c:icmpv6_notify() , daddr/saddr are obtained from the echoed
fragment of the "bad" packet, i.e. "daddr" is the original destination
address of that packet, "saddr" is our local address. Hence, for
icmpv6_notify() must use "saddr, daddr" in its arguments, not "daddr, saddr"
...
Steps to reproduce:
Create some raw socket, connect it to an address, and cause some error
situation: f.e. set ttl=1 where the remote address is more than 1 hop to reach.
Set IPV6_RECVERR .
Then send something and wait for the error (f.e. poll() with POLLERR|POLLIN).
You should receive "time exceeded" icmp message (because of "ttl=1"), but the
socket do not receive it.
If you do not connect your raw socket, you will receive MSG_ERRQUEUE
successfully. (The reason is that for unconnected socket there are no actual
checks for local/remote addresses).
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
As noticed by Jarek Poplawski <jarkao2@o2.pl>, the timer removal in
gen_kill_estimator races with the timer function rearming the timer.
Check whether the timer list is empty before rearming the timer
in the timer function to fix this.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jarek Poplawski <jarkao2@o2.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
SCTP currently permits users to bind to link-local addresses,
but doesn't verify that the scope id specified at bind matches
the interface that the address is configured on. It was report
that this can hang a system.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
skb_clone_fraglist is static so it shouldn't be exported.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
The conntrack assigned to locally generated ICMP error is usually the one
assigned to the original packet which has caused the error. But if
the original packet is handled as invalid by nf_conntrack, no conntrack
is assigned to the original packet. Then nf_ct_attach() cannot assign
any conntrack to the ICMP error packet. In that case the current
nf_conntrack_icmp assigns appropriate conntrack to it. But the current
code mistakes the direction of the packet. As a result, NAT code mistakes
the address to be mangled.
To fix the bug, this changes nf_conntrack_icmp not to assign conntrack
to such ICMP error. Actually no address is necessary to be mangled
in this case.
Spotted by Jordan Russell.
Upstream commit ID: 130e7a83d7ec8c5c673225e0fa8ea37b1ed507a5
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
patch 1/2 (revised):
- Fix drive->waiting_for_dma to work with CDB-intr devices.
- Do the dma status clearing in ide_intr() and add a new
hwif->ide_dma_clear_irq for Intel ICHx controllers.
Revised per Alan, Sergei and Bart's advice.
Patch against 2.6.20-rc6. Tested ok on my ICH4 and pdc20275 adapters.
Please review/apply, thanks.
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
fix deadlock in the 8139too driver: poll handlers should never forcibly
enable local interrupts, because they might be used by netpoll/printk
from IRQ context.
=================================
[ INFO: inconsistent lock state ]
2.6.19 #11
---------------------------------
inconsistent {softirq-on-W} -> {in-softirq-W} usage.
swapper/1 [HC0[0]:SC1[1]:HE1:SE0] takes:
(&npinfo->poll_lock){-+..}, at: [<c0350a41>] net_rx_action+0x64/0x1de
{softirq-on-W} state was registered at:
[<c0134c86>] mark_lock+0x5b/0x39c
[<c0135012>] mark_held_locks+0x4b/0x68
[<c01351e9>] trace_hardirqs_on+0x115/0x139
[<c02879e6>] rtl8139_poll+0x3d7/0x3f4
[<c035c85d>] netpoll_poll+0x82/0x32f
[<c035c775>] netpoll_send_skb+0xc9/0x12f
[<c035cdcc>] netpoll_send_udp+0x253/0x25b
[<c0288463>] write_msg+0x40/0x65
[<c011cead>] __call_console_drivers+0x45/0x51
[<c011cf16>] _call_console_drivers+0x5d/0x61
[<c011d4fb>] release_console_sem+0x11f/0x1d8
[<c011d7d7>] register_console+0x1ac/0x1b3
[<c02883f8>] init_netconsole+0x55/0x67
[<c010040c>] init+0x9a/0x24e
[<c01049cf>] kernel_thread_helper+0x7/0x10
[<ffffffff>] 0xffffffff
irq event stamp: 819992
hardirqs last enabled at (819992): [<c0350a16>] net_rx_action+0x39/0x1de
hardirqs last disabled at (819991): [<c0350b1e>] net_rx_action+0x141/0x1de
softirqs last enabled at (817552): [<c01214e4>] __do_softirq+0xa3/0xa8
softirqs last disabled at (819987): [<c0106051>] do_softirq+0x5b/0xc9
other info that might help us debug this:
no locks held by swapper/1.
stack backtrace:
[<c0104d88>] dump_trace+0x63/0x1e8
[<c0104f26>] show_trace_log_lvl+0x19/0x2e
[<c010532d>] show_trace+0x12/0x14
[<c0105343>] dump_stack+0x14/0x16
[<c0134980>] print_usage_bug+0x23c/0x246
[<c0134d33>] mark_lock+0x108/0x39c
[<c01356a7>] __lock_acquire+0x361/0x9ed
[<c0136018>] lock_acquire+0x56/0x72
[<c03aff1f>] _spin_lock+0x35/0x42
[<c0350a41>] net_rx_action+0x64/0x1de
[<c0121493>] __do_softirq+0x52/0xa8
[<c0106051>] do_softirq+0x5b/0xc9
[<c0121338>] irq_exit+0x3c/0x48
[<c0106163>] do_IRQ+0xa4/0xbd
[<c01047c6>] common_interrupt+0x2e/0x34
[<c011db92>] vprintk+0x2c0/0x309
[<c011dbf6>] printk+0x1b/0x1d
[<c01003f2>] init+0x80/0x24e
[<c01049cf>] kernel_thread_helper+0x7/0x10
=======================
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Going through the string and waiting for _pointer_ to become '\0'
is not what the authors meant...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Ben Collins <ben.collins@ubuntu.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Shows how many people are testing coda - the bug had been there for 5 years
and results of stepping on it are not subtle.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
This diff changes the default port range used for outgoing connections,
from "use 32768-61000 in most cases, but use N-4999 on small boxes
(where N is a multiple of 1024, depending on just *how* small the box
is)" to just "use 32768-61000 in all cases".
I don't believe there are any drawbacks to this change, and it keeps
outgoing connection ports farther away from the mess of
IANA-registered ports.
Signed-off-by: Mark Glines <mark@glines.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
sys_setsockopt() do not check properly timeout values for
SO_RCVTIMEO/SO_SNDTIMEO, for example it's possible to set negative timeout
values. POSIX do not defines behaviour for sys_setsockopt in case negative
timeouts, but requires that setsockopt() shall fail with -EDOM if the send and
receive timeout values are too big to fit into the timeout fields in the socket
structure.
In current implementation negative timeout can lead to error messages like
"schedule_timeout: wrong timeout value".
Proposed patch:
- checks tv_usec and returns -EDOM if it is wrong
- do not allows to set negative timeout values (sets 0 instead) and outputs
ratelimited information message about such attempts.
Signed-off-By: Vasily Averin <vvs@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
The Linux kernel ignored the PROM's serial settings (115200,n,8,1 in
my case). This was because mode_prop remained "ttyX-mode" (expected:
"ttya-mode") due to the constness of string literals when used with
"char *". Since there is no "ttyX-mode" property in the PROM, Linux
always used the default 9600.
[ Investigation of the suncore.s assembler reveals that gcc optimizied
away the stores, yet did not emit a warning, which is a pretty
anti-social thing to do and is the only reason this bug lived for
so long -DaveM ]
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
As mentioned in http://bugzilla.kernel.org/show_bug.cgi?id=5015
The helptext implies that this is on by default.
This may be true on some distros (Fedora/RHEL have it enabled
in /etc/sysctl.conf), but the kernel defaults to it off.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Noticed by Matvejchikov Ilya.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
dereference (CVE-2007-2876)
When creating a new connection by sending an unknown chunk type, we don't
transition to a valid state, causing a NULL pointer dereference in
sctp_packet when accessing sctp_timeouts[SCTP_CONNTRACK_NONE].
Fix by don't creating new conntrack entry if initial state is invalid.
Noticed by Vilmos Nebehaj <vilmos.nebehaj@ramsys.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Local variable `i' is a byte-counter. Don't use it as an index into an array
of le32's.
Reported-by: "young dave" <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
|
|
|
|
This fixes an out-of-boundary condition when the classified
band equals q->bands. Caught by Alexey
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Reverse the sense of the promiscuous-mode tests in ip6_mc_input().
Signed-off-by: Corey Mutter <crm-netdev@mutternet.com>
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
When an IPv6 router is forwarding a packet with a link-local scope source
address off-link, RFC 4007 requires it to send an ICMPv6 destination
unreachable with code 2 ("not neighbor"), but Linux doesn't. Fix below.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
When the server drops its connection, NFS client reconnects using the
same socket after disconnecting. If the new connection's SYN,ACK
doesn't contain the TCP timestamp option and the old connection's did,
tp->tcp_header_len is recomputed assuming no timestamp header but
tp->rx_opt.tstamp_ok remains set. Then tcp_build_and_update_options()
adds in a timestamp option past the end of the allocated TCP header,
overwriting TCP data, or when the data is in skb_shinfo(skb)->frags[],
overwriting skb_shinfo(skb) causing a crash soon after. (The issue was
debugged from such a crash.)
Similarly, wscale_ok and sack_ok also get set based on the SYN,ACK
packet but not reset on disconnect, since they are zeroed out at
initialization. The patch zeroes out the entire tp->rx_opt struct in
tcp_disconnect() to avoid this sort of problem.
Signed-off-by: Srinivas Aji <Aji_Srinivas@emc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Get rid of the CONFIG_NETPOLL_RX option completely since all the
dependencies have been removed long ago...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
CONFIG_NETPOLL_TRAP causes the TX queue controls to be completely bypassed in
the netpoll's "trapped" mode which easily causes overflows in the drivers with
short TX queues (most notably, in 8139too with its 4-deep queue). So, make
this option more sensible by making it only bypass the TX softirq wakeup.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Acked-by: Tom Rini <trini@kernel.crashing.org>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
When network device's are renamed, the IPV6 snmp6 code
gets confused. It doesn't track name changes so it will OOPS
when network device's are removed.
The fix is trivial, just unregister/re-register in notify handler.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Keith says
Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), wait_hpet_tick is
optimized away to a never ending loop and the kernel hangs on boot in timer
setup.
0000001a <wait_hpet_tick>:
1a: 55 push %ebp
1b: 89 e5 mov %esp,%ebp
1d: eb fe jmp 1d <wait_hpet_tick+0x3>
This is not a problem with gcc 3.3.5. Adding barrier() calls to
wait_hpet_tick does not help, making the variables volatile does.
And the consensus is that gcc-4.1.0 is busted.
Adrian Bunk:
Changed from a #warning to an #error for 2.6.16.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
|
|
|
|
__x25_find_socket does a sock_hold.
This adds a missing sock_put in x25_receive_data.
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
The clusterip_config_find_get() already increases entries reference
counter, so there is no reason to do it twice in checkentry() callback.
This causes the config to be freed before it is removed from the list,
resulting in a crash when adding the next rule.
Signed-off-by: Jaroslav Kysela <perex@suse.cz>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Fix a compile error reported by Michel Lespinasse.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
On ia64, kernel headers define REGION_OFFSET so we can't use that.
Reported by Andrew Morton.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
While porting some changes of the 2.6.21-rc7 pptp/proto_gre conntrack
and nat modules to a 2.4.32 kernel I noticed that the gre_key function
returns a wrong pointer to the GRE key of a version 0 packet thus
corrupting the packet payload.
The intended behaviour for GREv0 packets is to act like
ip_conntrack_proto_generic/ip_nat_proto_unknown so I have ripped the
offending functions (not used anymore) and modified the
ip_nat_proto_gre modules to not touch version 0 (non PPTP) packets.
Signed-off-by: Jorge Boncompte <jorge@dti2.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
sys_madvise has down_write of mmap_sem, then madvise_remove calls
vmtruncate_range which takes i_mutex and i_alloc_sem: no, we can
easily devise deadlocks from that ordering.
madvise_remove drop mmap_sem while calling vmtruncate_range: luckily,
since madvise_remove doesn't split or merge vmas, it's easy to handle
this case with a NULL prev, without restructuring sys_madvise. (Though
sad to retake mmap_sem when it's unlikely to be needed, and certainly
down_read is sufficient for MADV_REMOVE, unlike the other madvices.)
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
shmem_truncate_range has its own truncate_inode_pages_range, to free any
pages racily instantiated while it was in progress: a SHMEM_PAGEIN flag
is set when this might have happened. But holepunching gets no chance
to clear that flag at the start of vmtruncate_range, so it's always set
(unless a truncate came just before), so holepunch almost always does
this second truncate_inode_pages_range.
shmem holepunch has unlikely swap<->file races hereabouts whatever we do
(without a fuller rework than is fit for this release): I was going to
skip the second truncate in the punch_hole case, but Miklos points out
that would make holepunch correctness more vulnerable to swapoff. So
keep the second truncate, but follow it by an unmap_mapping_range to
eliminate the disconnected pages (freed from pagecache while still
mapped in userspace) that it might have left behind.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Miklos Szeredi observes that during truncation of shmem page directories,
info->lock is released to improve latency (after lowering i_size and
next_index to exclude races); but this is quite wrong for holepunching,
which receives no such protection from i_size or next_index, and is left
vulnerable to races with shmem_unuse, shmem_getpage and shmem_writepage.
Hold info->lock throughout when holepunching? No, any user could prevent
rescheduling for far too long. Instead take info->lock just when needed:
in shmem_free_swp when removing the swap entries, and whenever removing
a directory page from the level above. But so long as we remove before
scanning, we can safely skip taking the lock at the lower levels, except
at misaligned start and end of the hole.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Miklos Szeredi observes BUG_ON(!entry) in shmem_writepage() triggered
in rare circumstances, because shmem_truncate_range() erroneously
removes partially truncated directory pages at the end of the range:
later reclaim on pages pointing to these removed directories triggers
the BUG. Indeed, and it can also cause data loss beyond the hole.
Fix this as in the patch proposed by Miklos, but distinguish between
"limit" (how far we need to search: ignore truncation's next_index
optimization in the holepunch case - if there are races it's more
consistent to act on the whole range specified) and "upper_limit"
(how far we can free directory pages: generally we must be careful
to keep partially punched pages, but can relax at end of file -
i_size being held stable by i_mutex).
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
|
|
|
|
A security issue is emerging. Disallow Routing Header Type 0 by default
as we have been doing for IPv4.
This version already includes a fix for the original patch.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
which resulted in infinite recursion and stack overflow.
The bug is present in all kernel versions since the feature appeared.
The patch also makes some minimal cleanup:
1. Return something consistent (-ENOENT) when fib table is missing
2. Do not crash when queue is empty (does not happen, but yet)
3. Put result of lookup
Sergey Vlasov:
Oops fix
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Two functions are called from __devinit context, but they are marked as
__init. Fix this.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
The user can generate console output if they cause do_mmap() to fail
during sys_io_setup(). This was seen in a regression test that does
exactly that by spinning calling mmap() until it gets -ENOMEM before
calling io_setup().
We don't need this printk at all, just remove it.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
... and having it __init is a bad idea.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Without this initialization one gets
kernel BUG at kernel/rtmutex_common.h:80!
Signed-off-by: G. Liakhovetski <gl@dsa-ac.de>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
We must reserve SAR + MAX_HEADER bytes for IrLMP to fit in.
This fixes an oops reported (and fixed) by Jeet Chaudhuri, when max_sdu_size
is greater than 0.
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
IA32 manual says if micorcode update's size is 0, then the size is
default size (2048 bytes). But this doesn't suggest all microcode
update's size should be above 2048 bytes to me. We actually had a
microcode update whose size is 1024 bytes. The patch just removed the
check.
Backported by Daniel Drake.
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
|