aboutsummaryrefslogtreecommitdiffstats
path: root/net/netfilter
Commit message (Collapse)AuthorAgeFilesLines
* netfilter: nf_conntrack: Support expectations in different zonesJoe Stringer2015-08-121-1/+2
| | | | | | | | | | | | | | | commit 4b31814d20cbe5cd4ccf18089751e77a04afe4f2 upstream. When zones were originally introduced, the expectation functions were all extended to perform lookup using the zone. However, insertion was not modified to check the zone. This means that two expectations which are intended to apply for different connections that have the same tuple but exist in different zones cannot both be tracked. Fixes: 5d0aa2ccd4 (netfilter: nf_conntrack: add support for "conntrack zones") Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ipvs: kernel oops - do_ip_vs_get_ctlHans Schillstrom2015-08-072-22/+39
| | | | | | | | | | | | | | | | | | commit 8537de8a7ab6681cc72fb0411ab1ba7fdba62dd0 upstream. Change order of init so netns init is ready when register ioctl and netlink. Ver2 Whitespace fixes and __init added. Reported-by: "Ryan O'Hara" <rohara@redhat.com> Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Andreas Unterkircher <unki@netshadow.at>
* ipvs: fix memory leak in ip_vs_ctl.cTommi Rantala2015-08-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit f30bf2a5cac6c60ab366c4bc6db913597bf4d6ab upstream. Fix memory leak introduced in commit a0840e2e165a ("IPVS: netns, ip_vs_ctl local vars moved to ipvs struct."): unreferenced object 0xffff88005785b800 (size 2048): comm "(-localed)", pid 1434, jiffies 4294755650 (age 1421.089s) hex dump (first 32 bytes): bb 89 0b 83 ff ff ff ff b0 78 f0 4e 00 88 ff ff .........x.N.... 04 00 00 00 a4 01 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8262ea8e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811fba74>] __kmalloc_track_caller+0x244/0x430 [<ffffffff811b88a0>] kmemdup+0x20/0x50 [<ffffffff823276b7>] ip_vs_control_net_init+0x1f7/0x510 [<ffffffff8231d630>] __ip_vs_init+0x100/0x250 [<ffffffff822363a1>] ops_init+0x41/0x190 [<ffffffff82236583>] setup_net+0x93/0x150 [<ffffffff82236cc2>] copy_net_ns+0x82/0x140 [<ffffffff810ab13d>] create_new_namespaces+0xfd/0x190 [<ffffffff810ab49a>] unshare_nsproxy_namespaces+0x5a/0xc0 [<ffffffff810833e3>] SyS_unshare+0x173/0x310 [<ffffffff8265cbd7>] system_call_fastpath+0x12/0x6f [<ffffffffffffffff>] 0xffffffffffffffff Fixes: a0840e2e165a ("IPVS: netns, ip_vs_ctl local vars moved to ipvs struct.") Signed-off-by: Tommi Rantala <tt.rantala@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ipvs: uninitialized data with IP_VS_IPV6Dan Carpenter2015-05-091-5/+5
| | | | | | | | | | | | | | | | commit 3b05ac3824ed9648c0d9c02d51d9b54e4e7e874f upstream. The app_tcp_pkt_out() function expects "*diff" to be set and ends up using uninitialized data if CONFIG_IP_VS_IPV6 is turned on. The same issue is there in app_tcp_pkt_in(). Thanks to Julian Anastasov for noticing that. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
* ipvs: rerouting to local clients is not needed anymoreJulian Anastasov2015-05-091-11/+21
| | | | | | | | | | | | | | | | | | | | | | commit 579eb62ac35845686a7c4286c0a820b4eb1f96aa upstream. commit f5a41847acc5 ("ipvs: move ip_route_me_harder for ICMP") from 2.6.37 introduced ip_route_me_harder() call for responses to local clients, so that we can provide valid rt_src after SNAT. It was used by TCP to provide valid daddr for ip_send_reply(). After commit 0a5ebb8000c5 ("ipv4: Pass explicit daddr arg to ip_send_reply()." from 3.0 this rerouting is not needed anymore and should be avoided, especially in LOCAL_IN. Fixes 3.12.33 crash in xfrm reported by Florian Wiessner: "3.12.33 - BUG xfrm_selector_match+0x25/0x2f6" Reported-by: Smart Weblications GmbH - Florian Wiessner <f.wiessner@smart-weblications.de> Tested-by: Smart Weblications GmbH - Florian Wiessner <f.wiessner@smart-weblications.de> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
* net: make skb_gso_segment error handling more robustFlorian Westphal2015-05-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | commit 330966e501ffe282d7184fde4518d5e0c24bc7f8 upstream. skb_gso_segment has three possible return values: 1. a pointer to the first segmented skb 2. an errno value (IS_ERR()) 3. NULL. This can happen when GSO is used for header verification. However, several callers currently test IS_ERR instead of IS_ERR_OR_NULL and would oops when NULL is returned. Note that these call sites should never actually see such a NULL return value; all callers mask out the GSO bits in the feature argument. However, there have been issues with some protocol handlers erronously not respecting the specified feature mask in some cases. It is preferable to get 'have to turn off hw offloading, else slow' reports rather than 'kernel crashes'. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> [Brad Spengler: backported to 3.2] Signed-off-by: Brad Spengler <spender@grsecurity.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ipvs: add missing ip_vs_pe_put in sync codeJulian Anastasov2015-05-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | commit 528c943f3bb919aef75ab2fff4f00176f09a4019 upstream. ip_vs_conn_fill_param_sync() gets in param.pe a module reference for persistence engine from __ip_vs_pe_getbyname() but forgets to put it. Problem occurs in backup for sync protocol v1 (2.6.39). Also, pe_data usually comes in sync messages for connection templates and ip_vs_conn_new() copies the pointer only in this case. Make sure pe_data is not leaked if it comes unexpectedly for normal connections. Leak can happen only if bogus messages are sent to backup server. Fixes: fe5e7a1efb66 ("IPVS: Backup, Adding Version 1 receive capability") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* netfilter: xt_socket: fix a stack corruption bugEric Dumazet2015-05-091-9/+13
| | | | | | | | | | | | | | | | | | commit 78296c97ca1fd3b104f12e1f1fbc06c46635990b upstream. As soon as extract_icmp6_fields() returns, its local storage (automatic variables) is deallocated and can be overwritten. Lets add an additional parameter to make sure storage is valid long enough. While we are at it, adds some const qualifiers. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: b64c9256a9b76 ("tproxy: added IPv6 support to the socket match") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* netfilter: conntrack: disable generic tracking for known protocolsFlorian Westphal2015-02-201-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit db29a9508a9246e77087c5531e45b2c88ec6988b upstream. Given following iptables ruleset: -P FORWARD DROP -A FORWARD -m sctp --dport 9 -j ACCEPT -A FORWARD -p tcp --dport 80 -j ACCEPT -A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT One would assume that this allows SCTP on port 9 and TCP on port 80. Unfortunately, if the SCTP conntrack module is not loaded, this allows *all* SCTP communication, to pass though, i.e. -p sctp -j ACCEPT, which we think is a security issue. This is because on the first SCTP packet on port 9, we create a dummy "generic l4" conntrack entry without any port information (since conntrack doesn't know how to extract this information). All subsequent packets that are unknown will then be in established state since they will fallback to proto_generic and will match the 'generic' entry. Our originally proposed version [1] completely disabled generic protocol tracking, but Jozsef suggests to not track protocols for which a more suitable helper is available, hence we now mitigate the issue for in tree known ct protocol helpers only, so that at least NAT and direction information will still be preserved for others. [1] http://www.spinics.net/lists/netfilter-devel/msg33430.html Joint work with Daniel Borkmann. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* netfilter: ipset: small potential read beyond the end of bufferDan Carpenter2015-02-201-0/+6
| | | | | | | | | | | | | commit 2196937e12b1b4ba139806d132647e1651d655df upstream. We could be reading 8 bytes into a 4 byte buffer here. It seems harmless but adding a check is the right thing to do and it silences a static checker warning. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ipvs: avoid netns exit crash on ip_vs_conn_drop_conntrackJulian Anastasov2014-11-051-1/+0
| | | | | | | | | | | | | | commit 2627b7e15c5064ddd5e578e4efd948d48d531a3f upstream. commit 8f4e0a18682d91 ("IPVS netns exit causes crash in conntrack") added second ip_vs_conn_drop_conntrack call instead of just adding the needed check. As result, the first call still can cause crash on netns exit. Remove it. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* inetpeer: get rid of ip_id_countEric Dumazet2014-09-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 73f156a6e8c1074ac6327e0abd1169e95eb66463 ] Ideally, we would need to generate IP ID using a per destination IP generator. linux kernels used inet_peer cache for this purpose, but this had a huge cost on servers disabling MTU discovery. 1) each inet_peer struct consumes 192 bytes 2) inetpeer cache uses a binary tree of inet_peer structs, with a nominal size of ~66000 elements under load. 3) lookups in this tree are hitting a lot of cache lines, as tree depth is about 20. 4) If server deals with many tcp flows, we have a high probability of not finding the inet_peer, allocating a fresh one, inserting it in the tree with same initial ip_id_count, (cf secure_ip_id()) 5) We garbage collect inet_peer aggressively. IP ID generation do not have to be 'perfect' Goal is trying to avoid duplicates in a short period of time, so that reassembly units have a chance to complete reassembly of fragments belonging to one message before receiving other fragments with a recycled ID. We simply use an array of generators, and a Jenkin hash using the dst IP as a key. ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it belongs (it is only used from this file) secure_ip_id() and secure_ipv6_id() no longer are needed. Rename ip_select_ident_more() to ip_select_ident_segs() to avoid unnecessary decrement/increment of the number of segments. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ipvs: stop tot_stats estimator only under CONFIG_SYSCTLJulian Anastasov2014-08-061-1/+1
| | | | | | | | | | | | | | | | | | [ Upstream commit 9802d21e7a0b0d2167ef745edc1f4ea7a0fc6ea3 ] The tot_stats estimator is started only when CONFIG_SYSCTL is defined. But it is stopped without checking CONFIG_SYSCTL. Fix the crash by moving ip_vs_stop_estimator into ip_vs_control_net_cleanup_sysctl. The change is needed after commit 14e405461e664b ("IPVS: Add __ip_vs_control_{init,cleanup}_sysctl()") from 2.6.39. Reported-by: Jet Chen <jet.chen@intel.com> Tested-by: Jet Chen <jet.chen@intel.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* netfilter: nf_conntrack_dccp: fix skb_header_pointer API usagesDaniel Borkmann2014-04-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | commit b22f5126a24b3b2f15448c3f2a254fc10cbc2b92 upstream. Some occurences in the netfilter tree use skb_header_pointer() in the following way ... struct dccp_hdr _dh, *dh; ... skb_header_pointer(skb, dataoff, sizeof(_dh), &dh); ... where dh itself is a pointer that is being passed as the copy buffer. Instead, we need to use &_dh as the forth argument so that we're copying the data into an actual buffer that sits on the stack. Currently, we probably could overwrite memory on the stack (e.g. with a possibly mal-formed DCCP packet), but unintentionally, as we only want the buffer to be placed into _dh variable. Fixes: 2bc780499aa3 ("[NETFILTER]: nf_conntrack: add DCCP protocol support") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* netfilter: nf_ct_sip: don't drop packets with offsets pointing outside the ↵Patrick McHardy2013-11-281-1/+1
| | | | | | | | | | | | | | | | | packet commit 3a7b21eaf4fb3c971bdb47a98f570550ddfe4471 upstream. Some Cisco phones create huge messages that are spread over multiple packets. After calculating the offset of the SIP body, it is validated to be within the packet and the packet is dropped otherwise. This breaks operation of these phones. Since connection tracking is supposed to be passive, just let those packets pass unmodified and untracked. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> [bwh: Backported to 3.2: there is no log message to delete] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ip: generate unique IP identificator if local fragmentation is allowedAnsis Atteka2013-10-261-1/+1
| | | | | | | | | | | | | | | | | | | | [ Upstream commit 703133de331a7a7df47f31fb9de51dc6f68a9de8 ] If local fragmentation is allowed, then ip_select_ident() and ip_select_ident_more() need to generate unique IDs to ensure correct defragmentation on the peer. For example, if IPsec (tunnel mode) has to encrypt large skbs that have local_df bit set, then all IP fragments that belonged to different ESP datagrams would have used the same identificator. If one of these IP fragments would get lost or reordered, then peer could possibly stitch together wrong IP fragments that did not belong to the same datagram. This would lead to a packet loss or data corruption. Signed-off-by: Ansis Atteka <aatteka@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ipvs: ip_vs_sip_fill_param() BUG: bad check of return valueHans Schillstrom2013-05-301-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit f7a1dd6e3ad59f0cfd51da29dfdbfd54122c5916 upstream. The reason for this patch is crash in kmemdup caused by returning from get_callid with uniialized matchoff and matchlen. Removing Zero check of matchlen since it's done by ct_sip_get_header() BUG: unable to handle kernel paging request at ffff880457b5763f IP: [<ffffffff810df7fc>] kmemdup+0x2e/0x35 PGD 27f6067 PUD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: xt_state xt_helper nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle xt_connmark xt_conntrack ip6_tables nf_conntrack_ftp ip_vs_ftp nf_nat xt_tcpudp iptable_mangle xt_mark ip_tables x_tables ip_vs_rr ip_vs_lblcr ip_vs_pe_sip ip_vs nf_conntrack_sip nf_conntrack bonding igb i2c_algo_bit i2c_core CPU 5 Pid: 0, comm: swapper/5 Not tainted 3.9.0-rc5+ #5 /S1200KP RIP: 0010:[<ffffffff810df7fc>] [<ffffffff810df7fc>] kmemdup+0x2e/0x35 RSP: 0018:ffff8803fea03648 EFLAGS: 00010282 RAX: ffff8803d61063e0 RBX: 0000000000000003 RCX: 0000000000000003 RDX: 0000000000000003 RSI: ffff880457b5763f RDI: ffff8803d61063e0 RBP: ffff8803fea03658 R08: 0000000000000008 R09: 0000000000000011 R10: 0000000000000011 R11: 00ffffffff81a8a3 R12: ffff880457b5763f R13: ffff8803d67f786a R14: ffff8803fea03730 R15: ffffffffa0098e90 FS: 0000000000000000(0000) GS:ffff8803fea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff880457b5763f CR3: 0000000001a0c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper/5 (pid: 0, threadinfo ffff8803ee18c000, task ffff8803ee18a480) Stack: ffff8803d822a080 000000000000001c ffff8803fea036c8 ffffffffa000937a ffffffff81f0d8a0 000000038135fdd5 ffff880300000014 ffff880300110000 ffffffff150118ac ffff8803d7e8a000 ffff88031e0118ac 0000000000000000 Call Trace: <IRQ> [<ffffffffa000937a>] ip_vs_sip_fill_param+0x13a/0x187 [ip_vs_pe_sip] [<ffffffffa007b209>] ip_vs_sched_persist+0x2c6/0x9c3 [ip_vs] [<ffffffff8107dc53>] ? __lock_acquire+0x677/0x1697 [<ffffffff8100972e>] ? native_sched_clock+0x3c/0x7d [<ffffffff8100972e>] ? native_sched_clock+0x3c/0x7d [<ffffffff810649bc>] ? sched_clock_cpu+0x43/0xcf [<ffffffffa007bb1e>] ip_vs_schedule+0x181/0x4ba [ip_vs] ... Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* netfilter: Validate the sequence number of dataless ACK packets as wellJozsef Kadlecsik2012-12-061-8/+2
| | | | | | | | | | | | | | commit 4a70bbfaef0361d27272629d1a250a937edcafe4 upstream. We spare nothing by not validating the sequence number of dataless ACK packets and enabling it makes harder off-path attacks. See: "Reflection scan: an Off-Path Attack on TCP" by Jan Wrobel, http://arxiv.org/abs/1201.2074 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* netfilter: Mark SYN/ACK packets as invalid from original directionJozsef Kadlecsik2012-12-061-11/+8
| | | | | | | | | | | | | | | commit 64f509ce71b08d037998e93dd51180c19b2f464c upstream. Clients should not send such packets. By accepting them, we open up a hole by wich ephemeral ports can be discovered in an off-path attack. See: "Reflection scan: an Off-Path Attack on TCP" by Jan Wrobel, http://arxiv.org/abs/1201.2074 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* netfilter: nf_conntrack: fix racy timer handling with reliable eventsPablo Neira Ayuso2012-10-301-5/+11
| | | | | | | | | | | | | | | | | | | | | | commit 5b423f6a40a0327f9d40bc8b97ce9be266f74368 upstream. Existing code assumes that del_timer returns true for alive conntrack entries. However, this is not true if reliable events are enabled. In that case, del_timer may return true for entries that were just inserted in the dying list. Note that packets / ctnetlink may hold references to conntrack entries that were just inserted to such list. This patch fixes the issue by adding an independent timer for event delivery. This increases the size of the ecache extension. Still we can revisit this later and use variable size extensions to allocate this area on demand. Tested-by: Oliver Smith <olipro@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* netfilter: xt_limit: have r->cost != 0 case workJan Engelhardt2012-10-171-4/+4
| | | | | | | | | | | | | | commit 82e6bfe2fbc4d48852114c4f979137cd5bf1d1a8 upstream. Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when a running state is saved to userspace and then reinstated from there. Make sure that private xt_limit area is initialized with correct values. Otherwise, random matchings due to use of uninitialized memory. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* netfilter: limit, hashlimit: avoid duplicated inlineFlorian Westphal2012-10-172-8/+5
| | | | | | | | | | | | | | | | | | commit 7a909ac70f6b0823d9f23a43f19598d4b57ac901 upstream. credit_cap can be set to credit, which avoids inlining user2credits twice. Also, remove inline keyword and let compiler decide. old: 684 192 0 876 36c net/netfilter/xt_limit.o 4927 344 32 5303 14b7 net/netfilter/xt_hashlimit.o now: 668 192 0 860 35c net/netfilter/xt_limit.o 4793 344 32 5169 1431 net/netfilter/xt_hashlimit.o Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* netfilter: nf_ct_expect: fix possible access to uninitialized timerPablo Neira Ayuso2012-10-171-23/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2614f86490122bf51eb7c12ec73927f1900f4e7d upstream. In __nf_ct_expect_check, the function refresh_timer returns 1 if a matching expectation is found and its timer is successfully refreshed. This results in nf_ct_expect_related returning 0. Note that at this point: - the passed expectation is not inserted in the expectation table and its timer was not initialized, since we have refreshed one matching/existing expectation. - nf_ct_expect_alloc uses kmem_cache_alloc, so the expectation timer is in some undefined state just after the allocation, until it is appropriately initialized. This can be a problem for the SIP helper during the expectation addition: ... if (nf_ct_expect_related(rtp_exp) == 0) { if (nf_ct_expect_related(rtcp_exp) != 0) nf_ct_unexpect_related(rtp_exp); ... Note that nf_ct_expect_related(rtp_exp) may return 0 for the timer refresh case that is detailed above. Then, if nf_ct_unexpect_related(rtcp_exp) returns != 0, nf_ct_unexpect_related(rtp_exp) is called, which does: spin_lock_bh(&nf_conntrack_lock); if (del_timer(&exp->timeout)) { nf_ct_unlink_expect(exp); nf_ct_expect_put(exp); } spin_unlock_bh(&nf_conntrack_lock); Note that del_timer always returns false if the timer has been initialized. However, the timer was not initialized since setup_timer was not called, therefore, the expectation timer remains in some undefined state. If I'm not missing anything, this may lead to the removal an unexistent expectation. To fix this, the optimization that allows refreshing an expectation is removed. Now nf_conntrack_expect_related looks more consistent to me since it always add the expectation in case that it returns success. Thanks to Patrick McHardy for participating in the discussion of this patch. I think this may be the source of the problem described by: http://marc.info/?l=netfilter-devel&m=134073514719421&w=2 Reported-by: Rafal Fitt <rafalf@aplusc.com.pl> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ipvs: fix info leak in getsockopt(IP_VS_SO_GET_TIMEOUT)Mathias Krause2012-09-191-0/+1
| | | | | | | | | | | | | | | | | [ Upstream commit 2d8a041b7bfe1097af21441cb77d6af95f4f4680 ] If at least one of CONFIG_IP_VS_PROTO_TCP or CONFIG_IP_VS_PROTO_UDP is not set, __ip_vs_get_timeouts() does not fully initialize the structure that gets copied to userland and that for leaks up to 12 bytes of kernel stack. Add an explicit memset(0) before passing the structure to __ip_vs_get_timeouts() to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Wensong Zhang <wensong@linux-vs.org> Cc: Simon Horman <horms@verge.net.au> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ipvs: fix matching of fwmark templates during schedulingSimon Horman2012-02-291-1/+1
| | | | | | | | | | | | | | | | | | commit e0aac52e17a3db68fe2ceae281780a70fc69957f upstream. Commit f11017ec2d1859c661f4e2b12c4a8d250e1f47cf (2.6.37) moved the fwmark variable in subcontext that is invalidated before reaching the ip_vs_ct_in_get call. As vaddr is provided as pointer in the param structure make sure the fwmark variable is in same context. As the fwmark templates can not be matched, more and more template connections are created and the controlled connections can not go to single real server. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* net: reintroduce missing rcu_assign_pointer() callsEric Dumazet2012-02-038-12/+12
| | | | | | | | | | | | | | | | | [ Upstream commit cf778b00e96df6d64f8e21b8395d1f8a859ecdc7 ] commit a9b3cd7f32 (rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER) did a lot of incorrect changes, since it did a complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x, y). We miss needed barriers, even on x86, when y is not NULL. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* netfilter: ctnetlink: fix timeout calculationXi Wang2011-12-311-2/+2
| | | | | | | | | | | | | | The sanity check (timeout < 0) never works; the dividend is unsigned and so is the division, which should have been a signed division. long timeout = (ct->timeout.expires - jiffies) / HZ; if (timeout < 0) timeout = 0; This patch converts the time values to signed for the division. Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* ipvs: try also real server with port 0 in backup serverJulian Anastasov2011-12-313-4/+10
| | | | | | | | | | | We should not forget to try for real server with port 0 in the backup server when processing the sync message. We should do it in all cases because the backup server can use different forwarding method. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Merge branch 'nf' of git://1984.lsi.us.es/netDavid S. Miller2011-12-241-5/+13
|\
| * netfilter: ctnetlink: fix scheduling while atomic if helper is autoloadedPablo Neira Ayuso2011-12-241-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | This patch fixes one scheduling while atomic error: [ 385.565186] ctnetlink v0.93: registering with nfnetlink. [ 385.565349] BUG: scheduling while atomic: lt-expect_creat/16163/0x00000200 It can be triggered with utils/expect_create included in libnetfilter_conntrack if the FTP helper is not loaded. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: ctnetlink: fix return value of ctnetlink_get_expect()Pablo Neira Ayuso2011-12-241-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | This fixes one bogus error that is returned to user-space: libnetfilter_conntrack/utils# ./expect_get TEST: get expectation (-1)(Unknown error 18446744073709551504) This patch includes the correct handling for EAGAIN (nfnetlink uses this error value to restart the operation after module auto-loading). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | netfilter: xt_connbytes: handle negation correctlyFlorian Westphal2011-12-231-3/+3
|/ | | | | | | | | | | | | | | | | | | | "! --connbytes 23:42" should match if the packet/byte count is not in range. As there is no explict "invert match" toggle in the match structure, userspace swaps the from and to arguments (i.e., as if "--connbytes 42:23" were given). However, "what <= 23 && what >= 42" will always be false. Change things so we use "||" in case "from" is larger than "to". This change may look like it breaks backwards compatibility when "to" is 0. However, older iptables binaries will refuse "connbytes 42:0", and current releases treat it to mean "! --connbytes 0:42", so we should be fine. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netfilter: Remove ADVANCED dependency from NF_CONNTRACK_NETBIOS_NSDavid S. Miller2011-12-011-1/+0
| | | | | | firewalld in Fedora 16 needs this. Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'nf' of git://1984.lsi.us.es/netDavid S. Miller2011-11-295-43/+73
|\
| * netfilter: nf_conntrack: make event callback registration per-netnsPablo Neira Ayuso2011-11-222-40/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes an oops that can be triggered following this recipe: 0) make sure nf_conntrack_netlink and nf_conntrack_ipv4 are loaded. 1) container is started. 2) connect to it via lxc-console. 3) generate some traffic with the container to create some conntrack entries in its table. 4) stop the container: you hit one oops because the conntrack table cleanup tries to report the destroy event to user-space but the per-netns nfnetlink socket has already gone (as the nfnetlink socket is per-netns but event callback registration is global). To fix this situation, we make the ctnl_notifier per-netns so the callback is registered/unregistered if the container is created/destroyed. Alex Bligh and Alexey Dobriyan originally proposed one small patch to check if the nfnetlink socket is gone in nfnetlink_has_listeners, but this is a very visited path for events, thus, it may reduce performance and it looks a bit hackish to check for the nfnetlink socket only to workaround this situation. As a result, I decided to follow the bigger path choice, which seems to look nicer to me. Cc: Alexey Dobriyan <adobriyan@gmail.com> Reported-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: ipset: suppress compile-time warnings in ip_set_hash_ipport*.cJozsef Kadlecsik2011-11-213-3/+3
| | | | | | | | | | | | | | warning: 'ip_to' may be used uninitialized in this function Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | netfilter: Remove NOTRACK/RAW dependency on NETFILTER_ADVANCED.David S. Miller2011-11-231-1/+0
|/ | | | | | | | Distributions are using this in their default scripts, so don't hide them behind the advanced setting. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'modsplit-Oct31_2011' of ↵Linus Torvalds2011-11-068-0/+9
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h
| * net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modulesPaul Gortmaker2011-10-315-0/+5
| | | | | | | | | | | | | | | | | | These files are non modular, but need to export symbols using the macros now living in export.h -- call out the include so that things won't break when we remove the implicit presence of module.h from everywhere. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * net: add moduleparam.h for users of module_param/MODULE_PARM_DESCPaul Gortmaker2011-10-311-0/+1
| | | | | | | | | | | | | | | | These files were getting access to these two via the implicit presence of module.h everywhere. They aren't modules, so they don't need the full module.h inclusion though. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * net: Fix files explicitly needing to include module.hPaul Gortmaker2011-10-313-0/+3
| | | | | | | | | | | | | | | | | | With calls to modular infrastructure, these files really needs the full module.h header. Call it out so some of the cleanups of implicit and unrequired includes elsewhere can be cleaned up. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* | netfilter: do not propagate nf_queue errors in nf_hook_slowFlorian Westphal2011-11-011-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit f15850861860636c905b33a9a5be3dcbc2b0d56a (netfilter: nfnetlink_queue: return error number to caller) erronously assigns the return value of nf_queue() to the "ret" value. This can cause bogus return values if we encounter QUEUE verdict when bypassing is enabled, the listener does not exist and the next hook returns NF_STOLEN. In this case nf_hook_slow returned -ESRCH instead of 0. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | ipvs: Remove unused variable "cs" from ip_vs_leave function.Krzysztof Wilczynski2011-11-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This is to address the following warning during compilation time: net/netfilter/ipvs/ip_vs_core.c: In function ‘ip_vs_leave’: net/netfilter/ipvs/ip_vs_core.c:532: warning: unused variable ‘cs’ This variable is indeed no longer in use. Signed-off-by: Krzysztof Wilczynski <krzysztof.wilczynski@linux.com> Signed-off-by: Simon Horman <horms@verge.net.au>
* | netfilter: Remove unnecessary OOM logging messagesJoe Perches2011-11-0113-60/+28
| | | | | | | | | | | | | | | | | | | | Site specific OOM messages are duplications of a generic MM out of memory message and aren't really useful, so just delete them. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | ipvs: Removed unused variablesSimon Horman2011-11-011-4/+0
| | | | | | | | | | | | | | | | | | ipvs is not used in ip_vs_genl_set_cmd() or ip_vs_genl_get_cmd() Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | ipvs: Remove unused return value of protocol state transitionsSimon Horman2011-11-014-24/+14
| | | | | | | | | | | | | | Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | ipvs: Remove unused parameter from ip_vs_confirm_conntrack()Simon Horman2011-11-012-2/+2
| | | | | | | | | | | | | | Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | ipvs: Expose ip_vs_ftp module parameters via sysfs.Krzysztof Wilczynski2011-11-011-2/+3
|/ | | | | | | | | | This is to expose "ports" parameter via sysfs so it can be read at any time in order to determine what port or ports were passed to the module at the point when it was loaded. Signed-off-by: Krzysztof Wilczynski <krzysztof.wilczynski@linux.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2011-10-259-31/+31
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1745 commits) dp83640: free packet queues on remove dp83640: use proper function to free transmit time stamping packets ipv6: Do not use routes from locally generated RAs |PATCH net-next] tg3: add tx_dropped counter be2net: don't create multiple RX/TX rings in multi channel mode be2net: don't create multiple TXQs in BE2 be2net: refactor VF setup/teardown code into be_vf_setup/clear() be2net: add vlan/rx-mode/flow-control config to be_setup() net_sched: cls_flow: use skb_header_pointer() ipv4: avoid useless call of the function check_peer_pmtu TCP: remove TCP_DEBUG net: Fix driver name for mdio-gpio.c ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT rtnetlink: Add missing manual netlink notification in dev_change_net_namespaces ipv4: fix ipsec forward performance regression jme: fix irq storm after suspend/resume route: fix ICMP redirect validation net: hold sock reference while processing tx timestamps tcp: md5: add more const attributes Add ethtool -g support to virtio_net ... Fix up conflicts in: - drivers/net/Kconfig: The split-up generated a trivial conflict with removal of a stale reference to Documentation/networking/net-modules.txt. Remove it from the new location instead. - fs/sysfs/dir.c: Fairly nasty conflicts with the sysfs rb-tree usage, conflicting with Eric Biederman's changes for tagged directories.
| * Merge branch 'master' of ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller2011-10-243-53/+88
| |\