aboutsummaryrefslogtreecommitdiffstats
path: root/net/ipv4/syncookies.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge remote-tracking branch 'korg/linux-3.0.y' into cm-13.0rogersb112015-11-101-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: crypto/algapi.c drivers/gpu/drm/i915/i915_debugfs.c drivers/gpu/drm/i915/intel_display.c drivers/video/fbmem.c include/linux/nls.h kernel/cgroup.c kernel/signal.c kernel/timeconst.pl net/ipv4/ping.c Change-Id: I1f532925d1743df74d66bcdd6fc92f05c72ee0dd
| * tcp: incoming connections might use wrong route under synfloodDmitry Popov2013-05-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit d66954a066158781ccf9c13c91d0316970fe57b6 ] There is a bug in cookie_v4_check (net/ipv4/syncookies.c): flowi4_init_output(&fl4, 0, sk->sk_mark, RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE, IPPROTO_TCP, inet_sk_flowi_flags(sk), (opt && opt->srr) ? opt->faddr : ireq->rmt_addr, ireq->loc_addr, th->source, th->dest); Here we do not respect sk->sk_bound_dev_if, therefore wrong dst_entry may be taken. This dst_entry is used by new socket (get_cookie_sock -> tcp_v4_syn_recv_sock), so its packets may take the wrong path. Signed-off-by: Dmitry Popov <dp@highloadlab.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | smdk4412: network: squashed commitsDerTeufel2014-12-091-1/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 9792f37daba788506559f99832c62b240402296c Author: Sreeram Ramachandran <sreeram@google.com> Date: Tue Jul 8 11:37:03 2014 -0700 Handle 'sk' being NULL in UID-based routing. Bug: 15413527 Change-Id: If33bebb7b52c0ebfa8dac2452607bce0c2b0faa0 Signed-off-by: Sreeram Ramachandran <sreeram@google.com> commit 7ab80d7fd3f1e3faebb14313119700fd7416ad54 Author: Lorenzo Colitti <lorenzo@google.com> Date: Mon Mar 31 16:23:51 2014 +0900 net: core: Support UID-based routing. This contains the following commits: 1. 0149763 net: core: Add a UID range to fib rules. 2. 1650474 net: core: Use the socket UID in routing lookups. 3. 0b16771 net: ipv4: Add the UID to the route cache. 4. ee058f1 net: core: Add a RTA_UID attribute to routes. This is so that userspace can do per-UID route lookups. Bug: 15413527 Change-Id: I1285474c6734614d3bda6f61d88dfe89a4af7892 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> commit a769ab7f07dcbbf29f2a8658aa5486bb6a2a66c3 Author: Hannes Frederic Sowa <hannes@stressinduktion.org> Date: Fri Mar 8 02:07:16 2013 +0000 ipv6: introdcue __ipv6_addr_needs_scope_id and ipv6_iface_scope_id helper functions [net-next commit b7ef213ef65256168df83ddfbb8131ed9adc10f9] __ipv6_addr_needs_scope_id checks if an ipv6 address needs to supply a 'sin6_scope_id != 0'. 'sin6_scope_id != 0' was enforced in case of link-local addresses. To support interface-local multicast these checks had to be enhanced and are now consolidated into these new helper functions. v2: a) migrated to struct ipv6_addr_props v3: a) reverted changes for ipv6_addr_props b) test for address type instead of comparing scope v4: a) unchanged Change-Id: Id6fc54cec61f967928e08a9eba4f857157d973a3 Suggested-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> commit af9b98af02a072c3eb0f3dd7d3df7242d8294e5c Author: Hannes Frederic Sowa <hannes@stressinduktion.org> Date: Mon Nov 18 07:07:45 2013 +0100 ping: prevent NULL pointer dereference on write to msg_name A plain read() on a socket does set msg->msg_name to NULL. So check for NULL pointer first. [Backport of net-next cf970c002d270c36202bd5b9c2804d3097a52da0] Bug: 12780426 Change-Id: I29d9cb95ef05ec76d37517e01317f4a29e60931c Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Lorenzo Colitti <lorenzo@google.com> commit d66ae9bbbf35cd6e7a3d04f6946d506b3148f06b Author: Cong Wang <amwang@redhat.com> Date: Sun Jun 2 22:43:52 2013 +0000 ping: always initialize ->sin6_scope_id and ->sin6_flowinfo [net-next commit c26d6b46da3ee86fa8a864347331e5513ca84c2b] If we don't need scope id, we should initialize it to zero. Same for ->sin6_flowinfo. Change-Id: I28e4bc9593e76fc3434052182466fab4bb8ccf3a Cc: Lorenzo Colitti <lorenzo@google.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit 22d188e621c143108e1207831e5817f24d0cccc0 Author: Lorenzo Colitti <lorenzo@google.com> Date: Thu Jul 4 00:12:40 2013 +0900 net: ipv6: fix wrong ping_v6_sendmsg return value [net-next commit fbfe80c890a1dc521d0b629b870e32fcffff0da5] ping_v6_sendmsg currently returns 0 on success. It should return the number of bytes written instead. Bug: 9469865 Change-Id: I82b7d3a37ba91ad24e6dbd97a4880745ce16ad31 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit b691b1c9931f86c3fc7a10208030752f205d1adf Author: Lorenzo Colitti <lorenzo@google.com> Date: Thu Jul 4 00:52:49 2013 +0900 net: ipv6: add missing lock in ping_v6_sendmsg [net-next commit a1bdc45580fc19e968b32ad27cd7e476a4aa58f6] Bug: 9469865 Change-Id: I480f8ce95956dd8f17fbbb26dc60cc162f8ec933 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit 515b76147e907579254cd5997a4ab9e64da32268 Author: Lorenzo Colitti <lorenzo@google.com> Date: Wed Jan 16 22:09:49 2013 +0000 net: ipv6: Add IPv6 support to the ping socket. [backport of net-next 6d0bfe22611602f36617bc7aa2ffa1bbb2f54c67] This adds the ability to send ICMPv6 echo requests without a raw socket. The equivalent ability for ICMPv4 was added in 2011. Instead of having separate code paths for IPv4 and IPv6, make most of the code in net/ipv4/ping.c dual-stack and only add a few IPv6-specific bits (like the protocol definition) to a new net/ipv6/ping.c. Hopefully this will reduce divergence and/or duplication of bugs in the future. Caveats: - Setting options via ancillary data (e.g., using IPV6_PKTINFO to specify the outgoing interface) is not yet supported. - There are no separate security settings for IPv4 and IPv6; everything is controlled by /proc/net/ipv4/ping_group_range. - The proc interface does not yet display IPv6 ping sockets properly. Tested with a patched copy of ping6 and using raw socket calls. Compiles and works with all of CONFIG_IPV6={n,m,y}. Change-Id: Ia359af556021344fc7f890c21383aadf950b6498 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> [lorenzo@google.com: backported to 3.0] Signed-off-by: Lorenzo Colitti <lorenzo@google.com> commit d72b1c37bab1bbdebb096421b5ef88ceec6eae8e Author: Li Wei <lw@cn.fujitsu.com> Date: Thu Feb 21 00:09:54 2013 +0000 ipv4: fix a bug in ping_err(). [ Upstream commit b531ed61a2a2a77eeb2f7c88b49aa5ec7d9880d8 ] We should get 'type' and 'code' from the outer ICMP header. Change-Id: I9a467b4aa794127f22dbc5f802d17ae618aa0c74 Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit ead1926fc318a4c97e735a885db40e77135c0531 Author: Eric Dumazet <eric.dumazet@gmail.com> Date: Mon Oct 24 03:06:21 2011 -0400 ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT There is a long standing bug in linux tcp stack, about ACK messages sent on behalf of TIME_WAIT sockets. In the IP header of the ACK message, we choose to reflect TOS field of incoming message, and this might break some setups. Example of things that were broken : - Routing using TOS as a selector - Firewalls - Trafic classification / shaping We now remember in timewait structure the inet tos field and use it in ACK generation, and route lookup. Notes : - We still reflect incoming TOS in RST messages. - We could extend MuraliRaja Muniraju patch to report TOS value in netlink messages for TIME_WAIT sockets. - A patch is needed for IPv6 Change-Id: Ic7ad8a7b858de181bfe2a789c472f84955397d4c Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit 47ef68bdd0ceb7113496f3325068202e5d1f3eba Author: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed Nov 30 19:00:53 2011 +0000 ipv4: use a 64bit load/store in output path gcc compiler is smart enough to use a single load/store if we memcpy(dptr, sptr, 8) on x86_64, regardless of CONFIG_CC_OPTIMIZE_FOR_SIZE In IP header, daddr immediately follows saddr, this wont change in the future. We only need to make sure our flowi4 (saddr,daddr) fields wont break the rule. Change-Id: Iad9c8fd9121ec84c2599b013badaebba92db7c39 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit 5b7251328273e10d0d768a24f7b555d1e1f671e6 Author: Julian Anastasov <ja@ssi.bg> Date: Sun Aug 7 09:16:09 2011 +0000 ipv4: route non-local sources for raw socket The raw sockets can provide source address for routing but their privileges are not considered. We can provide non-local source address, make sure the FLOWI_FLAG_ANYSRC flag is set if socket has privileges for this, i.e. based on hdrincl (IP_HDRINCL) and transparent flags. Change-Id: I136b161c584deac3885efbf217e959e1a829fc1d Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net> Change-Id: I0022e9536ee1861bf163e5bba4a86a3e94669960
* tcp: fix syncookie regressionEric Dumazet2012-03-231-14/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit dfd25ffffc132c00070eed64200e8950da5d7e9d ] commit ea4fc0d619 (ipv4: Don't use rt->rt_{src,dst} in ip_queue_xmit()) added a serious regression on synflood handling. Simon Kirby discovered a successful connection was delayed by 20 seconds before being responsive. In my tests, I discovered that xmit frames were lost, and needed ~4 retransmits and a socket dst rebuild before being really sent. In case of syncookie initiated connection, we use a different path to initialize the socket dst, and inet->cork.fl.u.ip4 is left cleared. As ip_queue_xmit() now depends on inet flow being setup, fix this by copying the temp flowi4 we use in cookie_v4_check(). Reported-by: Simon Kirby <sim@netnation.com> Bisected-by: Simon Kirby <sim@netnation.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* tcp: initialize variable ecn_ok in syncookies pathMike Waychison2011-10-031-1/+1
| | | | | | | | | | | | | | | [ Upstream commit f0e3d0689da401f7d1981c2777a714ba295ea5ff ] Using a gcc 4.4.3, warnings are emitted for a possibly uninitialized use of ecn_ok. This can happen if cookie_check_timestamp() returns due to not having seen a timestamp. Defaulting to ecn off seems like a reasonable thing to do in this case, so initialized ecn_ok to false. Signed-off-by: Mike Waychison <mikew@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* inet: add RCU protection to inet->optEric Dumazet2011-04-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | We lack proper synchronization to manipulate inet->opt ip_options Problem is ip_make_skb() calls ip_setup_cork() and ip_setup_cork() possibly makes a copy of ipc->opt (struct ip_options), without any protection against another thread manipulating inet->opt. Another thread can change inet->opt pointer and free old one under us. Use RCU to protect inet->opt (changed to inet->inet_opt). Instead of handling atomic refcounts, just copy ip_options when necessary, to avoid cache line dirtying. We cant insert an rcu_head in struct ip_options since its included in skb->cb[], so this patch is large because I had to introduce a new ip_options_rcu structure. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Use flowi4_init_output() in cookie_v4_check()David S. Miller2011-03-311-11/+7
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Put fl4_* macros to struct flowi4 and use them again.David S. Miller2011-03-121-2/+2
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Use flowi4 in public route lookup interfaces.David S. Miller2011-03-121-12/+12
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Make flowi ports AF dependent.David S. Miller2011-03-121-2/+2
| | | | | | | | | | | | Create two sets of port member accessors, one set prefixed by fl4_* and the other prefixed by fl6_* This will let us to create AF optimal flow instances. It will work because every context in which we access the ports, we have to be fully aware of which AF the flowi is anyways. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Put flowi_* prefix on AF independent members of struct flowiDavid S. Miller2011-03-121-9/+11
| | | | | | | | | | I intend to turn struct flowi into a union of AF specific flowi structs. There will be a common structure that each variant includes first, much like struct sock_common. This is the first step to move in that direction. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Make output route lookup return rtable directly.David S. Miller2011-03-021-1/+2
| | | | | | Instead of on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: use the macros defined for the members of flowiChangli Gao2010-11-171-9/+6
| | | | | | | Use the macros defined for the members of flowi to clean the code up. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* syncookies: add support for ECNFlorian Westphal2010-06-261-5/+10
| | | | | | | | | | | | Allows use of ECN when syncookies are in effect by encoding ecn_ok into the syn-ack tcp timestamp. While at it, remove a uneeded #ifdef CONFIG_SYN_COOKIES. With CONFIG_SYN_COOKIES=nm want_cookie is ifdef'd to 0 and gcc removes the "if (0)". Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* syncookies: do not store rcv_wscale in tcp timestampFlorian Westphal2010-06-261-21/+14
| | | | | | | | | | | | | | | As pointed out by Fernando Gont there is no need to encode rcv_wscale into the cookie. We did not use the restored rcv_wscale anyway; it is recomputed via tcp_select_initial_window(). Thus we can save 4 bits in the ts option space by removing rcv_wscale. In case window scaling was not supported, we set the (invalid) wscale value 0xf. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* syncookies: check decoded options against sysctl settingsFlorian Westphal2010-06-161-6/+19
| | | | | | | | | | | | | | | | | | | | | Discard the ACK if we find options that do not match current sysctl settings. Previously it was possible to create a connection with sack, wscale, etc. enabled even if the feature was disabled via sysctl. Also remove an unneeded call to tcp_sack_reset() in cookie_check_timestamp: Both call sites (cookie_v4_check, cookie_v6_check) zero "struct tcp_options_received", hand it to tcp_parse_options() (which does not change tcp_opt->num_sacks/dsack) and then call cookie_check_timestamp(). Even if num_sacks/dsacks were changed, the structure is allocated on the stack and after cookie_check_timestamp returns only a few selected members are copied to the inet_request_sock. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net-next: remove useless union keywordChangli Gao2010-06-101-3/+3
| | | | | | | | | | remove useless union keyword in rtable, rt6_info and dn_route. Since there is only one member in a union, the union keyword isn't useful. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2010-06-061-1/+1
|\ | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/sfc/net_driver.h drivers/net/sfc/siena.c
| * tcp: use correct net ns in cookie_v4_check()Eric Dumazet2010-06-041-1/+1
| | | | | | | | | | | | | | Its better to make a route lookup in appropriate namespace. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | syncookies: update mss tablesFlorian Westphal2010-06-051-19/+19
| | | | | | | | | | | | | | | | - ipv6 msstab: account for ipv6 header size - ipv4 msstab: add mss for Jumbograms. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | syncookies: avoid unneeded tcp header flag double checkFlorian Westphal2010-06-051-1/+1
|/ | | | | | | | | | | caller: if (!th->rst && !th->syn && th->ack) callee: if (!th->ack) make the caller only check for !syn (common for 3whs), and move the !rst / ack test to the callee. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add rtnetlink init_rcvwnd to set the TCP initial receive windowlaurent chavey2009-12-231-1/+2
| | | | | | | | | | | | | | | | | | | Add rtnetlink init_rcvwnd to set the TCP initial receive window size advertised by passive and active TCP connections. The current Linux TCP implementation limits the advertised TCP initial receive window to the one prescribed by slow start. For short lived TCP connections used for transaction type of traffic (i.e. http requests), bounding the advertised TCP initial receive window results in increased latency to complete the transaction. Support for setting initial congestion window is already supported using rtnetlink init_cwnd, but the feature is useless without the ability to set a larger TCP initial receive window. The rtnetlink init_rcvwnd allows increasing the TCP initial receive window, allowing TCP connection to advertise larger TCP receive window than the ones bounded by slow start. Signed-off-by: Laurent Chavey <chavey@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Revert per-route SACK/DSACK/TIMESTAMP changes.David S. Miller2009-12-151-14/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It creates a regression, triggering badness for SYN_RECV sockets, for example: [19148.022102] Badness at net/ipv4/inet_connection_sock.c:293 [19148.022570] NIP: c02a0914 LR: c02a0904 CTR: 00000000 [19148.023035] REGS: eeecbd30 TRAP: 0700 Not tainted (2.6.32) [19148.023496] MSR: 00029032 <EE,ME,CE,IR,DR> CR: 24002442 XER: 00000000 [19148.024012] TASK = eee9a820[1756] 'privoxy' THREAD: eeeca000 This is likely caused by the change in the 'estab' parameter passed to tcp_parse_options() when invoked by the functions in net/ipv4/tcp_minisocks.c But even if that is fixed, the ->conn_request() changes made in this patch series is fundamentally wrong. They try to use the listening socket's 'dst' to probe the route settings. The listening socket doesn't even have a route, and you can't get the right route (the child request one) until much later after we setup all of the state, and it must be done by hand. This stuff really isn't ready, so the best thing to do is a full revert. This reverts the following commits: f55017a93f1a74d50244b1254b9a2bd7ac9bbf7d 022c3f7d82f0f1c68018696f2f027b87b9bb45c2 1aba721eba1d84a2defce45b950272cee1e6c72a cda42ebd67ee5fdf09d7057b5a4584d36fe8a335 345cda2fd695534be5a4494f1b59da9daed33663 dc343475ed062e13fc260acccaab91d7d80fd5b2 05eaade2782fb0c90d3034fd7a7d5a16266182bb 6a2a2d6bf8581216e08be15fcb563cfd6c430e1e Signed-off-by: David S. Miller <davem@davemloft.net>
* TCPCT part 1g: Responder Cookie => InitiatorWilliam Allen Simpson2009-12-021-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Parse incoming TCP_COOKIE option(s). Calculate <SYN,ACK> TCP_COOKIE option. Send optional <SYN,ACK> data. This is a significantly revised implementation of an earlier (year-old) patch that no longer applies cleanly, with permission of the original author (Adam Langley): http://thread.gmane.org/gmane.linux.network/102586 Requires: TCPCT part 1a: add request_values parameter for sending SYNACK TCPCT part 1b: generate Responder Cookie secret TCPCT part 1c: sysctl_tcp_cookie_size, socket option TCP_COOKIE_TRANSACTIONS TCPCT part 1d: define TCP cookie option, extend existing struct's TCPCT part 1e: implement socket option TCP_COOKIE_TRANSACTIONS TCPCT part 1f: Initiator Cookie => Responder Signed-off-by: William.Allen.Simpson@gmail.com Signed-off-by: David S. Miller <davem@davemloft.net>
* Allow tcp_parse_options to consult dst entryGilad Ben-Yossef2009-10-291-13/+14
| | | | | | | | | | We need tcp_parse_options to be aware of dst_entry to take into account per dst_entry TCP options settings Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: Ori Finkelman <ori@comsleep.com> Sigend-off-by: Yony Amit <yony@comsleep.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add sk_mark route lookup support for IPv4 listening socketsAtis Elsts2009-10-071-1/+2
| | | | | | | Add support for route lookup using sk_mark on IPv4 listening sockets. Signed-off-by: Atis Elsts <atis@mikrotik.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* percpu: clean up percpu variable definitionsTejun Heo2009-06-241-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Percpu variable definition is about to be updated such that all percpu symbols including the static ones must be unique. Update percpu variable definitions accordingly. * as,cfq: rename ioc_count uniquely * cpufreq: rename cpu_dbs_info uniquely * xen: move nesting_count out of xen_evtchn_do_upcall() and rename it * mm: move ratelimits out of balance_dirty_pages_ratelimited_nr() and rename it * ipv4,6: rename cookie_scratch uniquely * x86 perf_counter: rename prev_left to pmc_prev_left, irq_entry to pmc_irq_entry and nmi_entry to pmc_nmi_entry * perf_counter: rename disable_count to perf_disable_count * ftrace: rename test_event_disable to ftrace_test_event_disable * kmemleak: rename test_pointer to kmemleak_test_pointer * mce: rename next_interval to mce_next_interval [ Impact: percpu usage cleanups, no duplicate static percpu var names ] Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Dave Jones <davej@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: linux-mm <linux-mm@kvack.org> Cc: David S. Miller <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andi Kleen <andi@firstfloor.org>
* percpu: cleanup percpu array definitionsTejun Heo2009-06-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the following three different ways to define percpu arrays are in use. 1. DEFINE_PER_CPU(elem_type[array_len], array_name); 2. DEFINE_PER_CPU(elem_type, array_name[array_len]); 3. DEFINE_PER_CPU(elem_type, array_name)[array_len]; Unify to #1 which correctly separates the roles of the two parameters and thus allows more flexibility in the way percpu variables are defined. [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tony Luck <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: linux-mm@kvack.org Cc: Christoph Lameter <cl@linux-foundation.org> Cc: David S. Miller <davem@davemloft.net>
* syncookies: remove last_synq_overflow from struct tcp_sockFlorian Westphal2009-04-201-3/+2
| | | | | | | | | | | | | | | last_synq_overflow eats 4 or 8 bytes in struct tcp_sock, even though it is only used when a listening sockets syn queue is full. We can (ab)use rx_opt.ts_recent_stamp to store the same information; it is not used otherwise as long as a socket is in listen state. Move linger2 around to avoid splitting struct mtu_probe across cacheline boundary on 32 bit arches. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* lsm: Relocate the IPv4 security_inet_conn_request() hooksPaul Moore2009-03-281-4/+5
| | | | | | | | | | | | | | | | | | | | The current placement of the security_inet_conn_request() hooks do not allow individual LSMs to override the IP options of the connection's request_sock. This is a problem as both SELinux and Smack have the ability to use labeled networking protocols which make use of IP options to carry security attributes and the inability to set the IP options at the start of the TCP handshake is problematic. This patch moves the IPv4 security_inet_conn_request() hooks past the code where the request_sock's IP options are set/reset so that the LSM can safely manipulate the IP options as needed. This patch intentionally does not change the related IPv6 hooks as IPv6 based labeling protocols which use IPv6 options are not currently implemented, once they are we will have a better idea of the correct placement for the IPv6 hooks. Signed-off-by: Paul Moore <paul.moore@hp.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: James Morris <jmorris@namei.org>
* tcp: Port redirection support for TCPKOVACS Krisztian2008-10-011-0/+1
| | | | | | | | | | | | | | | Current TCP code relies on the local port of the listening socket being the same as the destination address of the incoming connection. Port redirection used by many transparent proxying techniques obviously breaks this, so we have to store the original destination port address. This patch extends struct inet_request_sock and stores the incoming destination port value there. It also modifies the handshake code to use that value as the source port when sending reply packets. Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Make Netfilter's ip_route_me_harder() non-local address compatibleKOVACS Krisztian2008-10-011-0/+2
| | | | | | | | | | | Netfilter's ip_route_me_harder() tries to re-route packets either generated or re-routed by Netfilter. This patch changes ip_route_me_harder() to handle packets from non-locally-bound sockets with IP_TRANSPARENT set as local and to set the appropriate flowi flags when re-doing the routing lookup. Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
* syncookies: Make sure ECN is disabledFlorian Westphal2008-07-261-0/+1
| | | | | | | | | | ecn_ok is not initialized when a connection is established by cookies. The cookie syn-ack never sets ECN, so ecn_ok must be set to 0. Spotted using ns-3/network simulation cradle simulator and valgrind. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* mib: add net to NET_INC_STATS_BHPavel Emelyanov2008-07-161-3/+3
| | | | | Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2008-06-131-2/+1
|\ | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/smc911x.c
| * inet{6}_request_sock: Init ->opt and ->pktopts in the constructorArnaldo Carvalho de Melo2008-06-101-2/+1
| | | | | | | | | | | | | | | | | | | | Wei Yongjun noticed that we may call reqsk_free on request sock objects where the opt fields may not be initialized, fix it by introducing inet_reqsk_alloc where we initialize ->opt to NULL and set ->pktopts to NULL in inet6_reqsk_alloc. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: remove CVS keywordsAdrian Bunk2008-06-111-2/+0
|/ | | | | | | | This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [Syncookies]: Add support for TCP options via timestamps.Florian Westphal2008-04-101-5/+84
| | | | | | | | | | | | | | | | | | | | | | | | | Allow the use of SACK and window scaling when syncookies are used and the client supports tcp timestamps. Options are encoded into the timestamp sent in the syn-ack and restored from the timestamp echo when the ack is received. Based on earlier work by Glenn Griffin. This patch avoids increasing the size of structs by encoding TCP options into the least significant bits of the timestamp and by not using any 'timestamp offset'. The downside is that the timestamp sent in the packet after the synack will increase by several seconds. changes since v1: don't duplicate timestamp echo decoding function, put it into ipv4/syncookie.c and have ipv6/syncookies.c use it. Feedback from Glenn Griffin: fix line indented with spaces, kill redundant if () Reviewed-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP]: Shrink syncookie_secret by 8 byte.Florian Westphal2008-03-231-2/+2
| | | | | | | | | | | | | | | | the first u32 copied from syncookie_secret is overwritten by the minute-counter four lines below. After adjusting the destination address, the size of syncookie_secret can be reduced accordingly. AFAICS, the only other user of syncookie_secret[] is the ipv6 syncookie support. Because ipv6 syncookies only grab 44 bytes from syncookie_secret[], this shouldn't affect them in any way. With fixes from Glenn Griffin. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Glenn Griffin <ggriffin.kernel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP]: Add IPv6 support to TCP SYN cookiesGlenn Griffin2008-03-041-4/+3
| | | | | | | | | Updated to incorporate Eric's suggestion of using a per cpu buffer rather than allocating on the stack. Just a two line change, but will resend in it's entirety. Signed-off-by: Glenn Griffin <ggriffin.kernel@gmail.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [TCP]: lower stack usage in cookie_hash() functionEric Dumazet2008-03-041-1/+3
| | | | | | | | 400 bytes allocated on stack might be a litle bit too much. Using a per_cpu var is more friendly. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [NETNS]: Add namespace parameter to ip_route_output_key.Denis V. Lunev2008-01-281-1/+1
| | | | | | | Needed to propagate it down to the ip_route_output_flow. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [SK_BUFF]: Introduce tcp_hdr(), remove skb->h.thArnaldo Carvalho de Melo2007-04-251-18/+18
| | | | | Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iphArnaldo Carvalho de Melo2007-04-251-4/+4
| | | | | Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET] IPV4: Fix whitespace errors.YOSHIFUJI Hideaki2007-02-101-29/+29
| | | | | Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV6]: Assorted trivial endianness annotations.Al Viro2006-12-021-9/+9
| | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* [MLSXFRM]: Auto-labeling of child socketsVenkat Yekkirala2006-09-221-1/+5
| | | | | | | | | | | | | This automatically labels the TCP, Unix stream, and dccp child sockets as well as openreqs to be at the same MLS level as the peer. This will result in the selection of appropriately labeled IPSec Security Associations. This also uses the sock's sid (as opposed to the isec sid) in SELinux enforcement of secmark in rcv_skb and postroute_last hooks. Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [MLSXFRM]: Add flow labelingVenkat Yekkirala2006-09-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | This labels the flows that could utilize IPSec xfrms at the points the flows are defined so that IPSec policy and SAs at the right label can be used. The following protos are currently not handled, but they should continue to be able to use single-labeled IPSec like they currently do. ipmr ip_gre ipip igmp sit sctp ip6_tunnel (IPv6 over IPv6 tunnel device) decnet Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ICSK]: Rename struct tcp_func to struct inet_connection_sock_af_opsArnaldo Carvalho de Melo2006-01-031-2/+2
| | | | | | | | And move it to struct inet_connection_sock. DCCP will use it in the upcoming changesets. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Fix sparse warningsArnaldo Carvalho de Melo2005-08-291-2/+0
| | | | | | | | | | | Of this type, mostly: CHECK net/ipv6/netfilter.c net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static? net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static? Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>