TCP套接字选项SO_LINGER与TCP_LINGER2

摘要:
概述本文对两个LINGER相关的套接字选项进行源码层面的分析,以更明确其各自的作用和区别;manpageSO_LINGER,该选项是socket层面的选项,通过structlinger结构来设置信息,如果启用该选项,那么使用close()和shutdown()(注意:虽然manpage这么写,但是shutdown内核代码流程中并未用到该选项)关闭socket,将会等待发送队列中的数据发送完成或者等
概述

本文对两个LINGER相关的套接字选项进行源码层面的分析,以更明确其各自的作用和区别;

man page

SO_LINGER,该选项是socket层面的选项,通过struct linger结构来设置信息,如果启用该选项,那么使用close()和shutdown()(注意:虽然manpage这么写,但是shutdown内核代码流程中并未用到该选项)关闭socket,将会等待发送队列中的数据发送完成或者等待超时;如果不启用该选项,那么调用会立即返回,关闭任务在后台完成;注意:如果是调用exit()函数关闭socket,那么无论是否启用SO_LINGER选项,socket总会在后台执行linger等待;

1 SO_LINGER
2               Sets or gets the SO_LINGER option.  The argument isa linger
3 structure.
4 
5                   structlinger {
6                       int l_onoff;    /*linger active */
7                       int l_linger;   /*how many seconds to linger for */
8 };
9 
10               When enabled, a close(2) or shutdown(2) will not returnuntil
11               all queued messages forthe socket have been successfully sent
12 or the linger timeout has been reached.  Otherwise, the call
13               returns immediately and the closing is done inthe background.
14               When the socket is closed as part of exit(2), it always
15               lingers in the background.

TCP_LINGER2,该选项是TCP层面的,用于设定孤儿套接字在FIN_WAIT2状态的生存时间,该选项可以用来替代系统级别的tcp_fin_timeout配置;在用于移植的代码中不应该使用该选项;另外,需要注意,不要混淆该选项与socket的SO_LINGER选项;

1        TCP_LINGER2 (since Linux 2.4)
2 The lifetime of orphaned FIN_WAIT2 state sockets.  This option
3               can be used to override the system-wide setting inthe file
4               /proc/sys/net/ipv4/tcp_fin_timeout for this socket.  This is
5               not to be confused with the socket(7) level option SO_LINGER.
6               This option should not be used incode intended to be
7               portable.
源码分析
SO_LINGER

在调用close()系统调用时,如果引用计数已经为0,则会进行套接字关闭操作,我们从inet_release开始分析;前置步骤请移步<套接字之close系统调用>;如果启用了SO_LINGER选项,那么会将lingertime传入到传输层的关闭函数中,tcp为tcp_close;

1 /*
2 *    The peer socket should always be NULL (or else). When we call this
3 *    function we are destroying the object and from then on nobody
4 *    should refer to it.
5  */
6 int inet_release(struct socket *sock)
7 {
8     struct sock *sk = sock->sk;
9 
10     if(sk) {
11         longtimeout;
12 
13         /*Applications forget to leave groups before exiting */
14         /*退出组播组 */
15 ip_mc_drop_socket(sk);
16 
17         /*If linger is set, we don't return until the close
18 * is complete.  Otherwise we return immediately. The
19 * actually closing is done the same either way.
20 *
21 * If the close is due to the process exiting, we never
22 * linger..
23          */
24         timeout = 0;
25 
26         /* 
27 设置了linger标记,进程未在退出,
28 则设置lingertime延迟关闭时间 
29         */
30         if (sock_flag(sk, SOCK_LINGER) &&
31             !(current->flags &PF_EXITING))
32             timeout = sk->sk_lingertime;
33         sock->sk =NULL;
34 
35         /*调用传输层的close函数 */
36         sk->sk_prot->close(sk, timeout);
37 }
38     return 0;
39 }

tcp_close函数,在关闭socket销毁资源之前,调用sk_stream_wait_close函数等待数据发送完毕或者达到lingertime超时时间,然后才继续进入关闭socket销毁资源的流程;

1 void tcp_close(struct sock *sk, longtimeout)
2 {
3        /*... */
4 
5     /*If socket has been already reset (e.g. in tcp_reset()) - kill it. */
6     /*CLOSE状态 */
7     if (sk->sk_state ==TCP_CLOSE)
8         gotoadjudge_to_death;
9 
10     /*As outlined in RFC 2525, section 2.17, we send a RST here because
11 * data was lost. To witness the awful effects of the old behavior of
12 * always doing a FIN, run an older 2.1.x kernel or 2.0.x, start a bulk
13 * GET in an FTP client, suspend the process, wait for the client to
14 * advertise a zero window, then kill -9 the FTP client, wheee...
15 * Note: timeout is always zero in such a case.
16      */
17     /*修复状态,断开连接 */
18     if (unlikely(tcp_sk(sk)->repair)) {
19         sk->sk_prot->disconnect(sk, 0);
20 } 
21     /*用户进程有数据未读 */
22     else if(data_was_unread) {
23         /*Unread data was tossed, zap the connection. */
24 NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONCLOSE);
25 
26         /*设置为close */
27 tcp_set_state(sk, TCP_CLOSE);
28 
29         /*发送rst */
30         tcp_send_active_reset(sk, sk->sk_allocation);
31 } 
32     /*lingertime==0,断开连接 */
33     else if (sock_flag(sk, SOCK_LINGER) && !sk->sk_lingertime) {
34         /*Check zero linger _after_ checking for unread data. */
35         sk->sk_prot->disconnect(sk, 0);
36 NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONDATA);
37 } 
38     /*关闭状态转移 */
39     else if(tcp_close_state(sk)) {
40         /*We FIN if the application ate all the data before
41 * zapping the connection.
42          */
43 
44         /*RED-PEN. Formally speaking, we have broken TCP state
45 * machine. State transitions:
46 *
47 * TCP_ESTABLISHED -> TCP_FIN_WAIT1
48 * TCP_SYN_RECV    -> TCP_FIN_WAIT1 (forget it, it's impossible)
49 * TCP_CLOSE_WAIT -> TCP_LAST_ACK
50 *
51 * are legal only when FIN has been sent (i.e. in window),
52 * rather than queued out of window. Purists blame.
53 *
54 * F.e. "RFC state" is ESTABLISHED,
55 * if Linux state is FIN-WAIT-1, but FIN is still not sent.
56 *
57 * The visible declinations are that sometimes
58 * we enter time-wait state, when it is not required really
59 * (harmless), do not send active resets, when they are
60 * required by specs (TCP_ESTABLISHED, TCP_CLOSE_WAIT, when
61 * they look as CLOSING or LAST_ACK for Linux)
62 * Probably, I missed some more holelets.
63 *                         --ANK
64 * XXX (TFO) - To start off we don't support SYN+ACK+FIN
65 * in a single packet! (May consider it later but will
66 * probably need API support or TCP_CORK SYN-ACK until
67 * data is written and socket is closed.)
68          */
69         /*发送fin */
70 tcp_send_fin(sk);
71 }
72 
73     /*等待关闭,无数据发送或sk_lingertime超时 */
74 sk_stream_wait_close(sk, timeout);
75 
76 adjudge_to_death:
77     /*socket关闭,释放资源 */
78         /*... */
79 }

下面的sk_stream_closing函数检查连接状态,当为TCPF_FIN_WAIT1 | TCPF_CLOSING | TCPF_LAST_ACK时,说明还有数据要发送,这时返回1,等待继续执行;sk_stream_wait_close在等待连接状态不为上述状态时,或者有信号要处理,或者超过lingertime,则返回;

1 /**
2 * sk_stream_closing - Return 1 if we still have things to send in our buffers.
3 * @sk: socket to verify
4  */
5 static inline int sk_stream_closing(struct sock *sk)
6 {
7     return (1 << sk->sk_state) &
8            (TCPF_FIN_WAIT1 | TCPF_CLOSING |TCPF_LAST_ACK);
9 }
10 
11 void sk_stream_wait_close(struct sock *sk, longtimeout)
12 {
13     if(timeout) {
14 DEFINE_WAIT_FUNC(wait, woken_wake_function);
15 
16         add_wait_queue(sk_sleep(sk), &wait);
17 
18         do{
19             if (sk_wait_event(sk, &timeout, !sk_stream_closing(sk), &wait))
20                 break;
21         } while (!signal_pending(current) &&timeout);
22 
23         remove_wait_queue(sk_sleep(sk), &wait);
24 }
25 }
TCP_LINGER2

启动FIN_WAIT_2定时器两个相关逻辑差不多,所以只拿一个位置来说明;在tcp_close函数中,如果判断状态为FIN_WAIT2,则需要进一步判断linger2配置;如下所示,在linger2<0的情况下,关闭连接到CLOSE状态,并且发送rst;在linger2 >= 0的情况下,需判断该值与TIME_WAIT等待时间TCP_TIMEWAIT_LEN值的关系,如果linger2 > TCP_TIMEWAIT_LEN,则启动FIN_WAIT_2定时器,其超时时间为二者的差值;如果linger2<0,则直接进入到TIME_WAIT状态,该TIME_WAIT的子状态是FIN_WAIT2,实际上就是由TIME_WAIT控制块进行了接管,统一交给TIME_WAIT控制块来处理;详细处理过程,后续补充;

1 void tcp_close(struct sock *sk, longtimeout)
2 {
3         /*... */
4     if (sk->sk_state ==TCP_FIN_WAIT2) {
5         struct tcp_sock *tp =tcp_sk(sk);
6         /*linger2小于0,无需等待 */
7         if (tp->linger2 < 0) {
8 
9             /*转到CLOSE */
10 tcp_set_state(sk, TCP_CLOSE);
11             /*发送rst */
12 tcp_send_active_reset(sk, GFP_ATOMIC);
13 __NET_INC_STATS(sock_net(sk),
14 LINUX_MIB_TCPABORTONLINGER);
15         } else{
16 
17             /*获取FIN_WAIT_2超时时间 */
18             const int tmo =tcp_fin_time(sk);
19 
20             /*FIN_WAIT_2超时时间> TIME_WAIT时间,加FIN_WAIT_2定时器 */
21             if (tmo >TCP_TIMEWAIT_LEN) {
22 inet_csk_reset_keepalive_timer(sk,
23                         tmo -TCP_TIMEWAIT_LEN);
24 } 
25             /*小于TIME_WAIT时间,则进入TIME_WAIT */
26             else{
27 tcp_time_wait(sk, TCP_FIN_WAIT2, tmo);
28                 goto out;
29 }
30 }
31 }
32 
33         /*... */
34 }

tcp_fin_time函数用来获取通过选项配置的linger2时间,未配置则默认为系统级别的tcp_fin_timeout;

1 static inline int tcp_fin_time(const struct sock *sk)
2 {
3     int fin_timeout = tcp_sk(sk)->linger2 ? : sock_net(sk)->ipv4.sysctl_tcp_fin_timeout;
4     const int rto = inet_csk(sk)->icsk_rto;
5 
6     if (fin_timeout < (rto << 2) - (rto >> 1))
7         fin_timeout = (rto << 2) - (rto >> 1);
8 
9     returnfin_timeout;
10 }

免责声明:文章转载自《TCP套接字选项SO_LINGER与TCP_LINGER2》仅用于学习参考。如对内容有疑问,请及时联系本站处理。

上篇ExtJS入门教程01,Window如此简单,你怎能不会?c++ 动态判断基类指针指向的子类类型(typeid)下篇

宿迁高防,2C2G15M,22元/月;香港BGP,2C5G5M,25元/月 雨云优惠码:MjYwNzM=

相关文章

Jakarta Java Mail属性参数配置

前言 Jakarta Mail网址:https://eclipse-ee4j.github.io/mail SMTP协议可匹配的属性:https://eclipse-ee4j.github.io/mail/docs/api/com/sun/mail/smtp/package-summary.html 翻译(Package com.sun.mail.smtp...

网络抓包神器 Tcpdump 使用指南

tcpdump 是一款强大的网络抓包工具,它使用 libpcap 库来抓取网络数据包,这个库在几乎在所有的 Linux/Unix 中都有。熟悉 tcpdump 的使用能够帮助你分析调试网络数据,本文将通过一个个具体的示例来介绍它在不同场景下的使用方法。 01 基本语法和使用方法 tcpdump 的常用参数如下: $ tcpdump -i eth0 -n...

3--Java NIO基础1

一、NIO概述 1. BIO带来的挑战 BIO即堵塞式I/O,数据在写入或读取时都有可能堵塞,一旦有堵塞,线程将失去CPU的使用权,性能较差。 2. NIO工作机制 Java NIO由Channel、Buffer、Selector三个核心组成,NIO框架类结构图如下: 其中,Buffer主要负责存取数据,Channel用于数据传输,获取数据,然后流入Bu...

长连接&amp;amp;短连接分析

转自:http://www.cnblogs.com/heyonggang/p/3660600.html 1. TCP连接 当网络通信时采用TCP协议时,在真正的读写操作之前,server与client之间必须建立一个连接,当读写操作完成后,双方不再需要这个连接 时它们可以释放这个连接,连接的建立是需要三次握手的,而释放则需要4次握手,所以说每个连接的建立...

TCP接收方对于重叠报文的处理

一、接受方有效负载的判断 在rfc793中说明了对于判断接收到的报文是否有负载的判断在Page 24和Page 25之间,其中的原文说明为  A segment is judged to occupy a portion of valid receive sequence   space if       RCV.NXT =< SEG.SEQ <...

《Python》网络编程之验证客户端连接的合法性、socketserver模块

一、socket的更多方法介绍 # 服务端套接字函数 s.bind() # 绑定(主机,端口号)到套接字 s.listen() # 开始TCP监听 s.accept() # 被动接受TCP客户的连接,(阻塞式)等待连接的到来 # 客户端套接字函数 s.connect() # 主动初始化TCP服务器连接 s.connect_ex() #...