乡下人产国偷v产偷v自拍,国产午夜片在线观看,婷婷成人亚洲综合国产麻豆,久久综合给合久久狠狠狠9

  • <output id="e9wm2"></output>
    <s id="e9wm2"><nobr id="e9wm2"><ins id="e9wm2"></ins></nobr></s>

    • 分享

      YUBO.ORG: 運(yùn)行中的ipvs

       mrjbydd 2011-01-11

      運(yùn)行中的ipvs

      ipvs 的規(guī)則實(shí)現(xiàn)原理

      ipvs的規(guī)則是如何生效的,先來看看他實(shí)現(xiàn)的原理

      簡(jiǎn)單的來講,ipvs無非就是修改了數(shù)據(jù)報(bào)頭信息來完成client -> virus server -> real server的調(diào)度.調(diào)度的目的是使realservers之間的負(fù)載接近于平衡狀態(tài).這里牽扯到2個(gè)問題,修改數(shù)據(jù)報(bào)的方式和調(diào)度的策略.

      我們先來看看修改數(shù)據(jù)報(bào)的具體方式,現(xiàn)在2.6內(nèi)核中ipvs實(shí)現(xiàn)的方式和原來有點(diǎn)不一樣.引用一下ipvs的作者張文嵩先生的一段話

      我們分別在Linux 內(nèi)核2.0和內(nèi)核2.2中修改了TCP/IP協(xié)議棧,在IP層截取和改寫/轉(zhuǎn)發(fā)IP報(bào)文,
      實(shí)現(xiàn)了三種IP負(fù)載均衡技術(shù),并提供了一個(gè)ipvsadm程序進(jìn)行虛擬服務(wù)器的配置和管理。在Linux
      內(nèi)核2.4和2.6中,我們把它實(shí)現(xiàn)為NetFilter的一個(gè)模塊,很多代碼作了改寫和進(jìn)一步優(yōu)化,
      目前版本已在網(wǎng)上發(fā)布,根據(jù)反饋信息該版本已經(jīng)較穩(wěn)定。

      好吧,說得很清楚了,ipvs就是借用netfilter來修改數(shù)據(jù)報(bào)的.那么簡(jiǎn)單了解一下netfilter的工作原理還是很有必要的,如圖

      netfilter一共有5個(gè)規(guī)則鏈,每個(gè)規(guī)則鏈都能存放若干條規(guī)則,規(guī)則之間都順序(也就是優(yōu)先級(jí)),一旦有規(guī)則被匹配到,完成相應(yīng)動(dòng)作后,跳出該規(guī)則鏈.這5個(gè)規(guī)則鏈分別是PREROUTING,INPUT,FORWARD,OUTPUT,POSTROUTING.我們可以將機(jī)器中的連接分成3中狀態(tài)

      • 從外部進(jìn)入主機(jī)的連接,經(jīng)過 PREROUTING -> INPUT
      • 從主機(jī)出去的連接,將經(jīng)過 OUPUT -> POSTROUTING
      • 由主機(jī)轉(zhuǎn)發(fā)的連接,經(jīng)過PREROUTING -> FORWARD -> POSTROUTING

      每個(gè)規(guī)則鏈里的規(guī)則會(huì)在數(shù)據(jù)經(jīng)過該規(guī)則鏈的時(shí)候起作用(也就是調(diào)用相應(yīng)的函數(shù)進(jìn)行處理).看上去很簡(jiǎn)單吧,比如ipvs作為netfilter的一個(gè)模塊,往這些規(guī)則鏈里寫入規(guī)則就好可以了

      等等.如果netfilter有很多模塊,都往一個(gè)規(guī)則鏈里寫入規(guī)則,會(huì)不會(huì)很亂呢?優(yōu)先級(jí)如何控制呢?所以規(guī)則鏈里的規(guī)則我們會(huì)根據(jù)不同的作用將其分類進(jìn)行管理,每一類的規(guī)則用一個(gè)整數(shù)來表示他的優(yōu)先級(jí),越小,優(yōu)先級(jí)越高.如果是同一類型的規(guī)則,則根據(jù)規(guī)則的先后順序來決定(鏈表結(jié)構(gòu),越靠前,優(yōu)先級(jí)越高)

      netfilter本身有3個(gè)作用,所以他的規(guī)則分為3種類型,用3個(gè)表來表示,分別為filter表(過濾),nat表(修改數(shù)據(jù)報(bào)頭),mangle表(修改數(shù)據(jù)).而ipvs模塊就相當(dāng)于在netfilter里添加了一張新的ipvs表一樣.關(guān)于netfilter的更多信息,請(qǐng)參考文獻(xiàn)一


      ipvs 的規(guī)則實(shí)現(xiàn)過程

      每當(dāng)有新的連接(數(shù)據(jù)報(bào))經(jīng)過netfilter的規(guī)則鏈時(shí),就會(huì)調(diào)用NF_HOOK()函數(shù).此函數(shù)會(huì)訪問一個(gè)全部變量nf_hooks.這個(gè)變量里存放了netfilter的所有表(包括filter,nat,mangle和ipvs附加表等),以及每個(gè)表的規(guī)則鏈,規(guī)則鏈里的函數(shù)調(diào)用.然后遍歷nf_hooks變量里相應(yīng)規(guī)則鏈里的所有信息,根據(jù)優(yōu)先級(jí)進(jìn)行相應(yīng)的函數(shù)調(diào)用,每個(gè)規(guī)則鏈里的函數(shù)都會(huì)根據(jù)該規(guī)則鏈里的規(guī)則對(duì)數(shù)據(jù)報(bào)進(jìn)行匹配和處理

      還記得在前一部分的最后,講到的nf_register_hook()部分嗎?正是ipvs使用ret = nf_register_hooks(ip_vs_ops, ARRAY_SIZE(ip_vs_ops)); 往nf_hooks變量里加入了一些數(shù)據(jù),才使得ipvs的規(guī)則能被netfilter執(zhí)行.接下來我們來看看加入的都是些什么數(shù)據(jù)

      ip_vs_ops的數(shù)據(jù)內(nèi)容是

      
                  

      net/ipv4/ipvs/ip_vs_core.c

      1.  static struct nf_hook_ops ip_vs_ops[] __read_mostly = {
      2.   /* After packet filtering, forward packet through VS/DR, VS/TUN,
      3.   * or VS/NAT(change destination), so that filtering rules can be
      4.   * applied to IPVS. */
      5.   {
      6.   .hook = ip_vs_in, //調(diào)用的函數(shù)名稱,也就是說只要有數(shù)據(jù)經(jīng)過INPUT規(guī)則鏈,就會(huì)調(diào)用ip_vs_in()對(duì)數(shù)據(jù)進(jìn)行匹配和處理
      7.   .owner = THIS_MODULE, //模塊的名稱
      8.   .pf = PF_INET, //協(xié)議族的名稱,一般都是ip(PF_INET)協(xié)議
      9.   .hooknum = NF_INET_LOCAL_IN, //規(guī)則鏈的代號(hào),為INPUT
      10.   .priority = 100, //優(yōu)先級(jí)
      11.   },
      12.   /* After packet filtering, change source only for VS/NAT */
      13.   {
      14.   .hook = ip_vs_out, //對(duì)經(jīng)過FORWARD的數(shù)據(jù)調(diào)用ip_vs_out()進(jìn)行處理
      15.   .owner = THIS_MODULE,
      16.   .pf = PF_INET,
      17.   .hooknum = NF_INET_FORWARD,
      18.   .priority = 100,
      19.   },
      20.   /* After packet filtering (but before ip_vs_out_icmp), catch icmp
      21.   * destined for 0.0.0.0/0, which is for incoming IPVS connections */
      22.   {
      23.   .hook = ip_vs_forward_icmp, //對(duì)經(jīng)過FORWARD的數(shù)據(jù)調(diào)用ip_vs_forward_icmp()進(jìn)行處理
      24.   .owner = THIS_MODULE,
      25.   .pf = PF_INET,
      26.   .hooknum = NF_INET_FORWARD,
      27.   .priority = 99,
      28.   },
      29.   /* Before the netfilter connection tracking, exit from POST_ROUTING */
      30.   {
      31.   .hook = ip_vs_post_routing, //對(duì)經(jīng)過POSTROUTING的數(shù)據(jù)調(diào)用ip_vs_post_routing()進(jìn)行處理
      32.   .owner = THIS_MODULE,
      33.   .pf = PF_INET,
      34.   .hooknum = NF_INET_POST_ROUTING,
      35.   .priority = NF_IP_PRI_NAT_SRC-1,
      36.   },
      37.  };

      可以看到,ipvs一共在INPUT,FORWARD,POSTROUTING這3個(gè)規(guī)則鏈里一共添加了4個(gè)處理的函數(shù).接下來一個(gè)一個(gè)來分析


      ip_vs_in()

      ip_vs_in()被放置在INPUT規(guī)則鏈里,會(huì)檢查進(jìn)入本機(jī)的所有數(shù)據(jù)報(bào).作用是將訪問vs(虛擬服務(wù)器)的連接轉(zhuǎn)給rs(真實(shí)服務(wù)器),達(dá)到負(fù)載均衡的目的,如何調(diào)度與配置時(shí)的調(diào)度算法相關(guān).如何修改數(shù)據(jù)報(bào)頭部與VS的類型相關(guān),VS有3種類型

      • VS/NAT會(huì)修改s_addr, d_addr, d_port(可能)
      • VS/DR會(huì)修改d_addr, d_port(可能)
      • VS/TUN直接在原來數(shù)據(jù)報(bào)的基礎(chǔ)上加一個(gè)新的包頭,也叫封裝

      在這個(gè)函數(shù)中,對(duì)所有目的地址為本機(jī)(調(diào)度服務(wù)器)的數(shù)據(jù)進(jìn)行了處理,從skb(sk_buff)中提出連接的協(xié)議結(jié)構(gòu)pp(ip_vs_protocol),找出哪些skb(sk_buff)符合虛擬服務(wù)的規(guī)則svc(ip_vs_service),并找到與之對(duì)應(yīng)的cp(ip_vs_conn),如果沒有找到就new一個(gè)cp,并將其加入到ip_vs_conn_tab列表中).最后根據(jù)cp->packet_xmit()的方法對(duì)數(shù)據(jù)進(jìn)行傳送.當(dāng)然,有很多的參數(shù)需要更新,比如連接的狀態(tài),pp,cp,skb的計(jì)數(shù)器等等...

      
                  

      net/ipv4/ipvs/ip_vs_core.c

      1.  /*
      2.   * Check if it's for virtual services, look it up,
      3.   * and send it on its way...
      4.   */ //這里翻譯一下,檢查數(shù)據(jù)報(bào)是否是發(fā)往vs(虛擬服務(wù)器)的,如果是,將其轉(zhuǎn)發(fā)到它該去的地方...
      5.  static unsigned int
      6.  ip_vs_in(unsigned int hooknum, struct sk_buff *skb,
      7.   const struct net_device *in, const struct net_device *out,
      8.   int (*okfn)(struct sk_buff *)) //hooknum是規(guī)則鏈代號(hào);*skb是數(shù)據(jù)報(bào)頭部;*in記錄了數(shù)據(jù)報(bào)從哪個(gè)網(wǎng)絡(luò)設(shè)備進(jìn)來;*out記錄了數(shù)據(jù)報(bào)將會(huì)從哪個(gè)網(wǎng)絡(luò)設(shè)備出去(如果知道的話); *okfn()是一個(gè)處理sk_buff指針的函數(shù)指針,基本上沒用到
      9.  {
      10.   struct iphdr *iph;
      11.   struct ip_vs_protocol *pp;
      12.   struct ip_vs_conn *cp;
      13.   int ret, restart;
      14.   int ihl;
      15.  
      16.   /*
      17.   * Big tappo: only PACKET_HOST (neither loopback nor mcasts)
      18.   * ... don't know why 1st test DOES NOT include 2nd (?)
      19.   */
      20.   if (unlikely(skb->pkt_type != PACKET_HOST //如果數(shù)據(jù)不是給本地網(wǎng)絡(luò)(我們/PACKET_HOST)的
      21.   || skb->dev->flags & IFF_LOOPBACK || skb->sk)) { //或者是給lo設(shè)備的,或者是一個(gè)sock已經(jīng)建立好的連接(應(yīng)該是指本機(jī)已存在的真實(shí)連接吧)
      22.   IP_VS_DBG(12, "packet type=%d proto=%d daddr=%d.%d.%d.%d ignored\n",
      23.   skb->pkt_type,
      24.   ip_hdr(skb)->protocol,
      25.   NIPQUAD(ip_hdr(skb)->daddr)); //調(diào)用IP_VS_DBG做下記錄
      26.   return NF_ACCEPT; //立刻返回NF_ACCEPT(意味著繼續(xù)下一個(gè)hook函數(shù))
      27.   } //而作為一個(gè)vs機(jī)器,以上情況是很少發(fā)生的,所以用到了unlikely這樣的gcc預(yù)編譯函數(shù).以加快執(zhí)行速度
      28.  
      29.   iph = ip_hdr(skb); //得到ip層頭部信息
      30.   if (unlikely(iph->protocol == IPPROTO_ICMP)) { //如果數(shù)據(jù)報(bào)是icmp協(xié)議
      31.   int related, verdict = ip_vs_in_icmp(skb, &related, hooknum); //用ip_vs_in_icmp()進(jìn)行處理
      32.  
      33.   if (related) //如果是相關(guān)聯(lián)的連接
      34.   return verdict; //用ip_vs_in_icmp()返回的值退出
      35.   iph = ip_hdr(skb); //否則得到skb的網(wǎng)絡(luò)層頭部指針(ip_hdr()使用的是偏移量得到的指針位置)
      36.   }
      37.  
      38.   /* Protocol supported? */
      39.   pp = ip_vs_proto_get(iph->protocol); //如果是ipvs不認(rèn)識(shí)的協(xié)議,pass掉
      40.   if (unlikely(!pp))
      41.   return NF_ACCEPT;
      42.  
      43.   ihl = iph->ihl << 2; //iph->ihl是以4byte為一個(gè)單位,所以要做一個(gè)轉(zhuǎn)換
      44.  
      45.   /*
      46.   * Check if the packet belongs to an existing connection entry
      47.   */
      48.   cp = pp->conn_in_get(skb, pp, iph, ihl, 0); //該連接是否已存在,cp為連接狀態(tài)
      49.  
      50.   if (unlikely(!cp)) { //如果在ip_vs_conn_tab中找不到該連接(也就是該連接是第一次訪問vs的話)
      51.   int v;
      52.  
      53.   if (!pp->conn_schedule(skb, pp, &v, &cp)) //利用該協(xié)議定義的conn_schedule函數(shù)為skb選擇合適的rs,并根據(jù)skb,pp生成一個(gè)新的cp.并將cp添加到ip_vs_conn_tab中.rs的選擇請(qǐng)查看相應(yīng)協(xié)議的conn_schedule函數(shù),比如tcp_conn_schedule()
      54.   return v; //添加失敗時(shí),返回錯(cuò)誤碼
      55.   }
      56.  
      57.   if (unlikely(!cp)) { //不可知的異常,輸出debug信息后,退出
      58.   /* sorry, all this trouble for a no-hit :) */
      59.   IP_VS_DBG_PKT(12, pp, skb, 0,
      60.   "packet continues traversal as normal");
      61.   return NF_ACCEPT;
      62.   }
      63.  
      64.   IP_VS_DBG_PKT(11, pp, skb, 0, "Incoming packet");
      65.  
      66.   /* Check the server status */
      67.   if (cp->dest && !(cp->dest->flags & IP_VS_DEST_F_AVAILABLE)) { //如果目標(biāo)地址不可用
      68.   /* the destination server is not available */
      69.  
      70.   if (sysctl_ip_vs_expire_nodest_conn) { //讓cp立刻超時(shí)
      71.   /* try to expire the connection immediately */
      72.   ip_vs_conn_expire_now(cp);
      73.   }
      74.   /* don't restart its timer, and silently
      75.   drop the packet. */
      76.   __ip_vs_conn_put(cp); //cp計(jì)數(shù)器-1
      77.   return NF_DROP;
      78.   }
      79.  
      80.   ip_vs_in_stats(cp, skb); //更新cp,skb的計(jì)數(shù)器(連接數(shù)和數(shù)據(jù)量)
      81.   restart = ip_vs_set_state(cp, IP_VS_DIR_INPUT, skb, pp); //更新skb連接在IP_VS_DIR_INPUT位置的狀態(tài)
      82.   if (cp->packet_xmit) //調(diào)用cp的packet_xmit()將數(shù)據(jù)傳送出去,函數(shù)是在建立cp的時(shí)候,由ip_vs_bind_xmit(cp),根據(jù)dest->flags(真實(shí)服務(wù)器的標(biāo)記)來決定的,有5種方法ip_vs_nat_xmit,ip_vs_tunnel_xmit,ip_vs_dr_xmit,ip_vs_null_xmit,ip_vs_bypass_xmit
      83.   ret = cp->packet_xmit(skb, cp, pp);
      84.   /* do not touch skb anymore */
      85.   else {
      86.   IP_VS_DBG_RL("warning: packet_xmit is null");
      87.   ret = NF_ACCEPT;
      88.   }
      89.  
      90.   /* Increase its packet counter and check if it is needed
      91.   * to be synchronized
      92.   *
      93.   * Sync connection if it is about to close to
      94.   * encorage the standby servers to update the connections timeout
      95.   */
      96.   atomic_inc(&cp->in_pkts); //計(jì)數(shù)器
      97.   if ((ip_vs_sync_state & IP_VS_STATE_MASTER) &&
      98.   (((cp->protocol != IPPROTO_TCP ||
      99.   cp->state == IP_VS_TCP_S_ESTABLISHED) &&
      100.   (atomic_read(&cp->in_pkts) % sysctl_ip_vs_sync_threshold[1]
      101.   == sysctl_ip_vs_sync_threshold[0])) ||
      102.   ((cp->protocol == IPPROTO_TCP) && (cp->old_state != cp->state) &&
      103.   ((cp->state == IP_VS_TCP_S_FIN_WAIT) ||
      104.   (cp->state == IP_VS_TCP_S_CLOSE)))))
      105.   ip_vs_sync_conn(cp); //將ip_vs_conn的信息添加到sync_buff,可用于vs(調(diào)度服務(wù)器)之間的信息同步
      106.   cp->old_state = cp->state;
      107.  
      108.   ip_vs_conn_put(cp); //釋放cp
      109.   return ret;
      110.  }

      ip_vs_out()

      此函數(shù)放在FORWARD規(guī)則鏈上,經(jīng)過本機(jī)進(jìn)行轉(zhuǎn)發(fā)的skb都會(huì)被該函數(shù)處理.在vs/nat模式下,內(nèi)網(wǎng)的rs返回給client的數(shù)據(jù)會(huì)經(jīng)網(wǎng)關(guān)(本機(jī))轉(zhuǎn)發(fā),這個(gè)時(shí)候需要修改數(shù)據(jù)報(bào)的源地址,將其修改為網(wǎng)關(guān)的公網(wǎng)ip地址,這樣才能使連接持續(xù)下去,否則client將無法訪問到rs(內(nèi)網(wǎng)地址)

      
                  

      net/ipv4/ipvs/ip_vs_core.c

      1.  /*
      2.   * It is hooked at the NF_INET_FORWARD chain, used only for VS/NAT.
      3.   * Check if outgoing packet belongs to the established ip_vs_conn,
      4.   * rewrite addresses of the packet and send it on its way...
      5.   */
      6.  static unsigned int
      7.  ip_vs_out(unsigned int hooknum, struct sk_buff *skb,
      8.   const struct net_device *in, const struct net_device *out,
      9.   int (*okfn)(struct sk_buff *))
      10.  {
      11.   struct iphdr *iph;
      12.   struct ip_vs_protocol *pp;
      13.   struct ip_vs_conn *cp;
      14.   int ihl;
      15.  
      16.   EnterFunction(11); //debug
      17.  
      18.   if (skb->ipvs_property) //如果已經(jīng)被ipvs修改過,直接pass
      19.   return NF_ACCEPT;
      20.  
      21.   iph = ip_hdr(skb); //得到skb的網(wǎng)絡(luò)層頭部信息起始指針
      22.   if (unlikely(iph->protocol == IPPROTO_ICMP)) { //如果是icmp協(xié)議的數(shù)據(jù)
      23.   int related, verdict = ip_vs_out_icmp(skb, &related); //用ip_vs_out_icmp處理
      24.  
      25.   if (related) //如果是相關(guān)聯(lián)的連接
      26.   return verdict; //返回verdict
      27.   iph = ip_hdr(skb); //否則再次得到iph(ip層頭部指針)***為什么又運(yùn)行一次呢?
      28.   }
      29.  
      30.   pp = ip_vs_proto_get(iph->protocol); //得到ipvs的ip_vs_proto結(jié)構(gòu)pp
      31.   if (unlikely(!pp)) //如果是ipvs不支持的協(xié)議,pass掉
      32.   return NF_ACCEPT;
      33.  
      34.   /* reassemble IP fragments */
      35.   if (unlikely(iph->frag_off & htons(IP_MF|IP_OFFSET) && //如果skb是一個(gè)分片
      36.   !pp->dont_defrag)) {
      37.   if (ip_vs_gather_frags(skb, IP_DEFRAG_VS_OUT)) //則重組以后,標(biāo)記為NF_STOLEN返回,防止netfilter對(duì)其再次操作
      38.   return NF_STOLEN;
      39.   iph = ip_hdr(skb); //如果重組失敗,再次得到iph.***重復(fù)3次了
      40.   }
      41.  
      42.   ihl = iph->ihl << 2; //轉(zhuǎn)成byte為長度單位,默認(rèn)為4byte
      43.  
      44.   /*
      45.   * Check if the packet belongs to an existing entry
      46.   */
      47.   cp = pp->conn_out_get(skb, pp, iph, ihl, 0); //檢查skb是否是ip_vs_conn_tab中某個(gè)連接(client -> rs)的相關(guān)連接(rs -> client),如果是,則返回cp(ip_vs_conn),如果不是,cp為NULL
      48.  
      49.   if (unlikely(!cp)) { //如果cp不存在
      50.   if (sysctl_ip_vs_nat_icmp_send && //sysctl_ip_vs_nat_icmp_send值為0,后面的代碼貌似不會(huì)繼續(xù)執(zhí)行了,這部分代碼估計(jì)是debug用的
      51.   (pp->protocol == IPPROTO_TCP || //skb為tcp協(xié)議或者udp協(xié)議
      52.   pp->protocol == IPPROTO_UDP)) {
      53.   __be16 _ports[2], *pptr;
      54.  
      55.   pptr = skb_header_pointer(skb, ihl, //得到skb端口信息
      56.   sizeof(_ports), _ports);
      57.   if (pptr == NULL) //如果沒端口,pass
      58.   return NF_ACCEPT; /* Not for me */
      59.   if (ip_vs_lookup_real_service(iph->protocol, //通過協(xié)議/源地址/源端口去尋找是否是內(nèi)網(wǎng)的某個(gè)rs發(fā)出的tcp/udp數(shù)據(jù)報(bào)
      60.   iph->saddr, pptr[0])) {
      61.   /*
      62.   * Notify the real server: there is no
      63.   * existing entry if it is not RST
      64.   * packet or not TCP packet.
      65.   */
      66.   if (iph->protocol != IPPROTO_TCP //考慮到由內(nèi)網(wǎng)(rs)通過本機(jī)轉(zhuǎn)發(fā)到外網(wǎng)(client)的數(shù)據(jù),不可能是不是tcp或者不是rst包,否則發(fā)出一個(gè)icmp出錯(cuò)報(bào)文,目的地址不可達(dá).然后丟棄skb
      67.   || !is_tcp_reset(skb)) {
      68.   icmp_send(skb,ICMP_DEST_UNREACH,
      69.   ICMP_PORT_UNREACH, 0);
      70.   return NF_DROP;
      71.   }
      72.   }
      73.   }
      74.   IP_VS_DBG_PKT(12, pp, skb, 0,
      75.   "packet continues traversal as normal");
      76.   return NF_ACCEPT; //pass掉從內(nèi)網(wǎng)(realserver)發(fā)出的到外網(wǎng)的新連接(因?yàn)椴慌cip_vs_conn_tab中的連接相關(guān)聯(lián))
      77.   }
      78.  
      79.   IP_VS_DBG_PKT(11, pp, skb, 0, "Outgoing packet"); //debug
      80.  
      81.   if (!skb_make_writable(skb, ihl)) //如果skb的頭部不可寫入,跳到drop處
      82.   goto drop;
      83.  
      84.   /* mangle the packet */
      85.   if (pp->snat_handler && !pp->snat_handler(skb, pp, cp)) //到這里的數(shù)據(jù)就是需要修改源地址的(rs -> client)從內(nèi)網(wǎng)到外網(wǎng)的數(shù)據(jù)報(bào)了
      86.   goto drop; //如果定義了snat_handler,但是snat_handler()失敗,跳到drop處
      87.   ip_hdr(skb)->saddr = cp->vaddr; //將源地址轉(zhuǎn)化為虛擬服務(wù)器的地址,讓這個(gè)到外網(wǎng)的數(shù)據(jù)報(bào)看上去就像是從vs發(fā)出的一樣
      88.   ip_send_check(ip_hdr(skb)); //改動(dòng)了源地址,就要重新計(jì)算校驗(yàn)和
      89.  
      90.   /* For policy routing, packets originating from this
      91.   * machine itself may be routed differently to packets
      92.   * passing through. We want this packet to be routed as
      93.   * if it came from this machine itself. So re-compute
      94.   * the routing information.
      95.   */
      96.   if (ip_route_me_harder(skb, RTN_LOCAL) != 0) //為了讓skb看上去就像是本機(jī)發(fā)出的,還需要刷新路由信息
      97.   goto drop;
      98.  
      99.   IP_VS_DBG_PKT(10, pp, skb, 0, "After SNAT"); //debug
      100.  
      101.   ip_vs_out_stats(cp, skb); //更新cp,skb的計(jì)數(shù)器(連接數(shù),通訊量)
      102.   ip_vs_set_state(cp, IP_VS_DIR_OUTPUT, skb, pp); //更新cp,skb,pp的狀態(tài)參數(shù),標(biāo)記等
      103.   ip_vs_conn_put(cp); //釋放cp計(jì)數(shù)
      104.  
      105.   skb->ipvs_property = 1; //打上標(biāo)記,以免再被ipvs修改
      106.  
      107.   LeaveFunction(11); //debug
      108.   return NF_ACCEPT; //pass
      109.  
      110.   drop:
      111.   ip_vs_conn_put(cp); //釋放cp計(jì)數(shù)
      112.   kfree_skb(skb); //釋放skb空間
      113.   return NF_STOLEN; //返回NF_STOLEN,避免netfilter再次修改
      114.  }

      ip_vs_forward_icmp()

      該函數(shù)和前面講到的ip_vs_out()在同一個(gè)FORWARD規(guī)則鏈上,但是的優(yōu)先級(jí)為99,比ip_vs_out()的100要小(高),所以優(yōu)先執(zhí)行.

      函數(shù)非常簡(jiǎn)單,就是將經(jīng)過FORWARD規(guī)則鏈的所有icmp數(shù)據(jù)報(bào)交給ip_vs_in_icmp()處理.為什么進(jìn)入本機(jī)的數(shù)據(jù)會(huì)到FORWARD規(guī)則鏈上呢,原因在于local配置成透明設(shè)備時(shí),tcp/udp協(xié)議是比較容易將forward的數(shù)據(jù)讓它input的,而icmp則沒有那么簡(jiǎn)單了,所以有一些發(fā)往本機(jī)的icmp報(bào)文會(huì)跑到forward規(guī)則鏈上來(具體原因不明),所以在這里把漏掉的進(jìn)入vs的icmp交給ip_vs_forward_icmp()處理

      
                  

      net/ipv4/ipvs/ip_vs_core.c

      1.  /*
      2.   * It is hooked at the NF_INET_FORWARD chain, in order to catch ICMP
      3.   * related packets destined for 0.0.0.0/0.
      4.   * When fwmark-based virtual service is used, such as transparent
      5.   * cache cluster, TCP packets can be marked and routed to ip_vs_in,
      6.   * but ICMP destined for 0.0.0.0/0 cannot not be easily marked and
      7.   * sent to ip_vs_in_icmp. So, catch them at the NF_INET_FORWARD chain
      8.   * and send them to ip_vs_in_icmp.
      9.   */
      10.  static unsigned int
      11.  ip_vs_forward_icmp(unsigned int hooknum, struct sk_buff *skb,
      12.   const struct net_device *in, const struct net_device *out,
      13.   int (*okfn)(struct sk_buff *))
      14.  {
      15.   int r;
      16.  
      17.   if (ip_hdr(skb)->protocol != IPPROTO_ICMP) //如果不是icmp,直接pass
      18.   return NF_ACCEPT;
      19.  
      20.   return ip_vs_in_icmp(skb, &r, hooknum); //如果是.處理之
      21.  }

      ip_vs_post_routing()

      此函數(shù)的優(yōu)先級(jí)為NF_IP_PRI_NAT_SRC-1,比POSTROUTING上的nat,mangle的優(yōu)先級(jí)都高,保證了早于他們執(zhí)行,目的就是防止被ipvs修改過的數(shù)據(jù)報(bào)再次被netfilter修改.具體做法如下

      
                  

      net/ipv4/ipvs/ip_vs_core.c

      1.  /*
      2.   * It is hooked before NF_IP_PRI_NAT_SRC at the NF_INET_POST_ROUTING
      3.   * chain, and is used for VS/NAT.
      4.   * It detects packets for VS/NAT connections and sends the packets
      5.   * immediately. This can avoid that iptable_nat mangles the packets
      6.   * for VS/NAT.
      7.   */
      8.  static unsigned int ip_vs_post_routing(unsigned int hooknum,
      9.   struct sk_buff *skb,
      10.   const struct net_device *in,
      11.   const struct net_device *out,
      12.   int (*okfn)(struct sk_buff *))
      13.  {
      14.   if (!skb->ipvs_property) //如果skb沒有ipvs修改過的記號(hào),則pass,讓netfilter繼續(xù)處理去
      15.   return NF_ACCEPT;
      16.   /* The packet was sent from IPVS, exit this chain */
      17.   return NF_STOP; //否則,用NF_STOP返回,netfilter受到這個(gè)信號(hào)以后,直接退出該規(guī)則鏈,不再做任何處理
      18.  }

        本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間,所有內(nèi)容均由用戶發(fā)布,不代表本站觀點(diǎn)。請(qǐng)注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息,謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,請(qǐng)點(diǎn)擊一鍵舉報(bào)。
        轉(zhuǎn)藏 分享 獻(xiàn)花(0

        0條評(píng)論

        發(fā)表

        請(qǐng)遵守用戶 評(píng)論公約

        類似文章 更多