| 
 | 1 | +---  | 
 | 2 | +layout: home  | 
 | 3 | +title: Deployment behind a load balancer  | 
 | 4 | +nav_order: 12  | 
 | 5 | +nav_titles: true  | 
 | 6 | +titles_max_depth: 2  | 
 | 7 | +---  | 
 | 8 | + | 
 | 9 | +## Overview  | 
 | 10 | + | 
 | 11 | +Supporting MPTCP on the server side is easy when services are directly exposed  | 
 | 12 | +to the Internet: it's generally just a matter of enabling MPTCP support in the  | 
 | 13 | +[applications](apps.html), or  | 
 | 14 | +[forcing them to use it](setup.html#force-applications-to-use-mptcp), that's it.  | 
 | 15 | + | 
 | 16 | +When services are exposed behind a L4 load balancer, it is important to make  | 
 | 17 | +sure the additional subflows will reach the same end-server, and not another one  | 
 | 18 | +sharing the same public IP (ECMP or Anycast servers).  | 
 | 19 | + | 
 | 20 | +### First path: no change  | 
 | 21 | + | 
 | 22 | +Creating the first subflow (or *path*) is easy: on the wire, this MPTCP subflow  | 
 | 23 | +is seen as a TCP connection with extra TCP options. It means nothing needs to be  | 
 | 24 | +modified.  | 
 | 25 | + | 
 | 26 | +```mermaid  | 
 | 27 | +flowchart LR  | 
 | 28 | +    C("fa:fa-mobile<br />Client") == Initial subflow ==> LB{"fa:fa-cloud<br />Load Balancer"}  | 
 | 29 | +    LB -.-> S1["fa:fa-server<br />Server 1"]  | 
 | 30 | +    LB == Initial subflow ==> S2["fa:fa-server<br />Server 2"]  | 
 | 31 | +    LB -.-> S3["fa:fa-server<br />Server 3"]  | 
 | 32 | +
  | 
 | 33 | +    linkStyle 0 stroke:green,fill:none  | 
 | 34 | +    linkStyle 2 stroke:green,fill:none  | 
 | 35 | +```  | 
 | 36 | + | 
 | 37 | +### Extra paths: static redirection  | 
 | 38 | + | 
 | 39 | +The extra subflows need to reach the same end-server. Such subflows will have  | 
 | 40 | +different source IP addresses and/or ports. A stateless L4 load-balancer needs  | 
 | 41 | +extra information to pick the same end-server as the one which accepted the  | 
 | 42 | +initial subflow.  | 
 | 43 | + | 
 | 44 | +```mermaid  | 
 | 45 | +flowchart LR  | 
 | 46 | +    C("fa:fa-mobile<br />Client") -- Initial subflow --> LB{"fa:fa-cloud<br />Load Balancer"}  | 
 | 47 | +    C == Second subflow ==> LB  | 
 | 48 | +    LB -.-> S1["fa:fa-server<br />Server 1"]  | 
 | 49 | +    LB -- Initial subflow --> S2["fa:fa-server<br />Server 2"]  | 
 | 50 | +    LB == Second subflow ==> S2  | 
 | 51 | +    LB -.-> S3["fa:fa-server<br />Server 3"]  | 
 | 52 | +
  | 
 | 53 | +    linkStyle 0 stroke:green,fill:none  | 
 | 54 | +    linkStyle 1 stroke:orange,fill:none  | 
 | 55 | +    linkStyle 3 stroke:green,fill:none  | 
 | 56 | +    linkStyle 4 stroke:orange,fill:none  | 
 | 57 | +```  | 
 | 58 | + | 
 | 59 | +If the extra subflows try to connect to the same destination IP address and  | 
 | 60 | +port, a stateless L4 load-balancer will not be able to pick the right server.  | 
 | 61 | + | 
 | 62 | +### Solution  | 
 | 63 | + | 
 | 64 | +The [MPTCP protocol](https://www.rfc-editor.org/rfc/rfc8684.html) suggests  | 
 | 65 | +handling this case like this:  | 
 | 66 | +- A server behind a L4 load-balancer should mention in its replies to MPTCP  | 
 | 67 | +  connection requests (`MP_CAPABLE`) that *it will not accept additional MPTCP  | 
 | 68 | +  subflows to the same IP address and port* (via the `C-flag`).  | 
 | 69 | +- Additionally, such server should announce an extra address (`ADD_ADDR`) with a  | 
 | 70 | +  v4/v6 IP address and/or port that are specific to this server.  | 
 | 71 | +- A L4 load-balancer should route traffic to this specific IP and/or port to the  | 
 | 72 | +  right server.  | 
 | 73 | + | 
 | 74 | +In other words, on Linux, it means that each server should:  | 
 | 75 | +- set the [`net.mptcp.allow_join_initial_addr_port`](https://docs.kernel.org/networking/mptcp-sysctl.html)  | 
 | 76 | +  sysctl knob to `0`  | 
 | 77 | +- add a `signal` MPTCP endpoint with a dedicated IP address and/or port:  | 
 | 78 | +  ```  | 
 | 79 | +  ip mptcp endpoint add <public IP address> dev <interface> [ port NR  ] signal  | 
 | 80 | +  ```  | 
 | 81 | + | 
 | 82 | +{: .note}  | 
 | 83 | +A stateful load-balancer could compute the MPTCP receiver's token from its key  | 
 | 84 | +exchanged in the connection request (`MP_CAPABLE`), and route additional  | 
 | 85 | +subflows to the same server by identifying the receiver's token from the join  | 
 | 86 | +request (`MP_JOIN`). Be careful that there is a risk of token collision, and  | 
 | 87 | +such load-balancer should handle the case where multiple end-servers are using  | 
 | 88 | +the same token for active MPTCP connections.  | 
 | 89 | + | 
 | 90 | +## CDNs  | 
 | 91 | + | 
 | 92 | +Supporting MPTCP would be beneficial for the users, to be able to easily benefit  | 
 | 93 | +from MPTCP: seamless handovers, best network selection, and network aggregation.  | 
 | 94 | + | 
 | 95 | +Here is a checklist for CDN owners implementing MPTCP support:  | 
 | 96 | +- [ ] Frontend:  | 
 | 97 | +  - [ ] Application: [enable MPTCP support](apps.html),  | 
 | 98 | +        [modify it to create an MPTCP socket](implementation.html), or  | 
 | 99 | +        [force it to use MPTCP](setup.html#force-applications-to-use-mptcp).  | 
 | 100 | +  - [ ] System: set [`sysctl net.mptcp.allow_join_initial_addr_port=0`](https://docs.kernel.org/networking/mptcp-sysctl.html)  | 
 | 101 | +  - [ ] System: Add a `signal` MPTCP endpoint with a dedicated IP v4/v6 and/or  | 
 | 102 | +        port per end-server:  | 
 | 103 | +  ```  | 
 | 104 | +  ip mptcp endpoint add <public IP address> dev <interface> [ port NR ] signal  | 
 | 105 | +  ```  | 
 | 106 | +- [ ] Stateless L4 Load-Balancer:  | 
 | 107 | +  - [ ] Add rules to route TCP flows to a specific IP and/or port to the  | 
 | 108 | +        corresponding server.  | 
 | 109 | +  - [ ] Optionally block all non MPTCP connections, and rate limit connections  | 
 | 110 | +        requests.  | 
0 commit comments