Skip to content

Conversation

@ssrlive
Copy link

@ssrlive ssrlive commented Mar 22, 2025

I think this time it works.

@ssrlive
Copy link
Author

ssrlive commented Mar 22, 2025

How to test local-tun app? Please write a brief tutorial so I can do it.

@zonyitoo
Copy link
Collaborator

zonyitoo commented Mar 24, 2025

Client ------> Tun Device (sslocal) ------> ssserver -------> Server
iperf3                                       a.b.c.1           a.b.c.2  iperf3

It requires at least 2 different IPs.

Starts a sslocal instance with local-tun enabled on device A with server set to a.b.c.1, a ssserver instance on device B, listening on IP a.b.c.1.

Starts a iperf3 server on device C, listening on IP a.b.c.2.

Run iperf3 client on device A, targeting to a.b.c.2. Set route table on device A, that a.b.c.2 routes to tun device listened by sslocal.

Some previous test results: #756

@zonyitoo
Copy link
Collaborator

zonyitoo commented Mar 25, 2025

I just ran a test in my local environment:

image
$ iperf -c 192.168.11.235 -p 15001
------------------------------------------------------------
Client connecting to 192.168.11.235, TCP port 15001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  1] local 10.0.0.1 port 51486 connected with 192.168.11.235 port 15001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.03 sec   341 MBytes   285 Mbits/sec

Tested with sslocal with debug build. The result was not ideal.

It couldn't run with release build because it would crash: smoltcp-rs/smoltcp#1048 .

As we can see in the image, smoltcp-poll thread was using 100% CPU but still bandwidth was still very low.

Updates:

Just looked deeper into smoltcp-poll process and found that the biggest CPU consumption routine was checking TCP's checksum. Checksum has to be disabled. But when I disabled smoltcp's checksum, it may send wrong iperf packets. Hmm..

zonyitoo added a commit that referenced this pull request Mar 25, 2025
- ref #1923

Checksum running on receiving packets are the most significant cost of
CPU time in local-tun.
@zonyitoo
Copy link
Collaborator

Everything now working pretty well.

$ iperf -c 192.168.11.235 -p 15001
------------------------------------------------------------
Client connecting to 192.168.11.235, TCP port 15001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  1] local 10.0.0.1 port 54707 connected with 192.168.11.235 port 15001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.12 sec   998 MBytes   827 Mbits/sec

@ssrlive
Copy link
Author

ssrlive commented Mar 26, 2025

Not work for me:

iperf3 server (56.78.9.10)

iperf3 -s

server (192.168.24.100)

ssserver -c /home/me/ss-cfg.json
{
    "server":"0.0.0.0",
    "server_port":8388,
    "local_port":1080,
    "password":"barfoo!",
    "method":"chacha20-ietf-poly1305"
}

client (127.0.0.1)

sudo sslocal.exe -U --protocol tun -s "192.168.24.100:8388" -m "chacha20-ietf-poly1305" -k "barfoo!" --outbound-bind-interface "Wi-Fi" --tun-interface-name "shadowsocks"

iperf3 client (127.0.0.1)

iperf3 -c 56.78.9.10

iperf3 can recieve data, but I can't watch it's through the TUN device. Even if I stop ssserver, the iperf3 can get data normally.

@ssrlive
Copy link
Author

ssrlive commented Mar 26, 2025

Please test my PR by yourself. If it passed your tests, please merge it yourself.

@zonyitoo
Copy link
Collaborator

I think the key changes in your PR is the smoltcp poll loop running in a tokio's Task. Because the loop is going to be very busy, it may occupy a worker thread in tokio's runtime.

On the other hand, the SpinMutex (spinlock) in the poll loop would be problematic that there will have no chance for the other tasks to run because there will be no yield point.

I think it would be nicer to keep the current implementation that run the loop in a separate thread.

@ssrlive
Copy link
Author

ssrlive commented Mar 26, 2025

I want to change it to run a separate task per TCP session, so it would not be appropriate to have all sessions handled by a single thread. So I submitted this PR.

@ssrlive
Copy link
Author

ssrlive commented Mar 26, 2025

I tested it just now, it can work. but the CPU 100% issue not resolve. I think it's a problem whether my PR exists or not.

log::error!("TcpTun smoltcp-poll error: {:?}", e);
}
log::debug!("TcpTun::drop, waiting for manager thread to exit");
std::thread::sleep(std::time::Duration::from_millis(100));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must be another way to make it work gracefully.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I'm researching it.

@zonyitoo
Copy link
Collaborator

I want to change it to run a separate task per TCP session

It would be nice to have.

@zonyitoo
Copy link
Collaborator

image

Actually my smoltcp-poll thread won't use 100% CPU to get 800Mbps.

@ssrlive ssrlive closed this by deleting the head repository Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants