Commit Graph

96 Commits

Author SHA1 Message Date
Devin Carr
3bf9217de5 TUN-9319: Add dynamic loading of features to connections via ConnectionOptionsSnapshot
Make sure to enforce snapshots of features and client information for each connection
so that the feature information can change in the background. This allows for new features
to only be applied to a connection if it completely disconnects and attempts a reconnect.

Updates the feature refresh time to 1 hour from previous cloudflared versions which
refreshed every 6 hours.

Closes TUN-9319
2025-05-14 20:11:05 +00:00
Luis Neto
6496322bee TUN-9007: modify logic to resolve region when the tunnel token has an endpoint field
## Summary

Within the work of FEDRamp it is necessary to change the HA SD lookup to use as srv `fed-v2-origintunneld`

This work assumes that the tunnel token has an optional endpoint field which will be used to modify the behaviour of the HA SD lookup.

Finally, the presence of the endpoint will override region to _fed_ and fail if any value is passed for the flag region.

Closes TUN-9007
2025-02-25 19:03:41 +00:00
Luis Neto
90176a79b4 TUN-8894: report FIPS+PQ error to Sentry when dialling to the edge
## Summary

Since we will enable PQ + FIPS it is necessary to add observability so that we can understand if issues are happening.

 Closes TUN-8894
2025-01-30 06:26:53 -08:00
Luis Neto
31a870b291 TUN-8855: Update PQ curve preferences
## Summary

Nowadays, Cloudflared only supports X25519Kyber768Draft00 (0x6399,25497) but older versions may use different preferences.

For FIPS compliance we are required to use P256Kyber768Draft00 (0xfe32,65074) which is supported in our internal fork of [Go-Boring-1.22.10](https://bitbucket.cfdata.org/projects/PLAT/repos/goboring/browse?at=refs/heads/go-boring/1.22.10 "Follow link").

In the near future, Go will support by default the X25519MLKEM768 (0x11ec,4588) given this we may drop the usage of our public fork of GO.

To summarise:

* Cloudflared FIPS: QUIC_CURVE_PREFERENCES=65074
* Cloudflared non-FIPS: QUIC_CURVE_PREFERENCES=4588

Closes TUN-8855
2025-01-30 05:02:47 -08:00
João "Pisco" Fernandes
4eb0f8ce5f TUN-8861: Rename Session Limiter to Flow Limiter
## Summary
Session is the concept used for UDP flows. Therefore, to make
the session limiter ambiguous for both TCP and UDP, this commit
renames it to flow limiter.

Closes TUN-8861
2025-01-20 06:33:40 -08:00
João "Pisco" Fernandes
bf4954e96a TUN-8861: Add session limiter to UDP session manager
## Summary
In order to make cloudflared behavior more predictable and
prevent an exhaustion of resources, we have decided to add
session limits that can be configured by the user. This first
commit introduces the session limiter and adds it to the UDP
handling path. For now the limiter is set to run only in
unlimited mode.
2025-01-20 02:52:32 -08:00
Devin Carr
3b522a27cf TUN-8807: Add support_datagram_v3 to remote feature rollout
Support rolling out the `support_datagram_v3` feature via remote feature rollout (DNS TXT record) with `dv3` key.

Consolidated some of the feature evaluation code into the features module to simplify the lookup of available features at runtime.

Reduced complexity for management logs feature lookup since it's a default feature.

Closes TUN-8807
2025-01-06 09:15:18 -08:00
Devin Carr
588ab7ebaa TUN-8640: Add ICMP support for datagram V3
Closes TUN-8640
2024-12-09 07:23:11 -08:00
Devin Carr
9da15b5d96 TUN-8640: Refactor ICMPRouter to support new ICMPResponders
A new ICMPResponder interface is introduced to provide different
implementations of how the ICMP flows should return to the QUIC
connection muxer.

Improves usages of netip.AddrPort to leverage the embedded zone
field for IPv6 addresses.

Closes TUN-8640
2024-11-27 12:46:08 -08:00
Devin Carr
1f3e3045ad TUN-8701: Add metrics and adjust logs for datagram v3
Closes TUN-8701
2024-11-07 11:02:55 -08:00
Devin Carr
952622a965 TUN-8709: Add session migration for datagram v3
When a registration response from cloudflared gets lost on it's way back to the edge, the edge service will retry and send another registration request. Since cloudflared already has bound the local UDP socket for the provided request id, we want to re-send the registration response.

There are three types of retries that the edge will send:

1. A retry from the same QUIC connection index; cloudflared will just respond back with a registration response and reset the idle timer for the session.
2. A retry from a different QUIC connection index; cloudflared will need to migrate the current session connection to this new QUIC connection and reset the idle timer for the session.
3. A retry to a different cloudflared connector; cloudflared will eventually time the session out since no further packets will arrive to the session at the original connector.

Closes TUN-8709
2024-11-06 12:06:07 -08:00
Devin Carr
589c198d2d TUN-8646: Allow experimental feature support for datagram v3
Closes TUN-8646
2024-11-04 13:59:32 -08:00
Devin Carr
16ecf60800 TUN-8661: Refactor connection methods to support future different datagram muxing methods
Some checks are pending
Check / check (1.22.x, macos-latest) (push) Waiting to run
Check / check (1.22.x, ubuntu-latest) (push) Waiting to run
Check / check (1.22.x, windows-latest) (push) Waiting to run
Semgrep config / semgrep/ci (push) Waiting to run
The current supervisor serves the quic connection by performing all of the following in one method:
1. Dial QUIC edge connection
2. Initialize datagram muxer for UDP sessions and ICMP
3. Wrap all together in a single struct to serve the process loops

In an effort to better support modularity, each of these steps were broken out into their own separate methods that the supervisor will compose together to create the TunnelConnection and run its `Serve` method.

This also provides us with the capability to better interchange the functionality supported by the datagram session manager in the future with a new mechanism.

Closes TUN-8661
2024-10-24 11:42:02 -07:00
Devin Carr
92e0f5fcf9 TUN-8688: Correct UDP bind for IPv6 edge connectivity on macOS
Some checks failed
Check / check (1.22.x, macos-latest) (push) Has been cancelled
Check / check (1.22.x, ubuntu-latest) (push) Has been cancelled
Check / check (1.22.x, windows-latest) (push) Has been cancelled
Semgrep config / semgrep/ci (push) Has been cancelled
For macOS, we want to set the DF bit for the UDP packets used by the QUIC
connection; to achieve this, you need to explicitly set the network
to either "udp4" or "udp6". When determining which network type to pick
we need to use the edge IP address chosen to align with what the local
IP family interface we will use. This means we want cloudflared to bind
to local interfaces for a random port, so we provide a zero IP and 0 port
number (ex. 0.0.0.0:0). However, instead of providing the zero IP, we
can leave the value as nil and let the kernel decide which interface and
pick a random port as defined by the target edge IP family.

This was previously broken for IPv6-only edge connectivity on macOS and
all other operating systems should be unaffected because the network type
was left as default "udp" which will rely on the provided local or remote
IP for selection.

Closes TUN-8688
2024-10-18 14:38:05 -07:00
Devin Carr
a3ee49d8a9 chore: Remove h2mux code
Some checks failed
Check / check (1.22.x, macos-latest) (push) Has been cancelled
Check / check (1.22.x, ubuntu-latest) (push) Has been cancelled
Check / check (1.22.x, windows-latest) (push) Has been cancelled
Semgrep config / semgrep/ci (push) Has been cancelled
Some more legacy h2mux code to be cleaned up and moved out of the way.
The h2mux.Header used in the serialization for http2 proxied headers is moved to connection module. Additionally, the booleanfuse structure is also moved to supervisor as it is also needed. Both of these structures could be evaluated later for removal/updates, however, the intent of the proposed changes here is to remove the dependencies on the h2mux code and removal.

Approved-by: Chung-Ting Huang <chungting@cloudflare.com>
Approved-by: Luis Neto <lneto@cloudflare.com>
Approved-by: Gonçalo Garcia <ggarcia@cloudflare.com>

MR: https://gitlab.cfdata.org/cloudflare/tun/cloudflared/-/merge_requests/1576
2024-10-15 13:10:30 -07:00
GoncaloGarcia
e251a21810 TUN-8621: Prevent QUIC connection from closing before grace period after unregistering
Whenever cloudflared receives a SIGTERM or SIGINT it goes into graceful shutdown mode, which unregisters the connection and closes the control stream. Unregistering makes it so we no longer receive any new requests and makes the edge close the connection, allowing in-flight requests to finish (within a 3 minute period).
 This was working fine for http2 connections, but the quic proxy was cancelling the context as soon as the controls stream ended, forcing the process to stop immediately.

 This commit changes the behavior so that we wait the full grace period before cancelling the request
2024-10-07 10:51:21 -05:00
GoncaloGarcia
2437675c04 Reverts the following:
Revert "TUN-8621: Fix cloudflared version in change notes."
Revert "PPIP-2310: Update quick tunnel disclaimer"
Revert "TUN-8621: Prevent QUIC connection from closing before grace period after unregistering"
Revert "TUN-8484: Print response when QuickTunnel can't be unmarshalled"
Revert "TUN-8592: Use metadata from the edge to determine if request body is empty for QUIC transport"
2024-09-10 16:50:32 +01:00
GoncaloGarcia
e05939f1c9 TUN-8621: Prevent QUIC connection from closing before grace period after unregistering
Some checks failed
Check / check (1.22.x, macos-latest) (push) Has been cancelled
Check / check (1.22.x, ubuntu-latest) (push) Has been cancelled
Check / check (1.22.x, windows-latest) (push) Has been cancelled
Whenever cloudflared receives a SIGTERM or SIGINT it goes into graceful shutdown mode, which unregisters the connection and closes the control stream. Unregistering makes it so we no longer receive any new requests and makes the edge close the connection, allowing in-flight requests to finish (within a 3 minute period).
 This was working fine for http2 connections, but the quic proxy was cancelling the context as soon as the controls stream ended, forcing the process to stop immediately.

 This commit changes the behavior so that we wait the full grace period before cancelling the request
2024-09-05 13:15:00 +00:00
João Oliveirinha
75752b681b TUN-8057: cloudflared uses new PQ curve ID 2024-07-09 11:19:10 -07:00
chungthuang
0b62d45738 TUN-8456: Update quic-go to 0.45 and collect mtu and congestion control metrics 2024-06-17 15:28:56 +00:00
chungthuang
354a5bb8af TUN-8452: Add flag to control QUIC stream-level flow control limit 2024-06-06 11:50:46 -05:00
chungthuang
e0b1899e97 TUN-8449: Add flag to control QUIC connection-level flow control limit and increase default to 30MB 2024-06-05 17:34:41 -05:00
Devin Carr
654a326098 TUN-8424: Refactor capnp registration server
Move RegistrationServer and RegistrationClient into tunnelrpc module
to properly abstract out the capnp aspects internal to the module only.
2024-05-24 11:40:10 -07:00
Devin Carr
43446bc692 TUN-8423: Deprecate older legacy tunnel capnp interfaces
Since legacy tunnels have been removed for a while now, we can remove
many of the capnp rpc interfaces that are no longer leveraged by the
legacy tunnel registration and authentication mechanisms.
2024-05-23 11:17:49 -07:00
Devin Carr
8184bc457d TUN-8427: Fix BackoffHandler's internally shared clock structure
A clock structure was used to help support unit testing timetravel
but it is a globally shared object and is likely unsafe to share
across tests. Reordering of the tests seemed to have intermittent
failures for the TestWaitForBackoffFallback specifically on windows
builds.

Adjusting this to be a shim inside the BackoffHandler struct should
resolve shared object overrides in unit testing.

Additionally, added the reset retries functionality to be inline with
the ResetNow function of the BackoffHandler to align better with
expected functionality of the method.

Removes unused reconnectCredentialManager.
2024-05-23 09:48:34 -07:00
Devin Carr
eb2e4349e8 TUN-8415: Refactor capnp rpc into a single module
Combines the tunnelrpc and quic/schema capnp files into the same module.

To help reduce future issues with capnp id generation, capnpids are
provided in the capnp files from the existing capnp struct ids generated
in the go files.

Reduces the overall interface of the Capnp methods to the rest of
the code by providing an interface that will handle the quic protocol
selection.

Introduces a new `rpc-timeout` config that will allow all of the
SessionManager and ConfigurationManager RPC requests to have a timeout.
The timeout for these values is set to 5 seconds as non of these operations
for the managers should take a long time to complete.

Removed the RPC-specific logger as it never provided good debugging value
as the RPC method names were not visible in the logs.
2024-05-17 11:22:07 -07:00
João "Pisco" Fernandes
76badfa01b TUN-8236: Add write timeout to quic and tcp connections
## Summary
To prevent bad eyeballs and severs to be able to exhaust the quic
control flows we are adding the possibility of having a timeout
for a write operation to be acknowledged. This will prevent hanging
connections from exhausting the quic control flows, creating a DDoS.
2024-02-15 17:54:52 +00:00
Chung-Ting
12dd91ada1 TUN-8052: Update go to 1.21.5
Also update golang.org/x/net and google.golang.org/grpc to fix vulnerabilities,
although cloudflared is using them in a way that is not exposed to those risks
2023-12-15 12:17:21 +00:00
Chung-Ting
4ddc8d758b TUN-7970: Default to enable post quantum encryption for quic transport 2023-12-07 11:37:46 +00:00
Chung-Ting
8068cdebb6 TUN-8006: Update quic-go to latest upstream 2023-12-04 17:09:40 +00:00
Devin Carr
e0a55f9c0e TUN-7965: Remove legacy incident status page check 2023-11-13 17:10:59 -08:00
João Oliveirinha
349586007c TUN-7756: Clarify that QUIC is mandatory to support ICMP proxying 2023-09-05 15:58:19 +01:00
Chung-Ting Huang
bec683b67d TUN-7700: Implement feature selector to determine if connections will prefer post quantum cryptography 2023-08-29 09:05:33 +01:00
Chung-Ting Huang
38d3c3cae5 TUN-7707: Use X25519Kyber768Draft00 curve when post-quantum feature is enabled 2023-08-28 14:18:05 +00:00
Devin Carr
b500e556bf TUN-7590: Remove usages of ioutil 2023-07-17 19:08:38 +00:00
João Oliveirinha
0c8bc56930 TUN-7575: Add option to disable PTMU discovery over QUIC
This commit implements the option to disable PTMU discovery for QUIC
connections.
QUIC finds the PMTU during startup by increasing Ping packet frames
until Ping responses are not received anymore, and it seems to stick
with that PMTU forever.

This is no problem if the PTMU doesn't change over time, but if it does
it may case packet drops.
We add this hidden flag for debugging purposes in such situations as a
quick way to validate if problems that are being seen can be solved by
reducing the packet size to the edge.

Note however, that this option may impact UDP proxying since we expect
being able to send UDP packets of 1280 bytes over QUIC.
So, this option should not be used when tunnel is being used for UDP
proxying.
2023-07-13 10:24:24 +01:00
Sudarsan Reddy
1abd22ef0a TUN-7480: Added a timeout for unregisterUDP.
I deliberately kept this as an unregistertimeout because that was the
intent. In the future we could change this to a UDPConnConfig if we want
to pass multiple values here.

The idea of this PR is simply to add a configurable unregister UDP
timeout.
2023-06-20 06:20:09 +00:00
João Oliveirinha
20e36c5bf3 TUN-7468: Increase the limit of incoming streams 2023-06-19 10:41:56 +00:00
Devin Carr
9426b60308 TUN-7227: Migrate to devincarr/quic-go
The lucas-clemente/quic-go package moved namespaces and our branch
went stale, this new fork provides support for the new quic-go repo
and applies the max datagram frame size change.

Until the max datagram frame size support gets upstreamed into quic-go,
this can be used to unblock go 1.20 support as the old
lucas-clemente/quic-go will not get go 1.20 support.
2023-05-10 19:44:15 +00:00
Sudarsan Reddy
e8841c0fb3 TUN-7394: Retry StartFirstTunnel on quic.ApplicationErrors
This PR adds ApplicationError as one of the "try_again" error types for
startfirstTunnel. This ensures that these kind of errors (which we've
seen occur when a tunnel gets rate-limited) are retried.
2023-04-26 12:58:01 +01:00
Devin Carr
991f01fe34 TUN-7131: Add cloudflared log event to connection messages and enable streaming logs 2023-04-12 14:41:11 -07:00
Devin Carr
bbc8d9431b TUN-7333: Default features checkable at runtime across all packages 2023-03-30 17:42:54 +00:00
Devin Carr
be64362fdb TUN-7124: Add intercept ingress rule for management requests 2023-03-21 11:42:25 -07:00
Devin Carr
27f88ae209 TUN-7252: Remove h2mux connection 2023-03-07 13:51:37 -08:00
iBug
fed60ae4c3
GH-352: Add Tunnel CLI option "edge-bind-address" (#870)
* Add Tunnel CLI option "edge-bind-address"
2023-02-28 16:11:42 +00:00
Devin Carr
0f95f8bae5 TUN-6938: Force h2mux protocol to http2 for named tunnels
Going forward, the only protocols supported will be QUIC and HTTP2,
defaulting to QUIC for "auto". Selecting h2mux protocol will be forcibly
upgraded to http2 internally.
2023-02-06 11:06:02 -08:00
Devin Carr
ae46af9236 TUN-7065: Remove classic tunnel creation 2023-02-06 18:19:22 +00:00
João Oliveirinha
62dcb8a1d1 Revert "TUN-7065: Remove classic tunnel creation"
This reverts commit c24f275981.
2023-02-01 14:01:59 +00:00
Devin Carr
c24f275981 TUN-7065: Remove classic tunnel creation 2023-01-31 22:35:28 +00:00
Sudarsan Reddy
99b3736cc7 TUN-6999: cloudflared should attempt other edge addresses before falling back on protocol
This PR does two things:
It changes how we fallback to a lower protocol: The current state
is to try connecting with a protocol. If it fails, fall back to a
lower protocol. And try connecting with that and so on. With this PR,
if we fail to connect with a protocol, we will try to connect to other
edge addresses first. Only if we fail to connect to those will we
fall back to a lower protocol.
It fixes a behaviour where if we fail to connect to an edge addr,
we keep re-trying the same address over and over again.
This PR now switches between edge addresses on subsequent connecton attempts.
Note that through these switches, it still respects the backoff time.
(We are connecting to a different edge, but this helps to not bombard an edge
address with connect requests if a particular edge addresses stops working).
2022-12-14 13:17:21 +00:00