Squid Proxy Is Very Slow After User Login (NTLM / Worker Bottleneck)

Problem

In a UCS environment using Squid as an explicit proxy with NTLM authentication, users can log in to their systems quickly, but internet access is extremely slow during the first 5–10 minutes after login. After this initial phase, browsing performance becomes normal without further changes.

The issue mainly affects environments with many parallel clients (≈250) and NTLM-based authentication.

Typical symptoms:

  • Login to the operating system is fast
  • First web requests take a very long time or appear to hang
  • After several minutes, browsing speed suddenly becomes normal
  • CPU usage on the proxy host appears moderate (25–50%)

Investigation

Squid Configuration Changes Observed

The following configuration changes were applied:

cache deny all
ntlmauth_children 250

Disabling the cache is a common practice in environments where Squid is used mainly as an authentication and filtering proxy (e.g. with NTLM and squidGuard). However, this also means Squid cannot mitigate slow authentication or connection setup by serving cached objects.

Increasing ntlmauth_children alone did not resolve the issue.


Log Analysis

Relevant entries from access.log:

2025-12-04 11:31:09.891 371403 10.198.2.137 TCP_TUNNEL/200 50255 CONNECT augloop.office.com:443 user1 HIER_DIRECT/52.111.243.4 -

Important detail:

  • The second field (371403) is the request duration in milliseconds, not seconds
  • 371403 ms371 seconds

This clearly indicates that Squid is blocking or waiting internally before the connection is fully established.

Later log entries show much lower values:

2025-12-05 11:11:13.037 120137 10.198.0.240 TCP_TUNNEL/200 23904 CONNECT www.bing.com:443 user2 HIER_DIRECT/2.16.204.157 -

This confirms the “slow at first, fast later” behavior.


Process and Resource Inspection

Process list (excerpt):

/usr/lib/squid/squid_ldap_ntlm_auth -c 60
(squidGuard) -c /etc/squidguard/squidGuard.conf

Observations:

  • Many NTLM helper processes were running
  • CPU usage stayed below critical levels

This can be misleading: NTLM authentication is synchronous per Squid worker. Even with many helper processes, Squid can still block if too few workers are available to handle concurrent NTLM handshakes.


Root Cause

The root cause is a worker bottleneck during NTLM authentication bursts:

  • After login, many clients open parallel HTTPS connections
  • Each connection requires an NTLM handshake
  • If Squid is running with too few workers, requests queue up
  • This results in very long request times (hundreds of seconds)
  • Once authentication caches are populated, performance improves automatically

Disabling the Squid cache (cache deny all) amplifies the effect, because Squid must process every request fully.


Solution

1. Configure Squid Workers (critical)

On UCS, Squid workers must be configured via UCR, not directly in squid.conf.

ucr set squid/workers=4
systemctl restart squid

Recommended values:

  • ~100 clients: 2–3 workers
  • ~250 clients: 4–6 workers

Verify:

ucr get squid/workers
ps -ef | grep squid | grep worker

2. NTLM helper configuration

In this setup, the NTLM helper count was not the primary bottleneck.

The effective and observed value was 50–60 helpers, which is sufficient for ~250 clients when Squid workers are sized correctly.

Recommended configuration:

ucr set squid/ntlmauth/children=60
systemctl restart squid

Verify running helpers:

ps aux | grep squid_ldap_ntlm_auth | grep -v grep | wc -l

Important: Increasing squid/ntlmauth/children to very high values (e.g. 200–300) does not improve performance and may even increase load. The main bottleneck in this case was the number of Squid workers, not the NTLM helpers.


3. Squid cache configuration

In many enterprise environments (Office 365, HTTPS-heavy traffic), Squid is used mainly as an authentication and policy proxy.

Disabling the disk cache avoids unnecessary I/O and startup overhead:

ucr set squid/cache=no
systemctl restart squid

This is recommended when:

  • Most traffic is HTTPS (TCP_TUNNEL)
  • Caching provides no benefit
  • Authentication latency is the main concern

4. Validation via logs

After applying the changes, connection times in access.log should drop significantly:

TCP_TUNNEL/200 47882 CONNECT www.bing.com:443 user HIER_DIRECT/2.16.204.157 -

Values are in milliseconds. Initial delays >300000 ms indicate authentication or worker bottlenecks.


5. Optional monitoring

tail -f /var/log/squid/cache.log

Look for repeated messages like:

Starting new ntlmauthenticator helpers...
helperOpenServers: Starting 1/60 'squid_ldap_ntlm_auth' processes

Frequent spawning during peak usage is a sign of undersized worker or helper configuration.


Result

With sufficient Squid workers and properly sized NTLM helpers, the initial slow internet access after login disappears, even with several hundred concurrent clients.