Problem
In a UCS environment using Squid as an explicit proxy with NTLM authentication, users can log in to their systems quickly, but internet access is extremely slow during the first 5–10 minutes after login. After this initial phase, browsing performance becomes normal without further changes.
The issue mainly affects environments with many parallel clients (≈250) and NTLM-based authentication.
Typical symptoms:
- Login to the operating system is fast
- First web requests take a very long time or appear to hang
- After several minutes, browsing speed suddenly becomes normal
- CPU usage on the proxy host appears moderate (25–50%)
Investigation
Squid Configuration Changes Observed
The following configuration changes were applied:
cache deny all
ntlmauth_children 250
Disabling the cache is a common practice in environments where Squid is used mainly as an authentication and filtering proxy (e.g. with NTLM and squidGuard). However, this also means Squid cannot mitigate slow authentication or connection setup by serving cached objects.
Increasing ntlmauth_children alone did not resolve the issue.
Log Analysis
Relevant entries from access.log:
2025-12-04 11:31:09.891 371403 10.198.2.137 TCP_TUNNEL/200 50255 CONNECT augloop.office.com:443 user1 HIER_DIRECT/52.111.243.4 -
Important detail:
- The second field (
371403) is the request duration in milliseconds, not seconds 371403 ms≈ 371 seconds
This clearly indicates that Squid is blocking or waiting internally before the connection is fully established.
Later log entries show much lower values:
2025-12-05 11:11:13.037 120137 10.198.0.240 TCP_TUNNEL/200 23904 CONNECT www.bing.com:443 user2 HIER_DIRECT/2.16.204.157 -
This confirms the “slow at first, fast later” behavior.
Process and Resource Inspection
Process list (excerpt):
/usr/lib/squid/squid_ldap_ntlm_auth -c 60
(squidGuard) -c /etc/squidguard/squidGuard.conf
Observations:
- Many NTLM helper processes were running
- CPU usage stayed below critical levels
This can be misleading: NTLM authentication is synchronous per Squid worker. Even with many helper processes, Squid can still block if too few workers are available to handle concurrent NTLM handshakes.
Root Cause
The root cause is a worker bottleneck during NTLM authentication bursts:
- After login, many clients open parallel HTTPS connections
- Each connection requires an NTLM handshake
- If Squid is running with too few workers, requests queue up
- This results in very long request times (hundreds of seconds)
- Once authentication caches are populated, performance improves automatically
Disabling the Squid cache (cache deny all) amplifies the effect, because Squid must process every request fully.
Solution
1. Configure Squid Workers (critical)
On UCS, Squid workers must be configured via UCR, not directly in squid.conf.
ucr set squid/workers=4
systemctl restart squid
Recommended values:
- ~100 clients:
2–3workers - ~250 clients:
4–6workers
Verify:
ucr get squid/workers
ps -ef | grep squid | grep worker
2. NTLM helper configuration
In this setup, the NTLM helper count was not the primary bottleneck.
The effective and observed value was 50–60 helpers, which is sufficient for ~250 clients when Squid workers are sized correctly.
Recommended configuration:
ucr set squid/ntlmauth/children=60
systemctl restart squid
Verify running helpers:
ps aux | grep squid_ldap_ntlm_auth | grep -v grep | wc -l
Important: Increasing
squid/ntlmauth/childrento very high values (e.g. 200–300) does not improve performance and may even increase load. The main bottleneck in this case was the number of Squid workers, not the NTLM helpers.
3. Squid cache configuration
In many enterprise environments (Office 365, HTTPS-heavy traffic), Squid is used mainly as an authentication and policy proxy.
Disabling the disk cache avoids unnecessary I/O and startup overhead:
ucr set squid/cache=no
systemctl restart squid
This is recommended when:
- Most traffic is HTTPS (
TCP_TUNNEL) - Caching provides no benefit
- Authentication latency is the main concern
4. Validation via logs
After applying the changes, connection times in access.log should drop significantly:
TCP_TUNNEL/200 47882 CONNECT www.bing.com:443 user HIER_DIRECT/2.16.204.157 -
Values are in milliseconds. Initial delays >300000 ms indicate authentication or worker bottlenecks.
5. Optional monitoring
tail -f /var/log/squid/cache.log
Look for repeated messages like:
Starting new ntlmauthenticator helpers...
helperOpenServers: Starting 1/60 'squid_ldap_ntlm_auth' processes
Frequent spawning during peak usage is a sign of undersized worker or helper configuration.
Result
With sufficient Squid workers and properly sized NTLM helpers, the initial slow internet access after login disappears, even with several hundred concurrent clients.