Varnish cache - cannot handle 4000 concurrent users -
experiencing issue when loading ~ 4000 concurrent users on wp site. here configuration have:
f5 loadbalancer ---> varnish 4 8 cores, 32 gb ram ---> 9 backends 4 cores, 16 ram each, running wp site.
while load ~ 2500-3000 users going fine, without errors, when users reaching 4k, varnish stops responding until compute queued requests, plus see many 502 errors.
have 2 pools, 5000 threads each; malloc=30g
additionaly added somaxconn , tcp_max_syn_backlog sysctl
here vcl:
vcl 4.0; import directors; import std; backend qa2 { .host = "xxx"; .port = "80"; } backend qa3 { .host = "xxx"; .port = "80"; } backend qa4 { .host = "xxx"; .port = "80"; } backend qa5 { .host = "xxx"; .port = "80"; } backend qa6 { .host = "xxx"; .port = "80"; } backend qa7 { .host = "xxx"; .port = "80"; } backend qa8 { .host = "xxx"; .port = "80"; } backend qa9 { .host = "xxx"; .port = "80"; } backend qa10 { .host = "xxx"; .port = "80"; } # .connect_timeout = 2s; .first_byte_timeout = 10m; .between_bytes_timeout = 10m; acl purge_list { "xxx"; "xxx"; "xxx"; "xxx"; "xxx"; "xxx"; "xxx"; "xxx"; "xxx"; "xxx"; } sub vcl_init { new rr = directors.round_robin(); rr.add_backend(qa2); rr.add_backend(qa3); rr.add_backend(qa4); rr.add_backend(qa5); rr.add_backend(qa6); rr.add_backend(qa7); rr.add_backend(qa8); rr.add_backend(qa9); rr.add_backend(qa10); } sub vcl_recv { set req.backend_hint = rr.backend(); if (req.method == "purge") { if (!client.ip ~ purge_list) { return(synth(405, "not allowed.")); } ban("req.url ~ .css"); return(synth(200, "css files cleared cache!")); } # don't check cache posts , various other http request types if (req.method != "get" && req.method != "head") { #ban("req.http.host == " + req.http.host); return(pass); } # don't check cache posts , various other http request types if (req.http.cookie ~ "sess[a-f|0-9]+" || req.http.authorization || req.url ~ "login" || req.method == "post" || req.http.cookie || req.url ~ "/wp-(login|admin)") { return (pass); } if (req.http.accept-encoding) { if (req.url ~ "\.(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg)$") { unset req.http.accept-encoding; } elsif (req.http.accept-encoding ~ "gzip") { set req.http.accept-encoding = "gzip"; } elsif (req.http.accept-encoding ~ "deflate") { set req.http.accept-encoding = "deflate"; } else { unset req.http.accept-encoding; } } if (req.url ~ "\.(aif|aiff|au|avi|bin|bmp|cab|carb|cct|cdf|class|css)$" || req.url ~ "\.(dcr|doc|dtd|eps|exe|flv|gcf|gff|gif|grv|hdml|hqx)$" || req.url ~ "\.(ico|ini|jpeg|jpg|js|mov|mp3|nc|pct|pdf|png|ppc|pws)$" || req.url ~ "\.(swa|swf|tif|txt|vbs|w32|wav|wbmp|wml|wmlc|wmls|wmlsc)$"|| req.url ~ "\.(xml|xsd|xsl|zip|woff)($|\?)") { unset req.http.cookie; #unset req.http.authorization; #unset req.http.authenticate; return (hash); } return(hash); } # cache hit: object found in cache sub vcl_hit { if (req.method == "purge") { return (synth(200, "purged!")); } } # cache miss: request sent backend sub vcl_miss { if (req.method == "purge") { return (synth(200, "purged (not in cache)")); } } sub vcl_backend_response { if (bereq.url ~ "\.(aif|aiff|au|avi|bin|bmp|cab|carb|cct|cdf|class|css)$" || bereq.url ~ "\.(dcr|doc|dtd|eps|exe|flv|gcf|gff|gif|grv|hdml|hqx)$" || bereq.url ~ "\.(ico|ini|jpeg|jpg|js|mov|mp3|nc|pct|pdf|png|ppc|pws)$" || bereq.url ~ "\.(swa|swf|tif|txt|vbs|w32|wav|wbmp|wml|wmlc|wmls|wmlsc)$"|| bereq.url ~ "\.(xml|xsd|xsl|zip|woff)($|\?)") { set beresp.grace = 30s; set beresp.ttl = 1d; set beresp.http.cache-control = "public, max-age=600"; set beresp.http.expires = beresp.ttl; return (deliver); } } # deliver response client sub vcl_deliver { # add x-cache diagnostic header if (obj.hits > 0) { set resp.http.x-cache = "hit"; set resp.http.x-cache-hits = obj.hits; # don't echo cached set-cookie headers unset resp.http.set-cookie; } else { set resp.http.x-cache = "miss"; } # remove headers not needed on production systems # unset resp.http.via; # unset resp.http.x-generator; # return(deliver); }*
and here results of last test:
actually response time good, throughput poor , wrote, varnish freezes until finish resolving previous requests.
so questions - there theoretical limit varnish concurrent users? how can tune work more 4k concurrent connections?
ps. extended maxclients on each of apache server.
varnish never produce 502 return code, means you're not caching responses.
you may benchmarking backend instead.
there no built in limits on number of concurrent users. thread count looks fine. 4000 sessions shouldn't need kernel/os tuning, defaults should fine. if hit somaxconn, artifact of benchmarking tools, , won't case on real traffic.
summing up: check hit rates , @ varnishlog figure out why things aren't being cached.
Comments
Post a Comment