Varnish cache - cannot handle 4000 concurrent users -


experiencing issue when loading ~ 4000 concurrent users on wp site. here configuration have:

f5 loadbalancer ---> varnish 4 8 cores, 32 gb ram ---> 9 backends 4 cores, 16 ram each, running wp site.

while load ~ 2500-3000 users going fine, without errors, when users reaching 4k, varnish stops responding until compute queued requests, plus see many 502 errors.

have 2 pools, 5000 threads each; malloc=30g

additionaly added somaxconn , tcp_max_syn_backlog sysctl

here vcl:

  vcl 4.0; import directors; import std; backend qa2 { .host = "xxx"; .port = "80"; } backend qa3 { .host = "xxx"; .port = "80"; } backend qa4 { .host = "xxx"; .port = "80"; } backend qa5 { .host = "xxx"; .port = "80"; } backend qa6 { .host = "xxx"; .port = "80"; } backend qa7 { .host = "xxx"; .port = "80"; } backend qa8 { .host = "xxx"; .port = "80"; } backend qa9 { .host = "xxx"; .port = "80"; } backend qa10 { .host = "xxx"; .port = "80"; }  # .connect_timeout = 2s; .first_byte_timeout = 10m; .between_bytes_timeout = 10m;  acl purge_list {     "xxx";     "xxx";     "xxx";     "xxx";     "xxx";     "xxx";     "xxx";     "xxx";     "xxx";     "xxx"; } sub vcl_init {     new rr = directors.round_robin();     rr.add_backend(qa2);     rr.add_backend(qa3);     rr.add_backend(qa4);     rr.add_backend(qa5);     rr.add_backend(qa6);     rr.add_backend(qa7);     rr.add_backend(qa8);     rr.add_backend(qa9);     rr.add_backend(qa10); }  sub vcl_recv {   set req.backend_hint = rr.backend();  if (req.method == "purge") {         if (!client.ip ~ purge_list) {             return(synth(405, "not allowed."));         }         ban("req.url ~ .css");         return(synth(200, "css files cleared cache!"));   }   # don't check cache posts , various other http request types   if (req.method != "get" && req.method != "head") {     #ban("req.http.host == " + req.http.host);     return(pass);    }   # don't check cache posts , various other http request types    if (req.http.cookie ~ "sess[a-f|0-9]+" ||      req.http.authorization ||      req.url ~ "login" ||     req.method == "post" ||     req.http.cookie ||      req.url ~ "/wp-(login|admin)") {     return (pass);    }   if (req.http.accept-encoding) {     if (req.url ~ "\.(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg)$") {           unset req.http.accept-encoding;       } elsif (req.http.accept-encoding ~ "gzip") {           set req.http.accept-encoding = "gzip";       } elsif (req.http.accept-encoding ~ "deflate") {           set req.http.accept-encoding = "deflate";       } else {           unset req.http.accept-encoding;       }   }    if (req.url ~ "\.(aif|aiff|au|avi|bin|bmp|cab|carb|cct|cdf|class|css)$"  ||       req.url ~ "\.(dcr|doc|dtd|eps|exe|flv|gcf|gff|gif|grv|hdml|hqx)$"    ||       req.url ~ "\.(ico|ini|jpeg|jpg|js|mov|mp3|nc|pct|pdf|png|ppc|pws)$"  ||       req.url ~ "\.(swa|swf|tif|txt|vbs|w32|wav|wbmp|wml|wmlc|wmls|wmlsc)$"||       req.url ~ "\.(xml|xsd|xsl|zip|woff)($|\?)") {       unset req.http.cookie;       #unset req.http.authorization;        #unset req.http.authenticate;        return (hash);   }    return(hash); }  # cache hit: object found in cache sub vcl_hit {     if (req.method == "purge") {         return (synth(200, "purged!"));     } } # cache miss: request sent backend sub vcl_miss {     if (req.method == "purge") {         return (synth(200, "purged (not in cache)"));     } } sub vcl_backend_response {   if (bereq.url ~ "\.(aif|aiff|au|avi|bin|bmp|cab|carb|cct|cdf|class|css)$"  ||       bereq.url ~ "\.(dcr|doc|dtd|eps|exe|flv|gcf|gff|gif|grv|hdml|hqx)$"    ||       bereq.url ~ "\.(ico|ini|jpeg|jpg|js|mov|mp3|nc|pct|pdf|png|ppc|pws)$"  ||       bereq.url ~ "\.(swa|swf|tif|txt|vbs|w32|wav|wbmp|wml|wmlc|wmls|wmlsc)$"||       bereq.url ~ "\.(xml|xsd|xsl|zip|woff)($|\?)") {     set beresp.grace = 30s;     set beresp.ttl = 1d;     set beresp.http.cache-control = "public, max-age=600";     set beresp.http.expires = beresp.ttl;       return (deliver);   } } # deliver response client sub vcl_deliver {   # add x-cache diagnostic header   if (obj.hits > 0) {     set resp.http.x-cache = "hit";     set resp.http.x-cache-hits = obj.hits;     # don't echo cached set-cookie headers     unset resp.http.set-cookie;   } else {     set resp.http.x-cache = "miss";   }   # remove headers not needed on production systems   #  unset resp.http.via;   #  unset resp.http.x-generator;   #  return(deliver); }*  

and here results of last test:

enter image description here

enter image description here

actually response time good, throughput poor , wrote, varnish freezes until finish resolving previous requests.

so questions - there theoretical limit varnish concurrent users? how can tune work more 4k concurrent connections?

ps. extended maxclients on each of apache server.

varnish never produce 502 return code, means you're not caching responses.

you may benchmarking backend instead.

there no built in limits on number of concurrent users. thread count looks fine. 4000 sessions shouldn't need kernel/os tuning, defaults should fine. if hit somaxconn, artifact of benchmarking tools, , won't case on real traffic.

summing up: check hit rates , @ varnishlog figure out why things aren't being cached.


Comments

Popular posts from this blog

Spring Boot + JPA + Hibernate: Unable to locate persister -

go - Golang: panic: runtime error: invalid memory address or nil pointer dereference using bufio.Scanner -

c - double free or corruption (fasttop) -