google bigquery - ARRAY_AGG leading to OOM -


i'm trying run pretty simple query it's failing resources exceeded error.
read in post heuristic used allocate number of mixers fail time time.

select   response.auctionid,   response.scenarioid,   array_agg(response) responses   rtb_response_logs.2016080515 group   response.auctionid,   response.scenarioid 

is there way fix query knowing that:

  • a response composed of 38 fields (most of them being short strings)
  • the max(count()) of single response kind of low (165)

query failed
error: resources exceeded during query execution.
job id: teads-1307:bquijob_257ce97b_1566a6a3f27

it's current limitation arrays (produced array_agg or other means) must fit in memory of single machine. we've made couple of recent improvements should reduce resources required queries such this, however. confirm whether issue, try query such as:

select   sum(length(format("%t", response))) total_response_size   rtb_response_logs.2016080515 group   response.auctionid,   response.scenarioid order total_response_size desc limit 1; 

this formats structs strings rough heuristic of how memory take represent. if result large, perhaps can restructure query use less memory. if result not large, other issue @ play, , we'll getting fixed :) thanks!


Comments

Popular posts from this blog

Spring Boot + JPA + Hibernate: Unable to locate persister -

go - Golang: panic: runtime error: invalid memory address or nil pointer dereference using bufio.Scanner -

c - double free or corruption (fasttop) -