c - Non-blocking synchronization of streams in CUDA? -


is possible synchronize 2 cuda streams without blocking host? know there's cudastreamwaitevent, non-blocking. creation , destruction of events using cudaeventcreate , cudaeventdestroy.

the documentation cudaeventdestroy says:

in case event has been recorded has not yet been completed when cudaeventdestroy() called, function return , resources associated event released automatically once device has completed event.

what don't understand here difference between recorded event , completed event. seems imply call blocking if event has not yet been recorded.

anyone can shed light on this?

you're on right track using cudastreamwaitevent. creating events carry cost, can created during application start-up prevent creation time being costly during gpu routines.

an event recorded when you put event stream. completed after activity put stream before event has completed. recording event puts marker stream, thing enables cudastreamwaitevent stop forward progress on stream until event has completed.


Comments

Popular posts from this blog

Spring Boot + JPA + Hibernate: Unable to locate persister -

go - Golang: panic: runtime error: invalid memory address or nil pointer dereference using bufio.Scanner -

c - double free or corruption (fasttop) -