Closed Bug 1336170 Opened 8 years ago Closed 7 years ago

Intermittent LeakSanitizer | leak at Create, HostDB_InitEntry, PLDHashTable::Add, nsHostResolver::ResolveHost

Categories

(Core :: Networking: DNS, defect, P2)

defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox53 --- affected

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, memory-leak, Whiteboard: [necko-next])

Blocks: LSan
Keywords: mlk
Nick, can you take a look, there is the predictor involved?
Flags: needinfo?(hurley)
Whiteboard: [necko-next]
I'm 99% certain this isn't predictor-related - the predictor doesn't do anything strange with the dns service (and anything it does, it's been doing for at least a year without issue now). Neither pieces of code have changed recently, either. Looks like there's a lot of manual addref/release going on around there, so this is probably a latent issue that's being exposed now for whatever reason. Patrick, I remember you doing some automatic memory management stuff with channels a couple years back, but I can't remember if you ever tried on the dns stuff. Did you try, and it just never landed because it was too hairy, or was this something that hasn't been attempted yet?
Flags: needinfo?(hurley) → needinfo?(mcmanus)
this is probably a shutdown thing - we want to fix it for the sake of the CI, but it won't have a product impact. The predictor does this PREDICTOR_LOG((" doing preresolve %s", hostname.get())); nsCOMPtr<nsICancelable> tmpCancelable; mDnsService->AsyncResolve(hostname, (nsIDNSService::RESOLVE_PRIORITY_MEDIUM | nsIDNSService::RESOLVE_SPECULATE), mDNSListener, nullptr, getter_AddRefs(tmpCancelable)); and then forgets about tmpCancelable (which is what is being leaked). Actually Cancel()ing it on shutdown would most likely fix the issue. otoh, nsHostResolver tries to deal with this in Shutdown(), calling ClearPendingQueue which should trigger a callback on that and make the references work out. perhaps more likely is that we only block for 25 seconds on shutdown for any OS threads that are blocked in getaddrinfo() - there is no way to cancel or interrupt that. (and we don't block at all if we aren't a leak checking build - but this obviously is :)). I guess its possible that That could be caused by a genuine network glitch - the OS will retry for a long time. If that were the case it would probably be worth figuring out the logging so we can wontfix it in the future more easily. NS_WARN_IF probably when the threadcount isn't 0 valentin, thoughts?
Flags: needinfo?(mcmanus) → needinfo?(valentin.gosu)
This seems to have the same signature as bug 1183781. I tried to fix that several times by using smart pointers, but it got backed out because of crashes, and I never got to the bottom of it. I would appreciate another pair of eyes on this issue.
Flags: needinfo?(valentin.gosu)
See Also: → 1183781
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.