-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Garbage collect event memory #4832
Conversation
The problem with anything like this is that certain workloads will have a ping-pong effect of allocating, deallocating and reallocating. Have we checked what the total amount of capacity being taken up by this is in a reasonable workflow? For example, moving the mouse will stop for at least a few frames, then as soon as the mouse starts moving again, we'd have to allocate again (potentially reallocating too). |
The primary motivator here is RFC linked above, which suggests we move hierarchy alteration detection to an event instead of incurring heavier archetype moves via The actual shrink should be pretty cheap given the Vec is emptied before the reallocation. Given that events tend to be either streamed in at a consistent pace (roughly the same number of events per frame) or sent in bursts, potentially many frames apart, I think this strategy accounts for both fairly well. In the consistent case, capacity is retained to roughly the current usage, otherwise if it isn't being used it exponentially decays every two frames. In the aforementioned mouse example, the movements per frame will trigger a consistent flow of events frame to frame, requiring only reallocations on the ramp up. As it falls off, the capacity decays, but provides a window for it to more easily ramp up if need be. If we don't do this automatically, we should at least provide the tools/APIs for allowing users to manually do it themselves. IMO it's irresponsible of us to permanently allocate memory on behalf of the user, and provide them no ways to rein it in. |
Yeah I think this is reasonable. I think something like this is good as a starting point to ensure memory is reclaimed (which does feel like table stakes). That being said, providing a user-configurable "off button" for garbage collection would make me more comfortable merging this. I'm relatively certain that there will be cases that would benefit from that, and it seems pretty easy to add in. Something like: enum EventGarbageCollection {
ExponentialFallOff {
/// Each internal vec never goes below this value.
/// There are two internal vecs, so the total minimum capacity is 2 * min_capacity.
min_capacity: usize,
},
None,
} Note that I added We can provide additional configuration and tweaks to optimize as needed. |
Should we have this on every |
I think it should be for each I'm thinking there are two paths we could take:
As a riff on (2), we could consider replacing I think (2) is more "idiomatic bevy", so that has my preference. But I'm open to suggestions. |
This 100% needs to be controlled separately for each event-type due to differing usage patterns. User-controllable defaults might be good though. Approach 3: use an The downside to that style of design of course is that it's fixed at the type level and can't be configured by users or dynamically modified. I think those are serious limitations, and will interfere with the ability to automatically explore micro-optimizations down the line in user code. Of all these options, I think I prefer approach 2, so we're not fetching the config data every single time. And moving this to a more standard plugin model is good, but we should keep the convenience method (or do autoregistration). |
} | ||
|
||
/// Settings for controlling the garbage collection behavior of [`Events<T>`]. | ||
pub enum EventGarbageCollection { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this flavor of design; there are a number of other settings that I'd like to be able to configure for events (e.g. smallvec usage and cleanup strategy).
As a result, I think we should refactor add_event to use a generic plugin under the hood.
|
||
/// Settings for controlling the garbage collection behavior of [`Events<T>`]. | ||
pub enum EventGarbageCollection { | ||
/// Exponentially decays the amount of allocated memory for the backing event buffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Exponentially decays the amount of allocated memory for the backing event buffer | |
/// Halves the amount of allocated memory for the backing event buffer each tick |
@@ -448,8 +494,12 @@ impl<E: Event> Events<E> { | |||
} | |||
|
|||
/// A system that calls [`Events::update`] once per frame. | |||
pub fn update_system(mut events: ResMut<Self>) { | |||
events.update(); | |||
pub fn update_system(mut events: ResMut<Self>, settings: Option<Res<EventSettings<E>>>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice use of an optional resource; I like the way it avoids polluting the resource list with default values.
Co-authored-by: Alice Cecile <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my 2c: This strategy although interesting for some scenarios is probably too aggressive to be the default one, and we should consider a smoother version kicking in only after a few frames. And as always, if the caller knows for a fact there's an atypical pattern on one frame, we should offer an escape hatch to manually trigger a GC, as the caller always has more context to decide what's best.
One thing also: here the proposed scheme will fall flat if some recurring bursts occur alternatively on frame A and frame B (e.g. every 3 frames), as we only look at the size of the current buffer to decide if it needs to shrink. We should probably have at least a slightly more complex heuristic that looks at both buffers to determine an expected capacity for next frame.
Also, I'm not sure what order of magnitude of memory allocations we're talking about here, but if this is a strong concern, we should maybe consider using a single circular buffer instead of 2 swapping vecs. The advantage is that there's a single allocation covering 2 frames together, so it will more easily absorb bursts and other ping-pong effects, and it naturally solves the issue of the 2-buffer heuristic mentioned above. The ideal data structure is probably similar to a deque, with blocks of N items (to reduce the number of allocations), and the possibility to add more blocks to handle bursts (here, only at one end, unlike deque).
if let Some(settings) = settings { | ||
events.update(&settings); | ||
} else { | ||
events.update(&Default::default()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does default()
works here (and below)? Or we need the extra Default::
qualification?
let new_capacity = match settings.garbage_collection { | ||
EventGarbageCollection::ExponentialFalloff { min_capacity } => self | ||
.events_b | ||
.len() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how Vec::shrink_to()
is implemented, but if the length of the vec decreases slowly frame-to-frame, this will cause a (non no-op) call to Vec::shrink_to()
every frame, which potentially will lead to a reallocation every frame depending on the implementation of Vec::shrink_to()
(which I imagine has some kind of threshold). This should probably be at least tested / validated with various sizes to understand the behavior, because if the implementation is a bit naive and it ends up doing one allocation per frame then that defeats the purpose of having reusable buffers in the first place.
More generally the exponential decay here is very aggressive. Given the cost of allocations compared the relative benefit of recovering memory, it's probably worth having a smoother heuristic which observes the trend over several frames before kicking in some GC operation. Or at least make that an option, and make that option the default one, and leave that aggressive instant-trigger decay variant for those few Events<T>
which we know for a fact are prone to large bursts for which we do not want to pay the cost of a large unused allocation, and instead prefer paying the cost of a reallocation. But I don't think reallocating as soon as possible should be the default.
Closing due to inactivity as part of backlog cleanup, and treating the linked RFC bevyengine/rfcs#53 as the tracking issue. cc: @alice-i-cecile in case I got this one wrong, bit above my pay grade 🙂 |
Objective
As mentioned briefly in discussions in bevyengine/rfcs#53, the memory used for events is never garbage collected despite the data being stored transiently. Bevy should release that memory when possible to avoid said memory from being held for too long.
Solution
Use
Vec::shrink_to
to halve the capacity of the the backingVec
for events after clearing during updates.