Skip to content
Snippets Groups Projects
user avatar
Patrick Steinhardt authored
Due to different reasons we need to perform regular full repacks both in
normal repositories and in object pools. These full repacks are guided
by a cooldown period so that we'll perform them only in case the last
full repack is longer ago than the cooldown period.

For object pool, the reason we do full repacks is to refresh deltas so
that they again honor our delta islands. This is not all that important
to users and should not be noticeable in general when we do this less
frequently. Consequentially, we only perform a full repack once every
week.

For normal repositories full repacks are mostly done in order to
guarantee that objects will get evicted into cruft packs so that they
can be expired and thus deleted. This _is_ something that both we and
our customers care about given that it can be directly equated to disk
space that is required. It is thus prudent that we perform this on a
more-regular basis so that objects get deleted quickly.

That being said, there is interplay between the stale object grace
period (which is 14 days) and the cooldown periods (which is 1 day).
Effectively, assuming that a repository gets daily optimization jobs,
and with the knowledge in mind that we need to perform two full repacks
in order to evict an unreachable object, objects will get deleted after
14 to 15 days:

    1. The first full repack on days 0 to 1 will evict the unreachable
       object into a cruft pack.

    2. We wait 14 days and will thus land either on day 14 or 15.

    3. We perform a second full repack to expire the object part of the
       cruft pack.

This interplay between both periods is important, because it means that
we can do compromises between tuning the cooldown period and stale
object grace period without actually impacting the median time to
deletion:

    - Increasing the cooldown period means we need to perform less
      regular full repacks, thus saving on resources. Conversely,
      decreasing the cooldown means more regular full repacks and thus
      using more resources.

    - Increasing the grace period means we'll have a longer time to
      avoid racy access to Git objects with the downside of more disk
      space use. Decreasing the grace period means we are more likely to
      hit racy access to Git objects, but evict objects and thus save
      disk space more regularly.

Now optimizing the cooldown period is something we're very keen to do
because it directly impacts how much resources we and our customers need
to provision for machines. On the other hand, the grace period is mostly
there to avoid racy access to Git objects, and two weeks feels excessive
for that.

So long story short, this commit changes our strategy to increase the
full repack cooldown period to 5 days instead of 1 day while decreasing
the stale object grace period from 14 days to 7 days to counteract the
longer time-to-deletion for stale objects. This means objects will get
deleted 12 to 17 days afer becoming unreachable, with a median value of
14.5 days. This is the exact same median value as previously, so the
time-to-deletion should not change in practice. But on the other hand,
it does allow us to greatly save on compute resources by reducing the
frequency we perform full repacks to one fifth.

Furthermore, as the repack cooldown period for normal repositories and
object pools are almost the same now, let's merge them so that we have
one less special case to think about.

Changelog: changed
21d61f7d
Name Last commit Last update