Skip to content
Snippets Groups Projects
This project is mirrored from https://gitlab.com/gitlab-org/gitaly.git. Pull mirroring updated .
  1. Feb 03, 2022
  2. Feb 02, 2022
    • Stan Hu's avatar
      Optimize link repository ID migration · f7d0bd08
      Stan Hu authored
      The previous migration was slow at times because the update would cause
      PostgreSQL to do a merge join and then filter out rows matching
      `repository_id IS NULL`. As more rows migrated gained a `repository_id`,
      this would increase the query time significantly for each batch.
      
      The batching was added to deal with limiting the payload size of a
      trigger update.
      
      We can make this migration go faster by disabling the triggers in the
      transactions, rollback to 2bbec66c, and re-enable the trigger.
      
      Relates to https://gitlab.com/gitlab-org/gitaly/-/issues/3973
      
      Changelog: fixed
      f7d0bd08
    • Patrick Steinhardt's avatar
      Merge branch 'jc-fix-cache-test' into 'master' · b1859840
      Patrick Steinhardt authored
      Fail Read if objectReader is closed
      
      Closes #3823
      
      See merge request gitlab-org/gitaly!3944
      b1859840
    • Sami Hiltunen's avatar
      Merge branch 'pks-supervisor-timeout-fixes' into 'master' · 46b9f457
      Sami Hiltunen authored
      supervisor: Fix bugs related to timeouts
      
      See merge request gitlab-org/gitaly!4029
      46b9f457
    • Patrick Steinhardt's avatar
      gitaly-lfs-smudge: Fix missing close for HTTP body · 08973448
      Patrick Steinhardt authored
      The gitaly-lfs-smudge command is a smudge filter for Git which will
      replace contents of LFS pointers with the actual LFS object's contents.
      To do so, we need to request the object's contents from Rails via an
      HTTP request. The tests exercising this code all of a sudden started
      failing due to leaking Goroutines, where the leak happens in the HTTP
      handling code. And sure enough: we never close the `http.Response` body,
      which may likely be the root cause here.
      
      Fix this by always closing the body. While I have no idea why leaks
      started to happen just now, chances are high that this fixes the new
      flake.
      08973448
  3. Jan 11, 2022
  4. Jan 10, 2022
  5. Dec 06, 2021
  6. Dec 03, 2021
  7. Dec 01, 2021
  8. Nov 23, 2021
  9. Nov 22, 2021
    • Patrick Steinhardt's avatar
      datastore: Revert use of materialized views · 38b5c332
      Patrick Steinhardt authored
      Revert the introduction of materialized views for `valid_primaries`. As
      it turns out, the changes cause incompatibilities with Postgres 11,
      which is still actively in use. Furthermore, the performance issues we
      have seen have not been fully fixed with this change, and we do not yet
      fully understand the root cause for this.
      
      Changelog: fixed
      38b5c332
  10. Nov 20, 2021
    • Pavlo Strokov's avatar
      list-untracked-repositories: Praefect sub-command to show untracked repositories · b6fb5c33
      Pavlo Strokov authored
      The change is a backport of the functionality implemented
      to resolve https://gitlab.com/gitlab-org/gitaly/-/issues/3792
      issue. A new sub-command for the praefect binary. On run it
      connects to all gitaly storages set in the configuration file
      and receives from each list of the repositories existing on
      the disk. Each repository then checked if it exists in the
      praefect database and if it is not the location of the
      repository is printed out in JSON format to the stdout of the
      process.
      
      Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3792
      
      Changelog: added
      b6fb5c33
    • Pavlo Strokov's avatar
      sql-migrate: Update storage_repositories table · 0e6a5d2a
      Pavlo Strokov authored
      The batch update query introduced to mitigate limitation
      of the PostgreSQL on amount of payload size that can be
      send by the NOTIFY function was missing a condition in
      the update statements. Because of that the payload contained
      changes not for the N storage-repository entries, but for
      (N * num_of_storages) storage-repository entries. So initial
      size of 150 becomes 450 if 3 storages used.
      
      The change also includes significantly reduced batch size.
      The calculation was done on the test data similar to the
      production used data. The approximate payload size for
      single row is about 470 bytes. As max payload size is 8k bytes
      we are allowed to use no more than 16~17 entries. To be more
      realistic we reduce it to 14.
      
      Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3806
      
      Changelog: fixed
      0e6a5d2a
  11. Nov 19, 2021
    • Patrick Steinhardt's avatar
      Merge branch 'pks-praefect-datastore-collector-metrics-endpoint-v14.4' into '14-4-stable' · f60e6616
      Patrick Steinhardt authored
      praefect: Backport separate endpoint for datastore collector (v14.4)
      
      See merge request gitlab-org/gitaly!4094
      f60e6616
    • John Cai's avatar
      praefect: Do not collect repository store metrics on startup · 3cde9b5e
      John Cai authored
      Our current code path will trigger the RepositoryStoreCollector to query
      the database on startup, even if the prometheus listener is not
      listening. This is because we call DescribeByCollect in the Describe
      method. The Prometheus client will call Describe on Register, which ends
      up triggering the Collect method and hence runs the queries. Instead, we
      can just provide the decriptions separately from the Collect method.
      
      Changelog: fixed
      (cherry picked from commit 90cb7fb7)
      3cde9b5e
    • John Cai's avatar
      praefect: Add ability to have separate database metrics endpoint · ebaade4a
      John Cai authored
      By default, when metrics are enabled, then each Praefect will expose
      information about how many read-only repositories there are, which
      requires Praefect to query the database. First, this will result in the
      same metrics being exposed by every Praefect given that the database is
      shared between all of them. And second, this will cause one query per
      Praefect per scraping run. This cost does add up and generate quite some
      load on the database, especially so if there is a lot of repositories in
      that database, up to a point where it may overload the database
      completely.
      
      Fix this issue by splitting metrics which hit the database into a
      separate endpoint "/db_metrics". This allows admins to set up a separate
      scraper with a different scraping interval for this metric, and
      furthermore it gives the ability to only scrape this metric for one of
      the Praefect instances so the work isn't unnecessarily duplicated.
      
      Given that this is a breaking change which will get backported, we must
      make this behaviour opt-in for now. We thus include a new configuration
      key "prometheus_use_database_endpoint" which enables the new behaviour
      such that existing installations' metrics won't break on a simple point
      release. The intent is to eventually remove this configuration though
      and enable it for all setups on a major release.
      
      Changelog: added
      (cherry picked from commit 7e74b733)
      ebaade4a
    • Pavlo Strokov's avatar
      prometheus: Avoid duplicated metrics registration · aac5d5e5
      Pavlo Strokov authored
      Praefect uses prometheus to export metrics from inside.
      It relies on the defaults from the prometheus library
      to gather set of metrics and register a new metrics.
      Because of it the new metrics got registered on the
      DefaultRegisterer - a global pre-configured registerer.
      Because of that we can't call 'run' function multiple
      times (for testing purposes) as it results to the metrics
      registration error. To omit that problem the 'run' function
      extended with prometheus.Registerer parameter that is used
      to register praefect custom metrics. The production code
      still uses the same DefaultRegisterer as it was before.
      And the test code creates a new instance of the registerer
      for each 'run' invocation, so there are no more duplicates.
      
      (cherry picked from commit 81368d46)
      aac5d5e5
    • Pavlo Strokov's avatar
      bootstrap: Abstract bootstrapper for testing · 43031e2d
      Pavlo Strokov authored
      The old implementation of the bootstrapper initialization
      does not allow calling the 'run' function to start a service
      because the tableflip library doesn't support multiple
      instances to be created for one process.
      Starting the Praefect service is required in tests to verify
      sub-command execution. The bootstrapper initialization
      extracted out of 'run' function. It allows using a new
      Noop bootstrapper to run service without tableflip
      support.
      
      (cherry picked from commit 18ff3676)
      43031e2d
  12. Nov 18, 2021
    • Toon Claes's avatar
      Merge branch 'smh-optimize-dataloss-query-14-4' into '14-4-stable' · c1cf3752
      Toon Claes authored
      Materialize valid_primaries view (14.4)
      
      See merge request gitlab-org/gitaly!4090
      c1cf3752
    • Sami Hiltunen's avatar
      Materialize valid_primaries view in RepositoryStoreCollector · df6b165f
      Sami Hiltunen authored
      RepositoryStoreCollector gathers metrics on repositories which don't
      have a valid primary candidates available. This indicates the repository
      is unavailable as the current primary is not valid and ther are no valid
      candidates to failover to. The query is currently extremely inefficient
      on some versions of Postgres as it ends up computing the full valid_primaries
      view for each of the rows it checks. This doesn't seem to occur on all versions
      of Postgres, namely 12.6 at least manages to push down the search criteria
      inside the view. This commit fixes the situation by materializing the
      valid_primaries view prior to querying it. This ensures the full view isn't
      computed for all of the rows but rather Postgres just uses the pre-computed
      result.
      
      Changelog: performance
      df6b165f
    • Sami Hiltunen's avatar
      Get the latest generation from repositories instead of a view · 57bef779
      Sami Hiltunen authored
      Dataloss query is currently getting the latest generation of a repository
      from a view that takes the max generation from storage_repositories. This
      is unnecessary as the repositories table already contains the latest generation
      and we can take it from there instead. This commit reads it from the repositories
      table instead.
      
      Changelog: performance
      57bef779
  13. Nov 17, 2021
    • Sami Hiltunen's avatar
      Materialize valid_primaries view in dataloss query · 6d569bb6
      Sami Hiltunen authored
      The dataloss query is extremely slow for bigger datasets. The problem
      is that for each row that the data loss query is returning,
      Postgres computes the full result of the valid_primaries view only to
      filter down to the correct record. This results in an o(n2) complexity
      which kills the performance as soon as the dataset size increases. It's
      not clear why the join parameters are not pushed down in to the view in
      the query.
      
      This commit optimizes the query by materializing the valid_primaries view.
      This ensures Postgres computes the full view only once and joins with the
      pre-computed result.
      
      Changelog: performance
      6d569bb6
  14. Nov 08, 2021
  15. Oct 28, 2021
  16. Oct 21, 2021