Skip to content
Snippets Groups Projects
This project is mirrored from https://gitlab.com/gitlab-org/gitaly.git. Pull mirroring updated .
  1. Jan 11, 2023
    • Will Chandler's avatar
      nodes: Set connection backoff MaxDelay to 1 second · aa235855
      Will Chandler authored
      gRPC clients use an exponential backoff strategy[0] for re-establishing
      connections, meaning that the longer a connection has been in a bad
      state the greater the delay before the client will make its next
      connection attempt. This is useful in scenarios where a very large
      number of clients could trigger a thundering herd effect on a server as
      it returns to service.
      
      In a Gitaly Cluster, this means that in cases where a Gitaly node is
      down for some time and a large connection backoff has been set,
      Praefect may wait to try to connect for up to 120 seconds. This causes
      Gitaly nodes to remain unavailable longer than necessary.
      
      The issues addressed gRPC's default exponential backoff behavior do not
      apply in this scenario as we will always have a small number of clients
      (Praefect nodes), and the volume of traffic from healthchecks is dwarfed
      by normal production load.
      
      To resolve this, set the maximum backoff delay to one second.
      
      [0] https://github.com/grpc/grpc/blob/v1.51.0/doc/connection-backoff.md
      
      Changelog: fixed
      aa235855
  2. Oct 14, 2022
  3. Jul 18, 2022
    • Toon Claes's avatar
      test: Disable all test with tag sha256 · 33ff00ca
      Toon Claes authored
      We're about to add the ability to test with SHA256 hashes. We assume
      none of the tests work with this object format.
      
      With this change we add the build constraint to not run any test when
      the tag 'gitaly_test_sha256' is set.
      33ff00ca
  4. Jun 21, 2022
    • Sami Hiltunen's avatar
      Replace deprecated WithInsecure with WithTransportCredentials · 3dfb0d51
      Sami Hiltunen authored
      gRPC has deprecated the WithInsecure option in favor of the more
      general WithTransportCredentials combined with insecure.NewCredentials.
      This commit removes our usage of the deprecated option so we don't get
      lint failures for using deprecated functionality when upgrading our
      gRPC dependency.
      3dfb0d51
  5. May 20, 2022
    • John Cai's avatar
      Update go package name from v14 to v15 · cd77c046
      John Cai authored
      This commit changes the major version in the package name from v14 to
      v15
      
      Updating go.mod & go.sum with new module name v15
      
      Update Makefile to bump major version to v15
      
      Update the gitaly package name in the Makefile. Also update
      gitaly-git2go-v14 -> gitaly-git2go-v15. We need to keep
      gitaly-git2go-v14 for a release however, for zero downtime upgrades.
      This pulls directly from a sha that is v14.
      
      Update package name from v14->v15 for auth, client, cmd, internal packages
      
      This commit changes the package name from v14 to v15 in go and proto
      files in the internal, auth, client, cmd packages.
      
      proto: Update major package number in package name
      
      tools: Change major version number in package name from v14 to v15
      
      gitaly-git2go: Change the package name from v14 to v15
      
      update module updater for v15
      
      Update the documentation for the module updater to reflect v15
      cd77c046
  6. Jan 27, 2022
    • Sami Hiltunen's avatar
      Automatically clean up testhelper.Context · 6c91c8e8
      Sami Hiltunen authored
      testhelper.Context() currently return a cancellation function as a
      second return value. Majority of the tests do not need to explicitly
      cancel the context but they simply defer its cancellation to clean up
      after the test. Given this, we can reduce the test verbosity and make
      testhelper.Context easier to compose by removing the unnecessary second
      return value. This adds a t.Cleanup function to automatically cancel the
      context at the end of the tests and omits the returned cancellation
      function.
      
      Tests which simply `defer cancel()` have had the extra call removed. Some
      tests explicitly call the cancellation, and these tests have been modified
      to add context.WithCancel around the testhelper.Context call. There are a
      few loctions where testing.TB was passed down to test helpers that create the
      context.
      6c91c8e8
  7. Dec 01, 2021
    • Patrick Steinhardt's avatar
      testhelper: Improve creation of loggers · 445d80ef
      Patrick Steinhardt authored
      The testhelper has functions to create test loggers for us. These are
      awkardly named though, and furthermore we have a global variable which
      contains a reference to one of the functions which creates the logger
      for us, which is bad design. Furthermore, this global variable isn't
      modified by anything anymore, making it essentially the same as the
      function it's aliasing.
      
      Rename functions to match Go best practices and remove the global
      variable.
      445d80ef
  8. Oct 21, 2021
    • Sami Hiltunen's avatar
      Return replica path from GetConsistentStorages · e5891071
      Sami Hiltunen authored
      Praefect will start generating unique relative paths for repositories in
      14.6 to prevent stale state conflicting with recreated repositories. To
      facilitate this, the 'repositories' table has the 'replica_path' column
      to store where the replicas of a repository are stored. While in 14.5
      Praefect still doesn't generate the paths, it needs to route the requests
      to the paths stored in the database. This allows 14.6 then to start generating
      unique paths and be backwards compatible wiht 14.5 as it is already using
      the paths from the database.
      
      This commit returns the replica path from GetConsistentStorages. The method
      was chosen mostly as its result is cached for accessors, so integrating the
      replica path into the cache is effortless. This commit only includes the changes
      required to fetch the replica path and cache it. A later commit will update the
      routing logic to actually route the requests to the stored path.
      e5891071
  9. Oct 15, 2021
    • Sami Hiltunen's avatar
      Return the virtual storage parameter to GetPrimary · a07caee5
      Sami Hiltunen authored
      Praefect is now getting the primary of a repository by the repository
      ID. Given that, the virtual storage paraemeter was no longer needed and
      was removed. There are some tests for RepositoryReplicas in Rails' repo
      which use the legacy election strategies. These legacy election strategies
      still need the virtual storage parameter to determine the primary node.
      In order to keep compatibility with the tests, this commit returns the
      parameter to GetPrimary. While this works for now, we should really configure
      Praefect properly in the tests to avoid being stuck with legacy implementations.
      a07caee5
  10. Oct 07, 2021
    • Patrick Steinhardt's avatar
      global: Close gRPC connections · 5981dc52
      Patrick Steinhardt authored
      When creating gRPC connections, then we spawn a set of Goroutines which
      listen on these connections. As a result, if they are never closed,
      those Goroutines are leaked.
      
      Fix this by closing connections.
      5981dc52
    • Patrick Steinhardt's avatar
      nodes: Fix leaking connections and Goroutines due to missing cleanup · f501a33d
      Patrick Steinhardt authored
      The node manager creates connections to all of its known nodes and
      starts monitoring routines which check the respective nodes' health
      status. We never clean up either of them, which thus leads to lots of
      Goroutine leakages in our tests.
      
      Fix this by providing a new `Stop()` function for the manager which both
      stops the electors' monitoring Goroutines and closes the node
      connections and call this function as required.
      f501a33d
  11. Oct 06, 2021
    • Jacob Vosmaer's avatar
      Praefect: proxy sidechannels · 9afed4db
      Jacob Vosmaer authored
      This commit adds backchannel support to the main gRPC listener of
      Praefect. And if clients make gRPC calls with sidechannels, Praefect
      will now proxy these to the Gitaly backend.
      
      Changelog: added
      9afed4db
  12. Jun 11, 2021
  13. May 27, 2021
    • Pavlo Strokov's avatar
      Create module v14 gitaly version · 12e0bf3a
      Pavlo Strokov authored
      The new "v14" version of the Gitaly module is named to match
      the next GitLab release. The module versioning is needed in
      order to pull gitaly as a dependency in other projects. The
      change updates all imports to include v14 version. The go.mod
      file was modified as well after go mod tidy execution. And
      the changes in dependency licenses are reflected in the NOTICE
      file.
      
      Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3177
      12e0bf3a
  14. Apr 29, 2021
    • Pavlo Strokov's avatar
      Refactoring of the NewHealthServerWithListener · 1af6eed9
      Pavlo Strokov authored
      Sometimes the test requires a gRPC server with only
      a health service registered. It doesn't matter if
      there are some kind of interceptors on it or not.
      It is only required to check the flow is working
      and the request reaches the service.
      Now it is not required to explicitly stop the server
      as it would be terminated automatically at the end
      of the test execution. And as there are no other
      use of the returned server the function doesn't return
      it anymore.
      1af6eed9
  15. Apr 23, 2021
    • Patrick Steinhardt's avatar
      golangci: Enable wastedassign linter · d6b658f9
      Patrick Steinhardt authored
      The wastedassign linter will check whether a newly assigned variable is
      used in any code path after its assignment. This seems like a useful
      check to have, e.g. to not forget checking a reassigned error value.
      
      Enable the linter and fix the single violation it surfaces.
      d6b658f9
  16. Apr 07, 2021
    • Sami Hiltunen's avatar
      inject backchannel ClientHandshaker from main · d1fca4de
      Sami Hiltunen authored
      This commit injects the multiplexing handshaker from Praefect's main
      to the dialing locations. This allows us to later plug in a backchannel
      server easily. This commit has no changes to the functionality itself.
      d1fca4de
  17. Apr 01, 2021
    • Sami Hiltunen's avatar
      Add GetPrimary to NodeManager · 6f68e2cb
      Sami Hiltunen authored
      This commit implements the praefect.PrimaryGetter interface on NodeManager.
      This allows for NodeManager to be used as a PrimaryGetter in code which has
      been written to support repository specific primaries.
      6f68e2cb
  18. Mar 04, 2021
  19. Jan 12, 2021
  20. Dec 17, 2020
    • Pavlo Strokov's avatar
      Retrieve all consistent storages · 7af9c950
      Pavlo Strokov authored
      On each read/write operation praefect requires to know which
      gitaly node is a primary. For mutator operations it used to
      define from what node the response will be returned back to
      the client. For the read operations it is used to redirect request
      to or as a fallback option for reads distribution in case it
      is enabled. The default strategy for defining the primary is
      'sql' which means the primary is tracked inside of the Postgres
      database and praefect issues select statement into it each time
      it needs to define the current primary. It creates a high load
      on the database when there are too many read operations (the
      outcome of the performance testing).
      
      To resolve this problem we change the logic of retrieval of
      the set of up to date storages to return all storages including
      the primary. With it in place we don't need to know the current
      primary and use any storage that has latest generation of the
      repository to serve the requests. As this information is cached
      by the in-memory cache praefect won't create a high load on the
      database anymore.
      
      This change also makes check IsLatestGeneration for the primary
      node redundant as it won't be present in the set of consistent
      storages if its generation not the latest one.
      
      Fix linting issues
      
      Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3337
      7af9c950
  21. Dec 14, 2020
    • Pavlo Strokov's avatar
      Removal of unused RepositoryStore dependency · ad18a102
      Pavlo Strokov authored
      As the strategy of interacting with the repository
      changes there is no more need to provide RepositoryStore
      into Mgr as a dependency as it has no usage.
      
      This commit removes RepositoryStore from the list of the
      input parameters of the constructor function and from the
      list of fields of the Mgr struct.
      ad18a102
  22. Dec 10, 2020
    • Pavlo Strokov's avatar
      971d49dc
    • Pavlo Strokov's avatar
      On each read/write operation praefect requires to know which · 09c6d25d
      Pavlo Strokov authored
      gitaly node is a primary. For mutator operations it used to
      define from what node the response will be returned back to
      the client. For the read operations it is used to redirect request
      to or as a fallback option for reads distribution in case it
      is enabled. The default strategy for defining the primary is
      an 'sql' which means the primary is tracked inside Postgres
      database and praefect issues select statement into it each time
      it needs to define current primary. It creates a high load
      on the database when there are too many read operations (the
      outcome of the performance testing).
      
      To resolve this problem we change logic of retrieving set of
      up to date storages to return all storages including primary.
      So now we don't need to know the current primary and use any
      storage that has latest generation of the repository to serve
      the requests. As this information is cached by the in-memory
      cache praefect won't create a high load on the database anymore.
      
      This change also makes check IsLatestGeneration for the primary
      node redundant as it won't be present in the set of consistent
      storages if its generation not the latest one.
      
      Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3337
      09c6d25d
  23. Dec 02, 2020
    • Sami Hiltunen's avatar
      remove usage of MemoryRepositoryStore in tests · 346bda13
      Sami Hiltunen authored
      To prepare for removing the MemoryRepositoryStore, this commit
      removes its uses in tests. Mostly the tests need something that works,
      which is when DisableRepositoryStore is used. When a test is testing
      a particular scenario with the RepositoryStore, a mock is provided
      instead. Ideally we'd use the Postgres implementation in these cases
      but hooking it in requires some additional work as the test setup
      overwrites home directory which breaks the discovery of GDK's Postgres.
      346bda13
  24. Nov 24, 2020
    • Pavlo Strokov's avatar
      Introduction of in-memory cache for reads distribution · 1052c313
      Pavlo Strokov authored
      With enabled distributed_reads feature each read operation leads to
      a database query execution to get state of the storages for particular
      repository. More read calls leads to more database access operations,
      so the pressure to it increases in linear (or even worse).
      To mitigate this problem it was decided to introduce an in-memory cache
      added before accessing the database. Invalidation happens on receiving
      notification events from the database. The events are send by the
      triggers attached to the repositories (delete) and storage_repositories
      (insert, delete, update) tables.
      
      To monitor the cache a new counter was added: gitaly_praefect_uptodate_storages_cache_access_total.
      It tracks amount of cache hits, misses and populates and evicts per virtual
      repository. And to track an error rate of the notifications processing the
      gitaly_praefect_uptodate_storages_errors_total was added with type set to one of:
      retrieve, notification_decode.
      
      Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3053
      1052c313
  25. Nov 18, 2020
    • Patrick Steinhardt's avatar
      nodes: Add context parameter to `GetShard()` · 0c9a7cf0
      Patrick Steinhardt authored
      The `nodes.Manager` interface's method `GetShard()` doesn't currently
      receive a context as parameter, even though some of its implementations
      perform non-trivial and potentially blocking work like querying the
      database. This commit thus adds this parameter and adjusts all current
      callsites.
      0c9a7cf0
  26. Nov 17, 2020
  27. Nov 16, 2020
    • Pavlo Strokov's avatar
      Introduction of storages provider to get up to date storages · 6da7ee5a
      Pavlo Strokov authored
      It is a next step in including cached storages provider in order to
      support reads distribution across gitalies. On each invocation it
      queries the passed in dependency and combine the result with existing
      primary. The resulted list is used by the manager to decide where request
      should be routed for processing. In a follow up MR it will be extended
      with expiration cache to reduce load on database as accessing it on each
      read operation is not efficient.
      
      Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3053
      6da7ee5a
  28. Nov 05, 2020
    • Sami Hiltunen's avatar
      read-only metrics for repository specific primaries · 9770fbc0
      Sami Hiltunen authored
      Adds support for collecting read-only repository metrics taking
      per repository primaries in to account. The new functionality is
      behind a boolean flag in order to keep metric collection working for
      virtual storage primaries as well. In-memory implementations have been
      removed as there is no support for per repository primaries in the
      local elector.
      9770fbc0
  29. Aug 31, 2020
  30. Aug 10, 2020
    • Pavlo Strokov's avatar
      Local elector that is used in case failover is disabled · a64e7599
      Pavlo Strokov authored
      Local elector doesn't handle retrieval of the primary node properly.
      As it is not started by the Mgr and there is no health checks
      executed for each node. In suck case each node considered
      unhealthy as it has 0 successful health checks.
      
      Disabled elector is used in case failover is disabled.
      It returns a first passed in node as a primary and all the others
      as secondaries. It also returns `ErrPrimaryNotHealthy` as other
      electors in case primary is not healthy and can't serve the requests.
      
      The health check request starts despite of failover for any type of
      elector. Each node shows actual health status with `IsHealthy` method.
      
      Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3011
      a64e7599
  31. Aug 06, 2020
  32. Aug 04, 2020
    • Sami Hiltunen's avatar
      metric for the number of read-only repositories · f5702245
      Sami Hiltunen authored
      Adds a Prometheus collector for RepositoryStore to expose the number
      of read-only repositories within a virtual storage.
      
      Since the state of the repositories is in the database, the current
      approach leads to redundant work as each Praefect will query the
      database for the number of read-only repositories. This could be
      fixed later by extracting an exporter from the metrics that use the
      database as their source of truth.
      
      Alternatively, synchronization between Praefect nodes could be added
      in order to coordinate which Praefect should perform the potentially
      expensive queries for metrics.
      f5702245
  33. Jul 31, 2020
  34. Jul 30, 2020
  35. Jul 27, 2020
  36. Jul 22, 2020
  37. Jun 09, 2020