This project is mirrored from https://gitlab.com/gitlab-org/gitaly.git.
Pull mirroring updated .
- Jan 11, 2023
-
-
Will Chandler authored
gRPC clients use an exponential backoff strategy[0] for re-establishing connections, meaning that the longer a connection has been in a bad state the greater the delay before the client will make its next connection attempt. This is useful in scenarios where a very large number of clients could trigger a thundering herd effect on a server as it returns to service. In a Gitaly Cluster, this means that in cases where a Gitaly node is down for some time and a large connection backoff has been set, Praefect may wait to try to connect for up to 120 seconds. This causes Gitaly nodes to remain unavailable longer than necessary. The issues addressed gRPC's default exponential backoff behavior do not apply in this scenario as we will always have a small number of clients (Praefect nodes), and the volume of traffic from healthchecks is dwarfed by normal production load. To resolve this, set the maximum backoff delay to one second. [0] https://github.com/grpc/grpc/blob/v1.51.0/doc/connection-backoff.md Changelog: fixed
-
- Oct 14, 2022
-
-
Patrick Steinhardt authored
A few weeks ago we have decided that `testing.TB` arguments should be first, `context.Context` second. This is rather a matter of perssnal taste instead of correctness, but we should strive for consistency. Reorder all arguments to match the agreed-upon style.
-
- Jul 18, 2022
-
-
Toon Claes authored
We're about to add the ability to test with SHA256 hashes. We assume none of the tests work with this object format. With this change we add the build constraint to not run any test when the tag 'gitaly_test_sha256' is set.
-
- Jun 21, 2022
-
-
Sami Hiltunen authored
gRPC has deprecated the WithInsecure option in favor of the more general WithTransportCredentials combined with insecure.NewCredentials. This commit removes our usage of the deprecated option so we don't get lint failures for using deprecated functionality when upgrading our gRPC dependency.
-
- May 20, 2022
-
-
John Cai authored
This commit changes the major version in the package name from v14 to v15 Updating go.mod & go.sum with new module name v15 Update Makefile to bump major version to v15 Update the gitaly package name in the Makefile. Also update gitaly-git2go-v14 -> gitaly-git2go-v15. We need to keep gitaly-git2go-v14 for a release however, for zero downtime upgrades. This pulls directly from a sha that is v14. Update package name from v14->v15 for auth, client, cmd, internal packages This commit changes the package name from v14 to v15 in go and proto files in the internal, auth, client, cmd packages. proto: Update major package number in package name tools: Change major version number in package name from v14 to v15 gitaly-git2go: Change the package name from v14 to v15 update module updater for v15 Update the documentation for the module updater to reflect v15
-
- Jan 27, 2022
-
-
Sami Hiltunen authored
testhelper.Context() currently return a cancellation function as a second return value. Majority of the tests do not need to explicitly cancel the context but they simply defer its cancellation to clean up after the test. Given this, we can reduce the test verbosity and make testhelper.Context easier to compose by removing the unnecessary second return value. This adds a t.Cleanup function to automatically cancel the context at the end of the tests and omits the returned cancellation function. Tests which simply `defer cancel()` have had the extra call removed. Some tests explicitly call the cancellation, and these tests have been modified to add context.WithCancel around the testhelper.Context call. There are a few loctions where testing.TB was passed down to test helpers that create the context.
-
- Dec 01, 2021
-
-
Patrick Steinhardt authored
The testhelper has functions to create test loggers for us. These are awkardly named though, and furthermore we have a global variable which contains a reference to one of the functions which creates the logger for us, which is bad design. Furthermore, this global variable isn't modified by anything anymore, making it essentially the same as the function it's aliasing. Rename functions to match Go best practices and remove the global variable.
-
- Oct 21, 2021
-
-
Sami Hiltunen authored
Praefect will start generating unique relative paths for repositories in 14.6 to prevent stale state conflicting with recreated repositories. To facilitate this, the 'repositories' table has the 'replica_path' column to store where the replicas of a repository are stored. While in 14.5 Praefect still doesn't generate the paths, it needs to route the requests to the paths stored in the database. This allows 14.6 then to start generating unique paths and be backwards compatible wiht 14.5 as it is already using the paths from the database. This commit returns the replica path from GetConsistentStorages. The method was chosen mostly as its result is cached for accessors, so integrating the replica path into the cache is effortless. This commit only includes the changes required to fetch the replica path and cache it. A later commit will update the routing logic to actually route the requests to the stored path.
-
- Oct 15, 2021
-
-
Sami Hiltunen authored
Praefect is now getting the primary of a repository by the repository ID. Given that, the virtual storage paraemeter was no longer needed and was removed. There are some tests for RepositoryReplicas in Rails' repo which use the legacy election strategies. These legacy election strategies still need the virtual storage parameter to determine the primary node. In order to keep compatibility with the tests, this commit returns the parameter to GetPrimary. While this works for now, we should really configure Praefect properly in the tests to avoid being stuck with legacy implementations.
-
- Oct 07, 2021
-
-
Patrick Steinhardt authored
When creating gRPC connections, then we spawn a set of Goroutines which listen on these connections. As a result, if they are never closed, those Goroutines are leaked. Fix this by closing connections.
-
Patrick Steinhardt authored
The node manager creates connections to all of its known nodes and starts monitoring routines which check the respective nodes' health status. We never clean up either of them, which thus leads to lots of Goroutine leakages in our tests. Fix this by providing a new `Stop()` function for the manager which both stops the electors' monitoring Goroutines and closes the node connections and call this function as required.
-
- Oct 06, 2021
-
-
Jacob Vosmaer authored
This commit adds backchannel support to the main gRPC listener of Praefect. And if clients make gRPC calls with sidechannels, Praefect will now proxy these to the Gitaly backend. Changelog: added
-
- Jun 11, 2021
-
-
Mikhail Mazurskiy authored
Changelog: changed
-
- May 27, 2021
-
-
Pavlo Strokov authored
The new "v14" version of the Gitaly module is named to match the next GitLab release. The module versioning is needed in order to pull gitaly as a dependency in other projects. The change updates all imports to include v14 version. The go.mod file was modified as well after go mod tidy execution. And the changes in dependency licenses are reflected in the NOTICE file. Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3177
-
- Apr 29, 2021
-
-
Pavlo Strokov authored
Sometimes the test requires a gRPC server with only a health service registered. It doesn't matter if there are some kind of interceptors on it or not. It is only required to check the flow is working and the request reaches the service. Now it is not required to explicitly stop the server as it would be terminated automatically at the end of the test execution. And as there are no other use of the returned server the function doesn't return it anymore.
-
- Apr 23, 2021
-
-
Patrick Steinhardt authored
The wastedassign linter will check whether a newly assigned variable is used in any code path after its assignment. This seems like a useful check to have, e.g. to not forget checking a reassigned error value. Enable the linter and fix the single violation it surfaces.
-
- Apr 07, 2021
-
-
Sami Hiltunen authored
This commit injects the multiplexing handshaker from Praefect's main to the dialing locations. This allows us to later plug in a backchannel server easily. This commit has no changes to the functionality itself.
-
- Apr 01, 2021
-
-
Sami Hiltunen authored
This commit implements the praefect.PrimaryGetter interface on NodeManager. This allows for NodeManager to be used as a PrimaryGetter in code which has been written to support repository specific primaries.
-
- Mar 04, 2021
-
-
Sami Hiltunen authored
This commit removes DirectStorageProvider as it is just a proxy to call GetConsistentStorages on the RepositoryStore.
-
Sami Hiltunen authored
This commit renames StoragesProvider to ConsistentStoragesGetter to better align with Go's single method interface naming conventions and the original name of the wrapped method in RepositoryStory.
-
- Jan 12, 2021
-
-
Patrick Steinhardt authored
The `testhelper.GetTemporaryGitalySocketFileName()` function currently returns an error. Given that all callers use `require.NoError()` on that error, let's convert it to instead receive a `testing.TB` and not return an error.
-
- Dec 17, 2020
-
-
Pavlo Strokov authored
On each read/write operation praefect requires to know which gitaly node is a primary. For mutator operations it used to define from what node the response will be returned back to the client. For the read operations it is used to redirect request to or as a fallback option for reads distribution in case it is enabled. The default strategy for defining the primary is 'sql' which means the primary is tracked inside of the Postgres database and praefect issues select statement into it each time it needs to define the current primary. It creates a high load on the database when there are too many read operations (the outcome of the performance testing). To resolve this problem we change the logic of retrieval of the set of up to date storages to return all storages including the primary. With it in place we don't need to know the current primary and use any storage that has latest generation of the repository to serve the requests. As this information is cached by the in-memory cache praefect won't create a high load on the database anymore. This change also makes check IsLatestGeneration for the primary node redundant as it won't be present in the set of consistent storages if its generation not the latest one. Fix linting issues Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3337
-
- Dec 14, 2020
-
-
Pavlo Strokov authored
As the strategy of interacting with the repository changes there is no more need to provide RepositoryStore into Mgr as a dependency as it has no usage. This commit removes RepositoryStore from the list of the input parameters of the constructor function and from the list of fields of the Mgr struct.
-
- Dec 10, 2020
-
-
Pavlo Strokov authored
This reverts commit 09c6d25d
-
Pavlo Strokov authored
gitaly node is a primary. For mutator operations it used to define from what node the response will be returned back to the client. For the read operations it is used to redirect request to or as a fallback option for reads distribution in case it is enabled. The default strategy for defining the primary is an 'sql' which means the primary is tracked inside Postgres database and praefect issues select statement into it each time it needs to define current primary. It creates a high load on the database when there are too many read operations (the outcome of the performance testing). To resolve this problem we change logic of retrieving set of up to date storages to return all storages including primary. So now we don't need to know the current primary and use any storage that has latest generation of the repository to serve the requests. As this information is cached by the in-memory cache praefect won't create a high load on the database anymore. This change also makes check IsLatestGeneration for the primary node redundant as it won't be present in the set of consistent storages if its generation not the latest one. Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3337
-
- Dec 02, 2020
-
-
Sami Hiltunen authored
To prepare for removing the MemoryRepositoryStore, this commit removes its uses in tests. Mostly the tests need something that works, which is when DisableRepositoryStore is used. When a test is testing a particular scenario with the RepositoryStore, a mock is provided instead. Ideally we'd use the Postgres implementation in these cases but hooking it in requires some additional work as the test setup overwrites home directory which breaks the discovery of GDK's Postgres.
-
- Nov 24, 2020
-
-
Pavlo Strokov authored
With enabled distributed_reads feature each read operation leads to a database query execution to get state of the storages for particular repository. More read calls leads to more database access operations, so the pressure to it increases in linear (or even worse). To mitigate this problem it was decided to introduce an in-memory cache added before accessing the database. Invalidation happens on receiving notification events from the database. The events are send by the triggers attached to the repositories (delete) and storage_repositories (insert, delete, update) tables. To monitor the cache a new counter was added: gitaly_praefect_uptodate_storages_cache_access_total. It tracks amount of cache hits, misses and populates and evicts per virtual repository. And to track an error rate of the notifications processing the gitaly_praefect_uptodate_storages_errors_total was added with type set to one of: retrieve, notification_decode. Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3053
-
- Nov 18, 2020
-
-
Patrick Steinhardt authored
The `nodes.Manager` interface's method `GetShard()` doesn't currently receive a context as parameter, even though some of its implementations perform non-trivial and potentially blocking work like querying the database. This commit thus adds this parameter and adjusts all current callsites.
-
- Nov 17, 2020
-
-
Sami Hiltunen authored
Praefect currently does not validate election strategy configuration. This commit adds validation for the config option.
-
- Nov 16, 2020
-
-
Pavlo Strokov authored
It is a next step in including cached storages provider in order to support reads distribution across gitalies. On each invocation it queries the passed in dependency and combine the result with existing primary. The resulted list is used by the manager to decide where request should be routed for processing. In a follow up MR it will be extended with expiration cache to reduce load on database as accessing it on each read operation is not efficient. Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3053
-
- Nov 05, 2020
-
-
Sami Hiltunen authored
Adds support for collecting read-only repository metrics taking per repository primaries in to account. The new functionality is behind a boolean flag in order to keep metric collection working for virtual storage primaries as well. In-memory implementations have been removed as there is no support for per repository primaries in the local elector.
-
- Aug 31, 2020
-
-
John Cai authored
Hooks up the error tracker in the node manager so it checks if a certain backend node has reached a threshold of errors. If it has, then it will be deemed unhealthy.
-
- Aug 10, 2020
-
-
Pavlo Strokov authored
Local elector doesn't handle retrieval of the primary node properly. As it is not started by the Mgr and there is no health checks executed for each node. In suck case each node considered unhealthy as it has 0 successful health checks. Disabled elector is used in case failover is disabled. It returns a first passed in node as a primary and all the others as secondaries. It also returns `ErrPrimaryNotHealthy` as other electors in case primary is not healthy and can't serve the requests. The health check request starts despite of failover for any type of elector. Each node shows actual health status with `IsHealthy` method. Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3011
-
- Aug 06, 2020
-
-
Pavlo Strokov authored
It is decided to go with all features enabled by default behaviour for tests. This should help us identify potential problems with feature flag combinations enabled in production. To verify implementation without feature flag enabled it should be disabled explicitly in the test. Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/2702
-
- Aug 04, 2020
-
-
Sami Hiltunen authored
Adds a Prometheus collector for RepositoryStore to expose the number of read-only repositories within a virtual storage. Since the state of the repositories is in the database, the current approach leads to redundant work as each Praefect will query the database for the number of read-only repositories. This could be fixed later by extracting an exporter from the metrics that use the database as their source of truth. Alternatively, synchronization between Praefect nodes could be added in order to coordinate which Praefect should perform the potentially expensive queries for metrics.
-
- Jul 31, 2020
-
-
Pavlo Strokov authored
Retrieve up to date storages that can server read operation for the repository in order to distribute reads across all healthy storages of the virtual storage. Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/2944
-
- Jul 30, 2020
-
-
Sami Hiltunen authored
With the introduction of per repository read-only mode, this commit removes the previous virtual storage wide read-only mode implementation.
-
- Jul 27, 2020
-
-
Sami Hiltunen authored
Fixes Praefect not starting successfully when dialing any of the configured Gitaly nodes fails.
-
- Jul 22, 2020
-
-
Paul Okstad authored
-
- Jun 09, 2020
-
-
Pavlo Strokov authored
Primary node should also be considered as a candidate for reads distribution as with introduction of transactions all the nodes will have the same load. Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/2834
-