feat(replica): support querying replica status via RESTful API#2377
Merged
empiredan merged 7 commits intoapache:masterfrom Mar 18, 2026
Merged
feat(replica): support querying replica status via RESTful API#2377empiredan merged 7 commits intoapache:masterfrom
empiredan merged 7 commits intoapache:masterfrom
Conversation
acelyc111-bot
left a comment
There was a problem hiding this comment.
Review: RESTful API for querying replica status
Summary: Adds a new replica/status?app_id=X&partition_index=Y HTTP endpoint that returns the current status of a replica. Also includes significant code modernization across replica_stub.
What's good:
- Clean HTTP handler with proper input validation (app_id >= 0, partition_index >= 0, valid int parsing)
- Nice
http_responsehelper methods (as_bad_request,as_missing_query_arg,as_ok_json) — makes handlers much cleaner - Solid modernization:
replica_life_cycle→ enum class, structured bindings in loops,constcorrectness,_is_running→std::atomic_bool,[[nodiscard]] - Good separation: unlocked internal method (
get_replica_life_cycle_unlocked) + public locked wrapper - Destructor properly deregisters the new endpoint
Minor notes:
- The
get_replica_statusreturns astd::string_viewpointing to a static array — safe and efficient, good pattern - One include
<nlohmann/detail/json_ref.hpp>inreplica_http_service.cppseems unnecessary (onlyjson.hppandjson_fwd.hppare needed) — might be an IDE auto-include
Verdict: ✅ Approve — Clean feature addition with bonus modernization. Ready to merge.
foreverneverer
approved these changes
Mar 17, 2026
GehaFearless
approved these changes
Mar 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Sometimes we need to know the current status of a replica. For example,
during offline partition split, after new partitions are generated locally,
we need to start the replica server to load the new partitions. Only after
confirming that all partition data has been successfully loaded can we
rebuild the metadata and recover the Pegasus cluster. However, there is
currently no reliable way to verify that all partition data has finished loading.
There are two possible approaches:
Check the replica server logs.
For example, if we find
"load replica successfully", we assume the partitionhas been loaded successfully; if we find
"load replica failed", we assume theloading failed.
However, the problem is that log files are automatically cleaned up once their
size or count exceeds certain thresholds. When there are a large number of
partitions, the relevant logs might already be removed before we even start
checking whether the partitions were loaded successfully.
Wait for a fixed period of time.
This approach is also impractical because we do not know when a partition
starts loading or how long it will take to load. At the same time, we cannot wait
indefinitely.
If we could directly know the current status of a replica — such as whether it is
still loading or already serving — this problem would be much easier to solve.
Therefore, this PR introduces a RESTful API to query the current status of a
replica.
Since the HTTP service is started before partition data loading begins, it is
possible to query the replica status from the replica server while partitions are
being loaded.
An example usage of the RESTful API:
If the partition is currently loading, the replica server will return the following
response in JSON format:
The currently supported statuses include:
LOADING: the replica is being loaded;NOT_FOUND: the replica does not exist;CREATING: the replica is being created;SERVING: the replica is serving;CLOSING: the replica is being closed;CLOSED: the replica has been closed;UNKNOWN: the replica is in an unknown status.