Skip to content

HBASE-30142 Resolve NPE while running recoverUnknown command because null RegionLocation#8192

Open
Umeshkumar9414 wants to merge 2 commits into
apache:masterfrom
Umeshkumar9414:HBASE-30142
Open

HBASE-30142 Resolve NPE while running recoverUnknown command because null RegionLocation#8192
Umeshkumar9414 wants to merge 2 commits into
apache:masterfrom
Umeshkumar9414:HBASE-30142

Conversation

@Umeshkumar9414
Copy link
Copy Markdown
Contributor

@Umeshkumar9414 Umeshkumar9414 commented May 5, 2026

MasterRpcServices.scheduleSCPsForUnknownServers() has no null guard before calling isServerUnknown(). With the old code, regions with null serverName (OFFLINE/FAILED_OPEN in-between state) would spuriously trigger a ServerCrashProcedure. The fix prevents that.

Copy link
Copy Markdown
Contributor

@wchevreuil wchevreuil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Should we add UT for this?

@Umeshkumar9414
Copy link
Copy Markdown
Contributor Author

Makes sense. Should we add UT for this?

@wchevreuil Added an test. Let me know if it looks good.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes HBASE-30142 by ensuring recoverUnknown/scheduleSCPsForUnknownServers does not treat regions with a null ServerName (i.e., null region location during transitions/failed-open handling) as “unknown servers,” avoiding spurious SCP scheduling and potential NPEs.

Changes:

  • Update ServerManager#isServerUnknown to return false for null server names.
  • Add a new master-level mini-cluster test that drives a region into ABNORMALLY_CLOSED with a null location and verifies scheduleSCPsForUnknownServers schedules no SCPs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java Adjusts “unknown server” classification to ignore null server names and updates Javadoc.
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestRecoverUnknownWithNullRegionLocation.java Adds regression coverage for recoverUnknown when a region’s location is null but state is non-OFFLINE.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +983 to +986
* any more, for example, a very old previous instance). A null serverName should not be
* considered unknown. We set regionLocation null before finding the assign candidate (in-between
* region transition) or while marking it OFFLINE/FAILED_OPEN For more info regarding null-check
* in this refer HBASE-30142
Comment on lines +101 to +107
assertTrue(node.isInState(RegionState.State.ABNORMALLY_CLOSED),
"regionClosedAbnormally must move state to ABNORMALLY_CLOSED");
assertNull(node.getRegionLocation(), "regionClosedAbnormally must null out the location");

MasterProtos.ScheduleSCPsForUnknownServersResponse response =
master.getMasterRpcServices().scheduleSCPsForUnknownServers(null,
MasterProtos.ScheduleSCPsForUnknownServersRequest.newBuilder().build());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants