Skip to content

New Vector_sort function, replacing insertion and quick sort#1798

Open
Explorer09 wants to merge 2 commits into
htop-dev:mainfrom
Explorer09:vector-sort
Open

New Vector_sort function, replacing insertion and quick sort#1798
Explorer09 wants to merge 2 commits into
htop-dev:mainfrom
Explorer09:vector-sort

Conversation

@Explorer09

@Explorer09 Explorer09 commented Oct 31, 2025

Copy link
Copy Markdown
Contributor

Introduce the new Vector_sort() function and obsolete the old Vector_quickSortCustomCompare() and Vector_insertionSort() APIs.

This new sort function is a natural, in-place merge sort. I.e. it takes advantage of partially sorted data, and it's stable.

Space complexity: O(log(n)) worst case
Time complexity: O(n) best case, O(n*log(n)*log(n)) average & worst case

@BenBE , I have copied some of the changes from your d3cd557 commit but I didn't base on it. Because there are some data type naming that I didn't like, and so I name my "context" structure in my own way. Also to keep changes small I didn't upgrade the Object_Compare function type to take a new "context" parameter.

@Explorer09 Explorer09 force-pushed the vector-sort branch 2 times, most recently from 0ec958a to 74c2a2d Compare November 1, 2025 09:29
@Explorer09 Explorer09 force-pushed the vector-sort branch 2 times, most recently from bbd8ecd to aba90b7 Compare November 1, 2025 09:44
@Explorer09 Explorer09 marked this pull request as draft November 1, 2025 12:10
Comment thread Vector.c Outdated
@Explorer09 Explorer09 force-pushed the vector-sort branch 2 times, most recently from 9b974af to f29067d Compare November 1, 2025 17:48
@Explorer09 Explorer09 marked this pull request as ready for review November 1, 2025 18:03
Comment thread Vector.c Outdated
@natoscott

Copy link
Copy Markdown
Member

Has any performance comparison been done that is sufficiently compelling that we'd want to keep all this extra code, instead of just using qsort? Thanks.

@Explorer09

Copy link
Copy Markdown
Contributor Author

@natoscott The qsort will be out for two reasons: (1) It's not stable sort (important when the user would sort the table under multiple keys). (2) It does not take the third argument that can make the sort function reentrant. With qsort we would need global variables to pass in extra states that would affect sorting (thus unsafe in potential multithread use).

Note my current implementation is not fully async-signal-safe though. The swap operation is not atomic, and can ruin the data when two sort function calls contest on the same data. It's expected that the sort function, when called in asynchronous contexts, operates on different data or the caller must guard the data with a mutex.

@natoscott

Copy link
Copy Markdown
Member

Sort stability is about what happens when two items compare equally (which can happen with/without multiple sort keys, so I don't follow your argument there). In htop, we always have the PID we can use as a tie-breaker to guarantee stability. Can you explain how this is insufficient?

Regarding threads and the extra parameter, htop is single threaded and is very likely to remain so. So again this is not a valid technical argument against the use of qsort here.

qsort is highly optimised and widely reviewed. Any custom home-grown sorting code we have in htop simply isn't, to anywhere near the same degree, and appears to be unnecessary code bloat. So unless there is a valid technical reason otherwise (and none have been put forward so far, AFAICT), I think we should switch to qsort and delete all this code.

@Explorer09

Copy link
Copy Markdown
Contributor Author

@natoscott I assume you have read #1784 and #1785 for the discussion I had with @BenBE before.

Basically I would vote for using qsort() (#1784) if there isn't a stability requirement. I suggested that the old insertionSort API is there because there would be some data that need to be stable sorted. It's also for future usefulness (or utility?) for the API.

And no, I don't think tiebreaking by PID is enough. htop 's process table can be sorted with multiple keys, and there will be a need to sort the table multiple times by the user without explicitly specifying the second sort key. There can be also tables where there isn't an ID field for tiebreaking.

And there's a technical thing: The Vector object in htop stores an array of pointers rather than an array of object data themselves. This mean we cannot pass the Object_compare function pointers directly to qsort(), but need to make a wrapper for compare function. That's why the third argument of the compare function is needed (and qsort() will not be the best API for the task). There's qsort_r(), but checking the availability of it in configure is also a pain.

Therefore the most future-proof way is to implement a stable sorting function by ourselves. Note that I've developed this algorithm for about half a month, and it would be sad for me if it's ditched finally.

@natoscott

Copy link
Copy Markdown
Member

I have read those, yes - there's little/no justification there for custom sort implementations. It seemed like there was consensus that qsort would be fine based on Hishams reasons for the original implementation, but then it somehow became an exercise in writing new code with insufficient justification IMO.

| There can be also tables where there isn't an ID field for tiebreaking.

What tables are you referring to here? I don't see them in the code today (something else I've missed?) - if they exist today, do these things require sort stability? Thanks.

The main (only?) data we need to sort is the process section (lower part) of the htop display, of course, where we always have a unique identifier (PID) for every entry. The PID provides stability with any number of secondary sort keys, its not clear what problem you are talking about there. And the user never needs to be involved or know it is being used, again that's a tangent that just doesn't make any sense to me. Just do all the current sort comparisons, including with multiple keys, and only use the PID as a final, stable, unique differentiator when everything else compares as equal. Sure, some code refactoring may be required to ensure the PID is available within comparison routines, but its a solvable problem.

I understand the limitations of the qsort API (these are solvable for htop though). I also understand the problems of creating custom sort implementations. Its relatively straight-forward to sort things, its more difficult to perform well in general and particularly with very large process counts.

| Therefore the most future-proof way [...]

It sounds like you are designing for problems htop does not currently have. Can you explain specifically what future features you foresee that cannot use qsort with a stable identifier tie-breaker? Thanks.

Either way, it will be interesting to see performance numbers comparing performance at both very large and small numbers of processes. If a different algorithm significantly outperforms qsort for sorting processes in htop, we should use it.

@Explorer09

Copy link
Copy Markdown
Contributor Author

The main (only?) data we need to sort is the process section (lower part) of the htop display, of course, where we always have a unique identifier (PID) for every entry. The PID provides stability with any number of secondary sort keys, its not clear what problem you are talking about there. And the user never needs to be involved or know it is being used, again that's a tangent that just doesn't make any sense to me. Just do all the current sort comparisons, including with multiple keys, and only use the PID as a final, stable, unique differentiator when everything else compares as equal. Sure, some code refactoring may be required to ensure the PID is available within comparison routines, but its a solvable problem.

The current process compare functions always sort PID in the ascending order for tiebreaking. No way to make it into descending order. Sorry I just don't like this. (If we were sorting a table in a SQL database, a fallback key for tiebreaking is reasonable. But htop is not a DB.)

I understand the limitations of the qsort API (these are solvable for htop though). I also understand the problems of creating custom sort implementations. Its relatively straight-forward to sort things, its more difficult to perform well in general and particularly with very large process counts.

Global variables, I know. When I proposed that @BenBE doesn't seem to like it. So it's not my decision on this part.

qsort() uses introsort in glibc, and smoothsort in musl.
I'm sure introsort requires O(log(n)) space in the worst case. I didn't understand smoothsort yet and so I can't judge it. For me, the memory space requirement means even the qsort() algorithm can crash htop if the system is very low on memory and there's a very large process list to sort. The space complexity is not better than the "natural, in place merge sort" that I propose here.

So, on the question of how the algorithm performs well, I think we're judging quite subjectively. And it would be better if we have some benchmarks.

(By the way, the old PR is there if you want to test the use of qsort() in htop.)

@Explorer09

Explorer09 commented Nov 6, 2025

Copy link
Copy Markdown
Contributor Author

If a different algorithm significantly outperforms qsort for sorting processes in htop, we should use it.

@natoscott @BenBE

Just a minor note. Sorting algorithms' performance can be compared with multiple kinds of data (use cases), and so it won't be a single metric that could decide which algorithm would win.

What I am saying is there can be uniformly random data as well as partially sorted data that need to be sorted. And for an algorithm like heapsort (as an example) it would work best with random data but relatively poor with data that's half sorted. Quicksort with a naïve pivot selection can easily make it the worst case (O(n²)) for data that's already sorted or reverse-sorted, you know. I'm not criticising either algorithm or qsort() API, just a note that the qsort() default algorithm is not adaptive in most libc implementations, thus not as "universally useful" as you think.

And there is no "perfect" sorting algorithm. Each algorithm comes with a trade-off. Even my algorithm does. If you are looking for block sort family that looks "perfect" enough in the stats, I did realise there is a tradeoff (adaptive sort i.e. O(n) best case time complexity or O(1) space complexity, pick one). So really no algorithm is perfect.

@natoscott

Copy link
Copy Markdown
Member

@Explorer09 I did a little more digging and have found all supported htop platforms have a variant of qsort_r - IIRC you mentioned earlier that its missing somewhere? (even modern Solaris has it, and Windows has qsort_s). Which platforms were you thinking do not have it?

There are differences in the calling conventions between platforms, but they are trivially resolved via wrapper.

| And there is no "perfect" sorting algorithm.

From a htop POV, when we can delete substantial code it's a good win - so qsort[_r] would be perfect in that sense IMO. I honestly cannot fathom why anyone would be implementing custom sorting code for a system performance tool, that just blows my mind.

If we did need qsort_r for some unusual / new platform, there are plenty of existing, tested, robust implementations to pick from that are license-compatible with htop - choose one, put it below generic/ and add some configure magic. Done.

@Explorer09

Copy link
Copy Markdown
Contributor Author

There are differences in the calling conventions between platforms, but they are trivially resolved via wrapper.

The catch is how to detect the calling conventions.

// Older BSD
void qsort_r(void*, size_t, size_t, void*, int(*)(void*, const void*, const void*));
// POSIX, GNU
void qsort_r(void*, size_t, size_t, int(*)(const void*, const void*, void*), void*);

Arguments 4 and 5 are both pointers, so only the const qualifier can tell which argument does what. At this point I am no longer interested in writing a configure code for this.

@natoscott

Copy link
Copy Markdown
Member

| The catch is how to detect the calling conventions.

Each platform will know the conventions of its own qsort_r, and a xxx/Platform.h wrapper can map that to whichever convention is chosen for use throughout htop. The only requirement for configure is to detect whether there is a qsort_r at all ... which AIUI you said some platforms are missing? If none are missing it, then no configure changes are needed. If it is missing somewhere, configure would need to enable a generic/qsort.c implementation (with contents from somewhere like https://github.com/freebsd/freebsd-src/blob/main/lib/libc/stdlib/qsort.c for example).

@Explorer09

Explorer09 commented Nov 7, 2025

Copy link
Copy Markdown
Contributor Author

If we did need qsort_r for some unusual / new platform, there are plenty of existing, tested, robust implementations to pick from that are license-compatible with htop - choose one, put it below generic/ and add some configure magic. Done.

I just heard on DistroWatch that htop is being ported to Redox OS! And no, I don't know them. It's just something I heard unintentionally.

And FYI, Redox OS's libc didn't have qsort_r at the moment, although I believe it's trivial for them to implement one.

The catch is how to detect the calling conventions.

Each platform will know the conventions of its own qsort_r, and a xxx/Platform.h wrapper can map that to whichever convention is chosen for use throughout htop. The only requirement for configure is to detect whether there is a qsort_r at all ...

FreeBSD provides both versions of the qsort_r API, with a feature macro (it's in the code you cited). And to test the availability of qsort_r, AC_SEARCH_LIBS won't work (the macro is limited to checking unmangled symbol only). A custom test is needed in configure, really.

@Explorer09 Explorer09 marked this pull request as draft November 10, 2025 12:33
@Explorer09 Explorer09 force-pushed the vector-sort branch 2 times, most recently from c068947 to 0c6d188 Compare November 30, 2025 18:41
@Explorer09 Explorer09 force-pushed the vector-sort branch 3 times, most recently from d72dff0 to 81e7498 Compare December 8, 2025 02:04
@Explorer09 Explorer09 force-pushed the vector-sort branch 2 times, most recently from ccb91f0 to f54f7f9 Compare April 1, 2026 16:28
@Explorer09 Explorer09 force-pushed the vector-sort branch 3 times, most recently from d2d599d to 9a9d5ab Compare April 10, 2026 05:45
@Explorer09 Explorer09 force-pushed the vector-sort branch 4 times, most recently from 759627f to ca1dc8c Compare April 19, 2026 17:52
@Explorer09 Explorer09 force-pushed the vector-sort branch 3 times, most recently from 037a402 to eabb44b Compare May 1, 2026 10:01
@Explorer09 Explorer09 force-pushed the vector-sort branch 3 times, most recently from 6762fa4 to 9b452f1 Compare May 11, 2026 04:08
@coderabbitai

coderabbitai Bot commented May 17, 2026

Copy link
Copy Markdown

Review Change Stack

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 00ef9052-7a56-4619-8f73-8fdaacddfc3e

📥 Commits

Reviewing files that changed from the base of the PR and between b7a6417 and 129add3.

📒 Files selected for processing (18)
  • Action.c
  • EnvScreen.c
  • ListItem.c
  • ListItem.h
  • Makefile.am
  • Object.h
  • OpenFilesScreen.c
  • Process.c
  • Process.h
  • ProcessLocksScreen.c
  • Row.c
  • Row.h
  • Table.c
  • Vector.c
  • Vector.h
  • generic/Sort.c
  • generic/Sort.h
  • pcp/Instance.c

📝 Walkthrough

Summary

This PR introduces a new stable, in-place merge sort API (Vector_sort) to replace deprecated Vector_quickSortCustomCompare() and Vector_insertionSort(), targeting improved performance on partially-sorted data (O(n) best case, O(n·log²n) average/worst case, O(log n) stack space).

Core Changes

New sorting subsystem: 184 lines added (generic/Sort.c + generic/Sort.h) implementing a merge-sort-based algorithm with run-tracking to leverage partially-sorted input. Merging is performed through rotation-based in-place operations; algorithm is declared stable.

API redesign: Extends Object_Compare typedef from 2-argument to 3-argument function pointers (adding void* context), enabling comparison functions to access external state without global variables. This change cascades through Object.h, Row.h, Process.h, and ListItem.h.

Systematic refactor: ~20 comparison function signatures updated consistently—Process_compare, Row_compare, ListItem_compare, Instance_compare, and related helpers now accept and forward the context parameter. Most suppress unused warnings via (void)context.

Call-site updates: Vector.c, Table.c, and five screen implementations (Action, EnvScreen, OpenFilesScreen, ProcessLocksScreen) replaced with Vector_sort(..., NULL, this) calls; all invocations pass context as third argument.

Implementation Assessment

Strengths:

  • Clean separation: sorting logic isolated in dedicated files with minimal footprint
  • Consistent signature pattern across codebase
  • Merge sort rationale (stability, reentrant context parameter) addresses stated limitations of qsort

Concerns:

  • Not fully async-signal-safe (swap operation not atomic); requires caller synchronization
  • Diverges from baseline benchmark (commit d3cd557), which uses separate Vector_SortFn typedef; this PR modifies Object_Compare itself, creating broader cascading changes across ~15 files rather than isolating to sorting subsystem
  • No performance benchmarks provided despite reviewer requests; algorithm speed gains unvalidated on realistic htop workloads
  • Unresolved design tension: alternative approach (bundling or wrapping qsort_r with platform-specific headers) dismissed without thorough evaluation

Commit Structure

Changes are logically split by file with focused modifications. However, the approach of extending Object_Compare rather than introducing a sort-specific typedef means changes spread across the codebase rather than being isolated to the sorting subsystem, increasing review surface area and future maintenance burden.

Walkthrough

This PR refactors htop's sorting infrastructure to use a unified, context-aware architecture. The core change replaces scattered sorting implementations (Vector_insertionSort, Vector_quickSortCustomCompare) with a new generic stable merge sort (Sort_sort) and updates all comparison callbacks to accept an optional void* context parameter. The Object_Compare typedef is extended to include context, and all comparators (Row, Process, ListItem, Instance) are updated to match the new signature. Vector_sort becomes the single entry point delegating to Sort_sort. All call sites—table rebuild, display list updates, and screen scans—are updated to use Vector_sort with the appropriate context parameter.

Poem

In scattered sorts, we found a way,
A merger rose to light the day—
One stable dance, no more divide,
Context flows, a unified tide,
All comparators, now aligned,
One sorting heart, perfectly designed.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Vector.c (1)

292-300: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Thread comparator context through Vector_indexOf() as well.

Line 300 still hard-codes NULL, so a comparator that works with Vector_sort() cannot be reused for lookups once it needs the new third argument. That leaves the Object_Compare contract inconsistent inside Vector.

Proposed fix
-int Vector_indexOf(const Vector* this, const void* search_, Object_Compare compare) {
+int Vector_indexOf(const Vector* this, const void* search_, Object_Compare compare, void* context) {
   const Object* search = search_;
   assert(Object_isA(search, this->type));
   assert(compare);
   assert(Vector_isConsistent(this));
   for (int i = 0; i < this->items; i++) {
      const Object* o = this->array[i];
      assert(o);
-      if (compare(search, o, NULL) == 0) {
+      if (compare(search, o, context) == 0) {
         return i;
      }
   }
   return -1;
}

Update the declaration in Vector.h to match.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 66593feb-2e96-493e-9807-a14875be6fec

📥 Commits

Reviewing files that changed from the base of the PR and between b7f9df9 and 7be9d0c.

📒 Files selected for processing (18)
  • Action.c
  • EnvScreen.c
  • ListItem.c
  • ListItem.h
  • Makefile.am
  • Object.h
  • OpenFilesScreen.c
  • Process.c
  • Process.h
  • ProcessLocksScreen.c
  • Row.c
  • Row.h
  • Table.c
  • Vector.c
  • Vector.h
  • generic/Sort.c
  • generic/Sort.h
  • pcp/Instance.c

Comment thread generic/Sort.c
void* array;
} MergeSortContext;

static void swapByte(char* p1, char* p2) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# List static function definitions in Sort.c to verify naming style.
rg -nP '^\s*static\s+[^()]+\s+([A-Za-z_][A-Za-z0-9_]*)\s*\(' generic/Sort.c

Repository: htop-dev/htop

Length of output: 419


🏁 Script executed:

#!/bin/bash
# Check if Sort.h exists and examine naming of public functions
ls -la generic/Sort.h 2>/dev/null || echo "No Sort.h found"

# Look at a few other .c files to see how static functions are named
echo "=== Static functions in other .c files ==="
rg -n '^\s*static\s+[^()]+\s+([A-Za-z_][A-Za-z0-9_]*)\s*\(' --max-count=3 | head -20

Repository: htop-dev/htop

Length of output: 1791


🏁 Script executed:

#!/bin/bash
# Examine Sort.h to see the public API naming pattern
cat -n generic/Sort.h

# Also check if there are any Sort_ prefix functions in Sort.c
echo "=== Functions with Sort_ prefix in Sort.c ==="
rg -n 'Sort_' generic/Sort.c | head -20

Repository: htop-dev/htop

Length of output: 654


🏁 Script executed:

#!/bin/bash
# Check if any of these functions are referenced outside Sort.c
echo "=== External references to swapByte ==="
rg -n 'swapByte' --type c --type h | grep -v 'generic/Sort.c'

echo "=== External references to mergeRuns ==="
rg -n 'mergeRuns' --type c --type h | grep -v 'generic/Sort.c'

echo "=== External references to mergeSortSubarray ==="
rg -n 'mergeSortSubarray' --type c --type h | grep -v 'generic/Sort.c'

echo "=== External references to mergeSort ==="
rg -n 'mergeSort' --type c --type h | grep -v 'generic/Sort.c'

Repository: htop-dev/htop

Length of output: 228


Rename static helper functions to follow Sort_functionName() convention for consistency.

The static functions swapByte, mergeRuns, mergeSortSubarray, and mergeSort do not align with the module naming convention used in the public API (Sort_sort). These should be renamed to Sort_swapByte, Sort_mergeRuns, Sort_mergeSortSubarray, and Sort_mergeSort respectively. As these are module-private functions with no external references, the refactoring is low-effort and improves internal consistency per the coding guidelines.

Also applies to: lines 31, 90, 150

Comment thread generic/Sort.c
Comment on lines +150 to +162
static void mergeSort(void* array, size_t len, size_t size, Object_Compare compare, void* context) {
if (!size)
return;

assert(len <= SIZE_MAX / size);

MergeSortContext mctx = {
.elementSize = size,
.compare = compare,
.compareContext = context,
.array = array
};
(void)mergeSortSubarray(&mctx, len * size, array, (char*)array + len * size);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Inspect mergeSort guards and arithmetic site.
rg -nP 'static void mergeSort\(' generic/Sort.c -A35
rg -nP 'if\s*\(\s*len\s*<\s*2' generic/Sort.c || true
rg -nP 'if\s*\(\s*len\s*>\s*SIZE_MAX\s*/\s*size\s*\)' generic/Sort.c || true
rg -nP 'len\s*\*\s*size' generic/Sort.c

Repository: htop-dev/htop

Length of output: 752


Add runtime guards for overflow and trivial lengths in mergeSort.

Line 154 relies on assert() for overflow protection, which is stripped in release builds (-DNDEBUG). This leaves the len * size multiplication on line 162 unguarded. Additionally, the function should short-circuit for len < 2 to avoid unnecessary pointer arithmetic and recursion.

Proposed fix
 static void mergeSort(void* array, size_t len, size_t size, Object_Compare compare, void* context) {
-   if (!size)
+   if (!size || len < 2)
       return;
 
-   assert(len <= SIZE_MAX / size);
+   if (len > SIZE_MAX / size) {
+      assert(len <= SIZE_MAX / size);
+      return;
+   }
 
    MergeSortContext mctx = {
       .elementSize = size,
       .compare = compare,
       .compareContext = context,
       .array = array
    };
    (void)mergeSortSubarray(&mctx, len * size, array, (char*)array + len * size);
 }

Comment thread Object.h Outdated
Comment thread Vector.c

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 00622385-c1ba-421e-9a74-f4083a5281b2

📥 Commits

Reviewing files that changed from the base of the PR and between 7be9d0c and b7a6417.

📒 Files selected for processing (18)
  • Action.c
  • EnvScreen.c
  • ListItem.c
  • ListItem.h
  • Makefile.am
  • Object.h
  • OpenFilesScreen.c
  • Process.c
  • Process.h
  • ProcessLocksScreen.c
  • Row.c
  • Row.h
  • Table.c
  • Vector.c
  • Vector.h
  • generic/Sort.c
  • generic/Sort.h
  • pcp/Instance.c

Comment thread Table.c
Comment on lines +154 to +155
static int compareRowByKnownParentThenNatural(const void* v1, const void* v2, void* context) {
return Row_compareByParent((const Row*) v1, (const Row*) v2, context);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Rename helper to match module naming convention.

compareRowByKnownParentThenNatural should use the Table_ module prefix to align with project function naming rules.

Proposed refactor
-static int compareRowByKnownParentThenNatural(const void* v1, const void* v2, void* context) {
+static int Table_compareRowByKnownParentThenNatural(const void* v1, const void* v2, void* context) {
    return Row_compareByParent((const Row*) v1, (const Row*) v2, context);
 }
@@
-   Vector_sort(this->rows, compareRowByKnownParentThenNatural, this);
+   Vector_sort(this->rows, Table_compareRowByKnownParentThenNatural, this);

As per coding guidelines, "Use ModuleName_functionName() naming convention for C functions (e.g. Process_compare())."

Also applies to: 185-185

@BenBE

BenBE commented May 17, 2026

Copy link
Copy Markdown
Member

@coderabbitat Please note the context and requirements given in #1785.

@coderabbitai

coderabbitai Bot commented May 17, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Fetch PR `#1785` details
gh pr view 1785 --repo htop-dev/htop --json title,body,state,author,comments | jq '.'

Length of output: 94


🏁 Script executed:

#!/bin/bash
# Try as an issue
gh issue view 1785 --repo htop-dev/htop --json title,body,state,author,comments 2>&1 | jq '.' || echo "Not an issue either"

# Search for related PRs/issues mentioning sort or vector sort
gh search prs --repo htop-dev/htop --state all "vector sort" --json number,title,state 2>&1 | head -100
gh search issues --repo htop-dev/htop --state all "vector sort" --json number,title,state 2>&1 | head -100

Length of output: 17159


@BenBE, thanks for the pointer. I've reviewed issue #1785 and will incorporate that context going forward.

To summarize how #1798 maps to the requirements stated in #1785:

Requirement Status in #1798
Stable sort ✅ Natural in-place merge sort is stable
Exploits partially sorted runs ✅ Natural merge sort detects existing runs
No dynamic memory allocation ✅ O(log n) stack only, no malloc
Unified Vector_sort API with context Vector_sort(Vector*, Object_Compare, void*)
Follows interface from d3cd557e ⚠️ Partial — uses Object_Compare typedef rather than the proposed Object_SortFn, and a different context-struct name

The primary deviation from the interface you outlined in d3cd557e is the type naming: Explorer09 kept Object_Compare (extended with a void* context param) rather than introducing a distinct Object_SortFn typedef, and used a different internal context-structure name. Explorer09's PR description acknowledges this explicitly, noting the implementation was not based on your commit.

Given that #1785 frames this as a competition between implementations before a final choice is made, it would be worth clarifying:

  1. Is the Object_SortFn typedef a hard requirement, or is extending Object_Compare acceptable if the signature is equivalent?
  2. What is the benchmark baseline — is BenBE's Powersort WIP (branch vector-sort-replace) ready for comparison, or is that still in progress?

This would let Explorer09 (and others) know whether type-naming alignment is needed before further review is productive.

@BenBE

BenBE commented May 17, 2026

Copy link
Copy Markdown
Member

@coderabbitai The benchmark branch up to d3cd557 is ready as a baseline (basically uses the existing code) but with the new interface. To minimize effects caused by function signatures, using Object_SortFn is highly encouraged. This makes comparing the results much easier to compare. If not using that signature, a good reason for diverting from it should be given though.

@coderabbitai

coderabbitai Bot commented May 17, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Look up commit d3cd557e in the htop-dev/htop repo (likely on a branch)
gh api repos/htop-dev/htop/commits/d3cd557e --jq '.sha, .commit.message' 2>&1 || echo "Commit not found on main"

# Search for branches that might be the benchmark branch
gh api repos/htop-dev/htop/branches --jq '.[].name' 2>&1 | head -30

# Look at PRs by BenBE related to sort
gh search prs --repo htop-dev/htop --author BenBE --state all --json number,title,headRefName 2>&1

Length of output: 3393


🏁 Script executed:

#!/bin/bash
# Get the full diff of commit d3cd557e
gh api repos/htop-dev/htop/commits/d3cd557e0534a45ab971397d4c407116547d3f6d --jq '.files[] | "\(.filename)\n\(.patch)"' 2>&1 | head -300

Length of output: 6790


@BenBE, thanks — that clarifies the benchmark situation precisely.

Having reviewed d3cd557e, here is the concrete difference:

d3cd557e interface (benchmark baseline)

// Vector.h
typedef int (*Vector_SortFn)(const void *a, const void *b, void *ctx);
void Vector_sort(Vector* this, Vector_SortFn cmp, void* ctx);

// Object_Compare stays as the original 2-arg form

Object_Compare is not changed. Vector_SortFn is a separate, context-aware typedef scoped to the sort API.

PR #1798 interface

// Object.h — Object_Compare itself extended to 3 args
typedef int (*Object_Compare)(const void*, const void*, void*);
void Vector_sort(Vector* this, Object_Compare compare, void* context);

The 3-arg signature is the same, but it is expressed by mutating Object_Compare rather than introducing a distinct typedef, which cascades changes into Object.h, Row.h, Process.h, ListItem.h, Process.c, Row.c, etc.

What this means for #1798:

To align with the benchmark baseline, Explorer09 needs to decide:

  1. Adopt Vector_SortFn / Object_SortFn — Keep Object_Compare as a 2-arg typedef and introduce a separate 3-arg sort-specific typedef. The Vector_sort signature becomes Vector_sort(Vector*, Vector_SortFn, void*), matching d3cd557e exactly. This avoids the wide-blast-radius changes to all compare-function signatures across the codebase.

  2. Or justify merging — If unifying Object_Compare to 3 args is intentional design (e.g., for future context-aware comparisons beyond sorting), that reasoning needs to be stated explicitly in the PR description, since it is a materially different architectural decision with broader interface impact.

From a benchmark-comparison standpoint, option 1 is cleaner: the only variable being tested is the sorting algorithm, not the comparator-calling convention throughout the type hierarchy.


✏️ Learnings added
Learnt from: BenBE
URL: https://github.com/htop-dev/htop/pull/1798

Timestamp: 2026-05-17T08:31:52.349Z
Learning: In htop-dev/htop, commit d3cd557e0534a45ab971397d4c407116547d3f6d ("Consolidate Vector_*Sort APIs") by BenBE defines the benchmark baseline interface: `typedef int (*Vector_SortFn)(const void *a, const void *b, void *ctx)` in Vector.h, with `Vector_sort(Vector* this, Vector_SortFn cmp, void* ctx)`. Object_Compare remains a 2-arg typedef unchanged. PR `#1798` by Explorer09 diverges by extending Object_Compare itself to 3 args (adding void* context), cascading changes across Object.h, Row.h, Process.h, ListItem.h etc. BenBE requires using Object_SortFn (equivalent to Vector_SortFn) in any competing implementation to enable fair benchmark comparison, unless a clear justification for diverging is given.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Update the prototypes of all Object_Compare functions and Vector_*Sort()
functions to accept the third argument.

The third argument allows passing in extra information or states that
could be useful in sorting. The definition of Object_Compare now matches
the prototype of the compare function in libc qsort_r().

(Previously many programmers would use global variables to read or store
extra information needed in a compare operation. This made the sort
function not reentrant. It was the primary motivation for the qsort_r()
function.)

Currently all Object_Compare functions in htop would simply the third
argument. No functions are using it yet, but this may be changed in the
future.

The callers of Vector_*Sort() will now by default pass in the reference
of the container object (usually named "this") as the third argument to
the sort compare functions.

Signed-off-by: Kang-Che Sung <explorer09@gmail.com>
Introduce the new Vector_sort() function and obsolete the old
Vector_quickSort() and Vector_insertionSort() APIs.

This new sort function is a natural, in-place merge sort. I.e. it takes
advantage of partially sorted data, and it's stable.

Space complexity: O(log(n)) worst case (on stack, no malloc())
Time complexity: O(n) best case, O(n*(log(n))^2) average & worst case

Signed-off-by: Kang-Che Sung <explorer09@gmail.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
generic/Sort.c (2)

25-25: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Rename static helper functions to follow Sort_functionName() convention.

The static functions swapByte, mergeRuns, mergeSortSubarray, and mergeSort do not align with the ModuleName_functionName() naming convention used for the public API (Sort_sort). Rename them to Sort_swapByte, Sort_mergeRuns, Sort_mergeSortSubarray, and Sort_mergeSort respectively for consistency with coding guidelines.

Also applies to: 31-31, 90-90, 150-150

Source: Coding guidelines


150-163: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add runtime guards for overflow and trivial lengths in mergeSort.

Line 154 relies on assert() for overflow protection, which is stripped in release builds (-DNDEBUG), leaving the len * size multiplication on line 162 unguarded. Additionally, the function should short-circuit for len < 2 to avoid unnecessary pointer arithmetic and recursion.

Proposed fix
 static void mergeSort(void* array, size_t len, size_t size, Object_Compare compare, void* context) {
-   if (!size)
+   if (!size || len < 2)
       return;
 
-   assert(len <= SIZE_MAX / size);
+   if (len > SIZE_MAX / size) {
+      assert(len <= SIZE_MAX / size);
+      return;
+   }
 
    MergeSortContext mctx = {
       .elementSize = size,
       .compare = compare,
       .compareContext = context,
       .array = array
    };
    (void)mergeSortSubarray(&mctx, len * size, array, (char*)array + len * size);
 }
Vector.c (1)

112-115: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve the caller's NULL comparator context.

Line 114 silently replaces NULL with this, preventing callers from passing an explicit null context to comparators that may distinguish between "no context" and "Vector as context." This forces this as the context even when the caller intends NULL.

Proposed fix
   VectorSortContext vc = {
      .compare = compare ? compare : this->type->compare,
-      .compareContext = context ? context : this,
+      .compareContext = context,
   };

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 00ef9052-7a56-4619-8f73-8fdaacddfc3e

📥 Commits

Reviewing files that changed from the base of the PR and between b7a6417 and 129add3.

📒 Files selected for processing (18)
  • Action.c
  • EnvScreen.c
  • ListItem.c
  • ListItem.h
  • Makefile.am
  • Object.h
  • OpenFilesScreen.c
  • Process.c
  • Process.h
  • ProcessLocksScreen.c
  • Row.c
  • Row.h
  • Table.c
  • Vector.c
  • Vector.h
  • generic/Sort.c
  • generic/Sort.h
  • pcp/Instance.c

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

♻️ Duplicate comments (3)
generic/Sort.c (2)

25-25: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Rename static helper functions to follow Sort_functionName() convention.

The static functions swapByte, mergeRuns, mergeSortSubarray, and mergeSort do not align with the ModuleName_functionName() naming convention used for the public API (Sort_sort). Rename them to Sort_swapByte, Sort_mergeRuns, Sort_mergeSortSubarray, and Sort_mergeSort respectively for consistency with coding guidelines.

Also applies to: 31-31, 90-90, 150-150

Source: Coding guidelines


150-163: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add runtime guards for overflow and trivial lengths in mergeSort.

Line 154 relies on assert() for overflow protection, which is stripped in release builds (-DNDEBUG), leaving the len * size multiplication on line 162 unguarded. Additionally, the function should short-circuit for len < 2 to avoid unnecessary pointer arithmetic and recursion.

Proposed fix
 static void mergeSort(void* array, size_t len, size_t size, Object_Compare compare, void* context) {
-   if (!size)
+   if (!size || len < 2)
       return;
 
-   assert(len <= SIZE_MAX / size);
+   if (len > SIZE_MAX / size) {
+      assert(len <= SIZE_MAX / size);
+      return;
+   }
 
    MergeSortContext mctx = {
       .elementSize = size,
       .compare = compare,
       .compareContext = context,
       .array = array
    };
    (void)mergeSortSubarray(&mctx, len * size, array, (char*)array + len * size);
 }
Vector.c (1)

112-115: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve the caller's NULL comparator context.

Line 114 silently replaces NULL with this, preventing callers from passing an explicit null context to comparators that may distinguish between "no context" and "Vector as context." This forces this as the context even when the caller intends NULL.

Proposed fix
   VectorSortContext vc = {
      .compare = compare ? compare : this->type->compare,
-      .compareContext = context ? context : this,
+      .compareContext = context,
   };

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 00ef9052-7a56-4619-8f73-8fdaacddfc3e

📥 Commits

Reviewing files that changed from the base of the PR and between b7a6417 and 129add3.

📒 Files selected for processing (18)
  • Action.c
  • EnvScreen.c
  • ListItem.c
  • ListItem.h
  • Makefile.am
  • Object.h
  • OpenFilesScreen.c
  • Process.c
  • Process.h
  • ProcessLocksScreen.c
  • Row.c
  • Row.h
  • Table.c
  • Vector.c
  • Vector.h
  • generic/Sort.c
  • generic/Sort.h
  • pcp/Instance.c
🛑 Comments failed to post (1)
Process.c (1)

349-372: 🧹 Nitpick | 🔵 Trivial | ⚖️ Poor tradeoff

Unrelated change mixed into sorting refactor.

This /proc pseudo-path stripping logic modifies command-display behavior (Process_makeCommandStr), not comparator signatures. It does not belong in a PR focused on updating sorting infrastructure. Extract this to a separate commit or PR to maintain clear change boundaries and simplify review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants