feat(api): add /v1/detokenize endpoint#9620
Conversation
| return grpc::Status::OK; | ||
| } | ||
|
|
||
| grpc::Status Detokenize(ServerContext* context, const backend::DetokenizeRequest* request, backend::DetokenizeResponse* response) override { |
There was a problem hiding this comment.
this requires a test addition to our e2e-backend test suite where we exercise a mocked backend via api
There was a problem hiding this comment.
Done — added a Detokenize method to the mock gRPC backend and two e2e tests in the MockBackend suite (0024a9c): one that POSTs known token IDs and asserts a non-empty content response, and a round-trip that tokenizes first then detokenizes the returned IDs.
Add Detokenize to the mock gRPC backend and wire up two e2e tests in the MockBackend suite: one that posts known token IDs and asserts a non-empty content response, and a round-trip that tokenizes first then detokenizes the returned IDs. Addresses reviewer feedback on mudler#9620. Assisted-by: Claude:claude-sonnet-4-6
Add Detokenize to the mock gRPC backend and wire up two e2e tests in the MockBackend suite: one that posts known token IDs and asserts a non-empty content response, and a round-trip that tokenizes first then detokenizes the returned IDs. Addresses reviewer feedback on mudler#9620. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Adira Denis Muhando <dennisadira@gmail.com>
0024a9c to
5ea7612
Compare
5ea7612 to
349c9d2
Compare
Add Detokenize to the mock gRPC backend and wire up two e2e tests in the MockBackend suite: one that posts known token IDs and asserts a non-empty content response, and a round-trip that tokenizes first then detokenizes the returned IDs. Addresses reviewer feedback on mudler#9620. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Adira Denis Muhando <dennisadira@gmail.com>
|
Rebased onto current master (05e8e1e). The only conflicts were in the generated swagger files — upstream had added Diarization types; I merged both sets in alphabetical order. All other files applied cleanly. Ready for another look when you have a moment. |
Closes mudler#1649. Mirror of the existing /v1/tokenize path, requested by @benniekiss in the issue thread for "complete API workflow" use cases that need to turn token IDs back into text without local processing. - Add Detokenize gRPC RPC with DetokenizeRequest{tokens} / DetokenizeResponse{content} messages. - Implement in the llama.cpp backend using common_token_to_piece, the same primitive TokenizeString already uses internally. - Other backends inherit the default Unimplemented from base.Base, in line with how Detect, Rerank, etc. are gated per-backend. - Wire up the Go gRPC interface, server, client, and in-process embed wrapper alongside their TokenizeString counterparts. - Add the schema types, ModelDetokenize wrapper, HTTP handler, route registration, RouteFeatureRegistry entry (gated by FeatureTokenize so no new feature flag is needed), and the discovery map entry under ai_functions. - Regenerated swagger reflects the new endpoint and types. - Update authentication.md to list /v1/detokenize alongside /v1/tokenize. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Adira Denis Muhando <dennisadira@gmail.com>
Add Detokenize to the mock gRPC backend and wire up two e2e tests in the MockBackend suite: one that posts known token IDs and asserts a non-empty content response, and a round-trip that tokenizes first then detokenizes the returned IDs. Addresses reviewer feedback on mudler#9620. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Adira Denis Muhando <dennisadira@gmail.com>
349c9d2 to
c168bcb
Compare
|
Rebased onto current master — branch now has 2 commits (the endpoint + e2e tests) cleanly on top of master with no unrelated changes. Also checked: no other PR has landed a |
|
Hi @mudler — just flagging that the e2e tests requested in the review have been added in the second commit ( Happy to make any further changes if needed. |
Summary
Closes #1649. Mirror of
/v1/tokenizefor the inverse direction: take a list of token IDs and return the detokenized text, requested by @benniekiss in the issue thread for "complete API workflow" use cases that need to turn token IDs back into text without local processing.The proto/handler shape was discussed in #1649 (comment). @benniekiss reacted positively; landing this with the strict-mirror-of-tokenize precedent in mind. Happy to adjust if the proto naming or response shape should differ.
What's added
backend/backend.proto): newDetokenize(DetokenizeRequest) returns (DetokenizeResponse)RPC, withDetokenizeRequest{repeated int32 tokens}andDetokenizeResponse{string content}. The Go bindings are regenerated bymake protogen-go(gitignored as usual).backend/cpp/llama-cpp/grpc-server.cpp): handler that callscommon_token_to_pieceper token and concatenates — the same primitiveTokenizeStringalready uses internally at the same file.Unimplementedfrompkg/grpc/base.Base— same pattern asDetect,Rerank, etc. Backends can opt in later.pkg/grpc/{interface,server,backend,client,embed}.go+pkg/grpc/base/base.goupdated alongside theirTokenizeStringcounterparts.POST /v1/detokenizeincore/http/endpoints/localai/detokenize.goandcore/http/routes/localai.go. Request{"model": "...", "tokens": [...]}, response{"content": "..."}.RouteFeatureRegistrygated by the existingFeatureTokenize— no new feature flag.ai_functionsin the routes index.authentication.mdupdated to list the new endpoint.Test plan
make protogen-goregenerates cleango build ./core/... ./pkg/grpc/...cleango vet ./core/... ./pkg/grpc/...cleango test -c -o /dev/null ./core/services/nodes/...clean (the existing testcontainers-based suite needs Docker; only updated the two interface mocks so the test package still compiles)make swaggerregenerates with the new endpoint visiblePOST /v1/tokenize→POST /v1/detokenizereturns the original text on a llama.cpp modelAssisted-by: Claude:claude-opus-4-7