Skip to main content

8. RagSearcher Triplets Update (Interface and Runtime Specification)

Scope

This specification describes triplet-aware retrieval behavior across:

  • app/Mcp/Tools/HawkiRagSearchTool.php
  • app/Services/RagSearch/RagSearcher.php
  • python_rag/pipeline/query_logic.py

System-level behavior

The query path is:

  1. MCP tool validates user input and sends normalized parameters to RagSearcher.
  2. RagSearcher sends /query request to Python RAG.
  3. Python retrieval performs semantic + structural retrieval, then rerank.
  4. Relation hits (component_type = relation) are preserved through filtering.
  5. Laravel returns a normalized response object with results, kg, and rewrite_terms.

Interface specification

MCP input validation (HawkiRagSearchTool)

  • query: required string.
  • top_k: optional integer in [1, 50].
  • Missing top_k defaults to 5.

Output contract

Response shape

RagSearcher::filterResponse() always returns this top-level structure:

KeyTypePurpose
resultsarrayRanked result items (semantic chunks and/or relation hits).
kgarrayNormalized relation triples for graph-aware consumers.
rewrite_termsarrayQuery rewrite/entity terms derived by backend logic.
{
"results": [],
"kg": [],
"rewrite_terms": []
}

results[]

Each item can represent a semantic chunk or a graph relation.

Semantic-oriented fieldsRelation-oriented fieldsShared/typing fields
metadata.languagesubjectcomponent_type
metadata.titlerelationcontent
metadata.urlobjectmetadata.collection
metadata.timestampmetadata.tags

!!! note "Interpretation rule" component_type = relation indicates a relation/triplet-oriented hit.

kg[] and rewrite_terms[]

Both fields are optional metadata arrays returned with results[].

FieldStructurePurposeEmpty state
kg[]triple items: subject, relation, objectGraph relation factskg: [] when no valid relation facts exist
rewrite_terms[]string itemsQuery expansion/debug metadatarewrite_terms: [] when no terms are produced

Contract delta summary

DimensionOnly QdrantQdrant + Neo4j
Retrieval mode defaultfast_mode=truefast_mode=false
Graph lookup defaultsmart_lookup=falsesmart_lookup=true
Structural depth from clientforced to 0omitted unless explicitly set
Allowed hit typeschunk-onlychunk + relation
Top-level response keysresultsresults, kg, rewrite_terms
Relation fields in resultsnot exposedexposed (component_type, subject, relation, object)

Reference response example

{
"results": [
{
"metadata": {
"language": "en",
"title": "Example Page",
"url": "https://example.org",
"timestamp": "2026-03-11T12:00:00Z",
"tags": "graph,neo4j",
"collection": "hawki"
},
"content": "Some chunk text..."
},
{
"metadata": {
"collection": "hawki"
},
"component_type": "relation",
"subject": "Entity A",
"relation": "connected_to",
"object": "Entity B"
}
],
"kg": [
{
"subject": "Entity A",
"relation": "connected_to",
"object": "Entity B"
}
],
"rewrite_terms": [
"entity a",
"entity b"
]
}