Reflect Memory

Updated 2026-05-21

Hybrid Infrastructure

How Reflect Memory manages hybrid cloud + on-prem deployments, model hosting, egress controls, and internal model endpoints.

Doc-specific AI prompt

Hybrid Infrastructure

Questions this doc answers

  • How do you mix hosted connectors with private on-prem models?
  • What happens when one team needs cloud AI (ChatGPT/Claude) and another uses internal models?
  • How do you keep AI egress under control across those boundaries?

Shared memory across hybrid stacks

Reflect Memory abstracts the infrastructure tier from the AI tier. A hosted deployment can still connect to your on-prem models via RM_ALLOWED_MODEL_HOSTS, while self-host deployments can still serve ChatGPT/Claude via MCP if agent keys are available.

  • disableModelEgress (default true in self-host) prevents outbound LLM calls unless explicit.
  • requireInternalModelBaseUrl ensures every hosted model call resolves to allowedModelHosts.
  • allowedModelHosts also gates scheduled webhooks and automation triggers that otherwise would reach public endpoints.

Model connectivity patterns

  1. Cloud-first + self-host complement: Use hosted API for general reads/writes, and self-hosted connectors for highly regulated data. Both sides read/write the same memory store, so context replicates naturally.
  2. On-prem model bridging: Install Reflect Memory inside your VPC. Set RM_ALLOWED_MODEL_HOSTS=llama.local,llama.next, RM_REQUIRE_INTERNAL_MODEL_BASE_URL=true, and RM_DISABLE_MODEL_EGRESS=true. Your LLMs call the local REST API; the same environment also accepts MCP keys from corporate-approved tools.
  3. Hybrid autop-run: Use the MCP server to let GPUs in your datacenter talk to Reflect Memory. Agents like Claude or Grok can reach internal memory via the same RM_PUBLIC_URL referencing your reverse proxy.

Connectivity guardrails

  • enforceModelHostPolicy(baseUrl) compares URLs to RM_ALLOWED_MODEL_HOSTS.
  • Private deployments still support optional RM_ALLOW_PUBLIC_WEBHOOKS for integration with approved partners.
  • Telemetry and billing events track cross-boundary usage so you can see if a self-hosted team is hitting cloud connectors or vice versa.
Diligence document | Reflect Memory | Reflect Memory