The Vision, Debugged;
Posts
Why Zep’s Memory System Is a Game-Changer for Enterprises

Why Zep’s Memory System Is a Game-Changer for Enterprises

PLUS: Karpathy, Zuck & Altman’s next big bet: Vibe coding 👨‍💻

Tezan Sahu & Sandra Anil
February 18th, 2025

Howdy Vision Debuggers!🕵️

Psst—Spark and Trouble smuggled a time machine into this edition. Not the clunky kind, but one that quietly stitches past chats, future plans, and… wait, is that a graph-shaped engine humming beneath?

Tick Doctor Who GIF by Feliks Tomasz Konczakowski

Gif by konczakowski on Giphy

Here’s a sneak peek into today’s edition 👀

AI memory just evolved with “Zep”. Here’s why it’s a game-changer.
Unmask hidden insights from user feedback with this prompt
5 Cutting-Edge AI Tools You Won't Want to Miss
Andrej Karpathy’s “Vibe Coding” Explained

Time to jump in!😄

PS: Got thoughts on our content? Share 'em through a quick survey at the end of every edition It helps us see how our product labs, insights & resources are landing, so we can make them even better.

Hot off the Wires 🔥

We're eavesdropping on the smartest minds in research. 🤫 Don't miss out on what they're cooking up! In this section, we dissect some of the juiciest tech research that holds the key to what's next in tech.⚡

Remember that friend who remembers everything – every inside joke, every detail from years ago? Now imagine if AI agents could actually do something similar - recalling not just what you discussed, but when you discussed it, how topics evolved, and even connecting dots you didn’t realize existed.

That’s the promise of Zep, a new memory architecture for AI agents that’s shaking up how machines understand and retain information. It lets agents recall, reason, and learn from weeks of conversations, business data, and real-time interactions.

Zep - The Foundational Memory Layer for AI

Equip your agents with the knowledge to complete tasks, from the mundane to monumental.

vimeo.com/1021963693

So, what’s new?

Most AI chatbots today have the memory span of a goldfish. While innovations like MemGPT improved this by using techniques like "virtual context windows," they still struggle with dynamic, real-world data. For instance:

Current systems treat conversations as static documents, missing how relationships and facts change over time.
When you ask, "What’s the latest update on Project X?" today’s AI might retrieve outdated info or fail to connect last week’s budget discussion with yesterday’s deadline change.
Enterprises lose hours daily to AI hallucinations in tasks like customer support logs or medical record synthesis.

Innovators at Zep AI saw this gap and asked: What if AI memory worked more like human memory—layered, adaptive, and temporally aware?

They created Zep by merging episodic memory (raw chat logs) and semantic memory (structured facts) into a knowledge graph that evolves in real time. It’s like upgrading a notebook to a self-updating encyclopedia.

Forging the fundamentals

Before proceeding, let’s break down some jargon:

Knowledge Graph: A network of interconnected entities (people, places, concepts, etc.) and their relationships.

Episodic Memory: This is the part of the memory that stores specific events (e.g., “Alice mentioned Project X on June 5”). Like a diary.

Semantic Memory: This is the part of the memory that captures general knowledge (e.g., “Project X is an AI safety initiative”). Like an encyclopedia.

Bi-temporal Model: This is a model that tracks (1) when events happened and (2) when they were recorded. Imagine a timeline that shows both real-world events and when the AI learned about them.

Temporal Reasoning: Understanding questions like, “What changed between Q2 and Q3?”—This is a nightmare for most AI today.

Deep Memory Retrieval: This test that measures how well AI systems can dig up specific details from long, multi-session conversations—like finding a needle in a haystack of chat history. For example, if you ask, “What’s Alice’s favorite coffee order from last month’s meeting?”, DMR evaluates if the AI can accurately recall that tiny detail buried in weeks of chats.

Under the hood…

Zep is a temporal knowledge graph architecture designed to power AI agents with human-like memory. Its secret weapon? Graphiti, a dynamic knowledge graph engine that ingests both unstructured conversations and structured business data while tracking how relationships evolve over time.

Zep’s architecture mimics human memory through three layered subgraphs:

Episode Subgraph
- This stores raw conversation snippets (e.g., “Alice: Let’s delay Project X to Q3”), in the form of messages, text & JSON.
- Is uses bi-temporal timestamps to track both real-world events and data ingestion order.
Semantic Entity Subgraph
- Here, various entities (people, projects) and facts (“Project X → delayed → Q3”) are extracted from the episodes, embedded into vectors
- It automatically resolves duplicates (e.g., merging “Project X” and “Proj. X” into one entity) using entity resolution prompt.
- Another super-interesting feature here is temporal edge invalidation - it is like setting expiration dates on facts: when new info contradicts old data (e.g., "Project X’s deadline moved to Q3"), Zep marks the outdated edge as invalid while preserving its history, ensuring the AI always references the current truth.
Community Subgraph
- Related entities (e.g., all teams working on AI safety) are then grouped into clusters.
- These clusters are updated automatically as new info comes in, keeping the AI’s understanding fresh without manual resets.
- High-level summaries of each cluster are also created, helping the AI grasp big-picture relationships quickly.

Zep’s retrieval system returns structured memory components, including facts, entities, and summaries, using hybrid search methods:

Cosine Similarity: Finds semantically related information.
Okapi BM25: Performs full-text searches.
Breadth-First Search: Retrieves contextual information from related nodes.

It then reranks results to prioritize what’s recent, frequently mentioned, or closest to the query’s heart. Finally, it compiles the top findings into a crisp summary (dates, facts, key players) for the AI to use.

Results speak louder than words

Zep was tested against two brutal benchmarks. In the Deep Memory Retrieval (DMR) benchmark, Zep outscored MemGPT by 1.4% (94.8% vs. 93.4%). But the real magic happened in the LongMemEval test, which mimics real-world enterprise scenarios:

Accuracy: Zep boosted scores by 18.5% for complex tasks like cross-session info synthesis.
Latency: Response times dropped by 90% compared to baseline systems.
Token Efficiency: Used 100x fewer tokens than full-context methods, saving costs.

For example, when asked, “What was the client’s feedback on the proposal we discussed three weeks ago?” ZEP retrieved the exact conversation snippet and linked it to related emails – all in seconds.

How does this matter?

Zep sets a new standard for dynamic memory in AI agents by seamlessly integrating structured and unstructured knowledge with superior temporal reasoning. The architecture’s hierarchical design and advanced retrieval methods make it a robust solution for real-world applications:

Customer Service: Agents that remember past interactions, reducing repetitive questions.
Healthcare: AI assistants tracking patient histories, drug interactions, and appointment changes.
Legal Tools: Systems that recall case details, contract clauses, and deadlines without manual input.
Enterprise Assistants: Synthesize quarterly reports by connecting Slack debates, email threads, and CRM updates.

Zep isn’t perfect—yet:

Scalability: Struggles with ultra-rare or never-before-seen attack patterns.
Compute Costs: Dynamic graph updates require heavy resources (though 90% latency cuts help).
Niche Languages: Works best in English; performance dips in low-resource languages like Swahili.

But here’s the kicker: Zep is already deployed in production systems, proving its enterprise-ready chops.

Zep signals a shift from static RAG to adaptive memory ecosystems. Just as humans blend facts, events, and context, AI agents now can too. The future isn’t just about answering questions—it’s about understanding history to predict what’s next.

Would you trust an AI with a memory like ZEP to handle your team’s projects, patients’ records, or customer relationships? Share your thoughts with Spark & Trouble…

Wish to dive deeper & try Zep for yourself?

➤ Check out the full research paper (appendix includes prompts for entity extraction and temporal validation)
➤ Play around with Zep’s SDKs for Python, TypeScript or Go

10x Your Workflow with AI 📈

Work smarter, not harder! In this section, you’ll find prompt templates 📜 & bleeding-edge AI tools ⚙️ to free up your time.

Fresh Prompt Alert!🚨

Ever felt like you’re drowning in user feedback, trying to decode what your customers really want? Yeah, us too.

That’s why this week’s Fresh Prompt Alert is your life raft!

Dive into the latest user chatter across multiple platforms to uncover hidden gems behind user behavior. Consider this your backstage pass to understanding your audience like never before.

Ready to play detective? Let’s go! 👇

Search & analyze the most recent user discussions & reviews about [product] from G2, Capterra, Reddit, ProductHunt & TrustRadius, and think deeply about:

What are the primary questions customers might have?
What are the trust signals that might be missing?
Where might users get confused?
What are the deeper motivations behind user behaviors?

* Replace the content in brackets with your details & use LLMs with Reasoning + Web Search capabilities (like ChatGPT, DeepSeek, Kimi, Copilot or Gemini)

We tried the same prompt for “Microsoft Copilot Agents”, and here are the results:

5 AI Tools You JUST Can't Miss 🤩

🔍 Remy: Ask questions & get answers from the world’s videos
🎼 Beatoven AI: Create unique background music that you can call your own
🖥️ Reweb: The AI visual builder for Next.js & Tailwind
🗣️ Talo: Real-time AI translator for video calls
🖼️ Readdy: Transform your idea into beautiful design with code in seconds

Spark 'n' Trouble Shenanigans 😜

Ever wish coding felt more like vibing and less like debugging? 🤔 Well, buckle up—Andrej Karpathy just introduced us to “vibe coding,” and Spark is already hyped... while Trouble is side-eyeing the error logs. 😆

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper… x.com/i/web/status/1…
— Andrej Karpathy (@karpathy)
11:17 PM • Feb 2, 2025

Imagine building projects by just describing your ideas—no tedious syntax, no endless loops of trial and error. Tools like Cursor, Replit, and Bolt are turning your vibes into fully functional code!

But here’s the spicy part: Not everyone’s a fan.
Traditional devs say it’s messy.
But hey, when Zuck, Nadella, and Sam Altman are all betting on AI-led innovation, you know the shift is real.

So, what do you think—Team Vibes or Team Old-School?

This meme humorously captures how the complexity of technical interviews contrasts with the simplicity of real-world tasks over time, especially in the AI and coding space

Well, that’s a wrap!
Thanks for reading 😊

See you next week with more mind-blowing tech insights 💻

Until then,
Stay Curious🧠 Stay Awesome🤩

PS: Do catch us on LinkedIn - Sandra & Tezan

Reply

or to participate.