We built a GTM agent that connects to 79 sales and marketing providers. Shipped it to Slack. Spent 80% of our time on Slack formatting bugs.
The agent logic worked fine. The chat layer took four weeks of edge cases.
This post covers the architecture, what worked, what broke, and what we'd do differently if we rebuilt it today.
Architecture
We ran two implementations in parallel:
LangGraph agent - Deep Agents framework with LangChain tools. Calls the Deepline HTTP API directly.
Managed Agent - Anthropic's sandbox with CLI tool execution. Runs deepline tools execute <tool_id> --payload '{...}' via subprocess.
Both use the same Deepline tool catalog. The managed agent approach meant zero custom tool definitions - every tool in Deepline works automatically.
Slack Events API → FastAPI (Railway) → Claude Agent → Deepline API (79 providers)
What worked
Claude Code patterns transferred
Waterfall enrichment, tool catalogs, provider playbooks - all worked out of the box. The skill docs we'd built for Claude Code gave the agent exactly the context it needed.
Key patterns that carried over:
- Full tool catalog embedded in the
deepline_calltool description - Waterfall functions that exhaust multiple providers before failing
- Structured output formats (person cards, company cards)
The sandbox model
Real filesystem, CLI access, persistent state. We upload the Deepline binary at session start. Every tool works automatically with zero custom definitions.
The agent just runs CLI commands and parses JSON output. No complex tool schemas to maintain.
Skill docs eliminate hallucination
300KB of context about provider patterns and exact tool IDs, fetched from CDN at startup:
SKILLS_BASE = "https://code.deepline.com/.well-known/skills/gtm-meta-skill"
CORE_SKILL_DOCS = [
"SKILL.md",
"finding-companies-and-contacts.md",
"provider-playbooks/apollo.md",
"provider-playbooks/hubspot.md",
]
With this context, the agent picks the right tools and constructs valid payloads without hallucinating field names.
What broke
Slack formatting (80% of bugs)
We spent more time debugging mrkdwn than agent logic.
Two markdown converters with different behavior. The LangGraph version and managed agent version produced inconsistent output.
Headers glued to text. The agent emits multiple text blocks. When they concatenate: Done.## Next Steps instead of proper separation.
Verbose narration. Despite prompting for concise responses: "I'll search for..." before every action. Then the actual search. Then "I found..." Then the results.
Markdown tables don't render in Slack. We built a converter, but it was brittle.
Memory leak in dedup cache
We cleared the whole cache at 10K entries:
if len(_seen_event_ids) > 10_000:
_seen_event_ids.clear() # Wrong
Brief window after clear meant duplicate events if Slack retried. Fixed with LRU eviction:
from collections import OrderedDict
_seen_event_ids: OrderedDict[str, None] = OrderedDict()
_MAX_SEEN_EVENTS = 5000
def _mark_event_seen(event_id: str) -> bool:
if event_id in _seen_event_ids:
return True
_seen_event_ids[event_id] = None
while len(_seen_event_ids) > _MAX_SEEN_EVENTS:
_seen_event_ids.popitem(last=False)
return False
Blocking I/O
Sync tool functions called from async FastAPI. Every deepline_execute() blocked the event loop. Concurrent requests queued behind each other.
No timeouts
Slow provider meant hung session. Added 120s timeout and retry logic for read-only operations.
What we'd do differently
Use Vercel AI SDK for chat
The Slack bot was 40% of the codebase and 80% of the bugs. Streaming, reconnection, formatting - all solved problems we rebuilt from scratch.
If we rebuilt today:
- Keep the managed agent core (it works)
- Replace custom Slack bot with AI SDK + thin adapter
Agents should call tools and reason. Chat layers should present output. We mixed them, and the coupling made both harder to maintain.
Opus for planning
Sonnet handles simple enrichments fine. Complex workflows (build TAM, research 20 companies, add to Instantly) need Opus - it plans better and makes fewer tool-call mistakes.
Structured output for data
Free-form markdown for GTM data is fragile. JSON for data, markdown for summaries.
Cost tracking
We had no visibility into credit consumption per request. Users burned quota on expensive waterfalls without knowing until their balance hit zero.
Cost patterns from Claude Code
We borrowed patterns from Claude Code's open source:
Truncate large results (presentation layer only):
MAX_TOOL_RESULT_CHARS = 8000
def truncate_tool_result(result):
if isinstance(result, str) and len(result) > MAX_TOOL_RESULT_CHARS:
return f"{result[:2000]}\n\n... ({len(result)} chars total)"
Short error stacks:
def short_error_stack(e, max_frames=5):
# 5 frames is enough. Save tokens.
Skill budget. Claude Code caps skill descriptions at 1% of context. We were embedding 300KB - way over budget.
Results after fixes
| Metric | Before | After | |--------|--------|-------| | Formatting bugs | 4-5/week | 0 | | Duplicate events | ~1% | 0 | | Timeout errors | ~2% | 0 | | Response time | 12s | 8s |
The lesson
Agent logic was easy. Claude with good context and well-designed tools makes a capable agent.
Chat infrastructure was hard. Streaming, formatting, reconnection, error recovery - none of this is agent-specific. It's plumbing that every chat app needs.
Build the agent as a pure API. Use existing chat infrastructure for the UI.
Get started
Install Deepline to give your agent access to 79 GTM providers:
bash <(curl -sS https://code.deepline.com/api/v2/cli/install)
Then build your Slack bot using Vercel AI SDK and call Deepline tools via the CLI or HTTP API.
Run GTM workflows from Claude Code
Deepline connects to 79 providers. Enrich, validate, and sequence - all through natural language.