How We Built a Multi-Agent Job Search Platform with CrewAI

The job search tools that exist today have a fundamental architectural problem. They use a single AI model that tries to be a resume expert, an ATS (applicant tracking system) specialist, a job market analyst, and a cover letter writer simultaneously. The result is output that's mediocre at all of these things — a resume rewrite that uses the right keywords but sounds robotic, a cover letter that's generic despite being "personalized," and job match scores that don't reflect how a real human recruiter would assess the fit.

Multi-agent architecture solves this by assigning each of these responsibilities to a specialized agent with its own prompt, memory, and tool access. The agents share context and collaborate, but each is optimized for its specific task. The result is noticeably better than what a single model produces. This post walks through the architecture we built for MyJob and the technical decisions that shaped it.

What Multi-Agent Means in Practice

In a multi-agent system, an "agent" is an AI model with a specific role, a set of tools it can use, a defined output format, and optionally a memory system that persists relevant information across the conversation. Agents can pass information to each other — the output of one agent becomes the input of another — and an orchestrator (in our case, a CrewAI Crew object) manages the flow of information between agents.

This is different from chaining prompts, which is what most single-model job tools actually do. Prompt chaining is sequential: call the model with prompt A, get output, use that output in prompt B, get output, and so on. Multi-agent architecture allows for more sophisticated coordination patterns — parallel execution, conditional routing, memory sharing across agents, and agent-to-agent communication without going through a central orchestrator for every interaction.

MyJob's Agent Architecture

MyJob uses four specialized agents orchestrated by CrewAI:

1. JobAnalyzer

The JobAnalyzer reads the raw job description and produces a structured analysis. Its output is a JSON object containing: required skills (explicitly stated as required), preferred skills (stated as nice-to-have), years of experience requirements, seniority indicators, cultural signals extracted from the language and tone of the posting, compensation information if available, and ATS-relevant keywords that the recruiter's system is likely scanning for. The JobAnalyzer is specifically trained not to infer or assume — if something isn't clearly stated in the job description, it marks the field as null rather than guessing. This prevents downstream agents from optimizing for requirements that may not actually exist.

2. ResumeAuditor

The ResumeAuditor receives the applicant's resume and the JobAnalyzer's output simultaneously. It produces a gap analysis: which required skills are clearly evidenced in the resume, which are absent, which are partially evidenced (the applicant has related but not identical experience), and how the resume's seniority signals compare to what the job description implies. The ResumeAuditor also scores ATS compatibility — estimating whether the resume's language and formatting are likely to pass through an ATS before reaching a human reviewer. This audit is the foundation that the optimization agent uses.

3. ResumeOptimizer

The ResumeOptimizer receives the original resume and the ResumeAuditor's gap analysis. Its task is to rewrite specific sections of the resume to better target the identified gaps, integrate the relevant keywords naturally (not as a keyword dump), and strengthen the evidence for skills that are present but undersold. Crucially, the ResumeOptimizer has a hard constraint: it cannot fabricate experience. If a required skill is genuinely absent from the applicant's background, it must mark that gap as unaddressable rather than inventing something. This constraint is implemented both in the prompt and as a validation step in the crew workflow.

4. CoverLetterWriter

The CoverLetterWriter has access to all three previous agents' outputs — the job analysis, the gap audit, and the optimized resume — plus the applicant's preferences for tone (formal vs. conversational) and any specific points they want to emphasize. This information density is what produces a genuinely personalized cover letter. The agent knows exactly what the job requires, exactly how the applicant's background matches and doesn't match, and exactly how the resume has been tailored. The resulting letter references specific overlaps between the applicant's experience and the role's requirements, acknowledges any significant gaps honestly, and makes a case for why the applicant is worth interviewing despite incomplete matches.

CrewAI vs. LangGraph: When to Use Each

We evaluated both frameworks seriously before choosing CrewAI for MyJob. The key differentiator is the mental model each framework imposes on your system design.

CrewAI is optimized for role-based agent teams. You define agents as roles — "you are a resume auditing specialist with 10 years of experience in ATS systems" — and the framework handles the orchestration of how these roles interact. The API is high-level and readable. The trade-off is that CrewAI is less flexible for complex conditional workflows — if you need agents to make decisions about which other agents to call, CrewAI becomes awkward.

LangGraph is optimized for stateful workflows with complex branching. You define your agent system as a graph where nodes are functions (or agents) and edges represent transitions between states. It gives you precise control over what information flows where and when, and it handles conditional routing naturally. The trade-off is that LangGraph is more verbose and requires more upfront design work. For MyJob's sequential-with-handoff pattern, CrewAI was the right choice. For our AI Insights Generator, which requires more dynamic routing based on the type of research request, we used LangGraph.

Technical Decisions

We chose Google Gemini over GPT-4 for three reasons: cost (Gemini is significantly cheaper at scale), speed (critical for the 5-second resume tailoring target), and Vertex AI integration (which simplifies deployment on Google Cloud Run). The agents are all running on the same model — agent specialization comes from the prompts and constraints, not from using different model architectures.

Agent memory in MyJob is session-scoped. The agents share context within a single job application session, but we don't persist memory across sessions for privacy reasons. This means each job application starts fresh — the agents don't remember previous resumes or applications. For a future version, an opt-in memory system that retains the applicant's verified skills and experience could significantly improve performance, but we chose not to build it in v1 due to the privacy implications.

Production Challenges

Rate limiting was the most significant operational challenge. When multiple users run the full four-agent workflow simultaneously, the Gemini API calls can stack up quickly. We implemented a token bucket rate limiter with queue management so that requests back up gracefully rather than failing noisily when the rate limit is hit. This added about two weeks to the development cycle and is the kind of production concern that's easy to ignore during prototyping but critical at any non-trivial scale.

Agent disagreement is a subtler problem. Occasionally the ResumeAuditor would identify a gap that the ResumeOptimizer couldn't address (because the experience genuinely doesn't exist), but the CoverLetterWriter would still write confidently about that skill because the original resume mentioned something adjacent to it. We solved this by having the CoverLetterWriter explicitly receive the "unaddressable gaps" list from the optimization step and include a constraint that it cannot make claims about those skills. Proper information passing between agents — not just output, but structured metadata about the output — is underappreciated in most multi-agent tutorials.

Lessons Learned

The most important lesson: agent specialization consistently produces better output than one super-agent, but it adds orchestration complexity that you have to be prepared to manage. Start with two agents and add more only when you've clearly hit a limitation that a third agent would solve. We added the ResumeAuditor as a distinct step after noticing that the ResumeOptimizer was making poor decisions when it had to simultaneously analyze and rewrite — separating the audit from the optimization improved output quality measurably.

The second lesson: information schema design between agents matters as much as the agents themselves. The quality of the ResumeOptimizer's output is entirely dependent on the quality of the ResumeAuditor's gap analysis. Define these schemas explicitly, validate them, and iterate on them before you iterate on the agent prompts.

MyJob is live at jsong.ai-biz.app/tools/myjob/. The AI Insights Generator at jsong.ai-biz.app/tools/ai-insights-generator/ is another production CrewAI implementation worth examining for a different pattern — research-oriented multi-agent systems with confidence scoring.