Mechanics Sidekick — Mark Bohaychuk

Overview

It started as a command-line RAG tool. I had a pile of PDF service manuals for an old car, none of them organized into one clean book, and finding a torque spec or a wiring diagram meant opening five PDFs and Ctrl-F'ing each one. I wanted to ask the manuals questions instead. That core still exists — but the project has since grown into a web app that plugs into the car itself. Mechanics Sidekick now reads live data straight off the OBD-II port, and the chat is an agent: when I ask a question, an LLM decides which tools to call — search the manuals, read live sensor values and trouble codes from the scanner, look up recalls on the web — and chains them together. Answers come back with citations to the exact document and page so I can go verify them.

Demo

A full diagnostic on a 2015 Ford F-150 5.0L. Live telemetry streams off the OBD-II reader (RPM, coolant temp, engine load), then I ask the agent to pull the trouble codes. It reads P0706/P0707 from the scanner, grounds them in the truck's service manual (Transmission Range Sensor circuit, with page citations), and surfaces the matching NHTSA recall, 26V237000, where a lost signal between the range sensor and the PCM can drop the transmission into second gear. That's one question pulling in three different tools, with every claim cited back to its source.

Technology Stack

Frontend

Vue 3 + TypeScript
Vite
Tailwind CSS v4
Pinia + vue-router

Backend & Agent

Python · FastAPI
Uvicorn (SSE streaming)
OpenAI tool-calling loop
MCP Python SDK (client)

AI & Data

gpt-5.4
text-embedding-3-small
SQLite + SQLAlchemy 2.0
PyMuPDF · Tavily

Architecture & Design Choices

The chat used to be a single RAG call. Now it's a tool-calling loop. Each turn the model picks from a set of tools and can chain several before it answers: search_manuals (the RAG pipeline), the live OBD tools, a web search for recalls and bulletins, and a lookup of past diagnostic reports. The OBD tools come from a separate project, the OBD-II MCP Server, which Mechanics Sidekick connects to as a standard MCP client over stdio. Anything the server marks destructive (like clearing codes) is dropped automatically, and a short denylist removes a few more, so the assistant stays read-only against the car.

The original app was the RAG pipeline, and that pipeline is now just one tool the agent reaches for. The retrieval is still the part I'm most proud of. Naive chunking on a service manual destroys context — a torque value lifted from the middle of a procedure loses which engine and which system it belonged to. So chunking is structure-aware: the PDF extractor reads font metadata (bold, ALL CAPS, size jumps) to detect section headings, and before each chunk is embedded a short LLM-generated summary situates it in the manual. The embedding represents the chunk plus its place, not just the raw words. A cross-encoder reranker can re-score the top candidates against that same signal, but it's off by default and kicks in only for procedure-style questions; exact spec and code lookups skip it, since reranking tends to hurt those.

v1 ran fully offline against Ollama on my own machine. I rebuilt it cloud-first on OpenAI (gpt-5.4 for chat, text-embedding-3-small for embeddings) because the tool-calling and answer quality were a clear step up for this kind of work. The model layer is kept behind a seam, with the Ollama path still in the code, so a local-only mode can come back later. But today it talks to the cloud.

What Works Today

Add a vehicle, upload its service manuals, and chat. With an OBD-II adapter plugged in, the live dashboard streams sensor values over SSE and the agent can read trouble codes and freeze-frame data straight from the car. Responses stream token by token, with tool activity shown as it happens and manual citations inline. A separate guided-diagnostic mode walks the car through a short protocol and saves a health report I can pull up later. I've run the full loop on a real truck, a 2015 F-150, from live telemetry to a cited answer to a matching recall. If no scanner is connected, the assistant still works against the manuals and the web; the live views just prompt you to connect one.