Frequently asked questions

How DevGhost estimates effort, what Ghost% means, and how to use it responsibly.

Are you monitoring developers? Where do the hours come from?+

No — no time tracking, no screens, no keystrokes. We analyze only the code changes themselves and estimate their cognitive difficulty in the hours of a reference developer. It's a yardstick, not a timesheet.

What does the "estimate in hours" mean?+

How long the change would take a mid-level developer (3–4 years) who knows the codebase and works without AI. It measures the difficulty of the work — not lines, and not actual time spent at the desk. It covers writing code, manual testing, and review fixes; it excludes meetings, planning, and waiting for review.

How exactly do you estimate effort?+

It's not "one call to a neural network" but a multi-stage pipeline in which AI is only one layer. First a model reads the code changes themselves — what actually changed — and judges the cognitive difficulty for a reference developer, rather than counting lines or commits. On top of it runs a deterministic algorithmic layer: the system classifies the nature of each change, separately recognizes high-stakes work (for example infrastructure, data migrations, security), filters out mechanical and generated changes (mass find-replace, generated and moved code, formatting), and applies sets of correction rules and guardrails so a single model guess can't swing the result. Large and combined commits are handled in more detail. The same standard is applied to everyone automatically, each commit is evaluated once and the result is fixed — hence comparability and reproducibility.

What experience and data is the methodology built on?+

It grew out of real-world enterprise development: the algorithmic layer encodes empirical patterns gathered on real projects — which changes are usually more expensive than they look, and which are cheap despite their size. These rules are checked against real reference estimates (calibration). So the system behaves more like an experienced technical lead assessing work than a simple line counter.

My team uses AI. Does that break the metric?+

On the contrary — that's the whole point. We compare your team against a reference developer who works without AI; if AI lets you deliver more per day, Ghost% goes up, and that gap from the "pre-AI norm" is exactly what the product shows. It's not a distortion — it's the result.

What is Ghost% and how do I read it?+

The ratio of your daily output to the output of the reference developer. 100% is on par with the reference, higher means you deliver more per day, lower means less. It's not hours and not overtime: a high number doesn't mean "burning out," and a low one by itself doesn't mean "weak."

How much can I trust it?+

It's a model, not a measurement. No one can reconstruct the real time, so the value is in one set of rules for everyone: strong for trends and comparisons, not for accuracy to the hour for a single person. A tool to ask better questions, not to pass verdicts.

Can the metric be gamed — by splitting or combining commits?+

Splitting and combining commits don't move it meaningfully — what's evaluated is the substance and difficulty of the changes, not the number of commits or lines. More importantly: any metric people are targeted on directly eventually gets optimized instead of the work. So use it as a team signal and trend, not as a personal KPI — then there's nothing to game.

The numbers for a person don't match my impression. Why?+

The system sees code, not the whole role: design, reviews, mentoring, planning and meetings aren't in the estimate. A discrepancy often means much of a person's value lives outside commits — which is itself worth noticing.

Does the system account for a person not being busy with code alone?+

Not on its own: it sees only code and doesn't know a person's real role and workload (reviews, mentoring, meetings, support). Only the manager knows the full workload. That's what the Share parameter is for — the share of time an employee actually spends writing code (0–100%). By default it's 100% (we assume the person is fully on code); the manager lowers it manually to reflect non-coding work — this is where context the code doesn't contain enters the system. Then the comparison against the reference becomes fair for those who don't code all day too.

Can I use it for reviews, pay, or layoffs?+

Not on its own. It's a team signal and a trend to start a conversation, not an individual verdict: one metric doesn't capture quality, impact, or context.

What do "cost" and "value" in money mean?+

Cost is roughly what the delivered work cost at a standard rate; value is roughly what it would cost to reproduce that volume by hand, without AI. The gap between them is an approximate indicator of leverage (tooling/AI), not a profit-and-loss statement.