Auditing smart contracts: 11 steps, importance, and process

by Sofia Petrou
November 24, 2025
0 Comments
18 minutes read
64 Views
3 months ago

Auditing smart contracts is the disciplined process of examining blockchain application code and its architecture to find and fix vulnerabilities before they can be exploited. In plain terms, you are validating assumptions, proving critical properties, and reducing the likelihood that bugs can move real value. Because smart contracts can hold funds and enforce irreversible logic, security is not optional—it’s foundational to product viability and user trust. This article lays out a complete, human-first approach to auditing: why it matters, how to scope it, and how to run an 11-step process that blends automated tools, manual review, testing, and verification. Disclaimer: this guide is educational, not legal or investment advice; for high-stakes deployments, consult qualified security professionals.

Short definition: A smart contract audit is a structured, independent review of code and design to uncover vulnerabilities and verify that the implementation matches intended behavior.

At-a-glance process (the 11 steps): scope & goals; threat modeling; architecture & invariants; codebase mapping; dependency & supply chain review; static analysis; manual line-by-line review; fuzzing & property-based testing; unit/integration & differential tests; formal specs & targeted verification; remediation, retest & release governance. Follow these and you’ll elevate both security and developer velocity through clearer assumptions, faster feedback, and fewer regressions.

1. Define scope, goals, and acceptance criteria

The fastest way to waste an audit is to start without a crisp scope. In the first step, you establish what contracts, components, and configurations are in scope, what risks matter most, and what “good” looks like at the end. State the audit’s objectives in measurable terms, such as “no Critical/High issues outstanding,” “all privileged roles gate-kept and monitored,” and “key invariants documented and tested.” Capture non-functional goals too: readability, gas efficiency trade-offs, and upgrade safety if you use proxies. Decide what’s explicitly out of scope—frontends, indexers, or third-party services—so nobody assumes all risks were covered. Define artifacts the team will provide (specs, diagrams, test suite, deployment plan) and what the audit will deliver (findings with severity, reproduction steps, proofs of concept, and remediation advice).

How to frame scope effectively

List contract addresses or paths (e.g., /contracts/core and /contracts/governance) and exclude folders not ready.
State chain(s), compiler versions, optimization flags, and proxy patterns (UUPS, Transparent, Beacon).
Identify risk priorities: loss of funds, stuck funds, broken governance, MEV exposure, griefing vectors.
Confirm acceptance criteria: severities, re-audit threshold, and sign-off process.
Pin timelines only to milestones (code freeze, release gates), not to arbitrary dates.

Numbers & guardrails

Target 100% coverage of in-scope contracts by manual review and ≥ 80% statement coverage in tests; don’t chase 100% if it adds trivial lines.
Require a single code freeze commit hash for primary review; allow a small ≤ 5% diff window for critical fixes.
Define severities and SLAs: Critical (block release), High (block unless mitigated), Medium (fix before roadmap), Low/Informational (as capacity allows).

End this step with a signed scope document. That signature avoids scope creep and ensures every later finding maps back to agreed objectives.

2. Run threat modeling to surface what can go wrong

Threat modeling is where you enumerate assets, actors, trust boundaries, and attack surfaces; it gives the audit a map. Start by naming what’s valuable: user funds, protocol treasury, governance power, price oracle integrity, and reputation. Then list actors and their capabilities: admins, users, relayers, keepers, arbitrageurs, validators, and potential attackers with MEV. Trace data and authority flows: who can call which functions, under which conditions, with which assumptions about ordering, liveness, and fees. Identify boundaries between on-chain code and off-chain components (keepers, moderators, multisigs, oracles), plus any cross-chain bridges.

Common threat classes to consider

Authorization and role drift: mis-scoped onlyOwner/AccessControl roles, forgotten emergency stops.
Economic attacks: reentrancy, price oracle manipulation, sandwiching/front-running, griefing, fee miscalculations.
Liveness hazards: stuck upgrades, paused systems with no unpause quorum, timelock misconfiguration.
Supply chain: library bugs, malicious dependencies, compiler/version quirks.
Governance risks: proposal hijacking, partial quorum capture, parameter griefing.

Mini case

A staking contract pays rewards from a pool based on totalShares. A threat model reveals that a malicious actor can front-run deposit math with a crafted token that changes totalSupply() during a callback, skewing share price. The mitigation becomes part of scope: block reentrancy on accounting paths and snapshot balances before external calls.

Numbers & guardrails

Catalog ≥ 10 concrete threats across distinct classes (auth, reentrancy, economic, liveness, supply chain).
For each, record impact × likelihood on a 1–5 scale to prioritize review time.
Ensure every Critical threat has at least one invariant tied to tests or verification.

Threat modeling front-loads insight: it sharpens where you look and turns “audit as code reading” into “audit as risk reduction.”

3. Review architecture, data flows, and invariants

Before reading code line by line, clarify how the system is supposed to work. That means diagrams for components, control flow, and state transitions, plus a plain-language specification. The payoff is twofold: auditors reason about intended behavior, and developers catch inconsistency between the idea and the implementation. Translate business rules into invariants—statements that must always be true—such as “the sum of user balances equals total assets under management,” “only governance can change fee parameters,” and “a vault share price never goes down except on realized losses.”

Create a minimal but sufficient spec

Sequence diagrams for lifecycle: deploy → initialize → operate → upgrade/retire.
State machine for contracts that change modes (Active, Paused, EmergencyWithdraw).
Enumerate every privileged action and its preconditions, delays, and caps.
List numeric limits: max mint/burn per block, fee bounds, slippage thresholds.

Numbers & guardrails

Write 5–15 invariants that matter to assets, roles, and math; too few and you miss intent, too many and noise drowns signal.
For each invariant, name the test or property that enforces it.
Keep diagrams current with the reviewed commit; regenerate if diff exceeds 5–10% of lines changed.

Close this step by agreeing that “the contracts must do exactly this and nothing else.” It becomes the truth source for later evidence.

4. Map the codebase and build it deterministically

A repeatable build is the prerequisite for a dependable audit. Start by pinning toolchain versions (Solidity/Vyper compiler, package managers) and reproducing the build from a clean environment. Generate an index: contract names, LOC per file, inheritance graphs, modifiers, and external calls. This gives you a tour and highlights hotspots: long files, complex inheritance, and functions with many branching paths. Note the initialization flows for upgradeable proxies—wrong init is a frequent source of latent issues.

Checklist for deterministic setup

solc/vyper, optimizer, EVM version pinned; test suite runs headlessly.
One-command bootstrap (e.g., make ci) that installs deps and compiles.
Produce artifacts: ABI, bytecode, AST, inheritance and call graphs.
Record addresses/roles for known deployments or test fixtures.

Mini case

During mapping, you discover a deep inheritance chain: Vault ← ERC4626 ← ERC20 ← Context. A custom override of _transfer introduces a fee-on-transfer token quirk not accounted for in deposit() math. The map points to where that override interacts with share accounting, creating a review waypoint that leads to a bug fix.

Numbers & guardrails

Flag files > 500 LOC or functions with cyclomatic complexity > 10 for focused review.
Ensure 0 nondeterministic steps in CI; a clean git clone && make ci must pass.
Generate a list of all external calls; every call, delegatecall, and staticcall should be intentional.

With a deterministic build and a code map, you move from wandering to directed exploration.

5. Review dependencies, libraries, and the supply chain

Modern contracts lean heavily on libraries (e.g., OpenZeppelin) and external tooling. Your audit must treat dependencies as part of the attack surface. Verify versions, changelogs, and whether imports are pinned by commit hash. Inspect custom forks: they often diverge from upstream fixes. If you’re using tokens or oracles from third parties, document assumptions and failure modes; for example, stablecoins that can pause transfers or price feeds that may lag.

Practical steps

Pin every dependency; avoid broad version ranges like ^ unless you have reasoned upgrade controls.
Compare your fork against upstream; document deltas.
Check for known weaknesses in dependency classes (reentrancy, unsafe calls, arithmetic assumptions).
Confirm license compatibility if you plan to redistribute modified code.
Note deploy-time addresses for external tokens/oracles and how they’re selected.

Mini case

A project imports SafeERC20 but interacts with a token that returns false on failure instead of reverting. The audit enforces safeTransfer/safeApprove wrappers and catches one raw transfer that could silently fail, leaving funds stuck.

Numbers & guardrails

Keep one source of truth for dependency versions; avoid mixed versions of the same library.
Require explicit allowlists for external tokens/contracts and limit per-call gas stipends to reduce griefing risk.
For oracles, document update frequency, deviation thresholds, and fallback paths; don’t assume “always fresh.”

This step reduces unknown unknowns—your code is only as safe as the code it touches.

6. Run static analysis and linting to catch low-hanging bugs

Static analysis tools scan code without executing it, surfacing patterns linked to common vulnerabilities. They won’t replace human review, but they will quickly spotlight risky constructs and style violations that impede readability. Use established analyzers for Solidity/Vyper to detect issues such as reentrancy, unchecked call returns, uninitialized storage pointers, shadowed variables, or incorrect arithmetic. Pair that with linters and formatters to normalize code style and make diffs meaningful.

How to do it well

Run analyzers with a pinned toolchain; export machine-readable reports.
Triage findings by severity and deduplicate across tools.
Wire tools into CI so regressions fail early.
Convert relevant checks into unit tests or properties to avoid recurring mistakes.

Numbers & guardrails

Expect static analysis to surface dozens of informational items; focus on the top 10–20 by severity and relevance.
Aim for zero unresolved High/Critical findings from static tools before manual review wraps.
Keep lints clean: a fail-on-warning policy after an initial cleanup sprint avoids future drift.

Tools/Examples

Pattern detectors and call graph printers for quick hotspots.
Slither for Solidity/Vyper static analysis with customizable detectors.
Linters (e.g., Solidity-specific) and formatters to standardize code style.

Use static analysis as a speed boost for humans, not a substitute; the goal is to eliminate noise so manual review targets the hard problems.

7. Perform manual line-by-line review focused on risk

Manual review is where auditors earn their keep. Humans reason about the intent behind code and catch issues automated tools miss: logic errors, broken invariants, unsafe assumptions, and subtle state-machine bugs. Read every line of in-scope contracts, prioritizing areas identified by threat modeling and the code map. Review the upgrade and initialization paths carefully; many incidents stem from mishandled proxies, storage collisions, or misordered initializers.

What to look for

Authorization checks for administrative functions and role transitions.
External calls before state updates (reentrancy risks).
Arithmetic that can underflow/overflow when assumptions change.
Inconsistent event emissions that break off-chain accounting.
Economic logic reliant on manipulable inputs (oracles, AMM pricing).

Mini case

A vault uses totalAssets() in the previewRedeem() path, which depends on a price feed that can be temporarily stale. Under volatility, a user can redeem at an outdated, favorable rate. The fix introduces a freshness check and caps on price deviation before allowing large redemptions.

Numbers & guardrails

Budget 60–70% of total audit hours to manual review; it’s the highest-ROI activity.
Track a checklist of ≥ 15 recurring bug classes and confirm each is considered in relevant files.
Every finding must include repro steps or a proof-of-concept; avoid “hand-wavy” concerns.

The synthesis of manual review and earlier steps turns scattered insights into actionable risk reduction.

8. Fuzzing and property-based testing to break assumptions

Fuzzing bombards your code with large volumes of randomized inputs, while property-based testing encodes behavioral rules that must hold across all inputs. Together, they’re excellent at surfacing edge cases you didn’t think to write as unit tests. Start by translating invariants and pre/post-conditions into properties. Then run a fuzzer that mutates transactions, calldata, and state to search for failures, reverts, or invariant violations. Instrument tests to snapshot the minimal failing case and reduce noise.

Practical setup

Write stateful properties describing multi-call sequences (e.g., deposit → withdraw → claim).
Combine on-chain fuzzers with symbolic execution for deeper path coverage when feasible.
Seed fuzzers with realistic distributions: small, large, boundary values; adversarial sequences.

Mini case

A property states: “Total shares times price per share equals or exceeds total assets minus fees.” After 100,000 fuzz iterations per property, the fuzzer finds that a rounding edge case lets price drift by 0.02% in a specific order of operations with fee-on-transfer tokens. The team fixes math and adds a rounding guard.

Numbers & guardrails

Run at least 50,000–100,000 iterations per critical property; stop only when coverage plateaus.
Prefer stateful fuzzing for protocols where sequence matters (lending, AMMs, vaults).
Treat any invariant failure as a release blocker until you explain and, if necessary, constrain behavior.

Fuzzing turns “unknown unknowns” into testable facts, giving you confidence beyond hand-crafted examples.

9. Unit, integration, and differential testing for behavior confidence

While fuzzing hunts for surprises, structured tests prove expected behavior. Expand unit tests to cover all public/external functions, then add integration tests that simulate realistic flows with multiple actors and assets. For upgrades or refactors, use differential testing—run the old and new implementations side by side on identical inputs and assert equivalence for shared semantics. Include failure-path tests: permissions, paused modes, and limit checks.

Testing blueprint

Aim for meaningful statement/branch coverage; avoid chasing trivial lines.
Test role changes, timelock flows, and emergency exits.
Model economic scenarios with parameter sweeps for fees, slippage, and liquidity.
Build fixtures for adversarial tokens (reverting, fee-on-transfer, callback-capable).

Mini case

A protocol introduces a fee change from 10 bps to 30 bps. Differential tests comparing v1 and v2 show a divergence only when the asset has 18 decimals and the user redeems to a recipient contract with a fallback that performs a token transfer. The divergence leads to a reentrancy risk now caught by the test harness; the mitigation adds a reentrancy guard and rearranges state updates.

Numbers & guardrails

Strive for ≥ 80% coverage on core contracts and ≥ 70% across the suite; coverage is a proxy, not a guarantee.
Include at least 3 adversarial token behaviors in fixtures.
Keep tests deterministic: identical inputs should yield identical results on repeated runs.

Strong tests create a safety net that accelerates refactoring and future audits.

10. Formal specifications and targeted verification where it counts

Formal methods let you mathematically prove that code satisfies a precise specification. You don’t need to verify everything; focus on high-value properties: no loss of funds except by design, only authorized roles can change critical parameters, or state machines cannot enter illegal states. Write specifications in clear, structured language, then encode them using tools that can check properties automatically. Even partial verification has two benefits: you catch logic errors early, and you generate machine-checked evidence for stakeholders.

When and how to apply

Choose small, critical modules (math libraries, escrow logic, token accounting).
Express properties as invariants, pre/post-conditions, and temporal logic (safety/liveness).
Link specs to tests and documentation, so they remain living artifacts.

Mini case

A lending module ensures collateralization stays above a threshold. A safety property states: “Liquidation is only possible when a user’s health factor < 1.” Verification finds a path where rounding lets a user be liquidated when health factor equals 0.9999996, below the intended tolerance. The fix tightens checks and aligns math across deposit and liquidation.

Numbers & guardrails

Select 2–5 properties for formal treatment on the most critical flows.
Require zero counterexamples for proved properties; if the tool finds one, triage immediately.
Keep specs versioned with code, and fail CI if they drift.

Targeted verification turns the riskiest assumptions into provable guarantees.

11. Remediate, retest, document, and govern the release

An audit is only as valuable as the follow-through. After you receive findings, triage them by severity and fix effort. Implement patches in small, reviewable commits. Retest: rerun static analysis, fuzzing, and the full test suite. If the design changes materially, schedule a focused re-review. Produce a clear report that explains the issue, impact, likelihood, proof of concept, and recommended fix, along with evidence of remediation. Prepare end-user artifacts: changelogs, risk disclosures, and operator runbooks. Finally, pair release with governance controls—timelocks on upgrades, multisig thresholds, emergency procedures, and monitoring to detect anomalies.

Operationalizing the outcome

Tag the audited commit and publish a hash-mapped report.
For upgradeable systems, schedule a staged rollout (small caps → broader exposure).
Set up on-chain monitors for critical metrics (reserves, share price, oracle freshness).
Institute a responsible disclosure policy and a bounty program scaled to potential impact.

Numbers & guardrails

Do not ship with any Critical issues open; High issues require a compensating control or a deferral with a hard deadline.
Keep upgrade timelocks long enough for community review; aim for ≥ 24–72 hours depending on risk tolerance and governance.
Require dual control for administrative keys and runbooks for pauses/unpauses.

Close the loop by proving fixes, documenting them, and setting guardrails that keep the system safe after launch.

Conclusion

Auditing smart contracts is not a box to check; it is a structured way to convert risk into knowledge and then into safer code. By preparing a crisp scope, modeling threats, and documenting invariants, you give reviewers context and targets. By mapping the codebase, pinning the supply chain, and running static analysis, you remove distractions and surface quick wins. Manual review, fuzzing, and property-based testing expose deeper logic errors and edge cases; formal methods turn the highest-risk assumptions into provable guarantees. Finally, disciplined remediation, retesting, and release governance make the improvements durable, with monitors and processes that continue to guard value over time. Follow the 11 steps in this guide, and you’ll ship software that is more predictable, auditable, and resilient—exactly what users expect when real assets are at stake. Copy-ready CTA: Ready to improve your security posture? Start by writing your scope and invariants today, then schedule an independent review.

FAQs

How is an audit different from a bug bounty?
An audit is proactive, structured, and comprehensive: it reviews code and design before (or alongside) deployment, with clear scope and acceptance criteria. A bounty invites the public to probe live or test deployments for rewards. Both are useful, but they serve different phases and risk appetites. Many teams use an audit to build a hardened baseline, then layer bounties to catch what slipped through and to monitor as the system evolves.

Do I need an audit if I used a standard library like OpenZeppelin?
Yes. Libraries reduce risk, but integrating them introduces your own logic and configuration choices. Real incidents often stem from how contracts compose, override, or initialize standard modules, not from the libraries themselves. An audit focuses on your glue code, role assignments, upgrade patterns, and assumptions about external tokens and oracles, which are unique to your project.

What’s a reasonable test coverage target for audited contracts?
Coverage is a proxy, not a guarantee, but targets help discipline the codebase. A practical goal is at least 80% statement coverage on core contracts and 70% overall, with explicit tests for privileged flows, failure paths, and emergency procedures. Pair these with fuzzing and property-based tests to explore edge cases that line coverage alone can’t reveal.

How do audits handle upgradeable proxy patterns?
Auditors examine storage layouts, initializer ordering, and upgrade governance. Typical pitfalls include storage collisions, missing initializer modifiers, and privileged upgrades without delays or multisig approvals. A strong audit checks the upgrade script, verifies that new implementations respect storage layout, and recommends timelocks, caps, and staged rollouts to contain blast radius.

What is an invariant, and why should I care?
An invariant is a statement that must always hold, such as “sum of balances equals total assets under management.” Encoding key invariants helps auditors and developers reason about correctness and gives you properties to test and verify. When an invariant fails in tests or fuzzing, you’ve found a bug—or a mis-specified rule that needs clarifying.

Can static analysis replace manual review?
No. Static analyzers find many patterns quickly—unchecked call returns, uninitialized storage, suspicious external calls—but they don’t understand business intent. Manual reviewers connect code to design, spot economic vulnerabilities, and reason about multi-transaction behavior. Use static tools to reduce noise and free humans to focus on the hard problems.

How do you prioritize findings during remediation?
Start with Critical issues that can directly lead to loss of funds or irreversible state corruption. Then handle High issues that require specific conditions to exploit. Medium and Low issues follow, especially if they affect maintainability or observability. Define severity criteria up front, and treat any invariant violation as a block until it’s either fixed or bounded by a compensating control.

What should be in a good audit report?
A useful report explains the methodology, lists the in-scope artifacts, and provides findings with severity, impact, and reproduction steps. It should include code references, suggested fixes, and notes on any changes made during remediation and retesting. Ideally, it also summarizes positive observations, verified invariants, and a checklist of residual risks with recommended monitors.

How often should I audit?
Audit when the code changes materially or when risk exposure grows—new features, new assets, or new governance structures. For upgradeable systems, consider smaller, targeted audits for each release rather than waiting for a large, infrequent review. Complement audits with ongoing monitoring, alerting, and a bounty program to sustain security between formal reviews.

Are audits only about security bugs?
Security is the priority, but audits often improve clarity, documentation, and operational safety. Reviewers may suggest better event emission, safer math patterns, clearer error messages, or more robust upgrade runbooks. These improvements reduce future bugs, accelerate onboarding, and make the system easier to reason about—benefits that compound over time.

References

Smart contract security overview — Ethereum.org — https://ethereum.org/developers/docs/smart-contracts/security/
Testing smart contracts — Ethereum.org — https://ethereum.org/developers/docs/smart-contracts/testing/
Formal verification of smart contracts — Ethereum.org — https://ethereum.org/developers/docs/smart-contracts/formal-verification/
Smart Contract Weakness Classification (SWC) Registry — ConsenSys Diligence — https://swcregistry.io/
Smart Contract Security Best Practices — ConsenSys Diligence — https://consensysdiligence.github.io/smart-contract-best-practices/
OpenZeppelin Contracts docs: developing secure smart contracts — OpenZeppelin — https://docs.openzeppelin.com/contracts/5.x/learn/developing-smart-contracts
Slither static analysis framework — Crytic/Trail of Bits — https://crytic.github.io/slither/slither.html
Secure development roadmap (readiness guide) — OpenZeppelin — https://www.openzeppelin.com/readiness-guide
Strategies for safer governance systems — OpenZeppelin — https://www.openzeppelin.com/news/smart-contract-security-guidelines-4-strategies-for-safer-governance-systems
Becoming a smart contract auditor — Trail of Bits — https://blog.trailofbits.com/2025/07/23/inside-ethcc8-becoming-a-smart-contract-auditor/

Sofia Petrou

author

Sofia holds a B.S. in Information Systems from the University of Athens and an M.Sc. in Digital Product Design from UCL. As a UX researcher, she worked on heavy enterprise dashboards, turning field studies into interfaces that reduce cognitive load and decision time. She later helped stand up design systems that kept sprawling apps consistent across languages. Her writing blends design governance with ethics: accessible visualization, consentful patterns, and how to say “no” to a chart that misleads. Sofia hosts webinars on inclusive data-viz, mentors designers through candid portfolio reviews, and shares templates for research readouts that executives actually read. Away from work, she cooks from memory, island-hops when she can, and fills watercolor sketchbooks with sun-bleached facades and ferry angles.