
The pentest proposal that fails your audit looks identical to the one that passes it. Both quote OSCP-certified testers, gray box testing, and OWASP methodology.
One delivers a 200-page Nessus export with a cover page. The other finds the chained authentication bypass the scanner missed, shows you full reproduction steps, and gives your engineers enough information to fix it without calling the vendor back.
You find out which one you hired when the report lands - or when the auditor rejects it. The five stages below will help you filter one from the other before you sign.
Stage 1 - Get the scope right (yourself or with a vendor)
Scope is the list of things (hosts, applications, buildings…) a tester is allowed to attack. Get it wrong and the budget is gone before testing starts. Build the asset inventory in-house, or hand it to one or two vendors early and let them push back. Do not accept a price quote from someone who has not asked what they are testing.
Build the asset inventory first:
- External web applications and the APIs they call (list separately)
- Internal network and Active Directory infrastructure
- Cloud tenants (AWS, Azure, GCP) including IAM policies, storage permissions, compute metadata
- Mobile apps (iOS and Android count as two, not one)
- Thick clients and desktop applications
- Third-party integrations (Salesforce, Okta, Shopify) - testing your integration is fine, testing the platform needs separate authorization
Then pick the test type. Three options, not interchangeable:
- Black-box - tester gets nothing but a target. Right for simulating an external attacker with no insider knowledge. Default for DORA TLPT and red team work. Not recommended for compliance pentesting - testers spend most of the budget on reconnaissance instead of testing the system.
- Gray-box - tester gets architecture diagrams, test credentials, partial documentation. The default for web apps, APIs, and internal network work. What the OWASP WSTG recommends as a minimum for application testing.
- White-box - tester gets full source code, build artifacts, infrastructure access. Pick this when the goal is to find every issue in a system, not to only prove an attacker could break in. White-box approach speeds-up the testing (no need for guessing or brute-forcing) and provides best results, but requires more experience from the pentester. Recommendations are more precise as a complete code fix can be provided.
Pick wrong and you pay the same price for a worse test. If unsure, ask the vendor to recommend one and explain why - their answer tells you whether they have run this kind of engagement before.
Map the compliance driver before finalizing scope. Each framework has specific requirements that affect how the test needs to be designed:
- PCI DSS v4.0 Requirement 11.4 - requires an annual pentest of the cardholder data environment with exploitation attempts. A scan-only engagement does not satisfy it. The tester must be independent from the team managing the CDE.
- SOC 2 CC4.1 - pentesting is accepted as evidence under the AICPA Trust Services Criteria. Not technically required, but Type II audits without it are rare enough that auditors notice the gap.
- ISO 27001:2022 Control A.8.8 - technical vulnerability management. No annual testing frequency is mandated. Watch out for vendors citing A.8.29 as the production pentest requirement - it covers development and acceptance testing, not production.
- NIS2 Article 21(2)(f) - requires policies to assess the effectiveness of cybersecurity measures. Regulators interpret this as independent external testing. Worth knowing: the directive text does not use the words "penetration testing" or "independent organisation" - that is regulatory interpretation, not statute.
- DORA Articles 26-27 - threat-led penetration testing on live production systems, at least every three years for designated financial entities. This is not a standard pentest - different methodology, different vendor requirements, different oversight. Read our TLPT under DORA guide before going further.
- FCA PS21/3 - firms must demonstrate they can remain within impact tolerances. Pentesting is common evidence for this, but there is no specific FCA mandate requiring it.
Sign the rules of engagement (RoE) before testing starts. The RoE lists IPs and URLs in scope, prohibited techniques, escalation contacts, the halt procedure, and what the vendor does if they find an active breach mid-test. No signed RoE means the testing is unauthorized. The commercial contract does not substitute.
Free download - the PDF checklist that covers all five stages, plus a ready-to-send vendor questionnaire.
Stage 2 - Qualify the penetration testing company before you ask for a proposal

Qualification filters out the vendors who cannot demonstrate the capability you need.
Check certifications at the individual tester level
OSCP is the floor. The exam is 24 hours of hands-on exploitation in a live lab - it proves the tester can break a system, not just run a tool. Other valuable certifications:
- OSWE - advanced web exploitation
- OSEP - advanced evasion and post-exploitation in Windows
- OSED - exploit development
- GPEN, GWAPT, GXPN - GIAC equivalents, common in US enterprise procurement
- CRTO - Certified Red Team Operator, for red team specifically
- eWPTX - recognised practical web exploitation cert
Verify the company-level controls
Does the company handle your data to a defined standard? ISO 27001 answers this. Ask for the certificate, the certification body, and the issue date.
Are testers covered if something goes wrong? Professional indemnity insurance for offensive security work, not generic E&O. Ask for the policy and the named coverage limits.
Use published CVE research as a quality signal
Finding bugs in known applications is harder than finding bugs in a customer's web app. A team that ships CVEs against hardened vendor code can do the easier job too.
Verify CVE IDs on NVD or CVE.org - Look at vendor diversity, CVSS severity, recency.
Zero CVEs does not disqualify a vendor - plenty of strong commercial teams do not publish. But 100+ CVEs against named enterprise products is a research investment that is hard to fake.
Confirm staffing reality
Vendors name a senior tester in the proposal and assign juniors to the engagement. Fix it contractually: names and cert IDs of the assigned testers in writing before signature, and confirm whether professional indemnity insurance covers contractors as well as employees.
Stage 3 - Ask the questions that separate a process from a sales pitch
The proposal call is where you find out whether the vendor has run engagements like yours, or whether they will improvise on yours. Vendors who improvise get defensive when you ask hard questions.
Methodology
A methodology-competent vendor names the framework per test type:
- Web applications - OWASP Web Security Testing Guide, plus OWASP ASVS for verification depth. ASVS Level 2 is the practical minimum for apps handling sensitive data.
- Network and infrastructure - NIST SP 800-115 plus MITRE ATT&CK for adversary technique mapping
- Red team and threat-led testing - MITRE ATT&CK for scenario planning
Escalation procedure
A tester who finds an active breach mid-engagement has to decide whether to keep testing or stop and escalate. Ask for the written procedure - it should name the client contact, the decision point (continue, pause, or stop), and the notification timeline for Critical findings. "We would call you right away" is an intention, not a procedure. The wrong move on a live intrusion tips off the attacker.
Stage 4 - Evaluate the proposal from each penetration testing company

Capability is established in Stage 2 and 3. The proposal answers one question: did this vendor actually think about your scope, or did they paste a template? Two things tell you.
Ask for a redacted sample report
The single highest-value step before signing. A sample from a comparable engagement tells you more than any sales call. Look for:
- Executive summary under two pages, written for a non-technical reader
- Findings with CVSS vector strings (v3.1 or v4.0) and working PoC - request/response, screenshots, or terminal output, not "tester confirmed exploitable"
- Remediation guidance engineering can act on without calling the vendor back
- Consistent finding structure across severities - if Highs are detailed and Lows are one-line boilerplate, the team is inconsistent
NDA before sharing is reasonable. Blanket refusal ends the evaluation.
Effort, not a Gantt chart
A proposal that pre-allocates "two days on auth, one day on business logic" is selling you procurement fiction. Real engagements pivot on what the testers find - if the auth flow falls over on day one, the rest of the time goes there. Ask for total effort, the seniority mix on those days, and what the vendor does mid-test if scope changes. A proposal with "team TBD" in the staffing section is a price quote with a capability statement attached.
Threat model before scope lock
Good coverage means testing the attack paths that are most probable and most damaging to your business - not just running through a standard checklist. Before scope is finalized, ask the vendor to walk you through their initial threat model: which entry points are highest priority given your architecture, and what is the business impact if each one is exploited? The answers should drive how testing time is allocated. A vendor who cannot articulate this before work starts will default to generic OWASP category coverage and miss the risks specific to your system. OWASP provides the floor; the threat model sets the ceiling.
Use this to score any proposal you receive. Download the checklist.
Stage 5 - Verify the report after delivery
The report is the product. The gap between a real penetration testing company and a scanner reseller is visible the day the PDF lands.
Finding anatomy
Every finding should have:
- CVSS score with the full vector string - v3.1 or v4.0
- PoC evidence: screenshots plus request/response or terminal output that proves exploitation, not "tester confirmed exploitable"
- Remediation guidance the engineering team can act on without calling the vendor: exact version, configuration line, or architectural change. Where a code fix applies, point at the component, not "patch the bug"
- A retest evidence section to fill in after remediation
A finding without PoC is a hypothesis. Developers cannot fix "the application is vulnerable to SQL injection." They need the endpoint, the parameter, the payload that worked, and enough evidence to scope what data was exposed.
Executive summary
The executive summary has one job: tell a decision-maker whether something needs action in the next 72 hours, the overall posture, and the top remediation items in business terms.
Retest and remediation tracking
Retest verifies that reported findings are fixed. It does not cover new features or new vulnerability classes added since. A retest produces a formal addendum with remediation status per finding.
Some vendors include retest in the price, others price it separately for red team or TLPT work. Both are fine if stated up front.
Benchmark vendors against each other
The next time you commission a pentest, hand a different vendor the same scope your current one tested last cycle. Read both reports side by side. We have come in after other vendors often enough to know the gap on identical scope is usually bigger than buyers think.
Data handling and insurance
The vendor accumulates sensitive data during the engagement - captured credentials, screenshots of accessed records, custom tools, network captures. Get the storage, retention (30-90 days post-report is standard), and deletion procedure in writing. Confirm contractor testers are covered by the same NDA and insurance as employees.
DORA Article 27(1)(e) requires TLPT testers to be "fully covered by relevant professional indemnity insurances, including against risks of misconduct and negligence." For non-TLPT work it is expected by regulated-sector buyers but not legally mandated. A vendor offering unlimited liability for test-caused disruption is misrepresenting their policy - the correct framing is mutual liability limits, defined halt procedures, and professional indemnity for negligence cases.
Red flags that should pause or end the evaluation across penetration testing companies
Any single one of these is enough to walk away:
- A price quote without any scoping conversation.
- No sample report under any circumstances.
- No named testers before contract signature.
- Methodology answered in a marketing language. "We follow all major frameworks" is not a methodology.
- 60+ certification logos on the website with no individual attribution.
- Refusal to sign any NDA, or rubber-stamping any NDA without legal review.
- A price significantly below comparable quotes. Manual exploitation takes time. A price that makes that time impossible usually means the engagement is automated scanning with a report wrapper.
FAQ - Choosing a penetration testing company
What is the difference between a penetration test and a vulnerability scan?
A vulnerability scan runs automated checks against a database of known CVEs and misconfigurations. A penetration test uses a human tester to attempt exploitation, chain vulnerabilities together, and find business logic flaws that scanners cannot detect.
What certifications should a penetration testing company's testers hold?
OSCP is the floor. OSWE, OSEP, OSED, GPEN, GWAPT, and GXPN signal specific advanced skills. CRTO is for the red team specifically. CISSP and CISA are governance certs - they do not prove hands-on testing ability. Ask for the certifications held by the named testers on your engagement, then verify the cert numbers.
Is CREST accreditation required for a penetration testing company?
No. CREST is a UK-originated trade body membership, not a control standard. It is the right filter only for two contexts: UK public sector work (NCSC CHECK is the named government scheme) and DORA TLPT in EU jurisdictions whose regulators have referenced CREST STAR-FS. For SOC 2, PCI DSS, ISO 27001, and most NIS2 scope, the meaningful signals are vendor ISO 27001, named pentest-specific professional indemnity, individual tester certs (OSCP, OSWE, OSEP), and published CVE research.
What does PCI DSS v4.0 require from a penetration testing company?
Annual penetration testing of the cardholder data environment (CDE), covering external and internal network layers and the application layer. Requirement 11.4 mandates exploitation attempts. The standard does not require manual exploitation specifically - the word "manual" appears only in guidance notes. Requirement 11.4.1 requires "organisational independence": the tester must be independent from the team managing the CDE.
Does NIS2 require penetration testing?
Article 21(2)(f) of the NIS2 Directive requires policies and procedures to assess the effectiveness of cybersecurity risk-management measures. Regulators broadly interpret this as requiring periodic independent external testing. The directive text itself does not contain the words "penetration testing" or "independent organisation" - those are regulatory interpretation, not directive text.
How do I know if a proposal from a penetration testing company covers enough effort?
Pentesting is iterative - testers pivot on what they find. A vendor who hands you a pre-allocated "two days on auth, one on business logic" Gantt chart is selling procurement fiction. The right asks are a total effort in days.
What should the vendor do if they find a critical issue during the test?
The vendor should reference a written escalation procedure with immediate notification to a named client contact, a documented decision point (continue or pause), mutual agreement on evidence preservation, and a notification timeline (within hours for Critical findings).
Apply this checklist to AFINE
Run the same checklist on us.




