Research Strategy & Sources
Research Strategy & Sources — Methodology, Intel, Pipeline
Research Strategy & Sources
How we're approaching the IT/web-security thematic bet — separated from the candidate universe so the methodology can evolve independently of the data.
Two-node split — why
- 01 · Initial Research List — the what. The universe of names, with current data, refreshed periodically.
- 02 · Research Strategy & Sources (this node) — the how and where. Methodology, intel sources, pipeline plan, known gaps.
How the candidate universe was built (in layers)
Ranked by edge contribution, lowest to highest:
| Layer | Done? | Edge | Notes |
|---|---|---|---|
| 1. Training-data recall | ✅ | None | Got the obvious 18 from memory. Common knowledge, no advantage. |
| 2. Targeted web search | ✅ | Some | Surfaced NTSK (Sept-2025 IPO) and SAIL (Feb-2025 re-IPO) — names ETFs hadn't fully absorbed yet. Plus NTCT, CVLT, CLBT. |
| 3. ETF holdings cross-ref | ⏸ | Low | CIBR/HACK/BUG holdings would catch 80% of $200M+ pure-plays. Confirms what we have, may add 1-2 we missed. |
| 4. FMP screener pass | ⏸ | Medium | Filter Technology → Application/Systems Software → grep description for security keywords. Catches names we don't know exist. |
| 5. Specialist substack reading | ⏸ | High (ongoing) | The single best ongoing source — practitioners + investors writing weekly on the AI-cyber thesis. See Intel Sources below. |
Where actual edge could come from
The thesis itself ("AI accelerates cyber attack → cyber demand structurally up") has been crowded for 2+ years. ETFs already encode it. Sell-side covers every $1B+ name. So the edge can't be the thesis — it has to be in how we operationalize it.
Layer 1 — Underfollowed names ETFs miss
CIBR has a $200M market cap floor. ETFs index by passive flow weight; small caps get missed. Pricing inefficiency lives here. Status: partially captured (NTSK, SAIL added).Layer 2 — AI-defense leadership signals (the X-factor)
The thesis demands AI-driven defense, but no screen captures who's actually leading. Specific signals:- Threat research output — Falcon OverWatch, Unit 42, Mandiant. Frequency × depth × originality.
- AI-product velocity — count + depth of AI-feature launches per quarter. CRWD shipped Charlotte AI; PANW shipped Precision AI; many others are slideware.
- CVE response time — public CVE → patch ship date.
- Patent activity — USPTO API is free; ML/anomaly-detection filings.
- Conference presence — Black Hat / DEF CON / RSA accepted-talk count per vendor.
Layer 3 — Customer telemetry
Forward-looking signal that beats earnings by 90+ days. Three free sources to start:1. Earnings call transcripts (FMP — already paid) — grep top-200 customer-side companies' transcripts for vendor mentions. When 3 Fortune 500 banks mention CRWD on the same quarter, that's a leading indicator. 2. CISA KEV + NVD CVE data (free public APIs) — vendor's own CVE count, patch time, threat-research-output volume. 3. USAspending federal contracts (free public API) — federal cyber spend by vendor; large lumpy awards land before revenue. DoD spends ~$15B/yr.
Status: planned, not built. ~2.5 days of work for all three.
Layer 4 — Quantitative edges screens miss
- NRR (Net Revenue Retention) — only metric that actually matters for SaaS-shaped businesses; companies disclose it, but no screener captures it.
- SBC drag — stock-based comp as % of revenue. Many cyber names look profitable until adjusted.
- FCF / Revenue ratio — separates real businesses from accounting illusions.
- Customer concentration — risk most screens hide.
Down-selection process
Universe is the starting point, not the deliverable. Apply value-investing rigor (Section B criteria from the strategy) to surface 2-3 stand-outs:
1. Consistent market-share gains 2. Fair-to-affordable valuation 3. Uniquely positioned for the AI-attack era (this is where Layer 2 above feeds in) 4. Nimble specialist vs. platform incumbent — decided per bucket 5. Real fundamentals (FCF positive or near-term path)
The combination that's hardest to find: high AI-leadership × reasonable valuation × already drawn-down. That's what we're hunting for.
Pipeline workflow — using existing infrastructure
We have the 18-step deep-analysis pipeline already (drives company-detail pages). For thematic candidates:
1. Triage: each candidate gets a fast deep / watch / skip verdict before we commit to a full analysis. 2. Deep dive: names that clear triage run the full 18-step pipeline (foundation → valuation → signals → final synthesis). 3. Surface: results auto-appear on company-detail pages. We can link from this tree's per-ticker nodes (when we add them) directly to the analysis.
Cost: triage is cheap (sub-$1/ticker). Full pipeline is expensive (~$2-5/ticker at full tier). Triage first, then deep-dive only the qualifying names.
Intel sources (substacks + sites we monitor)
The cybersecurity-investing community has converged on a small set of high-signal publications. These are where new names + thesis-aligned thinking surface before sell-side picks them up:
| Source | Author | Why it matters |
|---|---|---|
| Software Analyst Cyber Research (SACR) | Francis Odum | Single best deep-research source on cyber investing. Tracks AI/cyber intersection, VC, M&A. |
| Venture in Security | Ross Haleliuk | Cyber business models, key players (public + private). Runs a syndicate. |
| Resilient Cyber | Chris Hughes | Practitioner-side perspective on the actual security stack. |
| Strategy of Security | — | Deep dives on specific cyber companies (e.g. SailPoint S-1 breakdown). |
| Mostly Metrics | — | S-1 / financial breakdowns of newly public companies. |
| Help Net Security | — | Daily cyber news + M&A flow. |
Known gaps (with-budget upgrades)
Things we'd add if data budget grew:
| Gap | Tool | Cost | Edge added |
|---|---|---|---|
| Job-posting / vendor-skill tracking at scale | Coresignal or Revelio Labs | $1-2k/mo | Real-time deployment-growth proxy |
| Expert call transcripts | Tegus or AlphaSense | $20-50k/yr | Closest thing to ground truth on customer behavior |
| LinkedIn workforce data | Same as above (LinkedIn locked) | Same | Hiring trends as growth signal |
| Real-time CVE / threat feed | Recorded Future / Dragos | $$$$ | Catches breach narratives before headlines |
What changes when (update cadence)
- Universe (sibling node 01) — refresh the data roughly weekly. Add new names as they surface.
- This node — update when methodology shifts (new layer added, signal proven, paid data feed turned on, etc.). Don't update for every data refresh.
- Per-name nodes (future) — created under this tree when a candidate makes it to the
deepverdict. Will hold full thesis writeup + pipeline output link.