Research Strategy & Sources

Updated May 26, 2026

Research Strategy & Sources — Methodology, Intel, Pipeline

researchDraft1,065 words

Research Strategy & Sources

How we're approaching the IT/web-security thematic bet — separated from the candidate universe so the methodology can evolve independently of the data.

Two-node split — why

  • 01 · Initial Research List — the what. The universe of names, with current data, refreshed periodically.
  • 02 · Research Strategy & Sources (this node) — the how and where. Methodology, intel sources, pipeline plan, known gaps.
When a new name surfaces (e.g. via a SACR post or a fresh IPO), the universe gets updated. When the process changes (we add a new screening layer, swap in a paid data feed, etc.), this node updates. Different change cadences.


How the candidate universe was built (in layers)

Ranked by edge contribution, lowest to highest:

LayerDone?EdgeNotes
1. Training-data recallNoneGot the obvious 18 from memory. Common knowledge, no advantage.
2. Targeted web searchSomeSurfaced NTSK (Sept-2025 IPO) and SAIL (Feb-2025 re-IPO) — names ETFs hadn't fully absorbed yet. Plus NTCT, CVLT, CLBT.
3. ETF holdings cross-refLowCIBR/HACK/BUG holdings would catch 80% of $200M+ pure-plays. Confirms what we have, may add 1-2 we missed.
4. FMP screener passMediumFilter Technology → Application/Systems Software → grep description for security keywords. Catches names we don't know exist.
5. Specialist substack readingHigh (ongoing)The single best ongoing source — practitioners + investors writing weekly on the AI-cyber thesis. See Intel Sources below.
Honest assessment: layers 1-2 are done; the universe today is "common knowledge plus 5 recent IPOs." That's the floor, not the ceiling. Real edge is in layers 3-5.


Where actual edge could come from

The thesis itself ("AI accelerates cyber attack → cyber demand structurally up") has been crowded for 2+ years. ETFs already encode it. Sell-side covers every $1B+ name. So the edge can't be the thesis — it has to be in how we operationalize it.

Layer 1 — Underfollowed names ETFs miss

CIBR has a $200M market cap floor. ETFs index by passive flow weight; small caps get missed. Pricing inefficiency lives here. Status: partially captured (NTSK, SAIL added).

Layer 2 — AI-defense leadership signals (the X-factor)

The thesis demands AI-driven defense, but no screen captures who's actually leading. Specific signals:
  • Threat research output — Falcon OverWatch, Unit 42, Mandiant. Frequency × depth × originality.
  • AI-product velocity — count + depth of AI-feature launches per quarter. CRWD shipped Charlotte AI; PANW shipped Precision AI; many others are slideware.
  • CVE response time — public CVE → patch ship date.
  • Patent activity — USPTO API is free; ML/anomaly-detection filings.
  • Conference presence — Black Hat / DEF CON / RSA accepted-talk count per vendor.
Status: not yet built. Highest leverage / least cost. Recommended first feature.

Layer 3 — Customer telemetry

Forward-looking signal that beats earnings by 90+ days. Three free sources to start:

1. Earnings call transcripts (FMP — already paid) — grep top-200 customer-side companies' transcripts for vendor mentions. When 3 Fortune 500 banks mention CRWD on the same quarter, that's a leading indicator. 2. CISA KEV + NVD CVE data (free public APIs) — vendor's own CVE count, patch time, threat-research-output volume. 3. USAspending federal contracts (free public API) — federal cyber spend by vendor; large lumpy awards land before revenue. DoD spends ~$15B/yr.

Status: planned, not built. ~2.5 days of work for all three.

Layer 4 — Quantitative edges screens miss

  • NRR (Net Revenue Retention) — only metric that actually matters for SaaS-shaped businesses; companies disclose it, but no screener captures it.
  • SBC drag — stock-based comp as % of revenue. Many cyber names look profitable until adjusted.
  • FCF / Revenue ratio — separates real businesses from accounting illusions.
  • Customer concentration — risk most screens hide.
Status: most are derivable from existing FMP fundamentals — modest extension to the candidates table.


Down-selection process

Universe is the starting point, not the deliverable. Apply value-investing rigor (Section B criteria from the strategy) to surface 2-3 stand-outs:

1. Consistent market-share gains 2. Fair-to-affordable valuation 3. Uniquely positioned for the AI-attack era (this is where Layer 2 above feeds in) 4. Nimble specialist vs. platform incumbent — decided per bucket 5. Real fundamentals (FCF positive or near-term path)

The combination that's hardest to find: high AI-leadership × reasonable valuation × already drawn-down. That's what we're hunting for.


Pipeline workflow — using existing infrastructure

We have the 18-step deep-analysis pipeline already (drives company-detail pages). For thematic candidates:

1. Triage: each candidate gets a fast deep / watch / skip verdict before we commit to a full analysis. 2. Deep dive: names that clear triage run the full 18-step pipeline (foundation → valuation → signals → final synthesis). 3. Surface: results auto-appear on company-detail pages. We can link from this tree's per-ticker nodes (when we add them) directly to the analysis.

Cost: triage is cheap (sub-$1/ticker). Full pipeline is expensive (~$2-5/ticker at full tier). Triage first, then deep-dive only the qualifying names.


Intel sources (substacks + sites we monitor)

The cybersecurity-investing community has converged on a small set of high-signal publications. These are where new names + thesis-aligned thinking surface before sell-side picks them up:

SourceAuthorWhy it matters
Software Analyst Cyber Research (SACR)Francis OdumSingle best deep-research source on cyber investing. Tracks AI/cyber intersection, VC, M&A.
Venture in SecurityRoss HaleliukCyber business models, key players (public + private). Runs a syndicate.
Resilient CyberChris HughesPractitioner-side perspective on the actual security stack.
Strategy of SecurityDeep dives on specific cyber companies (e.g. SailPoint S-1 breakdown).
Mostly MetricsS-1 / financial breakdowns of newly public companies.
Help Net SecurityDaily cyber news + M&A flow.
These are the people doing the same thesis work we're doing — often with better access (sources, S-1 prep, VC dealflow). Reading them weekly is the cheapest edge available.


Known gaps (with-budget upgrades)

Things we'd add if data budget grew:

GapToolCostEdge added
Job-posting / vendor-skill tracking at scaleCoresignal or Revelio Labs$1-2k/moReal-time deployment-growth proxy
Expert call transcriptsTegus or AlphaSense$20-50k/yrClosest thing to ground truth on customer behavior
LinkedIn workforce dataSame as above (LinkedIn locked)SameHiring trends as growth signal
Real-time CVE / threat feedRecorded Future / Dragos$$$$Catches breach narratives before headlines
None of these are needed to start. The plan is to ship Layer 2 + Layer 3 telemetry from free sources first, see if signal exists, then upgrade.


What changes when (update cadence)

  • Universe (sibling node 01) — refresh the data roughly weekly. Add new names as they surface.
  • This node — update when methodology shifts (new layer added, signal proven, paid data feed turned on, etc.). Don't update for every data refresh.
  • Per-name nodes (future) — created under this tree when a candidate makes it to the deep verdict. Will hold full thesis writeup + pipeline output link.