
If you work with investors, you already know the pain: every fund has a different website, every portfolio page a different layout. Manually copying those companies into Google Sheets turns one simple research task into an afternoon of tab-hopping. Learning how to route portfolio data straight into Sheets with Scaper gives you a live source of truth for sales, BD, and deal sourcing. Once that flow exists, delegating it to an AI agent means it runs quietly in the background while you focus on conversations, not copy-paste.
There are a few ways to get portfolio companies from investor sites into Google Sheets. The best approach depends on whether you are cleaning up a one-off list or building a repeatable, always-fresh pipeline.
This is the classic. Open a fund’s portfolio page, scroll, copy each company name, URL, and description, and paste into Google Sheets.
Pros: Zero setup, works everywhere, great for a handful of funds.
Cons: Painfully slow at scale, error-prone, and impossible to keep current as portfolios change.
If the portfolio page has a structured table or consistent HTML, you can use built-in functions:- IMPORTHTML to pull tables or lists.- IMPORTXML with XPath to grab specific elements like company names or links.
Pros: Still no code, updates when the page changes, good for static sites.
Cons: Breaks when layouts change, struggles with JavaScript-heavy pages, and can be fiddly to maintain across dozens of funds.
Tools like general web scrapers or extensions can help you point-and-click to select portfolio rows and export to Google Sheets.
Pros: Faster to configure than pure formulas, more flexible for nested pages.
Cons: Still requires you to run them manually, and you become the bottleneck when teams need fresh data.
Here is where Simular-style agents shine. You teach an AI agent once: open your list of funds, launch Scaper or navigate with the browser, visit each portfolio page, extract key fields, standardize tags, and write everything into Google Sheets.
Pros: Runs on a schedule, adapts to small layout changes, scales from a few funds to hundreds, and frees humans for outreach and analysis.
Cons: Needs an initial setup and a short training loop so the agent understands your exact schema, naming rules, and edge cases.
The sweet spot for most agencies, sales teams, and founders is a hybrid: use Scaper plus an AI agent to handle 90% of the grunt work, then do a quick human pass in Sheets for high-stakes accounts. That way your portfolio intelligence is always a few clicks away, never a weekend project.
At minimum, capture company name, website URL, short description, industry or category, HQ location, and the fund or firm that lists them. For sales or partnerships, add optional fields like funding stage, last funding year, and any tags related to your ideal customer profile. Design these columns first in Google Sheets so Scaper and your AI agent know exactly where each data point should land.
For multi-page portfolios, define the navigation pattern once for your AI agent: open the portfolio, click Next or load more until the end, and only then move to the next fund. In Google Sheets, keep a column for source URL so you can trace each company back. If filters exist, record separate runs, e.g., SaaS or Europe-only, and include those filter values as tags in a dedicated column for later segmentation.
Google Sheets formulas like IMPORTHTML and IMPORTXML work well for simple, static portfolio tables. Start by pasting the portfolio URL in a cell, then use IMPORTHTML for tables or IMPORTXML with the correct XPath to grab names and links. However, these break on JavaScript-loaded content or frequent layout tweaks, and they do not click through sites. That is where browser tools and AI agents offer more resilience and control.
For active outreach, weekly or biweekly refreshes are ideal, since new investments and exits can shift priorities. Use your AI agent to schedule recurring runs that re-scrape target portfolios and append or update rows in Google Sheets. You can add a last_seen date column and have the agent update it each run, so you can easily filter for newly added or no-longer-listed companies without rebuilding the whole sheet.
Standardize company URLs as your unique key. In Google Sheets, use functions like UNIQUE or custom formulas to flag duplicates when the same domain appears from multiple funds. Ask your AI agent to normalize domains by stripping tracking parameters and www prefixes before writing rows. Keep a master Companies sheet, then let agents write new data to a Staging sheet where you or a simple script can merge, dedupe, and approve changes.