Free company search database
Use free_simple_company_search when you need a free, SQL-readable company corpus for exact company resolution, bounded list building, and quick segment checks.
It is a Deepline Native tool backed by a shared Snowflake companies table. It is not a live provider API call. The current corpus is loaded from a raw People Data Labs company CSV snapshot, transformed into Deepline’s normalized schema, quality-checked, and then exposed as the read-only companies table.
Deepline does not charge credits for free_simple_company_search. Broad
queries can still time out, so treat it as a free database with guardrails,
not an unlimited scanner.
When to use it
Use it for:
- Exact domain lookup when you have websites or normalized domains.
- Small exact batches by domain, LinkedIn company URL, or company name.
- Prefix candidate generation, such as
company_name ILIKE 'acme%'.
- Bounded ICP pulls where industry, location, size, or founding year are enough.
- Quick exploratory counts before deciding whether to use paid company search.
Use a provider-native company search instead when you need live web coverage, advanced facets, funding or investor filters, hiring signals, strict market sizing totals, or semantic discovery by what a company does.
Source
The production loader expects the raw PDL company CSV header:
country,founded,id,industry,linkedin_url,locality,name,region,size,website
Deepline transforms that source into the query table:
| Source field | Deepline column |
|---|
id | source_record_id |
website | domain, normalized_domain |
name | company_name |
industry | industry |
locality, region, country | location |
linkedin_url | linkedin_url |
size | employee_count |
founded | year_founded |
updated_at is set during Deepline’s table load. It should be read as import/update time for this corpus, not as proof that PDL recently observed the company.
Schema
Query the table as companies. public.companies also works for compatibility.
| Column | Type | Notes |
|---|
source_record_id | text | Source row id from the snapshot. |
normalized_domain | text | Lowercase root domain with protocol, www., path, and trailing dot removed. Fastest lookup key when you have a domain. |
domain | text | Original website/domain from the source. |
company_name | text | Company name. Required in the loaded table. |
industry | text | Source industry label, for example Computer Software. |
location | text | Comma-separated locality, region, country. |
linkedin_url | text | LinkedIn company URL or slug from the source. |
employee_count | integer | Upper bound of the source size bucket: 10, 50, 200, 500, 1000, 5000, 10000, or 10001 for 10000+. |
year_founded | integer | Source founding year when present. |
updated_at | timestamp | Deepline load/update timestamp. |
Query from the CLI
Inspect the live tool schema first:
deepline tools get free_simple_company_search
Exact domain lookup:
PAYLOAD=$(cat <<'JSON'
{
"sql": "SELECT normalized_domain, domain, company_name, industry, location, employee_count, year_founded FROM companies WHERE normalized_domain = 'openai.com' LIMIT 5"
}
JSON
)
deepline tools execute free_simple_company_search --payload "$PAYLOAD"
Use the same payload shape for other SQL queries.
Batch domain lookup SQL:
SELECT normalized_domain, company_name, linkedin_url
FROM companies
WHERE normalized_domain IN ('openai.com', 'anthropic.com', 'vercel.com')
LIMIT 10
LinkedIn company URL lookup SQL:
SELECT company_name, normalized_domain, linkedin_url
FROM companies
WHERE linkedin_url IN ('linkedin.com/company/openai')
LIMIT 5
Prefix fallback SQL when exact lookup misses:
SELECT company_name, normalized_domain, location, employee_count
FROM companies
WHERE company_name ILIKE 'acme%'
LIMIT 25
Bounded segment count SQL:
SELECT COUNT(*) AS company_count
FROM companies
WHERE industry ILIKE '%software%'
AND employee_count >= 50
LIMIT 1
Bounded company pull SQL:
SELECT normalized_domain, company_name, industry, location, employee_count, year_founded
FROM companies
WHERE location ILIKE 'san francisco%'
AND year_founded >= 2020
ORDER BY employee_count DESC
LIMIT 100
Query from the API
Use the normal tool execution endpoint:
curl -X POST "https://code.deepline.com/api/v2/integrations/free_simple_company_search/execute" \
-H "Authorization: Bearer $DEEPLINE_API_KEY" \
-H "Content-Type: application/json" \
--data-binary @- <<'JSON'
{
"payload": {
"sql": "SELECT normalized_domain, company_name FROM companies WHERE normalized_domain = 'openai.com' LIMIT 5"
}
}
JSON
The response wraps the SQL result in data:
{
"data": {
"schema": "public.companies",
"command": "SELECT",
"row_count_returned": 1,
"truncated": false,
"columns": [
{ "name": "company_name", "table_id": null, "data_type_id": null }
],
"sql": "SELECT company_name FROM companies WHERE normalized_domain = 'openai.com' LIMIT 5",
"executed_sql": "SELECT company_name FROM COMPANIES WHERE normalized_domain = 'openai.com' LIMIT 5",
"limit": 5,
"rows": [{ "company_name": "OpenAI" }]
}
}
SQL rules and limits
- SQL must be a single statement.
- Only
SELECT, WITH, or EXPLAIN statements are allowed.
- The statement must read from
companies or public.companies.
- A top-level
LIMIT is required and must be 100000 or less.
- Queries run with a Snowflake timeout. Broad
ILIKE '%keyword%', long OR chains, large country-wide scans, and expensive GROUP BY shapes can time out.
- Workspaces are currently rate-limited to 5 requests per second.
- Result limits above
5000 return CSV data in rows_csv; rows is empty for that path.
free_simple_company_search is for direct CLI/API use. It is not supported inside workflow executor runs.
Save matching companies to your workspace database
If you want to turn a free-company query into rows in your workspace customer database, use materialize_free_customer_companies. It runs a free-company SQL query and upserts row-shaped results into enrichments.companies.
The query must return only these company columns:
domain, company_name, industry, location, linkedin_url, employee_count, year_founded
Example:
PAYLOAD=$(cat <<'JSON'
{
"sql": "SELECT normalized_domain AS domain, company_name, industry, location, linkedin_url, employee_count, year_founded FROM companies WHERE industry ILIKE '%software%' AND employee_count >= 50 LIMIT 1000"
}
JSON
)
deepline tools execute materialize_free_customer_companies --payload "$PAYLOAD"
Then query the saved rows with query_customer_db:
deepline tools execute query_customer_db --payload '{
"sql": "SELECT domain, company_name, industry, employee_count FROM enrichments.companies ORDER BY updated_at DESC LIMIT 50"
}'
TAM builder guidance
For TAM work, start here when the ICP can be expressed with exact domains, company names, LinkedIn company URLs, industry, location, employee count, or founding year. Escalate to Dropleads, Apollo, Crustdata, Exa, or known-source extraction when the needed signal is not in this schema.