Apify Tutorial

The Ultimate Blueprint for Automated Growth.

Discover the exact AI tools, media engines, and infrastructure setups to scale your business on autopilot.

Sales & Marketing Automation

We eliminate repetitive tasks by building smart, custom workflows that connect your apps and save your team hours every single day.

Exprore sales & marketing Automation

AI Content & Media Production

High-impact visual content and smart assets tailored for modern platforms, helping your brand stand out in a crowded digital space.

Explore ai content & media productyion

Data, Tools & Digital Infrastructure

Rock-solid technical foundations, cloud setups, and integrations designed to scale smoothly as your business grows.

explore data, tools & digital infrastructure

Apify Tutorial

How to Scrape and Structure Web Data with Apify

Apify gives you two paths into web scraping: running a ready-made “Actor” from its marketplace with no code, or building and deploying your own. This tutorial covers both — starting with running an existing Actor to get usable, structured data quickly, then outlining what’s involved if you later want to build a custom scraper.

What this tutorial covers: finding and running a pre-built Actor from the Apify Store, configuring inputs, retrieving structured output, and a brief overview of building your own Actor with the Apify CLI.

Prerequisites:

An Apify account (the free tier includes monthly platform credits sufficient for small-to-moderate scraping jobs)
A target website or platform you want data from (e.g., Google Maps, LinkedIn, Amazon, or a general website)
For custom Actor development only: Node.js or Python installed, and the Apify CLI

For how Apify compares to alternatives for data extraction, see our Apify vs. Bright Data comparison or the Data, Tools & Digital Infrastructure pillar page.

Step 1: Browse the Apify Store

From your dashboard, click Store in the left sidebar. Search by keyword for the type of data source you need — for example, “Google Maps scraper,” “LinkedIn profiles,” or “Amazon product data.” You can also filter results by category, price, or popularity.

Click on an Actor’s card to view its detail page, which includes a description, pricing model, input options, and example output. Reviewing the example output before running anything gives you a clear sense of whether the Actor returns the fields you actually need.

Step 2: Configure and Run the Actor

Click Try for free (or Start) on the Actor’s page. Most Actors require at minimum:

A target URL or list of URLs
Optional filters (e.g., date ranges, result limits, specific fields to extract)

For Actors like the AI Web Scraper, you can skip manual configuration entirely — paste the target URL, then describe what you want in plain language (e.g., “Extract all product names and prices”), and the Actor configures its own extraction logic based on that prompt.

Once configured, click Start (or Run) to begin. The Actor runs in Apify’s cloud, so you don’t need to keep your browser open or your computer running during the job.

Step 3: Retrieve Your Data

When the run completes, results are stored in an Apify Dataset, accessible from the run’s detail page. From there you can:

View results directly in a table format in the browser
Export to CSV, JSON, Excel, or other formats
Connect a webhook to push results automatically to Slack, Google Sheets, or another endpoint

For ongoing data needs, Actors can be scheduled to run on a recurring basis, with new results appended to the dataset or sent via webhook each time.

Step 4 (Optional): Building Your Own Actor

If no existing Actor covers your use case, Apify supports building custom scrapers using the Crawlee framework (Node.js or Python). At a high level:

Install the Apify CLI and log in (apify login)
Scaffold a new project with apify create my-actor, choosing a starter template (e.g., a Crawlee Cheerio or Playwright template depending on whether the target site requires JavaScript rendering)
Write your scraping logic in the generated project
Test locally with apify run
Deploy to your account with apify push

After deploying, add an INPUT_SCHEMA.json file and a clear README so the Actor is usable from the Console — this is also required if you intend to publish it to the Apify Store for others (and potentially monetize it through pay-per-result pricing).

A simple custom scraper covering a handful of pages typically costs a small fraction of a dollar in compute units; larger jobs using browser-based rendering (Playwright) cost more per page than lightweight HTML-based scraping.

Settings That Are Easy to Miss

Compute Units (CUs) vs. flat pricing: most Actors bill based on compute usage, not a flat per-run fee — a scraper that uses a full browser (Playwright) to render JavaScript-heavy pages will consume credits significantly faster than one using simple HTML parsing, even for the same number of pages.
Free tier credits reset monthly but don’t roll over: unused credits don’t carry into the next month, so it’s worth timing larger one-off jobs to use available credits before they reset.
Proxy and stealth settings are often configurable per Actor: many Store Actors include options for proxy rotation or browser stealth settings — leaving these on defaults is usually fine for smaller jobs, but sites with aggressive bot detection may require adjusting these before a run succeeds.
Dataset results aren’t automatically deduplicated: if you run the same Actor multiple times against overlapping inputs, you’ll get duplicate records across runs unless you handle deduplication downstream (in your CRM, spreadsheet, or via a dedicated deduplication step).
MCP/agent integrations: Apify Actors can be run directly from AI tools like Claude Desktop or Cursor via Apify’s MCP server — useful if your workflow already involves an AI assistant pulling data on demand rather than running jobs manually from the Console.

Apify Tutorial

Sales & Marketing Automation

AI Content & Media Production

Data, Tools & Digital Infrastructure

Apify Tutorial

How to Scrape and Structure Web Data with Apify

Step 1: Browse the Apify Store

Step 2: Configure and Run the Actor

Step 3: Retrieve Your Data

Step 4 (Optional): Building Your Own Actor

Settings That Are Easy to Miss

Related Reading