Apify Tutorial

The Ultimate Blueprint for Automated Growth.
Discover the exact AI tools, media engines, and infrastructure setups to scale your business on autopilot.
Sales & Marketing Automation

We eliminate repetitive tasks by building smart, custom workflows that connect your apps and save your team hours every single day.
AI Content & Media Production

High-impact visual content and smart assets tailored for modern platforms, helping your brand stand out in a crowded digital space.
Data, Tools & Digital Infrastructure

Rock-solid technical foundations, cloud setups, and integrations designed to scale smoothly as your business grows.
Apify Tutorial
How to Scrape and Structure Web Data with Apify
Apify gives you two paths into web scraping: running a ready-made “Actor” from its marketplace with no code, or building and deploying your own. This tutorial covers both — starting with running an existing Actor to get usable, structured data quickly, then outlining what’s involved if you later want to build a custom scraper.
What this tutorial covers: finding and running a pre-built Actor from the Apify Store, configuring inputs, retrieving structured output, and a brief overview of building your own Actor with the Apify CLI.
Prerequisites:
- An Apify account (the free tier includes monthly platform credits sufficient for small-to-moderate scraping jobs)
- A target website or platform you want data from (e.g., Google Maps, LinkedIn, Amazon, or a general website)
- For custom Actor development only: Node.js or Python installed, and the Apify CLI
For how Apify compares to alternatives for data extraction, see our Apify vs. Bright Data comparison or the Data, Tools & Digital Infrastructure pillar page.
Step 1: Browse the Apify Store
From your dashboard, click Store in the left sidebar. Search by keyword for the type of data source you need — for example, “Google Maps scraper,” “LinkedIn profiles,” or “Amazon product data.” You can also filter results by category, price, or popularity.
Click on an Actor’s card to view its detail page, which includes a description, pricing model, input options, and example output. Reviewing the example output before running anything gives you a clear sense of whether the Actor returns the fields you actually need.
Step 2: Configure and Run the Actor
Click Try for free (or Start) on the Actor’s page. Most Actors require at minimum:
- A target URL or list of URLs
- Optional filters (e.g., date ranges, result limits, specific fields to extract)
For Actors like the AI Web Scraper, you can skip manual configuration entirely — paste the target URL, then describe what you want in plain language (e.g., “Extract all product names and prices”), and the Actor configures its own extraction logic based on that prompt.
Once configured, click Start (or Run) to begin. The Actor runs in Apify’s cloud, so you don’t need to keep your browser open or your computer running during the job.
Step 3: Retrieve Your Data
When the run completes, results are stored in an Apify Dataset, accessible from the run’s detail page. From there you can:
- View results directly in a table format in the browser
- Export to CSV, JSON, Excel, or other formats
- Connect a webhook to push results automatically to Slack, Google Sheets, or another endpoint
For ongoing data needs, Actors can be scheduled to run on a recurring basis, with new results appended to the dataset or sent via webhook each time.
Step 4 (Optional): Building Your Own Actor
If no existing Actor covers your use case, Apify supports building custom scrapers using the Crawlee framework (Node.js or Python). At a high level:
- Install the Apify CLI and log in (
apify login) - Scaffold a new project with
apify create my-actor, choosing a starter template (e.g., a Crawlee Cheerio or Playwright template depending on whether the target site requires JavaScript rendering) - Write your scraping logic in the generated project
- Test locally with
apify run - Deploy to your account with
apify push
After deploying, add an INPUT_SCHEMA.json file and a clear README so the Actor is usable from the Console — this is also required if you intend to publish it to the Apify Store for others (and potentially monetize it through pay-per-result pricing).
A simple custom scraper covering a handful of pages typically costs a small fraction of a dollar in compute units; larger jobs using browser-based rendering (Playwright) cost more per page than lightweight HTML-based scraping.
Settings That Are Easy to Miss
- Compute Units (CUs) vs. flat pricing: most Actors bill based on compute usage, not a flat per-run fee — a scraper that uses a full browser (Playwright) to render JavaScript-heavy pages will consume credits significantly faster than one using simple HTML parsing, even for the same number of pages.
- Free tier credits reset monthly but don’t roll over: unused credits don’t carry into the next month, so it’s worth timing larger one-off jobs to use available credits before they reset.
- Proxy and stealth settings are often configurable per Actor: many Store Actors include options for proxy rotation or browser stealth settings — leaving these on defaults is usually fine for smaller jobs, but sites with aggressive bot detection may require adjusting these before a run succeeds.
- Dataset results aren’t automatically deduplicated: if you run the same Actor multiple times against overlapping inputs, you’ll get duplicate records across runs unless you handle deduplication downstream (in your CRM, spreadsheet, or via a dedicated deduplication step).
- MCP/agent integrations: Apify Actors can be run directly from AI tools like Claude Desktop or Cursor via Apify’s MCP server — useful if your workflow already involves an AI assistant pulling data on demand rather than running jobs manually from the Console.
Related Reading
- See our Apify vs. Bright Data comparison for how the two platforms differ on pricing, proxy infrastructure, and ease of use
- Read our Bright Data review for a closer look at proxy-based data collection
- Return to the Data, Tools & Digital Infrastructure hub for more tools in this category
Disclaimer: Workflow Dynamics is a digital blueprint and resource hub. Some links on this website may be affiliate links, which can yield a commission for us at no additional cost to you. Affiliate Disclosure Page
