https://github.com/alex-berliner/facebook_mp_scraper.git
A Python-based scraper using Playwright to extract listings from Facebook Marketplace with authentication, pagination, and caching.
pip install -r requirements.txt
playwright install chromium
You can set your Facebook credentials as environment variables:
# Windows PowerShell
$env:FB_EMAIL="your_email@example.com"
$env:FB_PASSWORD="your_password"
# Windows CMD
set FB_EMAIL=your_email@example.com
set FB_PASSWORD=your_password
# Linux/Mac
export FB_EMAIL="your_email@example.com"
export FB_PASSWORD="your_password"
Or create a .env file:
FB_EMAIL=your_email@example.com
FB_PASSWORD=your_password
Note: If credentials are not provided, the scraper will open a browser window for manual login.
python main.py "search query"
# Search for "laptop"
python main.py laptop
# Search with custom max listings
python main.py "gaming chair" --max-listings 50
# Run in headless mode (browser not visible)
python main.py "bicycle" --headless
# Ignore cache and scrape all listings
python main.py "car" --no-cache
query (required): Search query for Facebook Marketplace--max-listings N: Maximum number of listings to scrape (default: 100)--headless: Run browser in headless mode--no-cache: Ignore cache and scrape all listingsdata/cookies.jsondata/cache.jsonmarketplace_apt/
โโโ src/
โ โโโ __init__.py
โ โโโ scraper.py # Main scraper class
โ โโโ auth.py # Authentication handling
โ โโโ cache.py # Listing cache management
โ โโโ models.py # Data models/classes
โโโ data/
โ โโโ cookies.json # Stored session cookies
โ โโโ cache.json # Cached listings
โโโ requirements.txt
โโโ config.py # Configuration settings
โโโ main.py # Entry point
โโโ README.md
data/cookies.json (created automatically after login)data/cache.json (created automatically when listings are scraped)data/cookies.json and login againplaywright install chromium--headless to see what's happeningThis project is for educational purposes only. Use responsibly and in accordance with Facebook's Terms of Service.