web scraping project for upskilling in Rails / MongoDB deployments
https://github.com/furkan/searchapi.git
A limited-scope web scraping project to dip my toes in developing and deploying with a Rails/MongoDB stack.
It scrapes autocomplete results of Play Store searches and uses caching (opt-out) to both improve performance and respect Google's bandwidth. Nevertheless, this project is for personal, educational purposes and should not be used for large-scale commercial scraping.
The backend API uses Rails to retrieve autocomplete results from an internal Play Store endpoint. All request and response data are logged in a MongoDB cluster for analysis and caching.
The frontend is a single page that allows using the API, with logs of the last 24 hours. No Cache check can be used to ensure fresh results.
I initially explored using Kamal for deployment but opted for a Docker Compose setup with Caddy, as my prior experience with this stack allowed for a faster and more straightforward deployment.
For the live demo, DNS is handled by Cloudflare, and the project is hosted on an AWS EC2 instance.
.env with the following environment variables:MONGODB_URI=
DOMAIN_NAME=
EMAIL= # For caddy to provide when registering the domain for SSL
Caddyfiledocker compose build
docker compose up -d
docker compose logs -f
Go to your domain from the browser and use the text input and the No Cache checkbox to interact with the API and observe the request logs of the last 24 hours.
Example request:
curl "https://api.yourdomain.com/api/v1/play_store_suggestions/search?query=api"
Without caching:
curl "https://api.yourdomain.com/api/v1/play_store_suggestions/search?query=api&no_cache=true"
Example response:
{
"count":5,
"suggestions": [
"api",
"api healthcare mobile workforce",
"apify",
"api tester",
"apics"
],
"error": null
}