Simple Web Scraping Colly App with Fiber

A Go application using Fiber, Colly v2, and GORM to scrape websites and persist data in PostgreSQL.

Prerequisites

Docker and Docker Compose

How to Run

Clone the repository.
Navigate to the project directory: cd colly-gorm
Copy the example env file: cp app/app.env.example app/app.env
Start the stack: docker compose up --build

Project Structure

colly-gorm/
├── app/
│   ├── app.env.example              # Environment variable template
│   ├── Dockerfile
│   ├── go.mod
│   ├── cmd/
│   │   └── api/
│   │       └── main.go              # App entry point, Fiber routes
│   └── internals/
│       ├── consts/
│       │   └── consts.go            # Config loading via Viper
│       └── services/
│           ├── database/
│           │   ├── database.go      # GORM connection
│           │   └── models.go        # Quote and Course models
│           └── scrapers/
│               ├── toscrape.go      # Quotes scraper
│               └── coursera_courses.go  # Coursera scraper
├── db/
│   └── create_db.sql                # DB initialization
└── docker-compose.yml

API Endpoints

Method	Path	Description
`GET`	`/api/healthchecker`	Health check — returns service status
`GET`	`/scrape/quotes`	Triggers async scraping of quotes.toscrape.com and stores results in PostgreSQL
`GET`	`/scrape/coursera`	Triggers async scraping of coursera.org/browse and stores course data in PostgreSQL

Scraping jobs run asynchronously; the endpoint returns immediately while scraping continues in the background.

Database Models

Quote

author — quote author
quote — quote text

Course

title, description, creator, url, rating

Environment Variables

See app/app.env.example:

POSTGRES_HOST=colly_db
POSTGRES_PORT=5432
POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres
POSTGRES_DB=colly

What It Does

Registers Colly HTML callbacks before visiting pages (correct callback order).
Scrapes data from websites and stores it in a PostgreSQL database via GORM.
Uses Fiber middleware (logger, CORS) applied globally before sub-app routing.

Prerequisites​

How to Run​

Project Structure​

API Endpoints​

Database Models​

Environment Variables​

What It Does​