Skip to content

JobScrapper-io/JobScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🕵️ JobScraper AI

Intelligent Job Aggregator with Semantic Vector Search

NestJS Prisma Python Go SQLite Handlebars Docker HTML5 CSS3


📖 About The Project

JobScraper AI is not your typical keyword-based job board. It utilizes a curated dataset of job offers collected by a high-performance Go (Golang) scraping engine and leverages Artificial Intelligence to understand the meaning of job titles.

Using a dedicated Python microservice, the system generates vector embeddings for your search queries in real-time. This allows users to search for "Software Engineer" and find results for "Backend Developer" or "Fullstack Programmer" thanks to semantic similarity search powered by sqlite-vec.

JobScraper - Page

✨ Key Features

🧠 Semantic AI Search

Don't worry about exact keywords. Type what you are looking for, and our Vector Search Engine will find the most contextually similar offers.

  • Powered by sentence-transformers & sqlite-vec for high-performance local vector similarity.
JobScraper - Job Offers

🔍 Smart Filtering & Pagination

We prioritize user experience with intuitive controls:

  • Recency Sorting: Automatically displays fresh listings first, ensuring you never miss the newest opportunities.
  • Navigation: Browse through hundreds of offers using navigation buttons.
  • Custom Limits: You decide how many offers to see per page using the dropdown selector.
JobScraper - Filtering and Pagination

📊 Rich Job Details

Each card provides essential insights at a glance, with a detailed view available on click (displaying all provided data):

  • 🛠️ Tech Stack & Skills: Instantly see required technologies alongside nice-to-have skills.
  • 📍 Location & Mode: Check if the offer is Remote, Hybrid, or On-site.
  • 💰 Salary & Contract: View salary ranges (Net/Gross) and contract types (B2B/UoP).
  • 📝 Full Description: Click to expand the card and read the complete requirements and responsibilities.
  • 🔗 Direct Application: One-click redirection to the original offer page.
JobScraper - Job Offer Details

🏗️ How It Works (Architecture)

The system is designed to perform fast read-heavy operations on a pre-built dataset generated by an external high-concurrency pipeline.

  1. Data Collection (External Pipeline):

    • A specialized scraping engine written in Go (Golang) crawls multiple job boards to aggregate offers.
    • The AI Service calculates vector embeddings for all collected titles.
    • The combined data (metadata + vectors) is saved into a portable SQLite database.
  2. Runtime Application (This Repo):

    • The application mounts this pre-populated .db file, ensuring zero-latency startup and no runtime scraping overhead.
  3. AI Service (Python):

    • Acts as a runtime vectorizer using the all-MiniLM-L6-v2 model.
    • Converts the user's search query into a high-dimensional vector to match it against the database.
  4. Core Backend (NestJS + Prisma):

    • Orchestrates the flow between the UI, the AI Service, and the Database.
    • Serves the frontend using Handlebars (HBS) templates.
  5. Database (SQLite + sqlite-vec):

    • Performs the actual mathematical heavy lifting.
    • Uses sqlite-vec to execute search between the query vector (from Python) and the job vectors (stored in DB).

🚀 Installation & Setup

Prerequisites

Running with Docker

  1. Clone the repository

    git clone https://github.com/JobScrapper-io/JobScraper.git
    cd JobScraper
  2. Build and Run

    docker-compose up --build
    • The App will start at: http://localhost:3000
    • The AI Service will start at: http://localhost:8000

💻 Usage

  1. Open your browser at http://localhost:3000.
  2. Search: Enter a job title (e.g., "Python Developer"). The system sends your text to the AI Service to create a vector, then queries the database.
  3. Browse: Use the dropdown to change the number of results per page.
  4. Navigate: Use the page navigation buttons to scroll through results.
  5. Apply: Select a job card to view full details and navigate to the original source to apply.

🛠️ Tech Stack Details

  • Backend: NestJS (TypeScript)
  • ORM: Prisma (Schema management & Client)
  • AI Model: all-MiniLM-L6-v2 (via SentenceTransformers)
  • Database: SQLite with sqlite-vec extension
  • Frontend: Server-Side Rendering with Handlebars (HBS), HTML5, CSS3
  • Containerization: Docker & Docker Compose

About

Job Search 2.0. Go scrapes. Python embeds. NestJS serves. A semantic aggregator using SQLite-vec & Prisma to find jobs by meaning, not just keywords.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors