A high-performance, concurrent HTTP/HTTPS proxy server built in C using core concepts of
Operating Systems (threads, semaphores, synchronization, caching) and Computer Networks
(socket programming, HTTP parsing, request forwarding, and tunneling).
Developed and maintained at 👉 github.com/Prayas248/MultiThreaded-Proxy-Server
- Introduction
- Key Concepts
- Features Added
- System Architecture
- How to Run
- Networking Workflow
- Demo
- Metrics & Logging
- Contributing
This project implements a multi-threaded caching proxy server that intermediates between clients and web servers.
It can:
- Handle multiple simultaneous HTTP/HTTPS client requests using threads and semaphores.
- Cache frequently accessed responses using an LRU (Least Recently Used) algorithm.
- Expose runtime metrics via an HTTP endpoint (
/__metrics). - Log every transaction with structured logging for observability.
- Multithreading: Implemented using
pthread_create()for concurrent client handling. - Semaphores: Limit the number of active threads (
sem_wait()/sem_post()). - Mutex Locks: Protect cache and shared resources against data races.
- LRU Cache Management: Automatically evicts least-recently used entries when full.
- Thread Pools: Efficient reuse of worker threads to reduce context-switch overhead.
- Socket Programming (TCP/IP): End-to-end client-proxy-server communication.
- HTTP Request Parsing: Parses
GET,CONNECT, headers, and HTTP versions. - HTTPS Tunneling: Handles encrypted traffic via the
CONNECTmethod. - Timeout Handling: Prevents blocking on slow or dead connections.
- Dynamic Response Forwarding: Streams data from origin servers to clients efficiently.
| Feature | Description |
|---|---|
| 1️⃣ Thread Pool & Semaphores | Manages concurrency safely and prevents overload. |
| 2️⃣ HTTPS CONNECT Support | Enables secure HTTPS proxy tunneling. |
| 3️⃣ Timeout Handling | Adds recv/send timeouts for both client and server sockets. |
4️⃣ /__metrics Endpoint |
Exposes live proxy statistics:total_requests, active_clients, cache_hits, cache_misses. |
| 5️⃣ Structured Logging | Prints one-line logs for each request:[timestamp] METHOD HOST PATH STATUS BYTES TIME(ms) |
| 6️⃣ LRU Cache | Caches responses to improve speed on repeated requests. |
| 7️⃣ Graceful Shutdown & Resource Handling | Ensures sockets, threads, and cache memory are safely released. |
┌────────────┐ ┌────────────────┐ ┌────────────┐ │ Browser │ ─────▶ │ Proxy Server │ ─────▶ │ Web Server │ └────────────┘ └────────────────┘ └────────────┘ │ ▼ ┌──────────┐ │ Cache │ │ (LRU) │ └──────────┘
- Proxy Server: Listens on a user-defined port, parses requests, forwards them to target servers.
- Cache: Stores frequently accessed responses; managed using timestamps and LRU logic.
- Thread Pool: Spawns worker threads to handle concurrent client sessions.
- Metrics & Logger: Collects statistics and outputs structured logs for monitoring.
- Client Connection: Proxy accepts new TCP connections using
accept(). - Thread Handling: Each connection is assigned to a thread from the pool.
- Request Parsing: HTTP request is parsed (method, host, port, headers).
- Cache Lookup:
- If found → send from cache.
- If not → forward to remote server via
connect().
- Response Forwarding: Proxy receives data from remote server using
recv()and forwards to client usingsend(). - Cache Update: Response stored in cache for future requests.
- Logging: Each request logged with duration, bytes, and status.
- Metrics Update: Counters updated in atomic variables for
/__metrics.
$ git clone https://github.com/Prayas248/MultiThreaded-Proxy-Server.git
$ cd MultiThreaded-Proxy-Server
$ make all
$ ./proxy <port_number>
---
---
## 📊 Structured Logging
Every request (HTTP, HTTPS, or cache hit) is logged with time, method, host, status, size, and latency.
Example output:
[2025-10-14 19:55:02] CONNECT google.com (tunnel) 200 0B 14ms
[2025-10-14 19:55:06] GET example.com /index.html 200 4839B 88ms
[2025-10-14 19:55:10] CACHE - - 200 4839B 2ms
[2025-10-14 19:55:13] GET localhost /__metrics 200 84B 1ms
---
## 🧪 Demo

### Cache Behavior
- **First Visit:** Cache miss — `"url not found"` printed.
- **Subsequent Visit:** Cache hit — `"Data retrieved from Cache"` printed.
---
## 📈 Metrics Endpoint
The proxy exposes a live statistics endpoint for real-time monitoring.
**View metrics:**
```bash
curl http://localhost:8080/__metrics
total_requests 125
active_clients 4
cache_hits 39
cache_misses 86