Available for interesting conversations Kathmandu · 27.7° N v. May 2026

Pipelines, plates of dal-bhaat, and a stubborn jogging habit.

I'm Rajeev — a data engineering team lead in Kathmandu. I spend most of my weekdays building production data pipelines for blockchain analytics and social intelligence, and most of my weekends running, watching F1, or arguing about midfield football.

01 / About

I lead a small data team at 108 Capital. We ship pipelines that index multi-TB blockchain history, scrape the strange corners of social media, and turn audio into searchable text — the kind of plumbing that's unglamorous until it isn't.

I care about systems that don't fail silently, code reviews that improve taste, and Saturdays that smell like wet pavement after a long run. I studied computer engineering at Pulchowk Campus in Nepal and have been writing data infra full-time since 2023.

This site is mostly for me. A spot to keep notes, list a few things I'm proud of, and remember that there's life outside the warehouse.

Currently shipping

ClickHouse → Trino migration

Benchmarking across cluster shapes for our analytics workloads.

Currently reading

Designing Data-Intensive Applications, again

Third pass. Different highlighter color this time.

Currently training for

Kathmandu Marathon, October

Long runs at 5:30 a.m. before the city wakes up.

Currently watching

F1 2026 + La Liga matchday 12

The new regs are weirder than expected. Loving it.

02 / Work

Three roles, one through-line — make the pipes hold.

Apr 2025 — Now

Data Engineering Team Lead

108 Capital · Remote

Lead a team of engineers shipping production data infrastructure. Owned architecture, deployment, and monitoring across five products in our first year together — from on-chain indexers to NLP-on-warehouse sentiment pipelines.

Architecture Mentorship Reviews
Oct 2023 — Mar 2025

Data Engineer

108 Capital · Remote

Built the on-chain indexer from scratch as sole owner — a ClickHouse warehouse with dbt + Prefect ETL keeping EVM data under one-minute latency at multi-TB scale. Then built the Twitter pipeline and started the Spark migration.

ClickHouse dbt Prefect Airflow
Jan 2023 — Oct 2023

Software Engineer — Frontend & Real-Time

Arman Group · Remote

Built a multiplayer mobile Ludo game in Flutter and owned the real-time communication layer over WebRTC. Different stack, same taste for low-latency systems.

Flutter WebRTC Dart
03 / Things I built

Six products, one team, one calendar year.

Production systems shipped at 108 Capital. Each one has a stack story and an outage story — happy to tell either over coffee.

01 — Production ClickHouse · Kafka · dbt

EVM On-Chain
Indexer

Multi-chain blockchain data platform covering Ethereum Mainnet, Base, BSC, Arbitrum, and Optimism. 20 TB+ of on-chain history with sub-1-minute end-to-end latency using Kafka stream processing and ClickHouse UDFs for real-time ABI decoding.

ClickHouseKafkadbtPrefectPythonUDFs
KAFKAstream
ABI DECODECH UDF
CLICKHOUSE20 TB+● live
< 1 min end-to-end
02 — Production Airflow · ClickHouse · GPU NLP

Socials
Intelligence

Twitter pipeline with Airflow + ClickHouse, GPU-enabled NLP UDFs for in-warehouse sentiment and entity extraction. Worst-case ingest delay at 45 minutes.

AirflowClickHouseGPU UDFsNLP
TWITTER
AIRFLOWGPU NLP
SENTIMENT45 min SLA
03 — Production Plomberry · Celery · Whisper

AV
Engine

Tracks new YouTube channels and podcast feeds, downloads content, and transcribes with AI. Plomberry orchestration with Celery for async background work.

PlomberryCeleryWhisper
YT / PODCAST
CELERYPlomberry
WHISPERtranscript
04 Browserless · SurrealDB

Narrative
Intelligence

Substack + Seeking Alpha scraper for paywalled content via authenticated browser sessions. 10,000+ articles ingested into SurrealDB for downstream analysis.

BrowserlessSurrealDB
SUBSTACK
BROWSERLESSauth session
SURREALDB10k+ articles
05 SurrealDB · Python

Reddit
Engine

Subreddit and user tracking ingestion using SurrealDB as both lake and warehouse — one less ETL layer to babysit.

SurrealDBPython
REDDIT API
SURREALDBlake + warehouse
06 — In Progress Trino · ClickHouse · Benchmarks

ClickHouse
→ Trino

Designing a benchmarking framework across cluster configurations and sizes. Documenting findings as we go.

TrinoClickHouseBenchmarks
CLICKHOUSE
TRINObenchmarking
⏳ in progress
01 / 06
scroll to explore
04 / Stack

Tools I reach for, in roughly the order I reach for them.

Warehouses
ClickHouse · Trino · Snowflake
Orchestration
Prefect · Airflow · Celery · Plomberry
Transformation
dbt · SQL · Python
Streaming
Apache Kafka · ClickHouse Mat. Views
Databases
SurrealDB · PostgreSQL
Languages
Python · Go · Dart · SQL
Infra
Docker · Linux · GPU UDFs · Distributed clusters
Real-time / other
WebRTC · Browserless · Headless scraping
05 / Off-hours

Most of who I am happens off the keyboard.

A non-exhaustive list of obsessions that crowd out engineering on the weekend. None of them are productive. All of them are the point.

Sport · Discipline

Running, badly but daily

Started after my wrist gave up on too much typing. Now somewhere between "stubborn jogger" and "slow marathoner". Targeting sub-4 in October.

Sunday ritual

Formula 1, every weekend

Race day starts before lights out. I keep an embarrassingly detailed spreadsheet of pit-stop deltas — old habits.

The other Sunday ritual

MotoGP

The bikes lean past 60° and pretend gravity is optional.

Tribal allegiance

Football, mostly tactics

I watch midfielders like other people watch the ball. La Liga and the Premier League weekends are sacred.

Slow burn

Reading non-fiction

Systems books, distributed-systems papers, the occasional novel when my brain needs a hard reset.

Background hum

Long walks, longer playlists

Kathmandu's hills are the best debugging environment I've found. One hour out, one bug fewer.

07 / Contact

If a pipeline of yours is leaking, I might be useful. Otherwise, just say hi.

Reach me

I'm slow on Twitter, fast on email. Most replies within a day.

Send a message
Kathmandu, Nepal
27.7172° N · 85.3240° E