Careers

Data Engineer

Own the data platform that powers ASOC, UPII research, and partnership engagements.

About the role

DataFrontier runs on data. ASOC ingests buying signals and call outcomes in real time; UPII research depends on clean, well-modeled corpora; our partnership engagements stand up warehouses and pipelines for other teams. The data platform under all of it is one person's to own.

As our Data Engineer, you will design, build, and operate that platform — the ingestion, the warehouses, the models, and the tooling that keeps them trustworthy. You will work across Snowflake, BigQuery, and Redshift depending on the engagement, and decide where the platform should be opinionated and where it should stay flexible.

This is not a ticket-taking role. You will set the standards for how data moves through the company, and you will be the person the research and product teams rely on when a question is only as good as the pipeline behind it. We are small, so you will own the whole surface rather than a corner of it.

What you'll do

Own the platform — design, build, and run ingestion, warehouses, and the tooling around them.
Model the data — turn raw events into models the research and product teams can trust.
Keep it trustworthy — tests, lineage, and monitoring so a number is defensible.
Serve ASOC — the real-time signal and outcome data the revenue system depends on.
Serve UPII research — clean, well-structured corpora for the research program.
Support partnerships — stand up warehouses and pipelines for partner engagements.
Set the standards — decide how data moves through the company, and document it.

Who we're looking for

Production data engineering — you have built and run pipelines others depend on.
Warehouse depth — Snowflake, BigQuery, or Redshift in anger.
Strong SQL and modeling — you model for clarity, not just to make a query run.
Orchestration — you have owned scheduled, dependency-aware pipelines.
Reliability mindset — you treat a broken pipeline as a broken promise.
Range — comfortable across batch and streaming.
A finisher — you close problems and keep systems alive.
Self-direction — you do not need a manager scheduling your week.
Relevant background — with tools like dbt, Airflow or Dagster, and Kafka.

Nice to have

dbt or similar — you have run a modern transformation layer.
Streaming — Kafka, Pub/Sub, or equivalents in production.
GCP — we host on Google Cloud; familiarity helps.
Applied-AI adjacency — you have fed data to RAG or agent systems.
Open-source — you have contributed to tools the field uses.

How we hire

A short application: CV, links, a brief note on a data problem you'd want to own, optional writing sample
A 45-minute conversation with one of the founders
A paid take-home problem, scoped to ~6-8 hours
A half-day on-site in Bangalore
An offer within 7 working days of the on-site, or a clear no with feedback

We do not ghost. Whether or not we move forward, you will hear from us at every stage.

What we offer

Hybrid: 3 days in office (Tuesday/Wednesday/Thursday at HSR Layout, Bangalore), 2 days remote
Annual research budget for books, conferences, courses, and compute — your call on how to spend it
Conference attendance support for relevant venues
Health insurance for self, spouse, two children, two parents
28 days of paid time off, plus public holidays
Compensation is calibrated to seniority and prior outcomes, discussed at the offer stage