ਕੰਮ ਦਾ ਵੇਰਵਾ

About the company

The employer is a YC-backed startup focused on AI training data infrastructure. Instead of acting like a labor marketplace, it offers a platform that helps data owners convert their existing datasets into formats that AI labs need, and then sell those datasets directly to buyers. This model opens access to more valuable data sources, and the team is scaling quickly to meet increasing demand.

Role overview

This position is the company’s most urgent hiring need and centers on a difficult research challenge: quality assurance at scale. Because data moves through a decentralized marketplace, building reliable verification systems is the main constraint on growth. In this role, you will create automated methods that confirm and improve data quality so that suppliers can consistently provide strong data to customers.

The work begins with hands-on analysis of the data to uncover failure patterns. From there, you will design scalable systems for quality checks, combining rule-based logic, AI-assisted methods, and human review when needed. This is a research-heavy engineering role, not a manual quality assurance job.

Responsibilities

Spot problems in datasets, such as mismatches, formatting errors, and issues during ingestion.
Start with manual review to understand where and how data quality breaks down.
Create automated quality-control systems that can operate at scale using rules and AI-based techniques.
Develop mixed workflows that use automation alongside human review when that produces better results.
Keep refining validation approaches as data needs and AI tools continue to change.

Requirements

Very strong technical ability, with a fast learning curve and the capability to adapt quickly in a changing environment.
Experience in AI/ML engineering, or in software engineering at an AI company with clear exposure to data ingestion and processing pipelines.
Ability to infer likely data-quality issues from first principles.
Comfort working through ambiguous problems independently from start to finish.
Willingness to work on-site full time in San Francisco.
Nice to have: experience with noisy or unstructured data, or judgment on when to rely on automation versus human review.

Additional information

Location: San Francisco, CA

Work model: In-person

Industry: AI training data infrastructure

Compensation: $140K-$250K base, plus equity

Research Engineer

Where you'll work