XPENG

AI Agent Data Pipeline Intern

XPENG

San Clara, Manitoba, Canada · Part Time

Be the first to apply

Experience
Any
Salary
Openings
1
Posted
2 days ago

Where you'll work

Job description

About the role

XPENG is a smart mobility company focused on advanced AI, autonomous driving, and connected technologies across electric vehicles, eVTOL aircraft, and robotics. The team develops platform capabilities that help build and deploy autonomous driving AI models, working closely with machine learning engineers to improve the speed, quality, and reliability of the experiment cycle from planning through deployment readiness.

The internship centers on creating the data foundation for an LLM-powered agent that helps machine learning engineers gather experiment context, track progress, and uncover useful insights from active model development work. The role involves organizing messy data sources, especially chat and meeting content, so the agent can accurately retrieve, interpret, and reason over experiment-related information. Based on performance and interest, there may also be an opportunity to assist with fine-tuning LLM-based models using curated experiment data.

Responsibilities

  • Develop pipelines to collect and structure experiment-related information from team conversations, meeting notes, experiment plans, analysis documents, metrics, and evaluation outputs.
  • Apply LLM-assisted methods to clean unstructured, noisy inputs, pull out experiment-relevant details, and turn fragmented discussions into structured records.
  • Create schemas, metadata, and validation checks that improve searchability, traceability, and downstream use of experiment context.
  • Support retrieval and indexing flows, including semantic search or RAG-style systems, so the agent can access the most relevant context.
  • Assemble curated datasets for agent testing and, where needed, LLM fine-tuning or instruction tuning.
  • Partner with machine learning engineers and platform engineers to understand experiment workflows, information gaps, and the insights most valuable for planning and analysis.
  • Check whether the agent is correctly using curated experiment data to produce summaries, comparisons, recommendations, and analytical insights.
  • Help create internal tools, dashboards, or reports that let teams monitor experiment status, results, and trends.

Requirements

  • Solid programming ability in Python and SQL, along with practical data processing experience.
  • Experience handling both structured and unstructured datasets, including text-heavy sources such as documents, notes, messages, or logs.
  • Exposure to data pipelines, ETL workflows, or large-scale data processing systems.
  • Interest in LLM development, LLM evaluation, agentic AI, RAG pipelines, semantic retrieval, prompt engineering, or LLM-assisted data workflows.
  • Familiarity with machine learning workflows, model training, evaluation metrics, or MLOps concepts.
  • Strong analytical judgment with a focus on data quality, consistency, and reliability.
  • Ability to work through ambiguous data inputs and collaborate with ML and platform teams to clarify needs.
  • Prior experience creating internal tools, automation scripts, or data quality checks.

Perks and benefits

  • Supportive, engaging, and collaborative work environment.
  • Access to infrastructure and compute resources needed for the work.
  • Chance to work on advanced technologies alongside highly skilled professionals.
  • Opportunity to contribute to the transportation revolution through autonomous driving innovation.
  • Competitive compensation package.
  • Snacks, lunches, dinners, and enjoyable team activities.

Additional information

XPENG is an equal opportunity employer. Employment decisions are made without regard to race, age, color, sex, sexual orientation, religion, national origin, disability, veteran status, marital status, or any other status protected by applicable federal or state law.

Leave it if you'd like a reply — we won't use it for anything else.

Click to browse, drag & drop, or paste a screenshot

PNG, JPG, GIF, MP4, WebM, MOV · Max 20MB each · Up to 5 files