Research Engineer, Post Training RL Job at TensorStax, Hayward, CA

ZjRTTlluQWZtMnpnSnR6Z3pUVU1jNWFIQVE9PQ==
  • TensorStax
  • Hayward, CA

Job Description

Research Engineer – Post Training Reinforcement Learning

Location: San Francisco (Hybrid)

About TensorStax

TensorStax is building fully autonomous AI systems to manage and maintain mission-critical data infrastructure and pipelines. We leverage reinforcement learning to enhance language models' ability to reason over large-scale data lakes and warehouses, detect pipeline failures, construct new pipelines with high precision, and enable agentic behavior—allowing systems to proactively identify and resolve issues autonomously.

What You’ll Do

As a Research Engineer specializing in Reinforcement Learning, you will:

  • Develop and refine reward functions to optimize agent behavior for complex data engineering tasks.
  • Create RL gym environments for language model agents.
  • Fine-tune language models using reinforcement learning techniques such as PPO, DPO, and KTO.
  • Stay at the forefront of research on RL for language models, incorporating advancements like GRPO, SWE-Gym, and SWE-RL into practical applications.
  • Curate and build high-quality datasets for supervised fine-tuning (SFT) and RLHF.
  • Design experiments to evaluate and improve the agentic capabilities of language models in data environments.

What We’re Looking For

  • Deep understanding of reinforcement learning, reward shaping, and optimization strategies.
  • Strong familiarity with LLM fine-tuning techniques (PPO, DPO, KTO) and their applications in reinforcement learning.
  • Knowledge of recent advancements in RL for language models (GRPO, SWE-Gym, SWE-RL).
  • Experience curating and constructing high-quality datasets for fine-tuning.
  • Strong problem-solving skills and a history of working on complex ML projects.
  • High agency—ability to work independently, experiment proactively, and drive research initiatives forward.

Bonus Points

  • Experience with distributed training in PyTorch (DDP, FSDP).
  • Hands-on experience designing RL environments for traditional RL problems.
  • Contributions to open-source projects in RL, LLMs, or ML infrastructure.
  • Familiarity with data lakes and warehouses (Snowflake, BigQuery, Redshift).

Benefits

  • 100% employer-covered health, dental, and vision insurance.
  • 401(k) with company match.
  • Access to Bay Club or Equinox in San Francisco.

Job Tags

Temporary work,

Similar Jobs

RTK Tickets

Trading Operations Analyst Job at RTK Tickets

 ...Monitor and manage ticket inventory, ensuring accuracy, availability, and pricing competitiveness Risk Management: Make informed trading decisions and effectively handle inventory-related risks. Quality Control: Implement and maintain quality control procedures to... 

Visiting Rehab and Nursing Services

Physical Therapy Assistant - PTA - Part Time Job at Visiting Rehab and Nursing Services

 ...Description: Physical Therapy Assistant (PTA) Braintree, MA $37-$40 Per Visit | Flexible Schedule | Comprehensive Benefits ***Proud Winner of Boston Globe's Top Places to Work 2024!*** Are you looking for a rewarding career where you can make a real difference... 

New City Window Cleaning

Window Cleaner Job at New City Window Cleaning

 ...Window Cleaner Job Summary: We are seeking a part-time detail-oriented and reliable individuals to join our team at New City Window Cleaning. The ideal candidate will be responsible for cleaning windows in commercial buildings and residential properties to ensure they... 

Vault Bioventures

Consultant Job at Vault Bioventures

 ...business development, business transformation, and digital customer engagement. About the Role - As a Vault Bioventures Consultant, you will have the opportunity to work with a diverse client base that includes some of the worlds largest pharmaceutical companies... 

Network ESC A Division of Network Temps, Inc.

Recruiter Job at Network ESC A Division of Network Temps, Inc.

 ...Responsibilities Responsibilities include developing and implementing external/internal recruiting strategies, identifying recruiting sources and maintaining ongoing records of vacancies The Recruiter must be able to work effectively in a team environment as well...