Specifications
- Apply from: Residents Only
- Salary: 8M-16M
- Location: Not Specified
- Conditions: Fulltime
Requirements
- Japanese: Business (N2)
- English: Business
- Minimum Experience: Not Specified
- Status:
NEW
Job Description
Responsibilities
-
Evaluation Metric Development
- Research and implement LLM-as-Judge calibration, reward modeling, and benchmark design
- Define and validate evaluation metrics for AI quality assessment
-
Automated Evaluation Pipelines
- Design and build scalable CI/CD evaluation systems
- Integrate research outcomes into production workflows
-
Safety and Red Teaming
- Automate adversarial testing and policy compliance verification
-
Quality Improvement
- Use statistical experiments (A/B testing) to verify prompt and model changes
- Provide evaluation feedback to R&D teams for continuous improvement
-
Product Quality Assurance
- Ensure AI output quality for ~200 enterprise clients
Apply Now