As a Senior Prompt Engineer focused on Data Science and Quality Analysis, you’ll design, test, and evaluate prompts for AI systems that interact with real-world restaurant data. You’ll work cross-functionally to develop AI solutions that drive operational efficiency, improve data interpretation, and support smarter decision-making for restaurant operators.
Your work will directly influence how AI models perform in high-stakes, dynamic environments such as order processing, reporting, support automation, and performance analysis.
- Proven experience in prompt engineering and working with LLMs (GPT-4, Claude, Gemini, and LLaMA) for text generation, reasoning, and structured data extraction.
- Proficiency in Python and SQL for data analysis, evaluation scripting, and workflow automation.
- Strong background in A/B testing, statistical analysis, and performance metrics
evaluation, with the ability to design experiments and interpret data-driven insights for continuous model optimization.
- Familiarity with prompt-evaluation tools such as LangFuse or Galileo, and Weights and Biases for experiment management and regression testing.
- Deep understanding of advanced prompting techniques, including few-shot prompting, reasoning-based prompting, multi-turn dialogue design, agentic orchestration, and DSPy/AdaFlow-style programmatic prompting frameworks.
- Experience applying CO-STAR and TIDD-EC! prompting frameworks for structured reasoning, instruction design, and context control in production-grade LLM systems.
- Excellent requirement-elicitation and communication skills, with the ability to translate business objectives into prompt engineering solutions.
- Analytical mindset with a process-driven approach to optimizing model behavior, data quality, and operational workflows.
- Design, test, and optimize LLM prompts for conversational AI, text classification, and structured data extraction tasks.
- Build evaluation pipelines to analyze prompt performance using quantitative metrics, human-in-the-loop feedback, and business KPIs.
- Conduct prompt experiments and regression testing to ensure stability, accuracy, and safety as models evolve.
- Collaborate with Machine Learning, Product, and Operations teams to translate business objectives into scalable, data-driven prompt-engineering strategies that enhance model accuracy, efficiency, and real-world usability.
- Use Python/SQL to analyze model outputs, identify anomalies, and automate quality checks.
- Document best practices and contribute to internal frameworks for prompt evaluation and continuous improvement.
- Communicate findings effectively to technical and non-technical stakeholders, driving measurable business impact through insight-driven decisions.
100% Remote
Salary $145,000 - $160,000
- B.S. or higher in a quantitative discipline (Data Science, Computer Science, Engineering, or related field) or in a field relevant to language models (Linguistics, Philosophy, Cognitive Science, etc.).
- 5+ years of relevant experience with a B.S. degree, or 3+ years of experience with a Master’s degree.
- Demonstrated proficiency in Python for automation, evaluation, and experimentation with LLM workflows.
- Academic or applied research experience related to language models, prompt engineering, or LLM-based systems is a strong plus.
- Familiarity with LLM architectures, embeddings, and fine-tuning techniques preferred.
- Experience with LLM red-teaming, adversarial evaluation, or model safety testing is a plus.
- Health Care Plan (Medical, Dental & Vision)
- Retirement Plan (401k)
- Life Insurance (Basic, Voluntary & AD&D)
- Flexible Paid Time Off
- Family Leave (Maternity, Paternity)
- Short Term & Long Term Disability
- Training & Development
- Work From Home
- Stock Option Plan