Data Engineer, Chemistry Simulation Platform
Job Description
About SandboxAQ
SandboxAQ is a high-growth company delivering AI solutions that address some of the world's greatest challenges. The company’s Large Quantitative Models (LQMs) power advances in life sciences, financial services, navigation, cybersecurity, and other sectors.
We are a global team that is tech-focused and includes experts in AI, chemistry, cybersecurity, physics, mathematics, medicine, engineering, and other specialties. The company emerged from Alphabet Inc. as an independent, growth capital-backed company in 2022, funded by leading investors and supported by a braintrust of industry leaders.
At SandboxAQ, we’ve cultivated an environment that encourages creativity, collaboration, and impact. By investing deeply in our people, we’re building a thriving, global workforce poised to tackle the world's epic challenges. Join us to advance your career in pursuit of an inspiring mission, in a community of like-minded people who value entrepreneurialism, ownership, and transformative impact.
About the Role
SandboxAQ is seeking a generalist software engineer to build infrastructure, development tooling, data pipelines, and data storage systems for AI and simulation in chemistry and life sciences. In this role, you’ll be crucial in developing tools that ingest, process, and serve large amounts of data. You’ll also contribute to improving the developer experience to increase developer velocity.
You’ll bring broad experience in data storage and processing technologies, orchestration tools, Python programming, CI/CD, and infrastructure as code. Most importantly, you’ll bring a track record of working in a small, fast-moving software development team, exploring new technologies, and solving problems across an entire software stack.
What You’ll Do
- Build and operate scalable data pipelines for data ingestions, processing, analytics, and storage. Optimize performance and cost-effectiveness of data pipelines and storage.
- Maintain data warehouse and data lake solutions.
- Collaborate closely with R&D teams to build and operate data tooling to meet project goals.
- In collaboration with domain experts, design and implement data models for scientific data and APIs to store and manipulate data across file storage, graph databases, and relational databases.
- Contribute to the design and implementation of complex, security-sensitive data processing and storage systems with complex tenancy and data isolation requirements.
- Collaborate closely with the product team and internal stakeholders in all phases of software development to validate the solutions you propose and implement.
- In collaboration with the rest of the engineering team, build and manage infrastructure for SandboxAQ’s simulation and data platform.
- Review code and participate in design and architectural discussions.
About You
- 3+ years of experience with Python, with strong knowledge of software design principles.
- Understanding of database principles and best practices.
- Experience with large-scale analytic databases. BigQuery preferred.
- 2+ years of experience with infrastructure as code management of public cloud providers. Familiar with terraform. GCP preferred.
- 2+ years of experience building data pipelines or data processing systems at scale.
- Experience with orchestration tools like Airflow.
- Experience with writing and optimizing database queries, graph database experience is a plus.
- Strong understanding of web and network fundamentals and experience with designing, building, and testing web APIs.
- Knowledge of CI/CD best practices and building CI/CD pipelines.
- Excellent communication and collaboration skills, with the ability to effectively influence a cross-functional team.
Nice to have
- Experience with GraphQL and familiarity with Strawberry or similar.
- Experience with FastAPI.
- Experience with CircleCI.
The US base salary range for this full-time position is expected to be $125k - $175k per year. Our salary ranges are determined by role and level. Within the range, individual pay is determined by factors including job-related skills, experience, and relevant education or training. This role may be eligible for annual discretionary bonuses and equity.
SandboxAQ welcomes all.
Company Information
Location: Palo Alto, California, United States
Type: Hybrid