Build and operate LLM-augmented ETL pipelines and domain-specific information retrieval systems. Design, deploy, and optimize production-grade data ingestion, databases, and infrastructure on AWS. Own projects end-to-end, collaborate with leadership and customers, and refine data priorities based on user feedback.
Maven Bio builds domain-specific AI for the BioPharma industry.
Our clients include publicly-traded BioPharma companies, venture capital firms, and global consultancies. We're based in Boston, the heart of the global BioPharma industry. As a Data Platform Engineer, you'll play a central role in enhancing our industry datasets that directly impact strategic decision-making in the biopharma industry.
What You’ll Do:
- Collaborate directly with our CEO and technical leadership on strategic decisions and data strategy
- Design, build, and optimize LLM-augmented ETL pipelines
- Implement, experiment with, and optimize domain-specific information retrieval systems
- Own projects end-to-end, from conception to deployment, ensuring robustness, scalability, and accuracy
- Interface closely with our customers and internal teams to continuously refine data priorities based on user feedback.
What We’re Looking For:
- Professional experience building production-grade data ingestion pipelines
- Strong proficiency with Python and building robust APIs
- Experience designing and integrating LLM-enabled ETL pipelines
- Expertise with relational databases (PostgreSQL preferred) and infrastructure management on AWS (Kubernetes preferred)
- Demonstrated ability to rapidly learn and deeply engage with complex industries (biopharma experience is a significant plus)
- A builder mentality with side projects or professional experiences showcasing your ability to create innovative solutions
- You're in Boston or you're willing to relocate, you want to work in-person and you are excited to work in a low-meeting environment
What We Offer:
- Career Acceleration: Join a rapidly growing YC startup with sustained market traction that serves some of the top names in BioPharma and is backed by strong financial resources
- Impact & Ownership: Directly influence product direction and technological decisions
- Balanced Intensity: We aim for high productivity, focused work during the week (45-55 hrs) and value offline weekends
- Cutting-edge Technology: Opportunity to work at the forefront of generative AI paired with a proprietary database of BioPharma knowledge
Our Team:
- We are a ~10 person team that combines decades of experience from top-performing technology, biopharma, and consulting firms (McKinsey, Google, Airbnb, Valo Health, NeuTrace, Science.io)
Similar Jobs
Artificial Intelligence • Big Data • Cloud • Information Technology • Software • Big Data Analytics • Automation
Lead operational design and engineering for the data platform: own Snowflake and dbt administration, automate platform operations and CI/CD, monitor platform health and observability, enforce governance and access, optimize performance and costs, and develop AI/agent integrations to enable governed data access for business users.
Top Skills:
Anthropic ClaudeAutogenAzure OpenaiCortex AiCortex AnalystCortex SearchDbtDbt CloudDbt CoreDynamic TablesElementaryFivetranGithub ActionsGitlab CiLangchainLanggraphMatillionOpenaiPagerdutyPythonServicenowSlackSnowflakeSnowpipeSQLStreamsTasksTerraform
Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
Design, build, and operate scalable, secure data platform infrastructure for ingesting, processing, cataloging, and accessing petabytes of data. Improve Spark/Databricks reliability and developer experience, build ingestion/replication systems, develop internal libraries and tooling (Go/Python), and collaborate with cross-functional teams to support analytics, ML, and customer-facing data products.
Top Skills:
AirflowAmundsenSparkAws RdsCloudFormationDagsterDatabricksDatahubDelta LakeDockerDynamoDBEcsFargateGoHive MetastoreHudiIcebergJavaKinesisKubernetesLambdaPrefectPythonS3ScalaSqsTerraformUnity Catalog
Fitness • Healthtech • Retail • Pharmaceutical
Design, build, and maintain data platform components for event streaming, databases, and data warehouses. Troubleshoot performance and reliability issues, automate provisioning and backups, participate in on-call rotation, mentor junior engineers, and contribute to the technical roadmap and platform scalability, security, and automation.
Top Skills:
AWSAzureBashGithub ActionsInfrastructure As CodeKafkaPythonRedis
What you need to know about the Boston Tech Scene
Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.
Key Facts About Boston Tech
- Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
- Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
- Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
- Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories



