Software Architect - Ion
**Location:** Remote, US-based, with morning overlap with our EU engineering
team
**Type:** Full-time, W-2 employee
**Reports to:** CTO
**Compensation:** $150,000-$175,000 base salary, commensurate with experience,
plus health, dental, vision, flexible PTO, and a remote-first environment
About ION
Ion builds the data platform behind sub-metered water utility deployments. Our
platform turns high-volume IoT telemetry from 80,000+ meters into reliable
hourly usage, leak detection, billing, and operational insight for property
owners and operators.
Our systems ingest several million device messages per day, reconcile data
across multiple device protocols and partner platforms, and deliver that data
into analytics, billing, operations, and customer-facing products. We are
scaling the platform and need a senior technical leader who can bring
architecture discipline, operational maturity, and hands-on engineering depth
to the team.
Role Summary
We are hiring a Software Architect to own the technical direction of the Ion
platform end-to-end. This is not a pure advisory role. You will write
production code, review architecture and pull requests, make key technical
decisions, and help the engineering team consistently ship reliable,
maintainable systems.
You will own the architecture behind our ingestion pipeline, device
integrations, time-series data model, analytics layer, and production
reliability practices. You will also be the senior technical voice day-to-day,
helping translate business needs around billing, leak detection, operations,
and customer reporting into scalable platform capabilities.
What You'll OwnCore Platform Architecture
* Own and evolve our cloud-native, event-driven ingestion platform for high-volume IoT telemetry.
* Design for reliable message processing, normalization, deduplication, late-arriving data, backfills, and operational recovery.
* Make architecture decisions around where logic belongs across ingestion services, storage layers, APIs, and analytics workloads.
* Ensure the platform can support new device streams, new partner integrations, and increasing telemetry volume without constant rework.
Data Architecture and Time-Series Storage
* Own the data model that supports raw telemetry, hourly usage, device health, leak detection, billing, and reporting.
* Lead schema design, migration strategy, retention strategy, and performance tuning across operational and analytical stores.
* Help determine when data belongs in a transactional store, time-series database, object storage, or secondary analytics layer.
* Improve data quality controls so the business can trust the numbers being shown to customers.
Analytics and Detection Logic
* Own the technical design behind usage calculations, leak detection, anomaly detection, and billing-grade aggregations.
* Work with product and operations teams to improve detection accuracy as new edge cases and device types are introduced.
* Build systems that make it easier to validate outputs, compare against partner data, and identify true data loss versus expected device or collector behavior.
Device and Partner Integrations
* Lead onboarding of new device streams and partner data sources with minimal disruption to the existing platform.
* Translate vendor MQTT/API specifications into Ion's normalized data model.
* Design reconciliation tooling that proves delivery completeness and highlights discrepancies between Ion, partner systems, and customer-facing dashboards.
Infrastructure, Reliability, and Operations
* Own infrastructure-as-code, deployment practices, observability, alerting, and production readiness standards.
* Define reliability expectations around ingestion latency, completeness, backlog handling, and incident response.
* Improve monitoring and reconciliation so data issues are caught internally before customers notice them.
* Participate in a light production escalation rotation for ingestion incidents, data drops, partner outages, and other platform-critical issues.
Responsibilities
* Set and document architecture direction for the Ion platform.
* Build and ship production backend code, primarily in TypeScript and Node.js.
* Review pull requests and raise the quality bar across the engineering team.
* Make practical build-versus-buy and storage decisions based on scale, reliability, maintainability, and cost.
* Lead technical design for new platform capabilities, partner integrations, and data products.
* Improve testing, observability, code review, deployment discipline, and operational maturity.
* Partner closely with Product, Operations, Sales, and the CTO to turn business requirements into technical plans.
* Mentor engineers and help a distributed team make better architecture decisions without creating unnecessary process.
Current Technical Environment
You do not need to have used every tool in our stack, but you should be
comfortable learning quickly and making good architecture decisions in this
type of environment.
* **Cloud and infrastructure:** AWS, Lambda, SNS/SQS, S3, DynamoDB, Fargate/ECS, IAM, VPC, CloudFormation
* **Data and analytics:** TimescaleDB, PostgreSQL, S3 Parquet, Athena, Glue, time-series aggregation, data reconciliation
* **Backend:** TypeScript, Node.js, event-driven services, APIs, background jobs, deployment automation
* **Device and partner data:** MQTT, partner APIs, device telemetry, normalization, idempotency, late-arriving data, backfills
* **Frontend touchpoints:** Next.js, React, Tailwind, platform APIs that support customer-facing workflows
* **Delivery:** GitHub Actions or similar CI/CD, Docker/ECR, structured code review, production monitoring and alerting
Required Experience
* 7+ years building production backend, platform, data, or cloud systems.
* 3+ years in a senior technical role such as Software Architect, Staff Engineer, Principal Engineer, or hands-on Engineering Lead.
* Strong experience designing event-driven, distributed, or streaming systems that handle high-volume data.
* Deep AWS experience, especially with serverless, messaging, storage, security, and infrastructure-as-code.
* Strong TypeScript/Node.js experience, or comparable backend depth with the ability to become productive in TypeScript quickly.
* Experience designing reliable data models and working with relational, time-series, analytical, or large-scale operational data stores.
* Experience with React / Next.js.
* Strong understanding of idempotency, retries, dead-letter handling, ordering, backfills, monitoring, and incident response.
* Track record of owning production systems where correctness, reliability, and operational visibility matter.
* Ability to lead architecture discussions, make clear technical decisions, and hold a high quality bar across a remote team.
Strongly Preferred
* MQTT or other device/IoT telemetry experience at meaningful scale.
* TimescaleDB, InfluxDB, ClickHouse, or similar time-series database experience.
* Data reconciliation experience, especially comparing internal systems against an external source of truth.
* Experience with utility, energy, water, building operations, billing, or other telemetry-heavy domains.
* Experience with anomaly detection, signal processing, or time-series analytics.
* Hands-on experience with S3-based analytics, Athena, Glue, Parquet, or similar lake/lakehouse patterns.
* Experience with Docker, ECS/Fargate, ECR, and CI/CD pipelines.
* Experience mentoring distributed engineering teams and improving engineering process without adding unnecessary bureaucracy.
Nice to Have
* Multi-tenant SaaS experience.
* Experience integrating LLM APIs into ETL, document extraction, operational workflows, or internal tooling.
Team and How You'll Work
* You will work closely with the CTO and Product Owner, and lead technical direction for a small, senior engineering team.
* The engineering team is primarily remote and EU-based, so morning overlap is important. We generally expect an 8:00am start time for standups, architecture review, and unblocking.
* This role is expected to be hands-on. You will write code, review code, debug production issues, and make architecture decisions.
* The environment is fast-moving, practical, and business-focused. We care about clean architecture, but we care even more about reliable systems, clear ownership, and solving the right problems for customers.
What Success Looks Like
In the first 90 days, success will look like:
* You understand the current platform architecture, data flow, and major reliability risks.
* You have shipped production improvements and earned trust with the engineering team.
* You have identified the highest-leverage architecture changes needed to improve reliability, scalability, and maintainability.
* You have improved visibility into ingestion completeness, latency, data quality, and operational health.
* You have helped the team make clearer architecture decisions and raised the standard for delivery quality.