The HAL is one of the most consequential layers in an embedded SDK to design well — and one of the most commonly designed by accretion.

Three patterns dominate the production landscape, and the choice between them shapes everything downstream.

The hardware abstraction layer is where architectural intent meets the register map. It is also where most embedded SDKs accumulate their longest-lived design debt, because the HAL is usually the first thing written for a new silicon family — at the point in the project when the driver developers know the most about the hardware and the least about the platform’s eventual scope. Decisions made in week three of a new SDK end up shaping how the platform supports five product variants, three RTOSes, and a community ecosystem of third-party drivers, five years later.

Three interface patterns dominate production-grade HALs in the embedded landscape: opaque handle types, vtable-based polymorphism, and device-tree-driven driver models. They are not mutually exclusive — most mature HALs combine them — but each carries distinct implications for portability, testability, and the experience of the driver developer who has to extend the HAL to a new peripheral or a new silicon variant.

This post walks through each pattern, the problems it solves, and where it fits in the broader set of HAL design principles that any production SDK has to satisfy.


What the HAL has to do at once

Before the patterns make sense, it’s worth being explicit about what a HAL is being asked to do simultaneously. A production HAL has to satisfy at least five constraints, and the difficulty of HAL design is that these constraints partially conflict.

It has to be portable: the same HAL API call must have the same signature and behave identically on every supported device in the family, with the only exception being features the underlying hardware genuinely cannot support.

It has to be performant: HAL overhead must be negligible for high-throughput peripherals like SPI or DMA-driven UART. Driver code that adds a layer of indirection per byte is not a HAL — it’s a bottleneck.

It has to be RTOS-aware: blocking APIs must support configurable timeouts and integrate cleanly with RTOS task scheduling, without baking in a specific kernel.

It has to propagate errors cleanly: every HAL function must return a typed error code that callers can handle without invoking undefined behavior, consistent across the family regardless of which device implements which subset of features.

It has to be thread-safe and re-entrant — and these are two different requirements, both of which have to be specified per API function. Thread safety means HAL instances must be independently lockable for concurrent use from multiple RTOS tasks, without forcing callers to share a single global lock. Re-entrancy means specific functions must be safe to call from interrupt context as well as task context. A given API function may be one, both, or neither, and the contract has to be explicit in the header for every function in the HAL.

The three interface patterns are different answers to the question of how to satisfy these constraints simultaneously, and they make different trade-offs against the porting cost, the testing experience, and the driver-developer ergonomics.


Pattern 1: Opaque handle types

The opaque handle is the most widely deployed pattern in embedded HALs. A handle is a pointer to an incomplete type — a struct declared in the public header but defined only in the implementation. Callers hold handles and pass them to driver functions but cannot inspect or modify internal fields.

/* uart_hal.h – public interface */

typedef struct hal_uart_s hal_uart_t;  /* incomplete type: implementation hidden */

hal_uart_t *hal_uart_open(uint8_t instance, const hal_uart_config_t *cfg);

int         hal_uart_write(hal_uart_t *h, const uint8_t *buf, size_t len);

int         hal_uart_read(hal_uart_t *h, uint8_t *buf, size_t len, uint32_t timeout_ms);

void        hal_uart_close(hal_uart_t *h);

What this buys you, in practice, comes down to three concrete properties.

Driver internals can change freely without breaking the ABI or requiring recompilation of dependent modules. The handle struct can grow new fields, reorder existing ones, or replace internal state machines, and as long as the public function signatures are stable, application code keeps working. This is the property that lets a HAL evolve across SDK releases without forcing customers to recompile the world.

Unsafe direct field access — one of the most common sources of driver misuse in embedded code — is eliminated at compile time. A customer who would otherwise reach into a UART_HandleDef_t to fix some symptom they’re seeing cannot do so, because the struct is not defined where their code can see it. The compiler enforces the abstraction.

Multiple instances of the same peripheral are managed uniformly. UART1, UART2, and UART3 are all hal_uart_t *, and the HAL functions operate on whichever handle is passed to them. There is no per-instance global state, no parallel set of _uart1_send / _uart2_send symbols, and no special-casing in application code.

The opaque handle is the right default for any HAL that ships across multiple device variants and expects to evolve over time. It costs almost nothing to apply consistently from day one, and retrofitting it later requires rewriting every API signature and every call site.


Pattern 2: Vtable-based polymorphism

For systems that target multiple hardware platforms, support runtime swapping between hardware variants, or require test-double injection without hardware, vtable-based polymorphism is the idiomatic C equivalent of an abstract interface. A driver vtable is a struct of function pointers that defines the operations a peripheral must support. Middleware components receive a pointer to the vtable and call through it, remaining entirely agnostic of the underlying hardware.

typedef struct {

    int  (*init)(hal_uart_t *h, const hal_uart_config_t *cfg);

    int  (*write)(hal_uart_t *h, const uint8_t *buf, size_t len);

    int  (*read)(hal_uart_t *h, uint8_t *buf, size_t len, uint32_t timeout_ms);

    int  (*ioctl)(hal_uart_t *h, hal_uart_cmd_t cmd, void *arg);

    void (*close)(hal_uart_t *h);

} hal_uart_driver_t;

/* Middleware receives driver pointer, not a concrete instance */

void protocol_stack_init(const hal_uart_driver_t *drv, hal_uart_t *handle);

Swapping a hardware target — or injecting a test double — becomes a single pointer assignment at the composition root. The middleware does not know whether it is talking to a real UART, a mock UART that records calls into a test harness, or a virtual UART that loops back over a Unix socket on a host build. This decoupling is what makes meaningful host-side unit testing of middleware achievable without hardware, and it is the property that turns a HAL from a porting layer into a testing primitive.

Vtables come with a real cost: the call overhead of an indirect function call, and the discipline required to keep every hardware backend’s vtable implementation behaviourally identical. In practice, the call overhead is negligible for everything except the highest-throughput inner loops, and for those paths the SDK can expose a direct-call fast path that bypasses the vtable for the specific hot operation. The behavioural-identity discipline is the harder problem: every backend has to pass the same conformance test suite, and any divergence between backends is a bug.

The pattern is most valuable in three specific situations: when middleware components have to be portable across several hardware backends in the same SDK; when the SDK needs to support host-side testing without hardware; and when the platform offers multiple physical implementations of the same peripheral type — a software-defined UART, a hardware UART, and a USB CDC virtual UART, for instance — that all need to present the same interface to higher layers.


Pattern 3: Device-tree-driven driver model

The third pattern, modeled on the Linux kernel device model and adopted by Zephyr and several other modern embedded RTOSes, takes a different approach. Peripheral access is mediated through a device node obtained at compile time from the device tree. Driver APIs are dispatched through a vtable-style ops structure, but the resolution happens at compile time, so there is no runtime indirection overhead.

/* Generic device-tree-driven driver pattern */

#include <sdk/drivers/uart.h>

/* Device node resolved at compile time from board .dts */

const struct device *uart = DEVICE_DT_GET(DT_NODELABEL(uart0));

if (!device_is_ready(uart)) {

    return -ENODEV;

}

/* Async callback API – no handle pointer needed */

uart_callback_set(uart, my_uart_cb, NULL);

uart_rx_enable(uart, rx_buf, sizeof(rx_buf), RX_TIMEOUT_US);

The device tree describes the hardware topology — UART instances, pins, clock sources, DMA channels — in a platform-specific .dts file that lives separate from the driver source. The driver itself is generic; the board-specific configuration lives in the device tree, and the build system stitches them together at configuration time.

What this enables is a much cleaner separation between board support and driver code than either of the previous patterns can offer on their own. Adding a new board variant requires only a new .dts overlay, not forking the driver source. A platform that supports fifty boards has fifty device tree overlays and one driver per peripheral type, rather than fifty board-specific driver forks.

The cost is a steeper initial learning curve. Developers new to the device tree have to learn the binding format, the overlay mechanism, and the Kconfig integration that ties them together. Customers evaluating an SDK for the first time encounter device tree before they encounter a working firmware image, which has implications for the thirty-minute getting-started target. The pattern scales exceptionally well across large device families with diverse board configurations, but it is over-engineered for narrow platforms with a small number of board variants.

The device-tree pattern is the right answer for SDKs that already have a complex board matrix and that expect it to grow. It is the wrong answer for an SDK whose entire supported hardware set is three reference boards and a customer’s prototype.


How the patterns combine

In practice, mature HALs combine all three. The driver instance is identified by an opaque handle. The handle is wired up at compile time from a device tree binding, which resolves the peripheral instance, the pin configuration, and the DMA channel allocation. The driver dispatches through a vtable so that the same middleware code runs against the silicon driver, against a host-side mock, and against a future hardware variant whose details are not yet known.

The combination is more than the sum of the parts: the device tree handles compile-time configuration without runtime overhead, the vtable handles runtime polymorphism where it is genuinely needed, and the opaque handle prevents the entire stack from leaking implementation details to application code.

The pure pattern choices in the previous sections are useful for understanding what each pattern contributes. The actual design decision in a production HAL is rarely “which one of the three” — it is “in what proportion, and at what layer.”


The cross-cutting principles

Regardless of the architectural pattern chosen, a small set of API design principles has to govern HAL design uniformly. These are not pattern-specific; they are the difference between a HAL that scales and one that becomes a maintenance liability.

Use typed error codes, never raw integers with undocumented semantics. Every HAL function returns a value of a named enum or typedef, and every possible return code is documented in the function header. A return 0; from a HAL function should be the result of a defined success path, not a default the compiler picked for a missing return statement.

Separate lifecycle stages. A peripheral init function configures the driver struct. A separate MSP or board init function configures clocks and pins. A separate enable function starts the peripheral. Conflating these stages forces callers into a specific initialisation order and makes the driver harder to use from secondary boot stages or from low-power resume paths.

Document ISR safety explicitly. Any callback invoked from a DMA or interrupt context must be documented as such, with a clear list of which RTOS primitives are ISR-safe to call from within it. The biggest source of subtle bugs in embedded systems is callbacks that look harmless but execute in interrupt context, and the HAL is where that documentation has to live.

Expose both blocking and async APIs. Blocking APIs are correct for startup code and non-latency-sensitive paths. Async or DMA-based APIs are mandatory for high-throughput production code. Both must operate on the same handles and the same internal state machines, so that switching between them is a configuration choice rather than a different driver.

Version the API with a published changelog. Breaking changes to the HAL API require a major version increment and a published migration guide. The HAL is part of the SDK’s public surface, and its changelog is part of the contract with customers.


What this means for the team writing the HAL

Most of the failure modes that derail HAL projects are not the result of choosing the wrong pattern. They are the result of inconsistency: applying opaque handles to four peripheral types and leaving the fifth with public struct fields; introducing a vtable for the SPI driver because a contributor needed it for a test, and not applying the same pattern to UART or I2C; documenting ISR safety on the new peripherals and leaving the old ones to convention.

The patterns work because they are applied uniformly across the HAL and across the family of devices the HAL supports. A HAL that mixes opaque handles for some peripherals and direct struct access for others is harder to use than a HAL that picks one model and applies it everywhere, even if the chosen model is the less sophisticated of the two. Consistency is the property that lets driver developers extend the HAL to a new peripheral without having to re-derive the local conventions, and it is the property that lets application developers move between peripheral types without context-switching their mental model.

Choosing the pattern is the visible decision. Applying it consistently, across every peripheral and every supported device variant, is the one that determines whether the HAL ages well.


Building a HAL that has to scale across five product variants and three RTOSes?

needCode designs embedded HALs where the patterns are applied consistently from day one — opaque handles, vtable polymorphism, and device-tree binding built in, not retrofitted. If you’re architecting a new platform or inheriting one that’s showing the cost of early decisions, let’s talk.

Book a free discovery call or get in touch


Further reading

  • Anatomy of a Production OTA Pipeline — a worked example of the SDK subsystem that sits above the HAL layer this post describes
  • FreeRTOS / Zephyr OS — the RTOS layer the HAL has to integrate with cleanly, including task scheduling, mutexes, and ISR-safe API patterns
  • Firmware Security — how secure boot and PSA Certified targeting interact with the HAL’s peripheral initialisation and lifecycle stages
  • End-to-End Testing — the CI and HIL infrastructure that validates HAL conformance across device variants and RTOS targets