Architecture

The v2 import tree at a glance

Import tree

hwpapi
β”œβ”€β”€ App                  # core.app.App   β€” slim facade (≀15 public members)
β”œβ”€β”€ Document             # document.Document β€” per-document surface (app.doc)
β”œβ”€β”€ collections/         # dict-like aggregations under app.doc.*
β”‚   β”œβ”€β”€ Collection       # Protocol: __getitem__, __iter__, __len__,
β”‚   β”‚                    #            __contains__, names, filter
β”‚   β”œβ”€β”€ fields           # FieldCollection + Field
β”‚   β”œβ”€β”€ bookmarks        # BookmarkCollection + Bookmark
β”‚   β”œβ”€β”€ hyperlinks
β”‚   β”œβ”€β”€ images
β”‚   β”œβ”€β”€ paragraphs       # ParagraphCollection + Paragraph + Run
β”‚   β”œβ”€β”€ styles
β”‚   └── tables           # TableCollection + Table + Cell
β”œβ”€β”€ context/
β”‚   └── scopes           # charshape_scope, parashape_scope, styled_text
β”œβ”€β”€ io/
β”‚   β”œβ”€β”€ open             # open_file, new_document
β”‚   └── export           # export_pdf, export_image, export_text
β”œβ”€β”€ errors               # HwpApiError hierarchy + wrap_com_error
β”œβ”€β”€ units                # mm, cm, inch, pt, parse, to_mm, ...
β”œβ”€β”€ logging              # get_logger, HWPAPI_LOG_LEVEL
└── low/                 # escape hatch
    β”œβ”€β”€ actions          # _Actions, _Action (900+ action wrappers)
    β”œβ”€β”€ engine           # Engine, Engines, Apps
    └── parametersets    # ParameterSet classes (CharShape, ParaShape, ...)

The two-layer model

hwpapi v2 is a layered API, not a levelled one. Both layers are first-class:

  • High layer (hwpapi.App β†’ app.doc.*) β€” the default surface. Opinionated, collection-oriented, one way to do each thing.
  • Low layer (hwpapi.low.*) β€” the raw action + ParameterSet + engine primitives. Officially supported. The high layer is implemented on top of the low layer, not alongside it.

See ADR-001 for the full rationale.

The Collection protocol

Every dict-like aggregation under app.doc.* implements hwpapi.collections.Collection:

from typing import Callable, Iterator, Protocol, runtime_checkable

@runtime_checkable
class Collection(Protocol):
    def __getitem__(self, key): ...
    def __iter__(self) -> Iterator: ...
    def __len__(self) -> int: ...
    def __contains__(self, key) -> bool: ...
    def names(self) -> list[str]: ...
    def filter(self, predicate: Callable[..., bool]) -> list: ...

This is a structural contract, not a nominal one β€” the collection classes do not inherit from Collection. Runtime checks use isinstance(coll, Collection) thanks to @runtime_checkable.

Collections that support mutation add __setitem__, __delitem__, and clear(). That set is not part of the Protocol, because some collections are read-only (e.g. paragraphs β€” you can’t delete a paragraph by index without affecting surrounding ones).

The element value-object pattern

Subscripting a collection returns a lightweight element β€” a reference object that carries (app, identifier) and looks up the underlying state lazily. Elements:

  • are cheap to construct (no COM calls)
  • hit COM at property-read time
  • never cache (reading twice hits COM twice)

Why not dataclasses? Because HWP state can change under you (user undo, concurrent script) and caching a snapshot would lie. When you want a snapshot, take one explicitly (e.g. pset.clone() on a ParameterSet, or a plain dict for element state).

Ownership and lifecycle

App
 └── Engine              (via app.engine)
      └── HwpObject      (raw COM; via app.api or app.engine.impl)
 └── Document            (via app.doc; cached_property)
      └── Collections    (per-collection, via cached_property)
  • App owns the Engine.
  • Document does not own a COM handle. It holds a reference to its owning App and fetches app.engine.impl at the point of use.
  • Collections similarly reach into self._app.engine.impl on every call β€” they are stateless wrappers over HWP’s live state.

This means: once app.close() runs, reading from the cached Document or any cached collection will raise β€” the indirection through app.engine.impl surfaces the invalidation.

The facade boundary

The v2 facade (App + Document + collections + context + io + errors + units) is intentionally small. The test of β€œwhat belongs on the facade” is:

Does removing this symbol break the one-screen dir(App) guarantee, or does it trade a clear high-level affordance for a low-level call?

If the answer is β€œbreak the guarantee” or β€œtrade a clear affordance”, it stays. Everything else goes to hwpapi.low or a sibling package.

Back to top