Deadlocks and COM

Computer Programs: COM and Deadlocks

Before getting into what a deadlock is, we will go over precursor computing topics.

A program is a process that is not yet loaded into memory. A process is a running instance of a program loaded into memory. So notepad.exe and excel.exe, are programs, but when activated and loaded, they become a process.

A process has:

-Its own memory space.
-Its own resources.
-Its own security context
-At least 1 thread.

A thread is a unit of execution inside a process. A process can have 1 or multiple threads. Threads share the same memory and the same global variables. But each thread has its own stack and its own instruction pointer.

A stack is a special region of memory used by a thread to keep track of function calls, local variables, and return addresses. They use a LIFO (last-in 1st-out) order.

How Windows opens a program.

When you double-click an app:

1. Explorer.exe calls the Windows API function: CreateProcess().
2. Windows loads the .exe into memory.
3. The OS loader: maps the executable into memory, loads required DLLs, and initializes runtime libraries.
4. The program’s WinMain() function runs.
5. The app creates its 1st window.
6. The message loop starts.

Examples: Notepad, Wordpad, and Microsoft Word.

Notepad has almost no COM dependency. It has no licensing layer, no service dependency, and no Click-to-Run virtualization.

Launch flow: Explorer -> CreateProcess -> notepad.exe -> WinMain -> CreateWindow -> Done.

WordPad.

WordPad is more complex. It uses RichEdit controls (riched20.dll), uses some COM components internally, and loads shared libraries for text formatting.

Flow: Explorer -> CreateProcess -> wordpad.exe.

-> loads riched20.dll
-> initializes COM (CoInitialize)
-> creates document object
-> builds UI

If COM is damaged, WordPad can sometimes fail — but it doesn’t rely on heavy licensing services like Microsoft Office.

Microsoft Word.

Word is not just a program — it's a COM server. When you open Word:

1: Shell Launch

Explorer calls CreateProcess on: WINWORD.EXE

2: Word Initializes COM

Word calls: CoInitializeEx(), then registers itself as a COM server.

3: OLE / Automation Layer

Registers automation objects.
Registers OLE document handlers.
Connects to licensing service.
Loads add-ins.
Checks registry COM class IDs.

4: Licensing Layer

In Office 2010:

Word connects to Office Software Protection Platform.
It verifies activation.
It may talk to a service (osppsvc).

If that fails -> licensing error -> Word closes.

COM (Component Object Model).

Think of it as a standardized way for programs to expose functionality to other programs. Example:

Excel exposes Excel.Application.
Word exposes Word.Application.

Other apps can create them like:

CoCreateInstance(CLSID_WordApplication)

That launches Word invisibly if needed.

Why “Waiting for OLE Action” happens.

When Word embeds an Excel spreadsheet, chart, PDF, or an ActiveX control, it waits for the COM object to respond. If that object is frozen, has permission issues, or has registry corruption, you get “Word is waiting for another application to complete an OLE action.” That’s COM inter-process communication hanging.

The COM is a system created by Microsoft in the 1990s that lets separate software components communicate with each other, even if they are written in different programming languages, running in different processes, or even on different machines (via extensions like distributed COM). So, COM is a standardized way for software components to communicate.

Normally, if 2 programs want to interact, they must be written in the same language, compiled together, and share the same runtime. For example, a C++ program can call a component written in Visual Basic which might internally call code written in Delphi. As long as they follow COM rules, they can communicate.

An interface is a table of function pointers. A COM object exposes interfaces. Every COM object must support the base interface called IUnknown. This lets programs discover available interfaces, manage memory, and safely interact with the object.

How are COM objects located? Programs don't load COM objects directly, instead Windows uses a registry lookup. Each COM object has a CLSID (class ID). Example: CLSID: {000209FF-0000-0000-C000-000000000046}. Windows checks the registry and loads the correct DLL or EXE.

In Java, objects are created by MyObject obj = new MyObject();
With COM, they are created by CoCreateInstance(CLSID_Something)

Deadlock.

Deadlock cycles formal terminology comes from graph theory. This comes from 1 of the 4 Coffman conditions (1971) for deadlock:

1. Mutual exclusion (a resource cannot be shared (only 1 thread can use it).)
2. Hold and wait (a thread holds one resource while waiting for another.)
3. No preemption (resources cannot be forcibly taken away.)
4. Circular wait

2-process (or 2-thread) circular deadlock:

ThreadA is waiting on ThreadB while ThreadB is waiting on ThreadA.

3-process (or 3-thread) circular deadlock:

ThreadA is waiting on ThreadB whom is waiting on ThreadC whom is waiting on ThreadA.

However, it is not:

ThreadA waiting for ThreadB directly.

But more specifically:

ThreadA holds Resource1.
ThreadB holds Resource2.

Each waits for the other's resource. The circular wait is between resources via threads. Threads compete for (and acquire) resources.

ThreadA waits for a resource1 held by Thread B.
ThreadB waits for a resource2 held by Thread A.

So does ThreadA that holds ResourceA, release ResourceA 1st and acquires ResourceB, or does ThreadA acquire ResourceB 1st, then releases ResourceA?

A well-designed thread releases ResourceA first, then acquires ResourceB.
A deadlocking thread tries to acquire ResourceB while still holding ResourceA.

That difference is everything.

And to clarify, resources are actually not owned by the threads, but are owned by the processes. Threads just use and temporily control shared resources owned by the process. Resources are also not made of stacks.

Threads also do not have their own independent memory space (like processes do). Processes have memory spaces, and threads share that space, except for their stacks, which are private slices within that space.

You can think of a process as a house, where the threads are the roommates, sharing the same memory space, heap and global variables but each roommate has their own private desk (stack).

Excel deadlock example: 2 VBA event handlers lock each other.

Excel is heavily multi-threaded (calculation engine, UI thread, background refresh, COM events, etc.).

Imagine:

You have Worksheet_Change() event code.
You also have Worksheet_Calculate() event code.
Both try to modify cells.
Both temporarily disable/enable events.
Both depend on a calculation finishing.

What can happen:

1. Worksheet_Change() fires.
2. It edits another cell.
3. That triggers recalculation.
4. Worksheet_Calculate() fires.
5. It tries to modify a cell that the first handler still has locked.
6. 1st handler is waiting for calculation to finish.
7. 2nd handler is waiting for 1st handler to release.

Then:

Thread A waits for Thread B.
Thread B waits for Thread A.

Excel appears frozen.

Another Excel example: Add-In Deadlock.

Excel loads

-A COM Add-In.
-A Real-Time Data (RTD) server.
-A background Power Query refresh.

Scenario:

1. Excel UI thread calls into Add-In.
2. Add-In waits for a calculation result.
3. Calculation thread is blocked waiting for UI thread message pump.
4. UI thread is blocked waiting for Add-In.

Circular wait: Excel hangs.

Word deadlock example:

Word uses things like UI thread, background grammar checker, AutoSave thread, printer driver thread, and add-ins.

Example: AutoSave + Add-In

1. AutoSave thread locks document structure.
2. Add-In tries to read document content.
3. Add-In calls back to UI thread.
4. UI thread tries to access locked document.

Everyone waits. Word hangs.

Other deadlock examples can happen between Word and Excel.

Deadlock happens between threads, processes, and both. But most commonly between threads inside a process.