Skip to content

Pyobjects

1. What exactly is a PyObject?

A PyObject is the fundamental base structure for all Python objects. You can think of it as the root class in an object-oriented hierarchy. In C, it is defined as a relatively simple structure that contains only the bare essentials needed for Python to manage the object.

The Structure (Simplified)

If you peek into the CPython source code (Include/object.h), a standard PyObject looks like this:

typedef struct _object {
    Py_ssize_t ob_refcnt;    // Reference count
    PyTypeObject *ob_type;   // Pointer to the object's type
} PyObject;
  • ob_refcnt (Reference Count): This is how Python manages memory. Every time you point a new variable to this object, this number goes up. When a variable goes out of scope or is deleted, it goes down. When it hits zero, Python frees the memory.
  • ob_type (Type Pointer): This points to another structure (a PyTypeObject) that defines what this object is (e.g., an int, a list, or a function). This is how Python knows that + means "addition" for integers but "concatenation" for strings.

2. Inheritance in C

Since C doesn't have native inheritance, Python uses a technique called Member Inclusion. More specific types (like PyLongObject for integers) include a PyObject as their first member.

typedef struct {
    PyObject ob_base;  // The "header"
    // ... specialized fields for integers ...
} PyLongObject;

Because ob_base is at the very beginning of the memory block, a pointer to a PyLongObject is also a valid pointer to a PyObject. This allows the CPython API to treat almost everything as a PyObject* through simple type casting.


3. Working with PyObject* in your Code

When you write a C extension, you will almost exclusively pass and receive PyObject* pointers.

Type Checking

Before doing anything with a PyObject, you should verify its type using the built-in macros:

if (PyList_Check(my_obj)) {
    // Treat as a list
} else if (PyUnicode_Check(my_obj)) {
    // Treat as a string
}

Reference Management

Because PyObject contains the ob_refcnt, you are responsible for maintaining it.

  • Py_INCREF(obj): "I'm keeping a copy of this pointer, don't delete it."
  • Py_DECREF(obj): "I'm done with this, feel free to delete it if no one else wants it."

4. Why this Design Matters

This "Universal Pointer" design is what gives Python its dynamic nature.

  1. Polymorphism: A C function can take a PyObject* and not care if it's a list, a dictionary, or an integer until runtime.
  2. Memory Safety: By centralizing the reference count in the PyObject header, Python ensures a consistent memory management strategy across the entire language.
  3. Extensibility: When you create a C extension, you simply wrap your C data inside a PyObject structure, and Python treats it just like a native object.

The PyVarObject (The Sibling)

For objects that have a variable length (like strings or lists), Python uses PyVarObject, which is just a PyObject with an extra ob_size field:

typedef struct {
    PyObject ob_base;
    Py_ssize_t ob_size; /* Number of items in variable part */
} PyVarObject;

Creating a custom PyObject (often called a "C Extension Type") is the most powerful way to extend Python. It allows you to create classes that store data in raw C structures, providing massive performance gains and memory efficiency compared to standard Python classes.


1. The Anatomy of a Custom Type

When you define a custom type, you are essentially creating a C struct that Python can treat as an object. This requires three distinct layers:

  1. The Data Structure: The C struct defining the object's fields.
  2. The Behavior: C functions that handle initialization, deallocation, and methods.
  3. The Type Blueprint: The PyTypeObject that tells Python how to use the struct.

2. Defining the Data Structure

Every custom type must start with PyObject_HEAD. This macro includes the reference count and a pointer to the type object.

#include <Python.h>
#include "structmember.h" // Required for attribute access

typedef struct {
    PyObject_HEAD
    PyObject *first_name; /* Python Object field */
    PyObject *last_name;  /* Python Object field */
    int age;              /* Standard C field */
} PersonObject;

3. Implementing Core Lifecycle Methods

To make the object "alive," you must define how it is created, initialized, and destroyed.

The Deallocator (tp_dealloc)

This function is called when the reference count hits zero. You must clean up any Python objects held within your struct.

static void Person_dealloc(PersonObject *self) {
    Py_XDECREF(self->first_name);
    Py_XDECREF(self->last_name);
    Py_TYPE(self)->tp_free((PyObject *)self);
}

The Constructor (tp_init)

This is the C equivalent of __init__. It parses arguments passed from Python.

static int Person_init(PersonObject *self, PyObject *args, PyObject *kwds) {
    static char *kwlist[] = {"first", "last", "age", NULL};
    PyObject *first = NULL, *last = NULL, *tmp;

    if (!PyArg_ParseTupleAndKeywords(args, kwds, "OOi", kwlist, 
                                     &first, &last, &self->age)) {
        return -1;
    }

    if (first) {
        tmp = self->first_name;
        Py_INCREF(first);
        self->first_name = first;
        Py_XDECREF(tmp);
    }
    // Repeat for last name...
    return 0;
}

4. Defining Methods and Members

You need to tell Python which fields are accessible as attributes and which functions are callable as methods.

Method Table (tp_methods)

static PyObject* Person_greet(PersonObject *self, PyObject *Py_UNUSED(ignored)) {
    return PyUnicode_FromFormat("Hello, %S %S!", self->first_name, self->last_name);
}

static PyMethodDef Person_methods[] = {
    {"greet", (PyCFunction)Person_greet, METH_NOARGS, "Return a greeting"},
    {NULL}  /* Sentinel */
};

Member Table (tp_members)

Exposes C fields directly to Python.

static PyMemberDef Person_members[] = {
    {"age", T_INT, offsetof(PersonObject, age), 0, "Person's age"},
    {NULL}  /* Sentinel */
};

5. The Type Definition (PyTypeObject)

This is the master blueprint. In a production-ready extension, you fill out the slots for the behaviors defined above.

static PyTypeObject PersonType = {
    PyObject_HEAD_INIT(NULL)
    // Initializes the base object header used by all Python objects.
    // This sets up reference counting and links the type to Python's object system.

    .tp_name = "my_module.Person",
    // Fully qualified name of the type as seen from Python.
    // Format: "module_name.ClassName".
    // This is what appears in repr() and error messages.

    .tp_doc = "Person objects",
    // Documentation string for the class.
    // Accessible in Python via: help(my_module.Person)

    .tp_basicsize = sizeof(PersonObject),
    // Size of the C structure used to represent the object instance.
    // Python uses this to allocate memory when creating a new object.

    .tp_itemsize = 0,
    // Used for variable-sized objects (like lists or tuples).
    // Set to 0 because PersonObject has a fixed size.

    .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
    // Flags controlling behavior of the type.
    // Py_TPFLAGS_DEFAULT: enables standard Python object features.
    // Py_TPFLAGS_BASETYPE: allows this type to be subclassed in Python.

    .tp_new = PyType_GenericNew,
    // Responsible for allocating a new instance of the object.
    // PyType_GenericNew is the default allocator used by most extensions.

    .tp_init = (initproc)Person_init,
    // Initialization function called after memory allocation.
    // Equivalent to the Python __init__ method.

    .tp_dealloc = (destructor)Person_dealloc,
    // Called when the object's reference count reaches zero.
    // Used to release resources and free memory.

    .tp_methods = Person_methods,
    // Table of methods exposed to Python.
    // Defines functions that behave like class methods in Python.

    .tp_members = Person_members,
    // Table describing attributes stored directly in the C struct.
    // Allows Python code to access struct fields as object attributes.
};

6. Registration and Export

Finally, inside your PyInit function, you must "ready" the type and add it to your module.

static struct PyModuleDef mymodule = {
    PyModuleDef_HEAD_INIT, "my_module", NULL, -1, NULL,
};

PyMODINIT_FUNC PyInit_my_module(void) {
    PyObject *m;
    if (PyType_Ready(&PersonType) < 0) return NULL;

    m = PyModule_Create(&mymodule);
    if (m == NULL) return NULL;

    Py_INCREF(&PersonType);
    if (PyModule_AddObject(m, "Person", (PyObject *)&PersonType) < 0) {
        Py_DECREF(&PersonType);
        Py_DECREF(m);
        return NULL;
    }
    return m;
}

PyMODINIT_FUNC PyInit_kv_store(void) { 
    // Module initialization function. Python calls this when `import kv_store` runs.
    // It must return a fully constructed module object or NULL on failure.

    PyObject* m;  // Will hold the newly created module object.

    /* --- 1. Finalize custom Python types --- */
    // PyType_Ready prepares the type objects for use by the interpreter.
    // It validates the structure, resolves inheritance, and fills internal slots.
    // Every custom PyTypeObject must be finalized before it can be exposed to Python.
    if (PyType_Ready(&StoreType) < 0)
        return NULL;

    if (PyType_Ready(&AdminType) < 0)
        return NULL;

    /* --- 2. Create the module object --- */
    // PyModule_Create allocates the module based on the module definition
    // (methods, documentation string, module state, etc.).
    m = PyModule_Create(&kvmodule);
    if (m == NULL)
        return NULL;

    /* --- 3. Expose the Store class to Python --- */
    // The module must hold a reference to the type object to prevent it
    // from being garbage collected.
    Py_INCREF(&StoreType);

    // Register the type in the module namespace so Python code can use:
    //     from kv_store import Store
    if (PyModule_AddObject(m, "Store", (PyObject*)&StoreType) < 0) {
        // If registration fails, undo the reference increment
        // and destroy the partially created module.
        Py_DECREF(&StoreType);
        Py_DECREF(m);
        return NULL;
    }

    /* --- 4. Expose the Admin class to Python --- */
    Py_INCREF(&AdminType);

    // After this call, Python users can access it as:
    //     from kv_store import Admin
    if (PyModule_AddObject(m, "Admin", (PyObject*)&AdminType) < 0) {
        Py_DECREF(&AdminType);
        Py_DECREF(&StoreType);
        Py_DECREF(m);
        return NULL;
    }

    // Return the fully initialized module to the Python interpreter.
    return m;
}

Why Use Custom PyObjects?

Feature Standard Python Class Custom C PyObject
Attribute Storage Dynamic __dict__ (Slow) Static C Struct (Blazing Fast)
Memory Usage High (Object overhead) Minimal (Raw C memory)
Type Safety Low (Everything is dynamic) High (C type checking)
Binary Logic Requires conversion Native access