Pyobjects
1. What exactly is a PyObject?¶
A PyObject is the fundamental base structure for all Python objects. You can think of it as the root class in an object-oriented hierarchy. In C, it is defined as a relatively simple structure that contains only the bare essentials needed for Python to manage the object.
The Structure (Simplified)¶
If you peek into the CPython source code (Include/object.h), a standard PyObject looks like this:
typedef struct _object {
Py_ssize_t ob_refcnt; // Reference count
PyTypeObject *ob_type; // Pointer to the object's type
} PyObject;
ob_refcnt(Reference Count): This is how Python manages memory. Every time you point a new variable to this object, this number goes up. When a variable goes out of scope or is deleted, it goes down. When it hits zero, Python frees the memory.ob_type(Type Pointer): This points to another structure (aPyTypeObject) that defines what this object is (e.g., anint, alist, or afunction). This is how Python knows that+means "addition" for integers but "concatenation" for strings.
2. Inheritance in C¶
Since C doesn't have native inheritance, Python uses a technique called Member Inclusion. More specific types (like PyLongObject for integers) include a PyObject as their first member.
typedef struct {
PyObject ob_base; // The "header"
// ... specialized fields for integers ...
} PyLongObject;
Because ob_base is at the very beginning of the memory block, a pointer to a PyLongObject is also a valid pointer to a PyObject. This allows the CPython API to treat almost everything as a PyObject* through simple type casting.
3. Working with PyObject* in your Code¶
When you write a C extension, you will almost exclusively pass and receive PyObject* pointers.
Type Checking¶
Before doing anything with a PyObject, you should verify its type using the built-in macros:
if (PyList_Check(my_obj)) {
// Treat as a list
} else if (PyUnicode_Check(my_obj)) {
// Treat as a string
}
Reference Management¶
Because PyObject contains the ob_refcnt, you are responsible for maintaining it.
Py_INCREF(obj): "I'm keeping a copy of this pointer, don't delete it."Py_DECREF(obj): "I'm done with this, feel free to delete it if no one else wants it."
4. Why this Design Matters¶
This "Universal Pointer" design is what gives Python its dynamic nature.
- Polymorphism: A C function can take a
PyObject*and not care if it's a list, a dictionary, or an integer until runtime. - Memory Safety: By centralizing the reference count in the
PyObjectheader, Python ensures a consistent memory management strategy across the entire language. - Extensibility: When you create a C extension, you simply wrap your C data inside a
PyObjectstructure, and Python treats it just like a native object.
The PyVarObject (The Sibling)¶
For objects that have a variable length (like strings or lists), Python uses PyVarObject, which is just a PyObject with an extra ob_size field:
typedef struct {
PyObject ob_base;
Py_ssize_t ob_size; /* Number of items in variable part */
} PyVarObject;
Creating a custom PyObject (often called a "C Extension Type") is the most powerful way to extend Python. It allows you to create classes that store data in raw C structures, providing massive performance gains and memory efficiency compared to standard Python classes.
1. The Anatomy of a Custom Type¶
When you define a custom type, you are essentially creating a C struct that Python can treat as an object. This requires three distinct layers:
- The Data Structure: The C
structdefining the object's fields. - The Behavior: C functions that handle initialization, deallocation, and methods.
- The Type Blueprint: The
PyTypeObjectthat tells Python how to use the struct.
2. Defining the Data Structure¶
Every custom type must start with PyObject_HEAD. This macro includes the reference count and a pointer to the type object.
#include <Python.h>
#include "structmember.h" // Required for attribute access
typedef struct {
PyObject_HEAD
PyObject *first_name; /* Python Object field */
PyObject *last_name; /* Python Object field */
int age; /* Standard C field */
} PersonObject;
3. Implementing Core Lifecycle Methods¶
To make the object "alive," you must define how it is created, initialized, and destroyed.
The Deallocator (tp_dealloc)¶
This function is called when the reference count hits zero. You must clean up any Python objects held within your struct.
static void Person_dealloc(PersonObject *self) {
Py_XDECREF(self->first_name);
Py_XDECREF(self->last_name);
Py_TYPE(self)->tp_free((PyObject *)self);
}
The Constructor (tp_init)¶
This is the C equivalent of __init__. It parses arguments passed from Python.
static int Person_init(PersonObject *self, PyObject *args, PyObject *kwds) {
static char *kwlist[] = {"first", "last", "age", NULL};
PyObject *first = NULL, *last = NULL, *tmp;
if (!PyArg_ParseTupleAndKeywords(args, kwds, "OOi", kwlist,
&first, &last, &self->age)) {
return -1;
}
if (first) {
tmp = self->first_name;
Py_INCREF(first);
self->first_name = first;
Py_XDECREF(tmp);
}
// Repeat for last name...
return 0;
}
4. Defining Methods and Members¶
You need to tell Python which fields are accessible as attributes and which functions are callable as methods.
Method Table (tp_methods)¶
static PyObject* Person_greet(PersonObject *self, PyObject *Py_UNUSED(ignored)) {
return PyUnicode_FromFormat("Hello, %S %S!", self->first_name, self->last_name);
}
static PyMethodDef Person_methods[] = {
{"greet", (PyCFunction)Person_greet, METH_NOARGS, "Return a greeting"},
{NULL} /* Sentinel */
};
Member Table (tp_members)¶
Exposes C fields directly to Python.
static PyMemberDef Person_members[] = {
{"age", T_INT, offsetof(PersonObject, age), 0, "Person's age"},
{NULL} /* Sentinel */
};
5. The Type Definition (PyTypeObject)¶
This is the master blueprint. In a production-ready extension, you fill out the slots for the behaviors defined above.
static PyTypeObject PersonType = {
PyObject_HEAD_INIT(NULL)
// Initializes the base object header used by all Python objects.
// This sets up reference counting and links the type to Python's object system.
.tp_name = "my_module.Person",
// Fully qualified name of the type as seen from Python.
// Format: "module_name.ClassName".
// This is what appears in repr() and error messages.
.tp_doc = "Person objects",
// Documentation string for the class.
// Accessible in Python via: help(my_module.Person)
.tp_basicsize = sizeof(PersonObject),
// Size of the C structure used to represent the object instance.
// Python uses this to allocate memory when creating a new object.
.tp_itemsize = 0,
// Used for variable-sized objects (like lists or tuples).
// Set to 0 because PersonObject has a fixed size.
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
// Flags controlling behavior of the type.
// Py_TPFLAGS_DEFAULT: enables standard Python object features.
// Py_TPFLAGS_BASETYPE: allows this type to be subclassed in Python.
.tp_new = PyType_GenericNew,
// Responsible for allocating a new instance of the object.
// PyType_GenericNew is the default allocator used by most extensions.
.tp_init = (initproc)Person_init,
// Initialization function called after memory allocation.
// Equivalent to the Python __init__ method.
.tp_dealloc = (destructor)Person_dealloc,
// Called when the object's reference count reaches zero.
// Used to release resources and free memory.
.tp_methods = Person_methods,
// Table of methods exposed to Python.
// Defines functions that behave like class methods in Python.
.tp_members = Person_members,
// Table describing attributes stored directly in the C struct.
// Allows Python code to access struct fields as object attributes.
};
6. Registration and Export¶
Finally, inside your PyInit function, you must "ready" the type and add it to your module.
static struct PyModuleDef mymodule = {
PyModuleDef_HEAD_INIT, "my_module", NULL, -1, NULL,
};
PyMODINIT_FUNC PyInit_my_module(void) {
PyObject *m;
if (PyType_Ready(&PersonType) < 0) return NULL;
m = PyModule_Create(&mymodule);
if (m == NULL) return NULL;
Py_INCREF(&PersonType);
if (PyModule_AddObject(m, "Person", (PyObject *)&PersonType) < 0) {
Py_DECREF(&PersonType);
Py_DECREF(m);
return NULL;
}
return m;
}
PyMODINIT_FUNC PyInit_kv_store(void) {
// Module initialization function. Python calls this when `import kv_store` runs.
// It must return a fully constructed module object or NULL on failure.
PyObject* m; // Will hold the newly created module object.
/* --- 1. Finalize custom Python types --- */
// PyType_Ready prepares the type objects for use by the interpreter.
// It validates the structure, resolves inheritance, and fills internal slots.
// Every custom PyTypeObject must be finalized before it can be exposed to Python.
if (PyType_Ready(&StoreType) < 0)
return NULL;
if (PyType_Ready(&AdminType) < 0)
return NULL;
/* --- 2. Create the module object --- */
// PyModule_Create allocates the module based on the module definition
// (methods, documentation string, module state, etc.).
m = PyModule_Create(&kvmodule);
if (m == NULL)
return NULL;
/* --- 3. Expose the Store class to Python --- */
// The module must hold a reference to the type object to prevent it
// from being garbage collected.
Py_INCREF(&StoreType);
// Register the type in the module namespace so Python code can use:
// from kv_store import Store
if (PyModule_AddObject(m, "Store", (PyObject*)&StoreType) < 0) {
// If registration fails, undo the reference increment
// and destroy the partially created module.
Py_DECREF(&StoreType);
Py_DECREF(m);
return NULL;
}
/* --- 4. Expose the Admin class to Python --- */
Py_INCREF(&AdminType);
// After this call, Python users can access it as:
// from kv_store import Admin
if (PyModule_AddObject(m, "Admin", (PyObject*)&AdminType) < 0) {
Py_DECREF(&AdminType);
Py_DECREF(&StoreType);
Py_DECREF(m);
return NULL;
}
// Return the fully initialized module to the Python interpreter.
return m;
}
Why Use Custom PyObjects?¶
| Feature | Standard Python Class | Custom C PyObject |
|---|---|---|
| Attribute Storage | Dynamic __dict__ (Slow) |
Static C Struct (Blazing Fast) |
| Memory Usage | High (Object overhead) | Minimal (Raw C memory) |
| Type Safety | Low (Everything is dynamic) | High (C type checking) |
| Binary Logic | Requires conversion | Native access |