Embedded Python
Listing 1
Embedding Python is a relatively straightforward process. Ifyour goal is merely to execute vanilla Python code from within a Cprogram, it's actually quite easy. Listing 1 is the complete sourceto a program that embeds the Python interpreter. This illustratesone of the simplest programs you could write making use of thePython interpreter.
Listing 1 uses three Python-specific function calls.Py_Initialize starts up the Pythoninterpreter library, causing it to allocate whatever internalresources it needs. You must call this function before calling mostother functions in the Python API.PyEval_SimpleString provides aquick, no-frills way to execute arbitrary Python code.Interpretation of the code is immediate. In the above example, forinstance, the import sys line causes Pythonto import the sys module beforereturning control to the C/C++ program. Each string passed toPyEval_SimpleString must be a complete Python statement of somekind. In other words, half statements are illegal, even if they arecompleted with another call to PyRun_SimpleString. For example, thefollowing code will not work properly:
// Python will print first error here
PyRun_SimpleString("import ");<\n>
// Python will print second error here
PyRun_SimpleString("sys\n");<\n>
Py_Finalize is the lastPython function which any application that embeds Python must call.This function shuts down the interpreter and frees any resources itallocated during its lifetime. You should call this when you arecompletely finished using the Python library. When you callPy_Finalize, Python will unload all imported modules one by one.Many modules must execute their own clean-up code when they areunloaded in order to free any global resources they may haveallocated. For this reason, calling Py_Finalize can have the sideeffect of causing quite a bit of other code to run.
PyEval_SimpleString is justone way to execute Python code from within your C applications. Infact, there is a whole collection of similar high-level functions.PyEval_SimpleFile is just likePyEval_SimpleString, except it reads its input from aFILE pointer rather than a character buffer. Seethe Python documentation atwww.python.org/docs/api/veryhigh.htmlfor complete documentation on these high-level functions.
In addition to evaluating Python scripts, you can alsomanipulate Python objects and call Python functions directly fromyour C code. While this involves more complex C code than usingPyEval_SimpleString, it also allows access to more detailedinformation. For example, you can access objects returned fromPython functions or determine if an exception has beenthrown.
Extending Python
When you embed Python within your application, it is oftendesirable to provide a small module that exposes an API related toyour application so that scripts executing within the embeddedinterpreter have a way to call back into the application. This isdone by providing your own Python module, written in C, and isexactly the same as writing normal Python modules. The onlydifference is your module will function properly only within theembedded interpreter.
Extending Python requires some understanding of how thePython interpreter manipulates objects from C. All functionarguments and return values are pointers to PyObject structures,which are the C representation of real Python objects. You can makeuse of various function calls to manipulate PyObjects. Listing 2 isa simple example of a Python module extension written in C. This isthe source to the Python cryptmodule, which provides one-way hashing used in passwordauthentication.
Listing 2
All C implementations of Python-callable functions take twoarguments of type PyObject. The first argument is always “self”,the object whose method is being called (similar to the infamous“this” pointer in C++). The second object contains all thearguments to the function.PyArg_Parse is used to extractvalues from a PyObject containing function arguments. You do thisby passing, in the PyObject which contains the values, a formatstring which represents the data types you expect to be there, andone or more pointers to data types to be filled in with values fromthe PyObject. In Listing 2, the function takes two strings,represented by "(ss)".PyArg_Parse is similar to the Cfunction sscanf, except itoperates on a PyObject rather than a character buffer. In order toreturn a string value from the function, callPyString_FromString. This helperfunction takes a char* value and converts itinto a PyObject.
Python, C and Threads
C programs can easily create new threads of execution. UnderLinux, this is most commonly done using the POSIX Threads(pthreads) API and the function callpthread_create. For an overview ofhow to use pthreads, see “POSIX Thread Libraries” by Felix Garciaand Javier Fernandez athttp://www.linuxjournal.com/lj-issues/issue70/3184.htmlin the “Strictly On-line” section of LJ,February 2000. In order to support multi-threading, Python uses amutex to serialize access to its internal data structures. I willrefer to this mutex as the “global interpreter lock”. Before agiven thread can make use of the Python C API, it must hold theglobal interpreter lock. This avoids race conditions that couldlead to corruption of the interpreter state.
The act of locking and releasing this mutex is abstracted bythe Python functionsPyEval_AcquireLock andPyEval_ReleaseLock. After callingPyEval_AcquireLock, you can safely assume your thread holds thelock; all other cooperating threads are either blocked or executingcode unrelated to the internals of the Python interpreter, and youmay now call arbitrary Python functions. Once acquiring the lock,however, you must be certain to release it later by callingPyEval_ReleaseLock. Failure to do so will cause a thread deadlockand freeze all other Python threads.
To complicate matters further, each thread running Pythonmaintains its own state information. This thread-specific data isstored in an object called PyThreadState. Whencalling Python API functions from C in a multi-threadedapplication, you must maintain your own PyThreadState objects inorder to safely execute concurrent Python code.
If you are experienced in developing threaded applications,you might find the idea of a global interpreter lock ratherunpleasant. Well, it's not as bad as it first appears. While Pythonis interpreting scripts, it periodically yields control to otherthreads by swapping out the current PyThreadState object andreleasing the global interpreter lock. Threads previously blockedwhile attempting to lock the global interpreter lock will now beable to run. At some point, the original thread will regain controlof the global interpreter lock and swap itself back in.
This means when you call PyEval_SimpleString, you are facedwith the unavoidable side effect that other threads will have achance to execute, even though you hold the global interpreterlock. In addition, making calls to Python modules written in C(including many of the built-in modules) opens the possibility ofyielding control to other threads. For this reason, two C threadsthat execute computationally intensive Python scripts will indeedappear to share CPU time and run concurrently. The downside isthat, due to the existence of the global interpreter lock, Pythoncannot fully utilize CPUs on multi-processor machines usingthreads.
Enabling Thread Support
Before your threaded C program is able to make use of thePython API, it must call some initialization routines. If theinterpreter library is compiled with thread support enabled (as isusually the case), you have the runtime option of enabling threadsor not. Do not enable runtime threading support unless you plan onusing threads. If runtime support is not enabled, Python will beable to avoid the overhead associated with mutex locking itsinternal data structures. If you are using Python to extend athreaded application, you will need to enable thread support whenyou initialize the interpreter. I recommend initializing Pythonfrom within your main thread of execution, preferably duringapplication startup, using the following two lines of code:
// initialize Python
Py_Initialize();
// initialize thread support
PyEval_InitThreads();
Both functions return void, so there are no error codes tocheck. You can now assume the Python interpreter is ready toexecute Python code. Py_Initializeallocates global resources used by the interpreter library. CallingPyEval_InitThreads turns on theruntime thread support. This causes Python to enable its internalmutex lock mechanism, used to serialize access to critical sectionsof code within the interpreter. This function also has the sideeffect of locking the global interpreter lock. Once the functioncompletes, you are responsible for releasing the lock. Beforereleasing the lock, however, you should grab a pointer to thecurrent PyThreadState object. You will need this later in order tocreate new Python threads and to shut down the interpreter properlywhen you are finished using Python. Use the following bit of codeto do this:
PyThreadState * mainThreadState = NULL;
// save a pointer to the main PyThreadState object
mainThreadState = PyThreadState_Get();
// release the lock
PyEval_ReleaseLock();
Creating a New Thread of Execution
Python requires a PyThreadState object for each thread thatis executing Python code. The interpreter uses this object tomanage a separate interpreter data space for each thread. Intheory, this means that actions taken in one thread should notinterfere with the state of another thread. For instance, if youthrow an exception in one thread, the other snippets of Python codekeep running as if nothing happened. You must help Python to manageper-thread data. To do this, manually create a PyThreadState objectfor each C thread that will execute Python code. In order to createa new PyThreadState object, you need a pre-existingPyInterpreterState object. The PyInterpreterState object holdsinformation that is shared across all cooperating threads. When youinitialized Python, it created a PyInterpreterState object andattached it to the main PyThreadState object. You can use thisinterpreter object to create a new PyThreadState for your own Cthread. Here's some example code which does just that (ignore linewrapping):
// get the global lock
PyEval_AcquireLock();
// get a reference to the PyInterpreterState
PyInterpreterState * mainInterpreterState = mainThreadState->interp<\n>;
// create a thread state object for this thread
PyThreadState * myThreadState = PyThreadState_New(mainInterpreterState);
// free the lock
PyEval_ReleaseLock();
Executing Python Code
Now that you have created a PyThreadState object, your Cthread can begin to use the Python API to execute Python scripts.You must adhere to a few simple rules when executing Python codefrom a C thread. First, you must hold the global interpreter lockbefore doing anything that alters the state of the current threadstate. Second, you must load your thread-specific PyThreadStateobject into the interpreter before executing any Python code. Onceyou have satisfied these constraints, you can execute arbitraryPython code by using functions such as PyEval_SimpleString.Remember to swap out your PyThreadState object and release theglobal interpreter lock when done. Note the symmetry of “lock,swap, execute, swap, unlock” in the code (ignore linewrapping):
// grab the global interpreter lock
PyEval_AcquireLock();
// swap in my thread state
PyThreadState_Swap(myThreadState);
// execute some python code
PyEval_SimpleString("import sys\n");
PyEval_SimpleString("sys.stdout.write('Hello from a C thread!\n')\n");
// clear the thread state
PyThreadState_Swap(NULL);
// release our hold on the global interpreter
PyEval_ReleaseLock();
Cleaning Up a Thread
Once your C thread is no longer using the Python interpreter,you must dispose of its resources. To do this, delete yourPyThreadState object. This is accomplished with the followingcode:
// grab the lock
PyEval_AcquireLock();
// swap my thread state out of the interpreter
PyThreadState_Swap(NULL);
// clear out any cruft from thread state object
PyThreadState_Clear(myThreadState);
// delete my thread state object
PyThreadState_Delete(myThreadState);
// release the lock
PyEval_ReleaseLock();
This thread is now effectively done using the Python API. Youmay safely call pthread_exit atthis point to halt execution of the thread.
Shutting Down the Interpreter
Once your application has finished using the Pythoninterpreter, you can shut down Python support with the followingcode:
// shut down the interpreter
PyEval_AcquireLock();
Py_Finalize();
Note there is no reason to release the lock, because Pythonhas been shut down. Be certain to delete all your thread-stateobjects with PyThreadState_Clearand PyThreadState_Delete beforecalling Py_Finalize.
Conclusion
Python is a good choice for use as an embedded language. Theinterpreter provides support for both embedding and extending,which allows two-way communication between C application code andembedded Python scripts. In addition, the threading supportfacilitates integration with multi-threaded applications withoutcompromising performance.
You can download example source code atftp.linuxjournal.com/pub/lj/listings/issue73/3641.tgz.This includes an example implementation of a multi-threaded HTTPserver with an embedded Python interpreter. In order to learn moreabout the implementation details, I recommend reading the Python CAPI documentation athttp://www.python.org/docs/api/.In addition, I have found the Python interpreter code itself to bean invaluable reference.
评论
Still getting crashes...
Thanks for the article, helped to understand the GIL a little more.
Since python 2.3 you can do the whole GIL lock things with the GILState_Ensure and Release functions. Look at my code:
kay this was the code basically. So again the handler class saves a python function pointer of a certain event. E.g. if i want to call a python function when (lets suppose you coded a chat program) some sends a message to others, you call the CHandler fitting to "ChatMessage" with arguments built like Py_BuildValue("(ss)", playerName, message) and call ExecuteHandler(handler, args /* built with above BuildValue */). The problem is then if someone excessively spams and there are many many threads which call the function, the program crashes sometime.
Full code can be seen at:
http://pyghost.googlecode.com