Thank you Ruslan. The serialization is handled by Beam but if you want to write your own IO module, then you can find serialization details on chapter 4.3 of the programming guide: https://beam.apache.org/documentation/programming-guide/
You can find more details on the SparkRunner here:
And on the execution model for parallelized tasks:
https://beam.apache.org/documentation/execution-model/
Please keep in mind that the Python SDK is quite experimental and far behind the Java one. This gap should be filled in the coming versions though.