Vincent Teyssier
1 min readNov 6, 2018

Thank you Ruslan. The serialization is handled by Beam but if you want to write your own IO module, then you can find serialization details on chapter 4.3 of the programming guide: https://beam.apache.org/documentation/programming-guide/

You can find more details on the SparkRunner here:

And on the execution model for parallelized tasks:
https://beam.apache.org/documentation/execution-model/

Please keep in mind that the Python SDK is quite experimental and far behind the Java one. This gap should be filled in the coming versions though.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

No responses yet

Write a response