My Journey With Spark On Kubernetes... In Python (Part 3 of 3)

We need to operate Kubernetes as part of a Python client application. So, we need to interact with the Kubernetes REST API. Luckily we do not need to implement the API calls and manage HTTP requests/responses ourselves: we can rely on the Kubernetes Python client, among other officially-supported Kubernetes client libraries for other languages such as Go, Java, .NET, JavaScript and Haskell (there are also a lot of community-maintained client libraries for many languages). ...

April 14, 2021 Ā· 18 min Ā· Pascal Gillet

My Journey With Spark On Kubernetes... In Python (Part 2 of 3)

In the previous article, we saw how to launch Spark applications with the Spark Operator. In this article, weā€™ll see how to do the same thing, but natively with spark-submit. Letā€™s first explain the differences between the two ways of deploying your driver on the worker nodes. ...

April 13, 2021 Ā· 10 min Ā· Pascal Gillet

My Journey With Spark On Kubernetes... In Python (Part 1 of 3)

Je vous parle dā€™un temps Que les moins de vingt ans Ne peuvent pas connaĆ®tre šŸŽ¶ Until not long ago, the way to go to run Spark on a cluster was either with Sparkā€™s own standalone cluster manager, Mesos or YARN. In the meantime, the Kingdom of Kubernetes has risen and spread widely. And when it comes to run Spark on Kubernetes, you now have two choices: Use ā€œnativeā€ Sparkā€™s Kubernetes capabilities: Spark can run on clusters managed by Kubernetes since Spark 2.3. Kubernetes support was still flagged as experimental until very recently, but as per SPARK-33005 Kubernetes GA Preparation, Spark on Kubernetes is now fully supported and production ready! šŸŽŠ Use the Spark Operator, proposed and maintained by Google, which is still in beta version (and always will be). This series of 3 articles tells the story of my experiments with both methods, and how I launch Spark applications from Python code. ā€œCabin crew, arm doors and cross checkā€. Letā€™s go! āœˆļø ...

April 12, 2021 Ā· 9 min Ā· Pascal Gillet