Scala and Python are resourceful on the Internet, let’s make Spark shines with Java!

Image for post
Image for post

What to learn Machine Learning using Spark?

Site like, is an example for high quality examples, mostly in Scala and PySpark and some for Java.

There is a lot of Java example from Apache Spark repo, they are nice to dig deeper to understand how Spark works.

Apart from these resource, Favio Vázquez has written one of the highest quality and beginner friendly post on building Spark MLLib model:

Credit to the original author on the content, a must read before you move on to the next section of this post!


Outdate warning: This is an old blog in 2016 I’m moving from blogspot to Medium

Spring Batch is a project under Spring Framework, interestingly this project begin the life that SpringSource (now Pivotal) and Accenture work together to develop and release as open source (Apache license)!

Spring batch has a lot of features and functions that allows enterprise to write batch processing application, this also allow program to supporting batch processing with worker model style.

Say, you have an Application Instance (AI) that runs in Pivotal CloudFoundry or your have a program deployed in AWS with AutoScaling group, your Spring…

Image for post
Image for post

In this post, we will dive into Custom Resource Definition as a way to extends K8S.

We will first describe the use case of CRD, some hygiene of writing CRD and finally you will learn how to use Java to generate CRD instead of manually editing it.

This post will demonstrate how to implement Controller and CRD using Java to build CRD without YAML, running a custom controller for your CRD and detail on controller implementation step by step.

This post will not dive deep if you should use CRD or note, read the great comparison documentation from K8S.


Let’s get our hands dirty by writing a custom controller for K8S

In part 1, we have walk through how controller works, interactions of controller, API server and what will be the shared responsibility but separation of duty between controllers.

In this blog — part 2, we will dive deeper into writing a controller that take cares of existing resource definition.

An entry barrier for this post, you must understand both API and Java SDK to understand the coding embedded in this blog. See previous blog posts:

Coding K8S resource in Java — Part 1 of 2 (K8S API) Coding…

How to use Java Client SDK to patch and update a live resource easily?

Image for post
Image for post

Patching vs Update:

As stated in my other blog Coding K8S resource in Java — Part 1 of 2 (K8S API) , the API server of K8S is RESTful, both HTTP PATCH and PUT are supported.

In short:

  • Update = let me read the current version of the object, do some changes, then tell API server to replace entire object (except the status part of the object) = HTTP PUT request
  • Patching = Even if I (client side) haven’t read current version of the object, follow my instruction and…

How to use K8S Java SDK to write a custom controller

Update 1-Jun-2020: Part 2 has release!

What is controller?

Image for post
Image for post

K8S has a lot of built-in resource such as Pod , Deployment , StatefulSet , ReplicaSet, CronJob , DaemonSet and many others.

However, API server does not understand the context, for example, when the user put a Deployment resource definition to RESTFul endpoint, the API server just stored the definition into etcd cluster, to note down a record of intent.

So there must be another set of components to deal with the resource object definition — Controller.

You can easily guest the name…

Last mile extension to K8S Java SDK for more user friendliness

This is an extension of my previous post on How Kubernetes support so many client libraries? and Coding K8S resource in Java — Part 1 of 2 (K8S API)

OpenAPI specification provides us a great way to automate the generation of essential model and api Java classes so that anyone can talk to K8S API server with the same data structure and endpoints.

There are several items that OpenAPI generator didn’t cover:

  1. Dealing with Quantity (eg 100Ki vs 100K), parsing Int from String
  2. Helper function for writing controller: reconciler…

Your goal: Pass

For the CKAD Exam, a score of 66% or above must be earned to pass.

You’ve got 2 hours, time is always not enough!


Read Candidate Handbook (v1.8)

Read Exam tips

Editor: VIM/Nano

Familiar with VIM (default) or Nano

export KUBE_EDITOR="nano"


Unless you are really familiar, switching between screen may not save your time.

Remember the Crontab format

Bookmark and find information

Useful to look up resource syntax detail in YAML

I created a new person in Chrome browser and setup the bookmark as below for quick reference.

Image for post
Image for post

Practice environment

As of 28-Jan-2020, uses Kubernetes v1.17

To provision an environment with 1.17, …

Manipulation of states in K8S API server using static file in YAML is a practice that many follows but one could argue if there is a better way.

Image for post

This post aim to provide answers for:

  1. Can we interact with the API server directly without YAML?
  2. Can we generate YAML programmatically?
  3. What are the useful features of Java Client?

Again, this post won’t cover anything related to running any Java application inside a Pod that interact with K8S env, rather, this is a post to show how to interact with K8S API via Java Client.

This is a multi-part post:


Hin Lam

Machine Learning; Cloud Native App; Cloud;

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store