Wednesday, December 7, 2022
No menu items!
HomeData Engineering and Data WarehousingFrom zero to 10 million lines of Kotlin

From zero to 10 million lines of Kotlin

We’re sharing lessons learned from shifting our Android development from Java to Kotlin.
Kotlin is a popular language for Android development and offers some key advantages over Java. 
As of today, our Android codebase contains over 10 million lines of Kotlin code.
We’re open sourcing various examples and utilities we used to manipulate Kotlin code as part of this migration

In recent years, Kotlin has become a popular language for Android development. So it only makes sense that we would shift our Android development at Meta to Kotlin as we work to make our development workflows more efficient. 

Meta’s Android repository is very large and reaches across our family of apps and technologies, including Facebook, Instagram, Messenger, Portal, and the Quest. Shifting away from Java, which we currently use for Android development, and over to Kotlin is not a trivial task.

Why we’re converting our codebase to Kotlin

Kotlin is generally regarded as a better language than Java, with higher favorability ratings than Java in the yearly Stack Overflow developer survey. We also compared the latest Kotlin version with Java 11, which is the latest version that can be used for Android development.

Aside from its popularity, Kotlin holds some major advantages:

Nullability: Null pointer exceptions are a common problem at Meta, as everywhere else. We are very good at fixing them before releasing our apps, but dealing with those issues is still time-consuming. We use internal tools to detect null safety issues earlier, and we rigorously annotate our code as part of our work to detect such issues in Java earlier. But even with that, Kotlin’s built-in nullability handling is more robust and easier to work with.
Functional programming: Kotlin’s support for inline functions and lambda expressions allows us to use a functional programming style without compromising execution speed. Although Java 8 adds support for lambdas and is available for Android, it comes at the cost of more anonymous objects, which affect performance negatively on low-end Android devices. Meta’s home-brewed Redex minimizes these issues, but they still exist, making Kotlin a better alternative.
Shorter code: Kotlin’s modern design makes its code shorter. Kotlin allows for dropping explicit types (as does Java 11), and together with the standard library, which is based on the functional style mentioned above, it shortens many repetitive loops into simpler statements. This shorter code is also more explicit, which can make it easier to follow.
Domain-specific language (DSL) / Type-safe builders: Kotlin’s various features come together and let us define a DSL. Basically, a way to move definitions such as Android XMLs to be implemented directly in Kotlin code. But this tool should be wielded carefully because implementing DSLs in Kotlin can either be useful or turn into overengineering.

 However, adopting Kotlin also has a few disadvantages that we could not ignore:

Adopting another language could mean we’ll have to deal with a mixed codebase of two languages for a long time. Kotlin is very good at interacting with Java, but quirks do pop up at times.
Kotlin is a popular language, but compared with Java, the popularity gap is clear. Java is the world’s second or third most popular language (depending on how one measures this). This means fewer tools are available. Worse than that, all the Kotlin tools need to account for Kotlin and Java interoperability, which complicates their implementation.

Lastly, our biggest worry was build times. We knew from the start that Kotlin’s build times would be longer than Java’s. The language and its ecosystem are more complicated, and Java had two decades of a head start to optimize its compiler. Since we own several large apps, the consequences of longer build times could negatively impact our developers’ experience. Hearing anecdotes such as OkHttp’s experience migrating to Kotlin painted a less-than-ideal picture.

How we’re approaching the migration

Migrating to Kotlin is both surprisingly easy and very complicated. It’s easy because Kotlin’s design allows simple conversion from Java with well-thought-out interoperability. This design made it possible for JetBrains to supply the developer community with J2K, the Java to Kotlin converter that comes with IntelliJ/Android Studio.

But even with J2K, the migration is still complicated. J2K doesn’t always get things correct, and the interoperability of Java and Kotlin exposes us to several edge cases. These run the gamut from style fixes to make the code cleaner all the way to tricky runtime behavior changes (which we will discuss later).

Going into this migration, we had two options:

We could make it possible to write new code at Meta using Kotlin but leave most of the existing code in Java.
We could attempt to convert almost all our in-house code into Kotlin.

The advantage of the first option is clear — it’s much less work. But there are two notable disadvantages to this approach. First, enabling interoperability between Kotlin and Java code introduces the use of platform types in Kotlin. Platform types give rise to runtime null pointer dereferences that result in crashes instead of the static safety offered by pure Kotlin code. In some complicated cases, Kotlin’s null check elision can let nulls through and create surprising null pointer exceptions later. This could happen if, for example, Kotlin code calls a Kotlin interface implemented by a Java interface.

Other issues include Java’s inability to tag type parameters as nullable (until recently), and Kotlin’s overloading rules taking nullability into account, while Java’s overloading rules do not.

The second disadvantage comes when considering that most software development at Meta — as with anywhere else — entails modifying existing code. If most of our code is in Java, we aren’t allowing our developers to fully enjoy Kotlin. Since the migration is a long process, expecting every engineer to convert a file to Kotlin before they touch it is exhausting and inefficient.

How we’re migrating to Kotlin

We considered these two options and decided our goal would be to convert almost all our code into Kotlin. After a slow start, where we had to fix a few blockers, we were able to begin converting a lot of code at bulk. Today, our Android apps for Facebook, Messenger, and Instagram each have more than 1 million lines of Kotlin code, and the rate of conversion is increasing. In total, our Android codebase has more than 10 millions lines of Kotlin code.

Unblocking

As soon as we started trying to use Kotlin in our existing apps we hit some issues. For example, we needed to update Redex to support bytecode patterns that Java did not generate. In addition, some internal libraries we use depend on transforming bytecode during compilation to achieve better performance. This code did not work when run as part of a Kotlin compilation.

We built workarounds for our tools to solve these issues. If you migrate your code to Kotlin and have a bunch of in-house optimizations, you should expect similar problems. However, we expect most people will not face such issues.

We also identified various gaps with existing tooling. For example, Kotlin syntax highlighting in our code review or wiki was lacking. We updated Pygments, the library we are using, to bring the experience to par with Java. We updated some of our internal code-modding tools to be able to handle Kotlin. We also built Ktfmt, a deterministic Kotlin formatter based on the code and philosophy of google-java-format.

Accelerating the migration

With our tools ready, we could now convert any part of our code to Kotlin. But each migration required a bunch of boilerplate work that had to be done manually. J2K is a general tool and, as such, avoids understanding the code it is converting. This creates many cases that require manual work.

One popular example is the usage of JUnit testing rules, which are commonly used in tests.

For example, you may want to verify the correct exceptions are thrown using the ExpectedException rule:

@Rule public ExpectedException expectedException = ExpectedException.none();

When J2K converts this code to Kotlin, we get: 

@Rule var expectedException = ExpectedException.none()

This code looks equivalent at first to the original Java, but due to Kotlin’s use site annotations, it is actually equivalent to:

@Rule private ExpectedException expectedException = ExpectedException.none();

public ExpectedException getExpectedException() {
return expectedException
}

Trying to run this test will fail and return an error: “The @Rule expectedException must be public” since JUnit will see a private field annotated with @Rule. This is a common problem that has been answered many times in forums and can be fixed in one of two ways: either add `@JvmField` to the field or add an annotation use-site to the annotation so it is `@get:Rule`:

// solution 1: use `get` as the use-site for the annotation
@get:Rule var expectedException = ExpectedException.none()

// solution 2: generate JVM code only for a Java field without a getter
@JvmField @Rule var expectedException = ExpectedException.none()

Since J2K does not (and probably should not) know the intricacies of JUnit, it cannot do the right thing. Even if we thought JUnit were popular enough that it might warrant having J2K know about it, we would still have this same problem with many niche frameworks.

For example, a lot of Android Java code will use the utility methods from android.text.TextUtils, such as isEmpty to simplify the check of some strings. In Kotlin, however, we have the built-in standard library method String.isNullOrEmpty. This method is preferable not only because it’s in the standard library, but also because it has a contract that tells the Kotlin compiler that if it returns false, the object being tested can no longer be null and can be smart-cast to a String.

Java code has many other similar helper methods, and many libraries implement these same basic methods. All of these should be replaced with the standard Kotlin methods to simplify the code and allow the compiler to properly detect nonnullable types.

We have found many instances of these small fixes. Some are easy to do (such as replacing isEmpty), some require research to figure out the first time (as in the case of JUnit rules), and a few are workarounds for actual J2K bugs that can result in anything from a build error to different runtime behavior.

To solve these issues, we put J2K in the middle of a three-step pipeline:

In the first step, we take one Java package and prepare it to be converted to Kotlin. This step mostly works around bugs and does conversions needed for our internal tools.
The second step is running J2K. We have been able to run Android Studio in a headless mode and invoke J2K, which allows us to run the entire pipeline as a script.
In the last step, we postprocess the new Kotlin files. This step contains the majority of our automated refactors and fixes steps such as tagging a JUnit rule as a @JvmField. As part of this step, we also apply our autocorrecting linters and apply various Android Studio suggestions in headless mode.

These automations do not resolve all the problems, but we are able to prioritize the most common ones. We run our conversion script (aptly named Kotlinator) on modules, prioritizing active and simpler modules first. We then observe the resulting commit: Does it compile? Does it pass our continuous integration smoothly? If it does, we commit it. And if not, we look at the issues and devise new automatic refactors to fix them. For issues that don’t seem systematic or new, we simply fix them manually and commit the change.

For the Java refactors, we use JavaASTParser, which allows us to resolve some types, along with other internal tools.

For the Kotlin side, we don’t yet have a good solution that can resolve types, so we opt to use the Kotlin compiler APIs. Loading a Kotlin code into its PSI AST is simple and, in practice, gives us all the power we need to continuously improve Kotlinator.

Since we started this process, we’ve learned a bit while using the Kotlin compiler APIs, so we’re also releasing a limited set of some of the automated refactorings in the hope that it will help more developers use the Kotlin compiler parser to their advantage.

Here is a quick example of using a template-matching utility we built to handle the Android TextUtils.isEmpty case mentioned above: 

val ktFile = load(path)
// make sure the correct class is imported
if (ktFile.imports.none {
it.importedReference?.text == “android.text.TextUtils”
}) {
return
}
val newContent = ktFile.replaceAll<KtExpression>(
matcher = template {
val a by match<KtExpression> {}
“TextUtils.isEmpty($a)”
},
replaceWith = {
val a by it.variables
“$a.isNullOrEmpty()”
})
write(path, newContent)

If you have an adversarial mind, you can probably see a bunch of ways to break this refactor. In practice, we find that they don’t show up in our code, and that this is enough for us to move forward.

What we’ve learned from our Kotlin migration

With our tooling improvements, we were already able to convert a sizable chunk of our code into Kotlin. We already have more than 10 million lines of Kotlin code in our codebase, and the majority of Android developers at Meta are now writing Kotlin code.  

This scale has led us to a few conclusions:

Reduced code length

We expected Kotlin code to be shorter going into this migration. Some files were indeed cut in half (and even more), especially when the Java code had to null-check many fields, or when simple repetitive loops could be replaced with standard Kotlin methods that accept a lambda, such as “first,” “single,” “any,” “etc.”

However, a lot of our code is simply about passing values around. For example, a Litho class, which defines UI and its styling, stays about the same length regardless of whether it’s in Java or Kotlin.

On average, we’ve seen a reduction of 11 percent in the number of lines of code from this migration. We have seen much higher numbers quoted online, but we suspect these numbers are derived from specific examples.

We are still happy about this number, as the lines removed are usually boilerplate code, which is less implicit than its shorter Kotlin counterpart.

Maintaining execution speed

Since Kotlin compiles to the same JVM bytecode, we did not expect to see any execution speed performance regressions from this migration.

To verify this, we ran multiple A/B tests comparing a Java implementation with a Kotlin implementation, using Kotlin features such as lambdas, nullability, and more. We found that Kotlin matched the performance of Java, as we expected.

Build size is not an issue

The Kotlin standard library is pretty small, and since all of our releases use Proguard and Redex, only some of it even makes it into a release APK. Therefore, size hasn’t proved to be a problem except in a situation where a few KBs of extra code matter. In those cases, we found that by avoiding Kotlin’s standard library and using the already available Java methods, the problem can be solved. For example, using CharSequence.split from kotlin.text would add a few classes and constants when compared with using Java’s String.split.

Addressing longer build times

We expected build times would be longer with Kotlin since it’s a relatively new language, compared with Java. We guessed right, and our developers noticed that build times increased as we used more Kotlin in our codebase.

While the Kotlin compiler keeps improving, we looked at ways we can improve build times on our end. One of them is source-only ABI support in our Buck build system, which can generate ABI jars for dependencies in the build graph without actually compiling them. This is already supported for Java, and we’re working on a Kotlin version, which we believe will flatten the build graph and vastly improve incremental build speeds.

The other area we investigated is annotation processing, a known pain point for build speed. Kotlin supports annotation processors using KAPT, which is currently in maintenance mode. KAPT works by generating a Java code stub for the existing Java annotation processor code to run. It’s nice since it lets your existing code work without modifications, but it’s slow due to the generation of the Java stub.

The solution is to use KSP, the new, recommended way to handle annotation processing. We added support for it in Buck and are working on porting our existing processors to KSP using an adapter we developed. This does minimize the cost of running annotation processors, but only if no KAPT-based processors remain. The downside is that this requires a lot of work to update all the annotation processors. We found the interop library by the Room developers as another option to reuse existing code, but there’s still necessary migration work needed for each processor.

What’s next for Kotlin at Meta?

Our migration to Kotlin is still ongoing and accelerating. We have been able to allow any Android developer at Meta who wants to use Kotlin to do so and have supplied them with tools to easily migrate existing code to Kotlin.

Kotlin still lacks some of the tools and optimizations that we have grown used to from working with Java. But we’re working to close those gaps. As we make progress and these tools and libraries mature, we will also work to release them back to the community.

The post From zero to 10 million lines of Kotlin appeared first on Engineering at Meta.

Read MoreEngineering at Meta

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments