Upgrade to Scala 2.13.16, Spark 4.0.1, Java 17, and XGBoost 2.0.3#887
Upgrade to Scala 2.13.16, Spark 4.0.1, Java 17, and XGBoost 2.0.3#887Urvashi2303 wants to merge 1 commit intocombust:masterfrom
Conversation
|
Thanks for the PR. Also: |
|
Hey @Urvashi2303 I have created a PR to added github action and fix the python issues. |
There was a problem hiding this comment.
Pull request overview
This PR upgrades MLeap to modern versions across the technology stack: Scala 2.13.16, Apache Spark 4.0.1, Java 17, and XGBoost 2.0.3. The changes address compatibility requirements and API changes introduced by these major version upgrades.
Key Changes:
- Core dependency upgrades in build configuration (Scala, Spark, Java, XGBoost, Spring Boot, Logback, Hadoop)
- Scala 2.13 collection API compatibility fixes throughout the codebase (
.toSeq,.toMapconversions) - Java namespace migration from
javax.*tojakarta.*for Spring Boot 3.x - XGBoost 2.0 API adjustments (file format parameters, ArrayRow constructor changes)
Reviewed changes
Copilot reviewed 38 out of 38 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| project/Dependencies.scala | Updated version numbers for core dependencies and supporting libraries |
| project/Common.scala | Changed Scala version to 2.13.16 and added Java 17 compiler options |
| mleap-xgboost-spark test files | Refactored implicit imports to avoid Spark 4.0 compilation issues |
| mleap-xgboost-runtime/CachedDatasetUtils.scala | Added XGBoost 2.0 format parameter and fixed ArrayRow constructor |
| mleap-tensorflow files | Added ClassTag imports and .toSeq conversions for Scala 2.13 |
| mleap-spring-boot files | Migrated from javax to jakarta namespaces for Spring Boot 3.x |
| mleap-spark-testkit/SparkParityBase.scala | Adjusted Spark configuration and added .toSet conversion |
| mleap-spark-extension/OneVsRest.scala | Removed parallel processing from Range.map |
| mleap-spark-base files | Updated DataFrame API usage for Spark 4.0 |
| mleap-runtime files | Extensive collection API updates for Scala 2.13 compatibility |
| mleap-core files | Added .toMap conversions and type annotations for Scala 2.13 |
| mleap-benchmark/SparkTransformBenchmark.scala | Removed deprecated config parameter |
| mleap-avro files | Collection API conversions throughout value and schema converters |
| bundle-ml files | Collection API fixes and empty parameter list syntax updates |
| bundle-hdfs/HadoopBundleFileSystem.scala | Added .toSeq conversion for config list |
| README.md | Updated artifact IDs and added version compatibility row |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** Object for serializing/deserializing model definitions with JSON. | ||
| */ | ||
| case class JsonFormatModelSerializer(implicit hr: HasBundleRegistry) extends FormatModelSerializer { | ||
| case class JsonFormatModelSerializer()(implicit hr: HasBundleRegistry) extends FormatModelSerializer { |
There was a problem hiding this comment.
Empty parameter lists () on case classes are unnecessary in Scala. The empty parentheses can be removed: case class JsonFormatModelSerializer(implicit hr: HasBundleRegistry)
| /** Object for serializing/deserializing model definitions with Protobuf. | ||
| */ | ||
| case class ProtoFormatModelSerializer(implicit hr: HasBundleRegistry) extends FormatModelSerializer { | ||
| case class ProtoFormatModelSerializer()(implicit hr: HasBundleRegistry) extends FormatModelSerializer { |
There was a problem hiding this comment.
Empty parameter lists () on case classes are unnecessary in Scala. The empty parentheses can be removed: case class ProtoFormatModelSerializer(implicit hr: HasBundleRegistry)
| case class ProtoFormatModelSerializer()(implicit hr: HasBundleRegistry) extends FormatModelSerializer { | |
| case class ProtoFormatModelSerializer(implicit hr: HasBundleRegistry) extends FormatModelSerializer { |
| .collect(Collectors.toList()) | ||
| .asScala | ||
| .map(removeElement) | ||
| .map(removeElement(_)) |
There was a problem hiding this comment.
The explicit underscore in .map(removeElement(_)) is redundant. Scala can infer this automatically, so it can be simplified to .map(removeElement)
| .map(removeElement(_)) | |
| .map(removeElement) |
| namedIndices: Array[(String, Int)] = Array(), | ||
| inputSize: Int) extends Model { | ||
| val allIndices: Array[Int] = indices.union(namedIndices.map(_._2)) | ||
| val allIndices: Array[Int] = (indices.union(namedIndices.map(_._2))).toArray |
There was a problem hiding this comment.
The parentheses around indices.union(namedIndices.map(_._2)) are unnecessary. The .toArray call can be applied directly: val allIndices: Array[Int] = indices.union(namedIndices.map(_._2)).toArray
| val allIndices: Array[Int] = (indices.union(namedIndices.map(_._2))).toArray | |
| val allIndices: Array[Int] = indices.union(namedIndices.map(_._2)).toArray |
Summary
This PR upgrades MLeap to Scala 2.13.16, Apache Spark 4.0.1, Java 17 and XGBoost 2.0.3. This brings the project up to date with the latest and stable versions across the entire stack.
Version Upgrades
Core Dependencies:
Supporting Libraries:
Build and Test Summary
Build Results:
Test Results:
mleap-benchmark
Pass
mleap-xgboost-runtime
Tests: 22 | Pass: 22 | Fail: 0
mleap-executor-tests
Tests: 31 | Pass: 31 | Fail: 0
root tests : mleap-base, mleap-tensor, bundle-ml, mleap-core, mleap-runtime, mleap-avro, mleap-spark
Tests: 8262 | Pass: 8261 | Fail: 1 (SparkTransformBuilderSpec fails with RemoteClassLoaderError in local mode)
Aborted(All these are local test environment limitations):