页面树结构
转至元数据结尾
转至元数据起始

问题现象:

在节点日志中,出现如下

Caused by: java.lang.OutOfMemoryError: Java heap space
at org.apache.spark.sql.execution.vectorized.OnHeapColumnVector.reserveInternal(OnHeapColumnVector.java:575)

........

Caused by: java.lang.RuntimeException: Cannot reserve additional contiguous bytes in the vectorized reader (requested 24859017 bytes). As a workaround, you can reduce the vectorized reader batch size, or disable the vectorized reader, or disable spark.sql.sources.bucketing.enabled if you read from bucket table. For Parquet file format, refer to spark.sql.parquet.columnarReaderBatchSize (default 4096) and spark.sql.parquet.enableVectorizedReader; for ORC file format, refer to spark.sql.orc.columnarReaderBatchSize (default 4096) and spark.sql.orc.enableVectorizedReader.
at org.apache.spark.sql.execution.vectorized.WritableColumnVector.throwUnsupportedException(WritableColumnVector.java:113)


原因:

Solved: Cannot reserve additional contiguous bytes in the ... - Databricks - 13774


解决方案:

由于ETL在开启了缓存后,会生成和读取parquet文件,而这个异常是Spark读写parquet文件抛出的异常,因此可以用如下方案规避:

1、关闭ETL缓存


2、设置目标的优化参数WRITE_JDBC_BATCHSIZE为1





  • 无标签