Release Notes - ASF JIRA

Release Notes - Spark - Version 2.0.1 - HTML format

Configure Release Notes

Sub-task

[SPARK-15232] - Add subquery SQL building tests to LogicalPlanToSQLSuite
[SPARK-15698] - Ability to remove old metadata for structure streaming MetadataLog
[SPARK-15814] - Aggregator can return null result
[SPARK-16287] - Implement str_to_map SQL function
[SPARK-16312] - Docs for Kafka 0.10 consumer integration
[SPARK-16380] - Update SQL examples and programming guide for Python language binding
[SPARK-16391] - KeyValueGroupedDataset.reduceGroups should support partial aggregation
[SPARK-16508] - Fix documentation warnings found by R CMD check
[SPARK-16510] - Move SparkR test JAR into Spark, include its source code
[SPARK-16519] - Handle SparkR RDD generics that create warnings in R CMD check
[SPARK-16577] - Add check-cran script to Jenkins
[SPARK-16579] - Add a spark install function
[SPARK-16581] - Making JVM backend calling functions public
[SPARK-16621] - Generate stable SQLs in SQLBuilder
[SPARK-16734] - Make sure examples in all language bindings are consistent
[SPARK-16735] - Fail to create a map contains decimal type with literals having different inferred precessions and scales
[SPARK-16774] - Fix use of deprecated TimeStamp constructor (also providing incorrect results)
[SPARK-16776] - Fix Kafka deprecation warnings
[SPARK-16778] - Fix use of deprecated SQLContext constructor
[SPARK-16800] - Fix Java Examples that throw exception
[SPARK-16866] - Basic infrastructure for file-based SQL end-to-end tests
[SPARK-17007] - Move test data files into a test-data folder
[SPARK-17008] - Normalize query results using sorting
[SPARK-17009] - Use a new SparkSession for each test case
[SPARK-17011] - Support testing exceptions in queries
[SPARK-17015] - group-by-ordinal and order-by-ordinal test cases
[SPARK-17018] - literals.sql for testing literal parsing
[SPARK-17042] - Repl-defined classes cannot be replicated
[SPARK-17096] - Fix StreamingQueryListener to return message and stacktrace of actual exception
[SPARK-17149] - array.sql for testing array related functions
[SPARK-17165] - FileStreamSource should not track the list of seen files indefinitely
[SPARK-17235] - MetadataLog should support purging old logs
[SPARK-17269] - Move finish analysis stage into its own file
[SPARK-17270] - Move object optimization rules into its own file
[SPARK-17274] - Move join optimizer rules into a separate file
[SPARK-17372] - Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError
[SPARK-17513] - StreamExecution should discard unneeded metadata
[SPARK-17586] - Use Static member not via instance reference
[SPARK-18151] - CLONE - MetadataLog should support purging old logs
[SPARK-18152] - CLONE - FileStreamSource should not track the list of seen files indefinitely
[SPARK-18153] - CLONE - Ability to remove old metadata for structure streaming MetadataLog
[SPARK-18156] - CLONE - StreamExecution should discard unneeded metadata

Bug

[SPARK-10683] - Source code missing for SparkR test JAR
[SPARK-11227] - Spark1.5+ HDFS HA mode throw java.net.UnknownHostException: nameservice1
[SPARK-12666] - spark-shell --packages cannot load artifacts which are publishLocal'd by SBT
[SPARK-14204] - [SQL] Failure to register URL-derived JDBC driver on executors in cluster mode
[SPARK-14209] - Application failure during preemption.
[SPARK-14818] - Move sketch and mllibLocal out from mima exclusion
[SPARK-15083] - History Server would OOM due to unlimited TaskUIData in some stages
[SPARK-15285] - Generated SpecificSafeProjection.apply method grows beyond 64 KB
[SPARK-15382] - monotonicallyIncreasingId doesn't work when data is upsampled
[SPARK-15390] - Memory management issue in complex DataFrame join and filter
[SPARK-15541] - SparkContext.stop throws error
[SPARK-15869] - HTTP 500 and NPE on streaming batch details page
[SPARK-15899] - file scheme should be used correctly
[SPARK-15989] - PySpark SQL python-only UDTs don't support nested types
[SPARK-16062] - PySpark SQL python-only UDTs don't work well
[SPARK-16321] - [Spark 2.0] Performance regression when reading parquet and using PPD and non-vectorized reader
[SPARK-16334] - SQL query on parquet table java.lang.ArrayIndexOutOfBoundsException
[SPARK-16409] - regexp_extract with optional groups causes NPE
[SPARK-16439] - Incorrect information in SQL Query details
[SPARK-16440] - Undeleted broadcast variables in Word2Vec causing OoM for long runs
[SPARK-16457] - Wrong messages when CTAS with a Partition By clause
[SPARK-16460] - Spark 2.0 CSV ignores NULL value in Date format
[SPARK-16462] - Spark 2.0 CSV does not cast null values to certain data types properly
[SPARK-16522] - [MESOS] Spark application throws exception on exit
[SPARK-16533] - Spark application not handling preemption messages
[SPARK-16550] - Caching data with replication doesn't replicate data
[SPARK-16558] - examples/mllib/LDAExample should use MLVector instead of MLlib Vector
[SPARK-16563] - Repeat calling Spark SQL thrift server fetchResults return empty for ExecuteStatement operation
[SPARK-16586] - spark-class crash with "[: too many arguments" instead of displaying the correct error message
[SPARK-16597] - DataFrame DateType is written as an int(Days since epoch) by csv writer
[SPARK-16610] - When writing ORC files, orc.compress should not be overridden if users do not set "compression" in the options
[SPARK-16613] - RDD.pipe returns values for empty partitions
[SPARK-16632] - Vectorized parquet reader fails to read certain fields from Hive tables
[SPARK-16633] - lag/lead using constant input values does not return the default value when the offset row does not exist
[SPARK-16634] - GenericArrayData can't be loaded in certain JVMs
[SPARK-16639] - query fails if having condition contains grouping column
[SPARK-16642] - ResolveWindowFrame should not be triggered on UnresolvedFunctions.
[SPARK-16644] - constraints propagation may fail the query
[SPARK-16646] - LEAST doesn't accept numeric arguments with different data types
[SPARK-16648] - LAST_VALUE(FALSE) OVER () throws IndexOutOfBoundsException
[SPARK-16656] - CreateTableAsSelectSuite is flaky
[SPARK-16664] - Spark 1.6.2 - Persist call on Data frames with more than 200 columns is wiping out the data.
[SPARK-16672] - SQLBuilder should not raise exceptions on EXISTS queries
[SPARK-16686] - Dataset.sample with seed: result seems to depend on downstream usage
[SPARK-16698] - json parsing regression - "." in keys
[SPARK-16699] - Fix performance bug in hash aggregate on long string keys
[SPARK-16700] - StructType doesn't accept Python dicts anymore
[SPARK-16703] - Extra space in WindowSpecDefinition SQL representation
[SPARK-16711] - YarnShuffleService doesn't re-init properly on YARN rolling upgrade
[SPARK-16714] - Fail to create a decimal arrays with literals having different inferred precessions and scales
[SPARK-16715] - Fix a potential ExprId conflict for SubexpressionEliminationSuite."Semantic equals and hash"
[SPARK-16721] - Lead/lag needs to respect nulls
[SPARK-16724] - Expose DefinedByConstructorParams
[SPARK-16729] - Spark should throw analysis exception for invalid casts to date type
[SPARK-16730] - Spark 2.0 breaks various Hive cast functions
[SPARK-16740] - joins.LongToUnsafeRowMap crashes with NegativeArraySizeException
[SPARK-16748] - Errors thrown by UDFs cause TreeNodeException when the query has an ORDER BY clause
[SPARK-16750] - ML GaussianMixture training failed due to feature column type mistake
[SPARK-16751] - Upgrade derby to 10.12.1.1 from 10.11.1.1
[SPARK-16770] - Spark shell not usable with german keyboard due to JLine version
[SPARK-16781] - java launched by PySpark as gateway may not be the same java used in the spark environment
[SPARK-16785] - dapply doesn't return array or raw columns
[SPARK-16787] - SparkContext.addFile() should not fail if called twice with the same file
[SPARK-16791] - casting structs fails on Timestamp fields (interpreted mode only)
[SPARK-16802] - joins.LongToUnsafeRowMap crashes with ArrayIndexOutOfBoundsException
[SPARK-16818] - Exchange reuse incorrectly reuses scans over different sets of partitions
[SPARK-16831] - PySpark CrossValidator reports incorrect avgMetrics
[SPARK-16836] - Hive date/time function error
[SPARK-16837] - TimeWindow incorrectly drops slideDuration in constructors
[SPARK-16850] - Improve error message for greatest/least
[SPARK-16873] - force spill NPE
[SPARK-16880] - Improve ANN training, add training data persist if needed
[SPARK-16883] - SQL decimal type is not properly cast to number when collecting SparkDataFrame
[SPARK-16901] - Hive settings in hive-site.xml may be overridden by Hive's default values
[SPARK-16905] - Support SQL DDL: MSCK REPAIR TABLE
[SPARK-16907] - Parquet table reading performance regression when vectorized record reader is not used
[SPARK-16922] - Query with Broadcast Hash join fails due to executor OOM in Spark 2.0
[SPARK-16925] - Spark tasks which cause JVM to exit with a zero exit code may cause app to hang in Standalone mode
[SPARK-16926] - Partition columns are present in columns metadata for partition but not table
[SPARK-16936] - Case Sensitivity Support for Refresh Temp Table
[SPARK-16942] - CREATE TABLE LIKE generates External table when source table is an External Hive Serde table
[SPARK-16943] - CREATE TABLE LIKE generates a non-empty table when source is a data source table
[SPARK-16950] - fromOffsets parameter in Kafka's Direct Streams does not work in python3
[SPARK-16953] - Make requestTotalExecutors public to be consistent with requestExecutors/killExecutors
[SPARK-16955] - Using ordinals in ORDER BY causes an analysis error when the query has a GROUP BY clause using ordinals
[SPARK-16959] - Table Comment in the CatalogTable returned from HiveMetastore is Always Empty
[SPARK-16961] - Utils.randomizeInPlace does not shuffle arrays uniformly
[SPARK-16966] - App Name is a randomUUID even when "spark.app.name" exists
[SPARK-16975] - Spark-2.0.0 unable to infer schema for parquet data written by Spark-1.6.2
[SPARK-16991] - Full outer join followed by inner join produces wrong results
[SPARK-16994] - Filter and limit are illegally permuted.
[SPARK-16995] - TreeNodeException when flat mapping RelationalGroupedDataset created from DataFrame containing a column created with lit/expr
[SPARK-17010] - [MINOR]Wrong description in memory management document
[SPARK-17013] - negative numeric literal parsing
[SPARK-17016] - group-by/order-by ordinal should throw AnalysisException instead of UnresolvedException
[SPARK-17022] - Potential deadlock in driver handling message
[SPARK-17027] - PolynomialExpansion.choose is prone to integer overflow
[SPARK-17038] - StreamingSource reports metrics for lastCompletedBatch instead of lastReceivedBatch
[SPARK-17051] - we should use hadoopConf in InsertIntoHiveTable
[SPARK-17056] - Fix a wrong assert in MemoryStore
[SPARK-17061] - Incorrect results returned following a join of two datasets and a map step where total number of columns >100
[SPARK-17065] - Improve the error message when encountering an incompatible DataSourceRegister
[SPARK-17066] - dateFormat should be used when writing dataframes as csv files
[SPARK-17086] - QuantileDiscretizer throws InvalidArgumentException (parameter splits given invalid value) on valid data
[SPARK-17093] - Roundtrip encoding of array<struct<>> fields is wrong when whole-stage codegen is disabled
[SPARK-17098] - "SELECT COUNT(NULL) OVER ()" throws UnsupportedOperationException during analysis
[SPARK-17099] - Incorrect result when HAVING clause is added to group by query
[SPARK-17100] - pyspark filter on a udf column after join gives java.lang.UnsupportedOperationException
[SPARK-17104] - LogicalRelation.newInstance should follow the semantics of MultiInstanceRelation
[SPARK-17110] - Pyspark with locality ANY throw java.io.StreamCorruptedException
[SPARK-17113] - Job failure due to Executor OOM in offheap mode
[SPARK-17114] - Adding a 'GROUP BY 1' where first column is literal results in wrong answer
[SPARK-17115] - Improve the performance of UnsafeProjection for wide table
[SPARK-17117] - 'SELECT 1 / NULL` throws AnalysisException, while 'SELECT 1 * NULL` works
[SPARK-17120] - Analyzer incorrectly optimizes plan to empty LocalRelation
[SPARK-17124] - RelationalGroupedDataset.agg should be order preserving and allow duplicate column names
[SPARK-17158] - Improve error message for numeric literal parsing
[SPARK-17160] - GetExternalRowField does not properly escape field names, causing generated code not to compile
[SPARK-17162] - Range does not support SQL generation
[SPARK-17167] - Issue Exceptions when Analyze Table on In-Memory Cataloged Tables
[SPARK-17180] - Unable to Alter the Temporary View Using ALTER VIEW command
[SPARK-17182] - CollectList and CollectSet should be marked as non-deterministic
[SPARK-17194] - When emitting SQL for string literals Spark should use single quotes, not double
[SPARK-17205] - Literal.sql does not properly convert NaN and Infinity literals
[SPARK-17210] - sparkr.zip is not distributed to executors when run sparkr in RStudio
[SPARK-17211] - Broadcast join produces incorrect results when compressed Oops differs between driver, executor
[SPARK-17216] - Even timeline for a stage doesn't core 100% of the bar timeline bar in chrome
[SPARK-17228] - Not infer/propagate non-deterministic constraints
[SPARK-17230] - Writing decimal to csv will result empty string if the decimal exceeds (20, 18)
[SPARK-17243] - Spark 2.0 history server summary page gets stuck at "loading history summary" with 10K+ application history
[SPARK-17244] - Joins should not pushdown non-deterministic conditions
[SPARK-17252] - Performing arithmetic in VALUES can lead to ClassCastException / MatchErrors during query parsing
[SPARK-17253] - Left join where ON clause does not reference the right table produces analysis error
[SPARK-17261] - Using HiveContext after re-creating SparkContext in Spark 2.0 throws "Java.lang.illegalStateException: Cannot call methods on a stopped sparkContext"
[SPARK-17264] - DataStreamWriter should document that it only supports Parquet for now
[SPARK-17296] - Spark SQL: cross join + two joins = BUG
[SPARK-17299] - TRIM/LTRIM/RTRIM strips characters other than spaces
[SPARK-17306] - QuantileSummaries doesn't compress
[SPARK-17309] - ALTER VIEW should throw exception if view not exist
[SPARK-17323] - ALTER VIEW AS should keep the previous table properties, comment, create_time, etc.
[SPARK-17335] - Creating Hive table from Spark data
[SPARK-17336] - Repeated calls sbin/spark-config.sh file Causes ${PYTHONPATH} Value duplicate
[SPARK-17339] - Fix SparkR tests on Windows
[SPARK-17342] - Style of event timeline is broken
[SPARK-17352] - Executor computing time can be negative-number because of calculation error
[SPARK-17353] - CREATE TABLE LIKE statements when Source is a VIEW
[SPARK-17354] - java.lang.ClassCastException: java.lang.Integer cannot be cast to java.sql.Date
[SPARK-17355] - Work around exception thrown by HiveResultSetMetaData.isSigned
[SPARK-17356] - A large Metadata filed in Alias can cause OOM when calling TreeNode.toJSON
[SPARK-17358] - Cached table(parquet/orc) should be shard between beelines
[SPARK-17364] - Can not query hive table starting with number
[SPARK-17369] - MetastoreRelation toJSON throws exception
[SPARK-17370] - Shuffle service files not invalidated when a slave is lost
[SPARK-17376] - Spark version should be available in R
[SPARK-17391] - Fix Two Test Failures After Backport
[SPARK-17396] - Threads number keep increasing when query on external CSV partitioned table
[SPARK-17418] - Spark release must NOT distribute Kinesis related assembly artifact
[SPARK-17438] - Master UI should show the correct core limit when `ApplicationInfo.executorLimit` is set
[SPARK-17439] - QuantilesSummaries returns the wrong result after compression
[SPARK-17442] - Additional arguments in write.df are not passed to data source
[SPARK-17463] - Serialization of accumulators in heartbeats is not thread-safe
[SPARK-17465] - Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak
[SPARK-17474] - Python UDF does not work between Sort and Limit
[SPARK-17491] - MemoryStore.putIteratorAsBytes() may silently lose values when KryoSerializer is used
[SPARK-17494] - Floor/ceil of decimal returns wrong result if it's in compact format
[SPARK-17502] - Multiple Bugs in DDL Statements on Temporary Views
[SPARK-17503] - Memory leak in Memory store when unable to cache the whole RDD in memory
[SPARK-17511] - Dynamic allocation race condition: Containers getting marked failed while releasing
[SPARK-17512] - Specifying remote files for Python based Spark jobs in Yarn cluster mode not working
[SPARK-17514] - df.take(1) and df.limit(1).collect() perform differently in Python
[SPARK-17515] - CollectLimit.execute() should perform per-partition limits
[SPARK-17521] - Error when I use sparkContext.makeRDD(Seq())
[SPARK-17525] - SparkContext.clearFiles() still present in the PySpark bindings though the underlying Scala method was removed in Spark 2.0
[SPARK-17531] - Don't initialize Hive Listeners for the Execution Client
[SPARK-17541] - fix some DDL bugs about table management when same-name temp view exists
[SPARK-17545] - Spark SQL Catalyst doesn't handle ISO 8601 date without colon in offset
[SPARK-17546] - start-* scripts should use hostname -f
[SPARK-17547] - Temporary shuffle data files may be leaked following exception in write
[SPARK-17548] - Word2VecModel.findSynonyms can spuriously reject the best match when invoked with a vector
[SPARK-17567] - Broken link to Spark paper
[SPARK-17571] - AssertOnQuery.condition should be consistent in requiring Boolean return type
[SPARK-17599] - Folder deletion after globbing may fail StructuredStreaming jobs
[SPARK-17613] - PartitioningAwareFileCatalog.allFiles doesn't handle URI specified path at parent
[SPARK-17616] - Getting "java.lang.RuntimeException: Distinct columns cannot exist in Aggregate "
[SPARK-17617] - Remainder(%) expression.eval returns incorrect result
[SPARK-17618] - Dataframe except returns incorrect results when combined with coalesce
[SPARK-17627] - Streaming Providers should be labeled Experimental
[SPARK-17641] - collect_set should ignore null values
[SPARK-17644] - The failed stage never resubmitted due to abort stage in another thread
[SPARK-17650] - Adding a malformed URL to sc.addJar and/or sc.addFile bricks Executors
[SPARK-17652] - Fix confusing exception message while reserving capacity
[SPARK-17666] - take() or isEmpty() on dataset leaks s3a connections
[SPARK-17672] - Spark 2.0 history server web Ui takes too long for a single application
[SPARK-17673] - Reused Exchange Aggregations Produce Incorrect Results
[SPARK-17752] - Spark returns incorrect result when 'collect()'ing a cached Dataset with many columns
[SPARK-17809] - scala.MatchError: BooleanType when casting a struct

New Feature

[SPARK-16956] - Make ApplicationState.MAX_NUM_RETRY configurable
[SPARK-17069] - Expose spark.range() as table-valued function in SQL
[SPARK-17150] - Support SQL generation for inline tables
[SPARK-17456] - Utility for parsing Spark versions

Improvement

[SPARK-2424] - ApplicationState.MAX_NUM_RETRY should be configurable
[SPARK-10835] - Word2Vec should accept non-null string array, in addition to existing null string array
[SPARK-12370] - Documentation should link to examples from its own release version
[SPARK-13286] - JDBC driver doesn't report full exception
[SPARK-15639] - Try to push down filter at RowGroups level for parquet reader
[SPARK-15703] - Make ListenerBus event queue size configurable
[SPARK-15923] - Spark Application rest api returns "no such app: <appId>"
[SPARK-16216] - CSV data source does not write date and timestamp correctly
[SPARK-16240] - model loading backward compatibility for ml.clustering.LDA
[SPARK-16320] - Document G1 heap region's effect on spark 2.0 vs 1.6
[SPARK-16324] - regexp_extract should doc that it returns empty string when match fails
[SPARK-16568] - update sql programing guide refreshTable API
[SPARK-16650] - Improve documentation of spark.task.maxFailures
[SPARK-16651] - Document no exception using DataFrame.withColumnRenamed when existing column doesn't exist
[SPARK-16663] - desc table should be consistent between data source and hive serde tables
[SPARK-16764] - Recommend disabling vectorized parquet reader on OutOfMemoryError
[SPARK-16772] - Correct API doc references to PySpark classes + formatting fixes
[SPARK-16796] - Visible passwords on Spark environment page
[SPARK-16805] - Log timezone when query result does not match
[SPARK-16812] - Open up SparkILoop.getAddedJars
[SPARK-16813] - Remove private[sql] and private[spark] from catalyst package
[SPARK-16865] - A file-based end-to-end SQL query suite
[SPARK-16870] - add "spark.sql.broadcastTimeout" into docs/sql-programming-guide.md to help people to how to fix this timeout error when it happenned
[SPARK-16875] - Add args checking for DataSet randomSplit and sample
[SPARK-16877] - Add a rule for preventing use Java's Override annotation
[SPARK-16932] - Programming-guide Accumulator section should be more clear w.r.t new API
[SPARK-16935] - Verification of Function-related ExternalCatalog APIs
[SPARK-16947] - Support type coercion and foldable expression for inline tables
[SPARK-16964] - Remove private[sql] and private[spark] from sql.execution package
[SPARK-17023] - Update Kafka connetor to use Kafka 0.10.0.1
[SPARK-17063] - MSCK REPAIR TABLE is super slow with Hive metastore
[SPARK-17084] - Rename ParserUtils.assert to validate
[SPARK-17186] - remove catalog table type INDEX
[SPARK-17193] - HadoopRDD NPE at DEBUG log level when getLocationInfo == null
[SPARK-17231] - Avoid building debug or trace log messages unless the respective log level is enabled
[SPARK-17246] - Support BigDecimal literal parsing
[SPARK-17279] - better error message for exceptions during ScalaUDF execution
[SPARK-17297] - Clarify window/slide duration as absolute time, not relative to a calendar
[SPARK-17301] - Remove unused classTag field from AtomicType base class
[SPARK-17316] - Don't block StandaloneSchedulerBackend.executorRemoved
[SPARK-17347] - Encoder in Dataset example has incorrect type
[SPARK-17378] - Upgrade snappy-java to 1.1.2.6
[SPARK-17421] - Document warnings about "MaxPermSize" parameter when building with Maven and Java 8
[SPARK-17445] - Reference an ASF page as the main place to find third-party packages
[SPARK-17480] - CompressibleColumnBuilder inefficiently call gatherCompressibilityStats
[SPARK-17483] - Minor refactoring and cleanup in BlockManager block status reporting and block removal
[SPARK-17484] - Race condition when cancelling a job during a cache write can lead to block fetch failures
[SPARK-17485] - Failed remote cached block reads can lead to whole job failure
[SPARK-17486] - Remove unused TaskMetricsUIData.updatedBlockStatuses field
[SPARK-17558] - Bump Hadoop 2.7 version from 2.7.2 to 2.7.3
[SPARK-17569] - Don't recheck existence of files when generating File Relation resolution in StructuredStreaming
[SPARK-17577] - SparkR support add files to Spark job and get by executors
[SPARK-17609] - SessionCatalog.tableExists should not check temp view
[SPARK-17638] - Stop JVM StreamingContext when the Python process is dead
[SPARK-17640] - Avoid using -1 as the default batchId for FileStreamSource.FileEntry
[SPARK-17649] - Log how many Spark events got dropped in LiveListenerBus
[SPARK-17651] - Automate Spark version update for documentations
[SPARK-18391] - Openstack deployment scenarios

Test

[SPARK-16690] - rename SQLTestUtils.withTempTable to withTempView
[SPARK-16722] - Fix a StreamingContext leak in StreamingContextSuite when eventually fails
[SPARK-17102] - bypass UserDefinedGenerator for json format check
[SPARK-17318] - Fix flaky test: o.a.s.repl.ReplSuite replicating blocks of object with class defined in repl
[SPARK-17326] - Tests with HiveContext in SparkR being skipped always
[SPARK-17473] - jdbc docker tests are failing with java.lang.AbstractMethodError:
[SPARK-17589] - Fix test case `create external table`

Question

[SPARK-17794] - 2.0.1 not in maven central repo?

Documentation

[SPARK-16295] - Extract SQL programming guide example snippets from source files instead of hard code them
[SPARK-16761] - Fix doc link in docs/ml-guide.md
[SPARK-16911] - Remove migrating to a Spark 1.x version in programming guide documentation
[SPARK-17085] - Documentation and actual code differs - Unsupported Operations
[SPARK-17089] - Remove link of api doc for mapReduceTriplets because its removed from api.
[SPARK-17242] - Update links of external dstream projects
[SPARK-17561] - DataFrameWriter documentation formatting problems
[SPARK-17575] - Make correction in configuration documentation table tags

Edit/Copy Release Notes

The text area below allows the project release notes to be edited and copied to another document.

Release Notes - Spark - Version 2.0.1
    
<h2>        Sub-task
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15232'>SPARK-15232</a>] -         Add subquery SQL building tests to LogicalPlanToSQLSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15698'>SPARK-15698</a>] -         Ability to remove old metadata for structure streaming MetadataLog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15814'>SPARK-15814</a>] -         Aggregator can return null result
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16287'>SPARK-16287</a>] -         Implement str_to_map SQL function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16312'>SPARK-16312</a>] -         Docs for Kafka 0.10 consumer integration
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16380'>SPARK-16380</a>] -         Update SQL examples and programming guide for Python language binding
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16391'>SPARK-16391</a>] -         KeyValueGroupedDataset.reduceGroups should support partial aggregation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16508'>SPARK-16508</a>] -         Fix documentation warnings found by R CMD check
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16510'>SPARK-16510</a>] -         Move SparkR test JAR into Spark, include its source code
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16519'>SPARK-16519</a>] -         Handle SparkR RDD generics that create warnings in R CMD check
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16577'>SPARK-16577</a>] -         Add check-cran script to Jenkins
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16579'>SPARK-16579</a>] -         Add a spark install function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16581'>SPARK-16581</a>] -         Making JVM backend calling functions public
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16621'>SPARK-16621</a>] -         Generate stable SQLs in SQLBuilder
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16734'>SPARK-16734</a>] -         Make sure examples in all language bindings are consistent
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16735'>SPARK-16735</a>] -         Fail to create a map contains decimal type with literals having different inferred precessions and scales
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16774'>SPARK-16774</a>] -         Fix use of deprecated TimeStamp constructor (also providing incorrect results)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16776'>SPARK-16776</a>] -         Fix Kafka deprecation warnings
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16778'>SPARK-16778</a>] -         Fix use of deprecated SQLContext constructor
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16800'>SPARK-16800</a>] -         Fix Java Examples that throw exception
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16866'>SPARK-16866</a>] -         Basic infrastructure for file-based SQL end-to-end tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17007'>SPARK-17007</a>] -         Move test data files into a test-data folder
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17008'>SPARK-17008</a>] -         Normalize query results using sorting
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17009'>SPARK-17009</a>] -         Use a new SparkSession for each test case
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17011'>SPARK-17011</a>] -         Support testing exceptions in queries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17015'>SPARK-17015</a>] -         group-by-ordinal and order-by-ordinal test cases
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17018'>SPARK-17018</a>] -         literals.sql for testing literal parsing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17042'>SPARK-17042</a>] -         Repl-defined classes cannot be replicated
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17096'>SPARK-17096</a>] -         Fix StreamingQueryListener to return message and stacktrace of actual exception
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17149'>SPARK-17149</a>] -         array.sql for testing array related functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17165'>SPARK-17165</a>] -         FileStreamSource should not track the list of seen files indefinitely
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17235'>SPARK-17235</a>] -         MetadataLog should support purging old logs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17269'>SPARK-17269</a>] -         Move finish analysis stage into its own file
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17270'>SPARK-17270</a>] -         Move object optimization rules into its own file
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17274'>SPARK-17274</a>] -         Move join optimizer rules into a separate file
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17372'>SPARK-17372</a>] -         Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17513'>SPARK-17513</a>] -         StreamExecution should discard unneeded metadata
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17586'>SPARK-17586</a>] -         Use Static member not via instance reference
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-18151'>SPARK-18151</a>] -         CLONE - MetadataLog should support purging old logs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-18152'>SPARK-18152</a>] -         CLONE - FileStreamSource should not track the list of seen files indefinitely
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-18153'>SPARK-18153</a>] -         CLONE - Ability to remove old metadata for structure streaming MetadataLog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-18156'>SPARK-18156</a>] -         CLONE - StreamExecution should discard unneeded metadata
</li>
</ul>
            
<h2>        Bug
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-10683'>SPARK-10683</a>] -         Source code missing for SparkR test JAR
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-11227'>SPARK-11227</a>] -         Spark1.5+ HDFS HA mode throw java.net.UnknownHostException: nameservice1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-12666'>SPARK-12666</a>] -         spark-shell --packages cannot load artifacts which are publishLocal&#39;d by SBT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-14204'>SPARK-14204</a>] -         [SQL] Failure to register URL-derived JDBC driver on executors in cluster mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-14209'>SPARK-14209</a>] -         Application failure during preemption.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-14818'>SPARK-14818</a>] -         Move sketch and mllibLocal out from mima exclusion
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15083'>SPARK-15083</a>] -         History Server would OOM due to unlimited TaskUIData in some stages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15285'>SPARK-15285</a>] -         Generated SpecificSafeProjection.apply method grows beyond 64 KB
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15382'>SPARK-15382</a>] -         monotonicallyIncreasingId doesn&#39;t work when data is upsampled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15390'>SPARK-15390</a>] -         Memory management issue in complex DataFrame join and filter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15541'>SPARK-15541</a>] -         SparkContext.stop throws error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15869'>SPARK-15869</a>] -         HTTP 500 and NPE on streaming batch details page
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15899'>SPARK-15899</a>] -         file scheme should be used correctly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15989'>SPARK-15989</a>] -         PySpark SQL python-only UDTs don&#39;t support nested types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16062'>SPARK-16062</a>] -         PySpark SQL python-only UDTs don&#39;t work well
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16321'>SPARK-16321</a>] -         [Spark 2.0] Performance regression when reading parquet and using PPD and non-vectorized reader
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16334'>SPARK-16334</a>] -         SQL query on parquet table java.lang.ArrayIndexOutOfBoundsException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16409'>SPARK-16409</a>] -         regexp_extract with optional groups causes NPE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16439'>SPARK-16439</a>] -         Incorrect information in SQL Query details
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16440'>SPARK-16440</a>] -         Undeleted broadcast variables in Word2Vec causing OoM for long runs 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16457'>SPARK-16457</a>] -         Wrong messages when CTAS with a Partition By clause
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16460'>SPARK-16460</a>] -         Spark 2.0 CSV ignores NULL value in Date format
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16462'>SPARK-16462</a>] -         Spark 2.0 CSV does not cast null values to certain data types properly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16522'>SPARK-16522</a>] -         [MESOS] Spark application throws exception on exit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16533'>SPARK-16533</a>] -         Spark application not handling preemption messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16550'>SPARK-16550</a>] -         Caching data with replication doesn&#39;t replicate data
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16558'>SPARK-16558</a>] -         examples/mllib/LDAExample should use MLVector instead of MLlib Vector
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16563'>SPARK-16563</a>] -         Repeat calling Spark SQL thrift server fetchResults return empty for ExecuteStatement operation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16586'>SPARK-16586</a>] -         spark-class crash with &quot;[: too many arguments&quot; instead of displaying the correct error message
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16597'>SPARK-16597</a>] -         DataFrame DateType is written as an int(Days since epoch) by csv writer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16610'>SPARK-16610</a>] -         When writing ORC files, orc.compress should not be overridden if users do not set &quot;compression&quot; in the options
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16613'>SPARK-16613</a>] -         RDD.pipe returns values for empty partitions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16632'>SPARK-16632</a>] -         Vectorized parquet reader fails to read certain fields from Hive tables
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16633'>SPARK-16633</a>] -         lag/lead using constant input values does not return the default value when the offset row does not exist
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16634'>SPARK-16634</a>] -         GenericArrayData can&#39;t be loaded in certain JVMs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16639'>SPARK-16639</a>] -         query fails if having condition contains grouping column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16642'>SPARK-16642</a>] -         ResolveWindowFrame should not be triggered on UnresolvedFunctions.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16644'>SPARK-16644</a>] -         constraints propagation may fail the query
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16646'>SPARK-16646</a>] -         LEAST doesn&#39;t accept numeric arguments with different data types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16648'>SPARK-16648</a>] -         LAST_VALUE(FALSE) OVER () throws IndexOutOfBoundsException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16656'>SPARK-16656</a>] -         CreateTableAsSelectSuite is flaky
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16664'>SPARK-16664</a>] -         Spark 1.6.2 - Persist call on Data frames with more than 200 columns is wiping out the data.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16672'>SPARK-16672</a>] -         SQLBuilder should not raise exceptions on EXISTS queries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16686'>SPARK-16686</a>] -         Dataset.sample with seed: result seems to depend on downstream usage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16698'>SPARK-16698</a>] -         json parsing regression - &quot;.&quot; in keys
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16699'>SPARK-16699</a>] -         Fix performance bug in hash aggregate on long string keys
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16700'>SPARK-16700</a>] -         StructType doesn&#39;t accept Python dicts anymore
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16703'>SPARK-16703</a>] -         Extra space in WindowSpecDefinition SQL representation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16711'>SPARK-16711</a>] -         YarnShuffleService doesn&#39;t re-init properly on YARN rolling upgrade
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16714'>SPARK-16714</a>] -         Fail to create a decimal arrays with literals having different inferred precessions and scales
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16715'>SPARK-16715</a>] -         Fix a potential ExprId conflict for SubexpressionEliminationSuite.&quot;Semantic equals and hash&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16721'>SPARK-16721</a>] -         Lead/lag needs to respect nulls 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16724'>SPARK-16724</a>] -         Expose DefinedByConstructorParams
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16729'>SPARK-16729</a>] -         Spark should throw analysis exception for invalid casts to date type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16730'>SPARK-16730</a>] -         Spark 2.0 breaks various Hive cast functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16740'>SPARK-16740</a>] -         joins.LongToUnsafeRowMap crashes with NegativeArraySizeException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16748'>SPARK-16748</a>] -         Errors thrown by UDFs cause TreeNodeException when the query has an ORDER BY clause
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16750'>SPARK-16750</a>] -         ML GaussianMixture training failed due to feature column type mistake
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16751'>SPARK-16751</a>] -         Upgrade derby to 10.12.1.1 from 10.11.1.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16770'>SPARK-16770</a>] -         Spark shell not usable with german keyboard due to JLine version
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16781'>SPARK-16781</a>] -         java launched by PySpark as gateway may not be the same java used in the spark environment
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16785'>SPARK-16785</a>] -         dapply doesn&#39;t return array or raw columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16787'>SPARK-16787</a>] -         SparkContext.addFile() should not fail if called twice with the same file
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16791'>SPARK-16791</a>] -         casting structs fails on Timestamp fields (interpreted mode only)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16802'>SPARK-16802</a>] -         joins.LongToUnsafeRowMap crashes with ArrayIndexOutOfBoundsException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16818'>SPARK-16818</a>] -         Exchange reuse incorrectly reuses scans over different sets of partitions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16831'>SPARK-16831</a>] -         PySpark CrossValidator reports incorrect avgMetrics
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16836'>SPARK-16836</a>] -         Hive date/time function error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16837'>SPARK-16837</a>] -         TimeWindow incorrectly drops slideDuration in constructors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16850'>SPARK-16850</a>] -         Improve error message for greatest/least
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16873'>SPARK-16873</a>] -         force spill NPE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16880'>SPARK-16880</a>] -         Improve ANN training, add training data persist if needed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16883'>SPARK-16883</a>] -         SQL decimal type is not properly cast to number when collecting SparkDataFrame
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16901'>SPARK-16901</a>] -         Hive settings in hive-site.xml may be overridden by Hive&#39;s default values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16905'>SPARK-16905</a>] -         Support SQL DDL: MSCK REPAIR TABLE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16907'>SPARK-16907</a>] -         Parquet table reading performance regression when vectorized record reader is not used
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16922'>SPARK-16922</a>] -         Query with Broadcast Hash join fails due to executor OOM in Spark 2.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16925'>SPARK-16925</a>] -         Spark tasks which cause JVM to exit with a zero exit code may cause app to hang in Standalone mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16926'>SPARK-16926</a>] -         Partition columns are present in columns metadata for partition but not table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16936'>SPARK-16936</a>] -         Case Sensitivity Support for Refresh Temp Table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16942'>SPARK-16942</a>] -         CREATE TABLE LIKE generates External table when source table is an External Hive Serde table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16943'>SPARK-16943</a>] -         CREATE TABLE LIKE generates a non-empty table when source is a data source table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16950'>SPARK-16950</a>] -         fromOffsets parameter in Kafka&#39;s Direct Streams does not work in python3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16953'>SPARK-16953</a>] -         Make requestTotalExecutors public to be consistent with requestExecutors/killExecutors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16955'>SPARK-16955</a>] -         Using ordinals in ORDER BY causes an analysis error when the query has a GROUP BY clause using ordinals
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16959'>SPARK-16959</a>] -         Table Comment in the CatalogTable returned from HiveMetastore is Always Empty
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16961'>SPARK-16961</a>] -         Utils.randomizeInPlace does not shuffle arrays uniformly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16966'>SPARK-16966</a>] -         App Name is a randomUUID even when &quot;spark.app.name&quot; exists
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16975'>SPARK-16975</a>] -         Spark-2.0.0 unable to infer schema for parquet data written by Spark-1.6.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16991'>SPARK-16991</a>] -         Full outer join followed by inner join produces wrong results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16994'>SPARK-16994</a>] -         Filter and limit are illegally permuted.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16995'>SPARK-16995</a>] -         TreeNodeException when flat mapping RelationalGroupedDataset created from DataFrame containing a column created with lit/expr
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17010'>SPARK-17010</a>] -         [MINOR]Wrong description in memory management document
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17013'>SPARK-17013</a>] -         negative numeric literal parsing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17016'>SPARK-17016</a>] -         group-by/order-by ordinal should throw AnalysisException instead of UnresolvedException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17022'>SPARK-17022</a>] -         Potential deadlock in driver handling message
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17027'>SPARK-17027</a>] -         PolynomialExpansion.choose is prone to integer overflow 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17038'>SPARK-17038</a>] -         StreamingSource reports metrics for lastCompletedBatch instead of lastReceivedBatch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17051'>SPARK-17051</a>] -         we should use hadoopConf in InsertIntoHiveTable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17056'>SPARK-17056</a>] -         Fix a wrong assert in MemoryStore
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17061'>SPARK-17061</a>] -         Incorrect results returned following a join of two datasets and a map step where total number of columns &gt;100
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17065'>SPARK-17065</a>] -         Improve the error message when encountering an incompatible DataSourceRegister
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17066'>SPARK-17066</a>] -         dateFormat should be used when writing dataframes as csv files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17086'>SPARK-17086</a>] -         QuantileDiscretizer throws InvalidArgumentException (parameter splits given invalid value) on valid data
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17093'>SPARK-17093</a>] -         Roundtrip encoding of array&lt;struct&lt;&gt;&gt; fields is wrong when whole-stage codegen is disabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17098'>SPARK-17098</a>] -         &quot;SELECT COUNT(NULL) OVER ()&quot; throws UnsupportedOperationException during analysis
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17099'>SPARK-17099</a>] -         Incorrect result when HAVING clause is added to group by query
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17100'>SPARK-17100</a>] -         pyspark filter on a udf column after join gives java.lang.UnsupportedOperationException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17104'>SPARK-17104</a>] -         LogicalRelation.newInstance should follow the semantics of MultiInstanceRelation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17110'>SPARK-17110</a>] -         Pyspark with locality ANY throw java.io.StreamCorruptedException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17113'>SPARK-17113</a>] -         Job failure due to Executor OOM in offheap mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17114'>SPARK-17114</a>] -         Adding a &#39;GROUP BY 1&#39; where first column is literal results in wrong answer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17115'>SPARK-17115</a>] -         Improve the performance of UnsafeProjection for wide table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17117'>SPARK-17117</a>] -         &#39;SELECT 1 / NULL` throws AnalysisException, while &#39;SELECT 1 * NULL` works
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17120'>SPARK-17120</a>] -         Analyzer incorrectly optimizes plan to empty LocalRelation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17124'>SPARK-17124</a>] -         RelationalGroupedDataset.agg should be order preserving and allow duplicate column names
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17158'>SPARK-17158</a>] -         Improve error message for numeric literal parsing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17160'>SPARK-17160</a>] -         GetExternalRowField does not properly escape field names, causing generated code not to compile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17162'>SPARK-17162</a>] -         Range does not support SQL generation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17167'>SPARK-17167</a>] -         Issue Exceptions when Analyze Table on In-Memory Cataloged Tables
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17180'>SPARK-17180</a>] -         Unable to Alter the Temporary View Using ALTER VIEW command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17182'>SPARK-17182</a>] -         CollectList and CollectSet should be marked as non-deterministic
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17194'>SPARK-17194</a>] -         When emitting SQL for string literals Spark should use single quotes, not double
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17205'>SPARK-17205</a>] -         Literal.sql does not properly convert NaN and Infinity literals
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17210'>SPARK-17210</a>] -         sparkr.zip is not distributed to executors when run sparkr in RStudio
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17211'>SPARK-17211</a>] -         Broadcast join produces incorrect results when compressed Oops differs between driver, executor
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17216'>SPARK-17216</a>] -         Even timeline for a stage doesn&#39;t core 100% of the bar timeline bar in chrome
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17228'>SPARK-17228</a>] -         Not infer/propagate non-deterministic constraints
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17230'>SPARK-17230</a>] -         Writing decimal to csv will result empty string if the decimal exceeds (20, 18)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17243'>SPARK-17243</a>] -         Spark 2.0 history server summary page gets stuck at &quot;loading history summary&quot; with 10K+ application history
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17244'>SPARK-17244</a>] -         Joins should not pushdown non-deterministic conditions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17252'>SPARK-17252</a>] -         Performing arithmetic in VALUES can lead to ClassCastException / MatchErrors during query parsing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17253'>SPARK-17253</a>] -         Left join where ON clause does not reference the right table produces analysis error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17261'>SPARK-17261</a>] -         Using HiveContext after re-creating SparkContext in Spark 2.0 throws &quot;Java.lang.illegalStateException: Cannot call methods on a stopped sparkContext&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17264'>SPARK-17264</a>] -         DataStreamWriter should document that it only supports Parquet for now
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17296'>SPARK-17296</a>] -         Spark SQL: cross join + two joins = BUG
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17299'>SPARK-17299</a>] -         TRIM/LTRIM/RTRIM strips characters other than spaces
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17306'>SPARK-17306</a>] -         QuantileSummaries doesn&#39;t compress
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17309'>SPARK-17309</a>] -         ALTER VIEW should throw exception if view not exist
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17323'>SPARK-17323</a>] -         ALTER VIEW AS should keep the previous table properties, comment, create_time, etc.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17335'>SPARK-17335</a>] -         Creating Hive table from Spark data
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17336'>SPARK-17336</a>] -         Repeated calls sbin/spark-config.sh file Causes ${PYTHONPATH} Value duplicate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17339'>SPARK-17339</a>] -         Fix SparkR tests on Windows
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17342'>SPARK-17342</a>] -         Style of event timeline is broken
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17352'>SPARK-17352</a>] -         Executor computing time can be negative-number because of calculation error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17353'>SPARK-17353</a>] -         CREATE TABLE LIKE statements when Source is a VIEW
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17354'>SPARK-17354</a>] -         java.lang.ClassCastException: java.lang.Integer cannot be cast to java.sql.Date
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17355'>SPARK-17355</a>] -         Work around exception thrown by HiveResultSetMetaData.isSigned
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17356'>SPARK-17356</a>] -         A large Metadata filed in Alias can cause OOM when calling TreeNode.toJSON
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17358'>SPARK-17358</a>] -         Cached table(parquet/orc) should be shard between beelines
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17364'>SPARK-17364</a>] -         Can not query hive table starting with number
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17369'>SPARK-17369</a>] -         MetastoreRelation toJSON throws exception
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17370'>SPARK-17370</a>] -         Shuffle service files not invalidated when a slave is lost
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17376'>SPARK-17376</a>] -         Spark version should be available in R
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17391'>SPARK-17391</a>] -         Fix Two Test Failures After Backport
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17396'>SPARK-17396</a>] -         Threads number keep increasing when query on external CSV partitioned table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17418'>SPARK-17418</a>] -         Spark release must NOT distribute Kinesis related assembly artifact
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17438'>SPARK-17438</a>] -         Master UI should show the correct core limit when `ApplicationInfo.executorLimit` is set
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17439'>SPARK-17439</a>] -         QuantilesSummaries returns the wrong result after compression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17442'>SPARK-17442</a>] -         Additional arguments in write.df are not passed to data source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17463'>SPARK-17463</a>] -         Serialization of accumulators in heartbeats is not thread-safe
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17465'>SPARK-17465</a>] -         Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17474'>SPARK-17474</a>] -         Python UDF does not work between Sort and Limit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17491'>SPARK-17491</a>] -         MemoryStore.putIteratorAsBytes() may silently lose values when KryoSerializer is used
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17494'>SPARK-17494</a>] -         Floor/ceil of decimal returns wrong result if it&#39;s in compact format
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17502'>SPARK-17502</a>] -         Multiple Bugs in DDL Statements on Temporary Views 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17503'>SPARK-17503</a>] -         Memory leak in Memory store when unable to cache the whole RDD in memory
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17511'>SPARK-17511</a>] -         Dynamic allocation race condition: Containers getting marked failed while releasing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17512'>SPARK-17512</a>] -         Specifying remote files for Python based Spark jobs in Yarn cluster mode not working
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17514'>SPARK-17514</a>] -         df.take(1) and df.limit(1).collect() perform differently in Python
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17515'>SPARK-17515</a>] -         CollectLimit.execute() should perform per-partition limits
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17521'>SPARK-17521</a>] -         Error when I use sparkContext.makeRDD(Seq())
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17525'>SPARK-17525</a>] -         SparkContext.clearFiles() still present in the PySpark bindings though the underlying Scala method was removed in Spark 2.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17531'>SPARK-17531</a>] -         Don&#39;t initialize Hive Listeners for the Execution Client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17541'>SPARK-17541</a>] -         fix some DDL bugs about table management when same-name temp view exists
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17545'>SPARK-17545</a>] -         Spark SQL Catalyst doesn&#39;t handle ISO 8601 date without colon in offset
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17546'>SPARK-17546</a>] -         start-* scripts should use hostname -f
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17547'>SPARK-17547</a>] -         Temporary shuffle data files may be leaked following exception in write
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17548'>SPARK-17548</a>] -         Word2VecModel.findSynonyms can spuriously reject the best match when invoked with a vector
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17567'>SPARK-17567</a>] -         Broken link to Spark paper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17571'>SPARK-17571</a>] -         AssertOnQuery.condition should be consistent in requiring Boolean return type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17599'>SPARK-17599</a>] -         Folder deletion after globbing may fail StructuredStreaming jobs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17613'>SPARK-17613</a>] -         PartitioningAwareFileCatalog.allFiles doesn&#39;t handle URI specified path at parent
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17616'>SPARK-17616</a>] -         Getting &quot;java.lang.RuntimeException: Distinct columns cannot exist in Aggregate &quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17617'>SPARK-17617</a>] -         Remainder(%) expression.eval returns incorrect result
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17618'>SPARK-17618</a>] -         Dataframe except returns incorrect results when combined with coalesce
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17627'>SPARK-17627</a>] -         Streaming Providers should be labeled Experimental
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17641'>SPARK-17641</a>] -         collect_set should ignore null values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17644'>SPARK-17644</a>] -         The failed stage never resubmitted due to abort stage in another thread
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17650'>SPARK-17650</a>] -         Adding a malformed URL to sc.addJar and/or sc.addFile bricks Executors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17652'>SPARK-17652</a>] -         Fix confusing exception message while reserving capacity
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17666'>SPARK-17666</a>] -         take() or isEmpty() on dataset leaks s3a connections
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17672'>SPARK-17672</a>] -         Spark 2.0 history server web Ui takes too long for a single application
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17673'>SPARK-17673</a>] -         Reused Exchange Aggregations Produce Incorrect Results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17752'>SPARK-17752</a>] -         Spark returns incorrect result when &#39;collect()&#39;ing a cached Dataset with many columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17809'>SPARK-17809</a>] -         scala.MatchError: BooleanType when casting a struct
</li>
</ul>
            
<h2>        New Feature
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16956'>SPARK-16956</a>] -         Make ApplicationState.MAX_NUM_RETRY configurable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17069'>SPARK-17069</a>] -         Expose spark.range() as table-valued function in SQL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17150'>SPARK-17150</a>] -         Support SQL generation for inline tables
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17456'>SPARK-17456</a>] -         Utility for parsing Spark versions
</li>
</ul>
    
<h2>        Improvement
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-2424'>SPARK-2424</a>] -         ApplicationState.MAX_NUM_RETRY should be configurable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-10835'>SPARK-10835</a>] -         Word2Vec should accept non-null string array, in addition to existing null string array
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-12370'>SPARK-12370</a>] -         Documentation should link to examples from its own release version
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-13286'>SPARK-13286</a>] -         JDBC driver doesn&#39;t report full exception
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15639'>SPARK-15639</a>] -         Try to push down filter at RowGroups level for parquet reader
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15703'>SPARK-15703</a>] -         Make ListenerBus event queue size configurable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-15923'>SPARK-15923</a>] -         Spark Application rest api returns &quot;no such app: &lt;appId&gt;&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16216'>SPARK-16216</a>] -         CSV data source does not write date and timestamp correctly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16240'>SPARK-16240</a>] -         model loading backward compatibility for ml.clustering.LDA
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16320'>SPARK-16320</a>] -         Document G1 heap region&#39;s effect on spark 2.0 vs 1.6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16324'>SPARK-16324</a>] -         regexp_extract should doc that it returns empty string when match fails
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16568'>SPARK-16568</a>] -         update sql programing guide refreshTable API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16650'>SPARK-16650</a>] -         Improve documentation of spark.task.maxFailures 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16651'>SPARK-16651</a>] -         Document no exception using DataFrame.withColumnRenamed when existing column doesn&#39;t exist
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16663'>SPARK-16663</a>] -         desc table should be consistent between data source and hive serde tables
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16764'>SPARK-16764</a>] -         Recommend disabling vectorized parquet reader on OutOfMemoryError
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16772'>SPARK-16772</a>] -         Correct API doc references to PySpark classes + formatting fixes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16796'>SPARK-16796</a>] -         Visible passwords on Spark environment page
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16805'>SPARK-16805</a>] -         Log timezone when query result does not match
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16812'>SPARK-16812</a>] -         Open up SparkILoop.getAddedJars
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16813'>SPARK-16813</a>] -         Remove private[sql] and private[spark] from catalyst package
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16865'>SPARK-16865</a>] -         A file-based end-to-end SQL query suite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16870'>SPARK-16870</a>] -         add &quot;spark.sql.broadcastTimeout&quot; into docs/sql-programming-guide.md to help people to how to fix this timeout error when it happenned
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16875'>SPARK-16875</a>] -         Add args checking for DataSet randomSplit and sample
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16877'>SPARK-16877</a>] -         Add a rule for preventing use Java&#39;s Override annotation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16932'>SPARK-16932</a>] -         Programming-guide Accumulator section should be more clear w.r.t new API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16935'>SPARK-16935</a>] -         Verification of Function-related ExternalCatalog APIs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16947'>SPARK-16947</a>] -         Support type coercion and foldable expression for inline tables
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16964'>SPARK-16964</a>] -         Remove private[sql] and private[spark] from sql.execution package
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17023'>SPARK-17023</a>] -         Update Kafka connetor to use Kafka 0.10.0.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17063'>SPARK-17063</a>] -         MSCK REPAIR TABLE is super slow with Hive metastore
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17084'>SPARK-17084</a>] -         Rename ParserUtils.assert to validate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17186'>SPARK-17186</a>] -         remove catalog table type INDEX
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17193'>SPARK-17193</a>] -         HadoopRDD NPE at DEBUG log level when getLocationInfo == null
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17231'>SPARK-17231</a>] -         Avoid building debug or trace log messages unless the respective log level is enabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17246'>SPARK-17246</a>] -         Support BigDecimal literal parsing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17279'>SPARK-17279</a>] -         better error message for exceptions during ScalaUDF execution
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17297'>SPARK-17297</a>] -         Clarify window/slide duration as absolute time, not relative to a calendar
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17301'>SPARK-17301</a>] -         Remove unused classTag field from AtomicType base class
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17316'>SPARK-17316</a>] -         Don&#39;t block StandaloneSchedulerBackend.executorRemoved
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17347'>SPARK-17347</a>] -         Encoder in Dataset example has incorrect type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17378'>SPARK-17378</a>] -         Upgrade snappy-java to 1.1.2.6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17421'>SPARK-17421</a>] -         Document warnings about &quot;MaxPermSize&quot; parameter when building with Maven and Java 8
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17445'>SPARK-17445</a>] -         Reference an ASF page as the main place to find third-party packages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17480'>SPARK-17480</a>] -         CompressibleColumnBuilder inefficiently call gatherCompressibilityStats 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17483'>SPARK-17483</a>] -         Minor refactoring and cleanup in BlockManager block status reporting and block removal
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17484'>SPARK-17484</a>] -         Race condition when cancelling a job during a cache write can lead to block fetch failures
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17485'>SPARK-17485</a>] -         Failed remote cached block reads can lead to whole job failure
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17486'>SPARK-17486</a>] -         Remove unused TaskMetricsUIData.updatedBlockStatuses field
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17558'>SPARK-17558</a>] -         Bump Hadoop 2.7 version from 2.7.2 to 2.7.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17569'>SPARK-17569</a>] -         Don&#39;t recheck existence of files when generating File Relation resolution in StructuredStreaming
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17577'>SPARK-17577</a>] -         SparkR support add files to Spark job and get by executors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17609'>SPARK-17609</a>] -         SessionCatalog.tableExists should not check temp view
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17638'>SPARK-17638</a>] -         Stop JVM StreamingContext when the Python process is dead
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17640'>SPARK-17640</a>] -         Avoid using -1 as the default batchId for FileStreamSource.FileEntry
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17649'>SPARK-17649</a>] -         Log how many Spark events got dropped in LiveListenerBus
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17651'>SPARK-17651</a>] -         Automate Spark version update for documentations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-18391'>SPARK-18391</a>] -         Openstack deployment scenarios
</li>
</ul>
    
<h2>        Test
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16690'>SPARK-16690</a>] -         rename SQLTestUtils.withTempTable to withTempView
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16722'>SPARK-16722</a>] -         Fix a StreamingContext leak in StreamingContextSuite when eventually fails
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17102'>SPARK-17102</a>] -         bypass UserDefinedGenerator for json format check
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17318'>SPARK-17318</a>] -         Fix flaky test: o.a.s.repl.ReplSuite replicating blocks of object with class defined in repl
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17326'>SPARK-17326</a>] -         Tests with HiveContext in SparkR being skipped always
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17473'>SPARK-17473</a>] -         jdbc docker tests are failing with java.lang.AbstractMethodError:
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17589'>SPARK-17589</a>] -         Fix test case `create external table`
</li>
</ul>
                                                                    
<h2>        Question
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17794'>SPARK-17794</a>] -         2.0.1 not in maven central repo?
</li>
</ul>
                                                                            
<h2>        Documentation
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16295'>SPARK-16295</a>] -         Extract SQL programming guide example snippets from source files instead of hard code them
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16761'>SPARK-16761</a>] -         Fix doc link in docs/ml-guide.md
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-16911'>SPARK-16911</a>] -         Remove migrating to a Spark 1.x version in programming guide documentation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17085'>SPARK-17085</a>] -         Documentation and actual code differs - Unsupported Operations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17089'>SPARK-17089</a>] -         Remove link of api doc for mapReduceTriplets because its removed from api. 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17242'>SPARK-17242</a>] -         Update links of external dstream projects
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17561'>SPARK-17561</a>] -         DataFrameWriter documentation formatting problems
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17575'>SPARK-17575</a>] -         Make correction in configuration documentation table tags
</li>
</ul>