pyspark.sql.datasource.DataSourceStreamReader#
- class pyspark.sql.datasource.DataSourceStreamReader[source]#
A base class for streaming data source readers. Data source stream readers are responsible for outputting data from a streaming data source.
Methods
commit(end)Informs the source that Spark has completed processing all data for offsets less than or equal to end and will only request offsets greater than end in the future.
getDefaultReadLimit()Returns the read limits potentially passed to the data source through options when creating the data source.
Return the initial offset of the streaming data source.
latestOffset(start, limit)Returns the most recent offset available given a read limit.
partitions(start, end)Returns a list of InputPartition given the start and end offsets.
read(partition)Generates data for a given partition and returns an iterator of tuples or rows.
reportLatestOffset()Returns the most recent offset available.
stop()Stop this source and free any resources it has allocated.