Class DataflowGraph
Object
org.apache.spark.sql.pipelines.graph.DataflowGraph
- All Implemented Interfaces:
Serializable,org.apache.spark.internal.Logging,GraphOperations,GraphValidations,scala.Equals,scala.Product
public class DataflowGraph
extends Object
implements GraphOperations, GraphValidations, scala.Product, Serializable
DataflowGraph represents the core graph structure for Spark declarative pipelines.
It manages the relationships between logical flows, tables, and views, providing
operations for graph traversal, validation, and transformation.
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter -
Constructor Summary
ConstructorsConstructorDescriptionDataflowGraph(scala.collection.immutable.Seq<Flow> flows, scala.collection.immutable.Seq<Table> tables, scala.collection.immutable.Seq<View> views) -
Method Summary
Modifier and TypeMethodDescriptionscala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,Flow> flow()scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,FlowNode> A map from flow identifier to `FlowNode`, which contains the input/output nodes.scala.collection.immutable.Seq<Flow>flows()scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,scala.collection.immutable.Seq<Flow>> flowsTo()scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,StructType> scala.collection.immutable.Set<org.apache.spark.sql.catalyst.TableIdentifier>scala.collection.immutable.Set<org.apache.spark.sql.catalyst.TableIdentifier>The identifiers ofmaterializedFlows().scala.collection.immutable.Seq<ResolvedFlow>scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,Output> output()scala.collection.immutable.Seq<PersistedView>scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,ResolutionFailedFlow> scala.collection.immutable.Seq<ResolutionFailedFlow>resolve()booleanresolved()Returns true iff allFlows are successfully analyzed.scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,ResolvedFlow> scala.collection.immutable.Seq<ResolvedFlow>scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,scala.collection.immutable.Seq<ResolvedFlow>> scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,Table> table()scala.collection.immutable.Seq<Table>tables()validate()Ensure that theDataflowGraphis valid and throws errors if not.scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,View> view()scala.collection.immutable.Seq<View>views()Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface scala.Equals
canEqual, equalsMethods inherited from interface org.apache.spark.sql.pipelines.graph.GraphOperations
dfsInternal, downstreamFlows, reachabilitySet, reachabilitySet, upstreamDatasets, upstreamDatasets, upstreamFlowsMethods inherited from interface org.apache.spark.sql.pipelines.graph.GraphValidations
detectCycle, validateFlowStreamingness, validateGraphIsTopologicallySorted, validateMultiQueryTables, validatePersistedViewSources, validateSuccessfulFlowAnalysis, validateTablesAreResettable, validateTablesAreResettable, validateUserSpecifiedSchemasMethods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContextMethods inherited from interface scala.Product
productArity, productElement, productElementName, productElementNames, productIterator, productPrefix
-
Constructor Details
-
DataflowGraph
-
-
Method Details
-
flowNodes
public scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,FlowNode> flowNodes()Description copied from interface:GraphOperationsA map from flow identifier to `FlowNode`, which contains the input/output nodes.- Specified by:
flowNodesin interfaceGraphOperations
-
flows
-
tables
-
views
-
output
public scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,Output> output() -
materializedFlows
-
materializedFlowIdentifiers
public scala.collection.immutable.Set<org.apache.spark.sql.catalyst.TableIdentifier> materializedFlowIdentifiers()The identifiers ofmaterializedFlows(). -
table
-
flow
-
view
-
persistedViews
-
inputIdentifiers
public scala.collection.immutable.Set<org.apache.spark.sql.catalyst.TableIdentifier> inputIdentifiers() -
flowsTo
public scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,scala.collection.immutable.Seq<Flow>> flowsTo() -
resolvedFlows
-
resolvedFlow
public scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,ResolvedFlow> resolvedFlow() -
resolvedFlowsTo
public scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,scala.collection.immutable.Seq<ResolvedFlow>> resolvedFlowsTo() -
resolutionFailedFlows
-
resolutionFailedFlow
public scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,ResolutionFailedFlow> resolutionFailedFlow() -
inferredSchema
public scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,StructType> inferredSchema() -
validate
Ensure that theDataflowGraphis valid and throws errors if not. -
resolved
public boolean resolved()Returns true iff allFlows are successfully analyzed. -
resolve
-