public class GraphImpl<VD,ED> extends Graph<VD,ED> implements scala.Serializable
Graph to support computation on graphs.
Graphs are represented using two RDDs: vertices, which contains vertex attributes and the
routing information for shipping vertex attributes to edge partitions, and
replicatedVertexView, which contains edges and the vertex attributes mentioned by each edge.
| Modifier and Type | Method and Description |
|---|---|
<A> VertexRDD<A> |
aggregateMessagesWithActiveSet(scala.Function1<EdgeContext<VD,ED,A>,scala.runtime.BoxedUnit> sendMsg,
scala.Function2<A,A,A> mergeMsg,
TripletFields tripletFields,
scala.Option<scala.Tuple2<VertexRDD<?>,EdgeDirection>> activeSetOpt,
scala.reflect.ClassTag<A> evidence$10) |
static <VD,ED> GraphImpl<VD,ED> |
apply(RDD<Edge<ED>> edges,
VD defaultVertexAttr,
StorageLevel edgeStorageLevel,
StorageLevel vertexStorageLevel,
scala.reflect.ClassTag<VD> evidence$13,
scala.reflect.ClassTag<ED> evidence$14)
Create a graph from edges, setting referenced vertices to
defaultVertexAttr. |
static <VD,ED> GraphImpl<VD,ED> |
apply(RDD<scala.Tuple2<Object,VD>> vertices,
RDD<Edge<ED>> edges,
VD defaultVertexAttr,
StorageLevel edgeStorageLevel,
StorageLevel vertexStorageLevel,
scala.reflect.ClassTag<VD> evidence$17,
scala.reflect.ClassTag<ED> evidence$18)
Create a graph from vertices and edges, setting missing vertices to
defaultVertexAttr. |
static <VD,ED> GraphImpl<VD,ED> |
apply(VertexRDD<VD> vertices,
EdgeRDD<ED> edges,
scala.reflect.ClassTag<VD> evidence$19,
scala.reflect.ClassTag<ED> evidence$20)
Create a graph from a VertexRDD and an EdgeRDD with arbitrary replicated vertices.
|
Graph<VD,ED> |
cache()
Caches the vertices and edges associated with this graph at the previously-specified target
storage levels, which default to
MEMORY_ONLY. |
void |
checkpoint()
Mark this Graph for checkpointing.
|
EdgeRDDImpl<ED,VD> |
edges()
An RDD containing the edges and their associated attributes.
|
static <VD,ED> GraphImpl<VD,ED> |
fromEdgePartitions(RDD<scala.Tuple2<Object,org.apache.spark.graphx.impl.EdgePartition<ED,VD>>> edgePartitions,
VD defaultVertexAttr,
StorageLevel edgeStorageLevel,
StorageLevel vertexStorageLevel,
scala.reflect.ClassTag<VD> evidence$15,
scala.reflect.ClassTag<ED> evidence$16)
Create a graph from EdgePartitions, setting referenced vertices to
defaultVertexAttr. |
static <VD,ED> GraphImpl<VD,ED> |
fromExistingRDDs(VertexRDD<VD> vertices,
EdgeRDD<ED> edges,
scala.reflect.ClassTag<VD> evidence$21,
scala.reflect.ClassTag<ED> evidence$22)
Create a graph from a VertexRDD and an EdgeRDD with the same replicated vertex type as the
vertices.
|
scala.collection.Seq<String> |
getCheckpointFiles()
Gets the name of the files to which this Graph was checkpointed.
|
Graph<VD,ED> |
groupEdges(scala.Function2<ED,ED,ED> merge)
Merges multiple edges between two vertices into a single edge.
|
boolean |
isCheckpointed()
Return whether this Graph has been checkpointed or not.
|
<ED2> Graph<VD,ED2> |
mapEdges(scala.Function2<Object,scala.collection.Iterator<Edge<ED>>,scala.collection.Iterator<ED2>> f,
scala.reflect.ClassTag<ED2> evidence$6)
Transforms each edge attribute using the map function, passing it a whole partition at a
time.
|
<ED2> Graph<VD,ED2> |
mapTriplets(scala.Function2<Object,scala.collection.Iterator<EdgeTriplet<VD,ED>>,scala.collection.Iterator<ED2>> f,
TripletFields tripletFields,
scala.reflect.ClassTag<ED2> evidence$7)
Transforms each edge attribute a partition at a time using the map function, passing it the
adjacent vertex attributes as well.
|
<VD2> Graph<VD2,ED> |
mapVertices(scala.Function2<Object,VD,VD2> f,
scala.reflect.ClassTag<VD2> evidence$5,
scala.Predef.$eq$colon$eq<VD,VD2> eq)
Transforms each vertex attribute in the graph using the map function.
|
<VD2,ED2> Graph<VD,ED> |
mask(Graph<VD2,ED2> other,
scala.reflect.ClassTag<VD2> evidence$8,
scala.reflect.ClassTag<ED2> evidence$9)
Restricts the graph to only the vertices and edges that are also in
other, but keeps the
attributes from this graph. |
<U,VD2> Graph<VD2,ED> |
outerJoinVertices(RDD<scala.Tuple2<Object,U>> other,
scala.Function3<Object,VD,scala.Option<U>,VD2> updateF,
scala.reflect.ClassTag<U> evidence$11,
scala.reflect.ClassTag<VD2> evidence$12,
scala.Predef.$eq$colon$eq<VD,VD2> eq)
Joins the vertices with entries in the
table RDD and merges the results using mapFunc. |
Graph<VD,ED> |
partitionBy(PartitionStrategy partitionStrategy)
Repartitions the edges in the graph according to
partitionStrategy. |
Graph<VD,ED> |
partitionBy(PartitionStrategy partitionStrategy,
int numPartitions)
Repartitions the edges in the graph according to
partitionStrategy. |
Graph<VD,ED> |
persist(StorageLevel newLevel)
Caches the vertices and edges associated with this graph at the specified storage level,
ignoring any target storage levels previously set.
|
org.apache.spark.graphx.impl.ReplicatedVertexView<VD,ED> |
replicatedVertexView() |
Graph<VD,ED> |
reverse()
Reverses all edges in the graph.
|
Graph<VD,ED> |
subgraph(scala.Function1<EdgeTriplet<VD,ED>,Object> epred,
scala.Function2<Object,VD,Object> vpred)
Restricts the graph to only the vertices and edges satisfying the predicates.
|
RDD<EdgeTriplet<VD,ED>> |
triplets()
An RDD containing the edge triplets, which are edges along with the vertex data associated with
the adjacent vertices.
|
Graph<VD,ED> |
unpersist(boolean blocking)
Uncaches both vertices and edges of this graph.
|
Graph<VD,ED> |
unpersistVertices(boolean blocking)
Uncaches only the vertices of this graph, leaving the edges alone.
|
VertexRDD<VD> |
vertices()
An RDD containing the vertices and their associated attributes.
|
aggregateMessages, fromEdges, fromEdgeTuples, graphToGraphOps, mapEdges, mapTriplets, mapTriplets, opspublic static <VD,ED> GraphImpl<VD,ED> apply(RDD<Edge<ED>> edges, VD defaultVertexAttr, StorageLevel edgeStorageLevel, StorageLevel vertexStorageLevel, scala.reflect.ClassTag<VD> evidence$13, scala.reflect.ClassTag<ED> evidence$14)
defaultVertexAttr.edges - (undocumented)defaultVertexAttr - (undocumented)edgeStorageLevel - (undocumented)vertexStorageLevel - (undocumented)evidence$13 - (undocumented)evidence$14 - (undocumented)public static <VD,ED> GraphImpl<VD,ED> fromEdgePartitions(RDD<scala.Tuple2<Object,org.apache.spark.graphx.impl.EdgePartition<ED,VD>>> edgePartitions, VD defaultVertexAttr, StorageLevel edgeStorageLevel, StorageLevel vertexStorageLevel, scala.reflect.ClassTag<VD> evidence$15, scala.reflect.ClassTag<ED> evidence$16)
defaultVertexAttr.edgePartitions - (undocumented)defaultVertexAttr - (undocumented)edgeStorageLevel - (undocumented)vertexStorageLevel - (undocumented)evidence$15 - (undocumented)evidence$16 - (undocumented)public static <VD,ED> GraphImpl<VD,ED> apply(RDD<scala.Tuple2<Object,VD>> vertices, RDD<Edge<ED>> edges, VD defaultVertexAttr, StorageLevel edgeStorageLevel, StorageLevel vertexStorageLevel, scala.reflect.ClassTag<VD> evidence$17, scala.reflect.ClassTag<ED> evidence$18)
defaultVertexAttr.vertices - (undocumented)edges - (undocumented)defaultVertexAttr - (undocumented)edgeStorageLevel - (undocumented)vertexStorageLevel - (undocumented)evidence$17 - (undocumented)evidence$18 - (undocumented)public static <VD,ED> GraphImpl<VD,ED> apply(VertexRDD<VD> vertices, EdgeRDD<ED> edges, scala.reflect.ClassTag<VD> evidence$19, scala.reflect.ClassTag<ED> evidence$20)
VertexRDD.withEdges or an appropriate VertexRDD constructor.vertices - (undocumented)edges - (undocumented)evidence$19 - (undocumented)evidence$20 - (undocumented)public static <VD,ED> GraphImpl<VD,ED> fromExistingRDDs(VertexRDD<VD> vertices, EdgeRDD<ED> edges, scala.reflect.ClassTag<VD> evidence$21, scala.reflect.ClassTag<ED> evidence$22)
VertexRDD.withEdges or an appropriate VertexRDD constructor.vertices - (undocumented)edges - (undocumented)evidence$21 - (undocumented)evidence$22 - (undocumented)public VertexRDD<VD> vertices()
Graphpublic org.apache.spark.graphx.impl.ReplicatedVertexView<VD,ED> replicatedVertexView()
public EdgeRDDImpl<ED,VD> edges()
Graphpublic RDD<EdgeTriplet<VD,ED>> triplets()
Graphedges if the vertex data are not needed, i.e.
if only the edge data and adjacent vertex ids are needed.
public Graph<VD,ED> persist(StorageLevel newLevel)
Graphpublic Graph<VD,ED> cache()
GraphMEMORY_ONLY. This is used to pin a graph in memory enabling
multiple queries to reuse the same construction process.public void checkpoint()
Graphcheckpoint in class Graph<VD,ED>public boolean isCheckpointed()
GraphisCheckpointed in class Graph<VD,ED>public scala.collection.Seq<String> getCheckpointFiles()
GraphgetCheckpointFiles in class Graph<VD,ED>public Graph<VD,ED> unpersist(boolean blocking)
Graphpublic Graph<VD,ED> unpersistVertices(boolean blocking)
GraphunpersistVertices in class Graph<VD,ED>blocking - Whether to block until all data is unpersisted (default: false)public Graph<VD,ED> partitionBy(PartitionStrategy partitionStrategy)
GraphpartitionStrategy.
partitionBy in class Graph<VD,ED>partitionStrategy - the partitioning strategy to use when partitioning the edges
in the graph.public Graph<VD,ED> partitionBy(PartitionStrategy partitionStrategy, int numPartitions)
GraphpartitionStrategy.
partitionBy in class Graph<VD,ED>partitionStrategy - the partitioning strategy to use when partitioning the edges
in the graph.numPartitions - the number of edge partitions in the new graph.public Graph<VD,ED> reverse()
Graphpublic <VD2> Graph<VD2,ED> mapVertices(scala.Function2<Object,VD,VD2> f, scala.reflect.ClassTag<VD2> evidence$5, scala.Predef.$eq$colon$eq<VD,VD2> eq)
GraphmapVertices in class Graph<VD,ED>f - the function from a vertex object to a new vertex value
evidence$5 - (undocumented)eq - (undocumented)public <ED2> Graph<VD,ED2> mapEdges(scala.Function2<Object,scala.collection.Iterator<Edge<ED>>,scala.collection.Iterator<ED2>> f, scala.reflect.ClassTag<ED2> evidence$6)
GraphmapTriplets.
public <ED2> Graph<VD,ED2> mapTriplets(scala.Function2<Object,scala.collection.Iterator<EdgeTriplet<VD,ED>>,scala.collection.Iterator<ED2>> f, TripletFields tripletFields, scala.reflect.ClassTag<ED2> evidence$7)
GraphmapEdges instead.
mapTriplets in class Graph<VD,ED>f - the iterator transformtripletFields - which fields should be included in the edge triplet passed to the map
function. If not all fields are needed, specifying this can improve performance.
evidence$7 - (undocumented)public Graph<VD,ED> subgraph(scala.Function1<EdgeTriplet<VD,ED>,Object> epred, scala.Function2<Object,VD,Object> vpred)
Graph
V' = {v : for all v in V where vpred(v)}
E' = {(u,v): for all (u,v) in E where epred((u,v)) && vpred(u) && vpred(v)}
subgraph in class Graph<VD,ED>epred - the edge predicate, which takes a triplet and
evaluates to true if the edge is to remain in the subgraph. Note
that only edges where both vertices satisfy the vertex
predicate are considered.
vpred - the vertex predicate, which takes a vertex object and
evaluates to true if the vertex is to be included in the subgraph
public <VD2,ED2> Graph<VD,ED> mask(Graph<VD2,ED2> other, scala.reflect.ClassTag<VD2> evidence$8, scala.reflect.ClassTag<ED2> evidence$9)
Graphother, but keeps the
attributes from this graph.public Graph<VD,ED> groupEdges(scala.Function2<ED,ED,ED> merge)
GraphpartitionBy.
groupEdges in class Graph<VD,ED>merge - the user-supplied commutative associative function to merge edge attributes
for duplicate edges.
public <A> VertexRDD<A> aggregateMessagesWithActiveSet(scala.Function1<EdgeContext<VD,ED,A>,scala.runtime.BoxedUnit> sendMsg, scala.Function2<A,A,A> mergeMsg, TripletFields tripletFields, scala.Option<scala.Tuple2<VertexRDD<?>,EdgeDirection>> activeSetOpt, scala.reflect.ClassTag<A> evidence$10)
public <U,VD2> Graph<VD2,ED> outerJoinVertices(RDD<scala.Tuple2<Object,U>> other, scala.Function3<Object,VD,scala.Option<U>,VD2> updateF, scala.reflect.ClassTag<U> evidence$11, scala.reflect.ClassTag<VD2> evidence$12, scala.Predef.$eq$colon$eq<VD,VD2> eq)
Graphtable RDD and merges the results using mapFunc.
The input table should contain at most one entry for each vertex. If no entry in other is
provided for a particular vertex in the graph, the map function receives None.
outerJoinVertices in class Graph<VD,ED>other - the table to join with the vertices in the graph.
The table should contain at most one entry for each vertex.updateF - the function used to compute the new vertex values.
The map function is invoked for all vertices, even those
that do not have a corresponding entry in the table.
evidence$11 - (undocumented)evidence$12 - (undocumented)eq - (undocumented)