Class AbstractFlowScanner

  • All Implemented Interfaces:
    Iterable<FlowNode>, Iterator<FlowNode>, Filterator<FlowNode>
    Direct Known Subclasses:
    DepthFirstScanner, ForkScanner, LinearScanner

    @NotThreadSafe
    public abstract class AbstractFlowScanner
    extends Object
    implements Iterable<FlowNode>, Filterator<FlowNode>
    Core APIs and base logic for FlowScanners that extract information from a pipeline execution.

    These iterate through the directed acyclic graph (DAG) or "flow graph" of FlowNodes produced when a pipeline runs.

    This provides 6 base APIs to use, in decreasing expressiveness and increasing genericity:

    All APIs visit the parent nodes, walking backward from heads(inclusive) until they they hit myBlackList nodes (exclusive) or reach the end of the DAG. If denyList nodes are an empty collection or null, APIs will walk to the beginning of the FlowGraph. Multiple denyList nodes are helpful for putting separate bounds on walking different parallel branches.

    Key Points:

    • There are many helper methods offering syntactic sugar for the above APIs in common use cases (simpler method signatures).
    • Each implementation provides its own iteration order (described in its javadoc comments), but it is generally unsafe to rely on parallel branches being visited in a specific order.
    • Implementations may visit some or all points in the DAG, this should be called out in the class's javadoc comments
    • FlowScanners are NOT thread safe, for performance reasons and because it is too hard to guarantee.
    • Many fields and methods are protected: this is intentional to allow building upon the implementations for more complex analyses.
    • Each FlowScanner stores state internally for several reasons:
      • This state can be used to construct more advanced analyses.
      • FlowScanners can be reinitialized and reused repeatedly: avoids the overheads of creating scanners repeatedly.
      • Allows for caching to be added inside a FlowScanner if desired, but caching is only useful when reused.

    Suggested uses:

    Implementations are generally NOT threadsafe and should be so annotated
    Author:
    Sam Van Oort
    • Field Detail

      • myCurrent

        protected FlowNode myCurrent
      • MAX_LIST_CHECK_SIZE

        protected static final int MAX_LIST_CHECK_SIZE
        When checking for denylist membership, we convert to a hashset when checking more than this many elements
        See Also:
        Constant Field Values
    • Constructor Detail

      • AbstractFlowScanner

        public AbstractFlowScanner()
    • Method Detail

      • convertToFastCheckable

        @NonNull
        protected Collection<FlowNode> convertToFastCheckable​(@CheckForNull
                                                              Collection<FlowNode> nodeCollection)
        Helper: convert stop nodes to a collection that can efficiently be checked for membership, handling null if needed
      • setup

        public boolean setup​(@CheckForNull
                             Collection<FlowNode> heads,
                             @CheckForNull
                             Collection<FlowNode> blackList)
        Set up for iteration/analysis on a graph of nodes, initializing the internal state Includes null-checking on arguments to allow directly calling with unchecked inputs (simplifies use).
        Parameters:
        heads - The head nodes we start walking from (the most recently executed nodes, i.e. FlowExecution.getCurrentHeads()
        blackList - Nodes that we cannot visit or walk past (useful to limit scanning to only nodes after a specific point)
        Returns:
        True if we can have nodes to work with, otherwise false
      • reset

        protected abstract void reset()
        Reset internal state so that we can begin walking a new flow graph Public APIs need to invoke this before searches
      • setHeads

        protected abstract void setHeads​(@NonNull
                                         Collection<FlowNode> filteredHeads)
        Set up to begin flow scanning using the filteredHeads as starting points This method makes several assumptions: - reset() has already been invoked to reset state - filteredHeads has already had any points in myBlackList removed - none of the filteredHeads are null
        Parameters:
        filteredHeads - Head nodes that have been filtered against denyList
      • next

        @CheckForNull
        protected abstract FlowNode next​(@NonNull
                                         FlowNode current,
                                         @NonNull
                                         Collection<FlowNode> blackList)
        Actual meat of the iteration, get the next node to visit, using and updating state as needed
        Parameters:
        current - Current node to use in generating next value
        blackList - Nodes that are not eligible for visiting
        Returns:
        Next node to visit, or null if we've exhausted the node list
      • filter

        @NonNull
        public Filterator<FlowNode> filter​(@NonNull
                                           com.google.common.base.Predicate<FlowNode> filterCondition)
        Expose a filtered view of this FlowScanner's output.
        Specified by:
        filter in interface Filterator<FlowNode>
        Parameters:
        filterCondition - Filterator only returns FlowNodes matching this predicate.
        Returns:
        A Filterator against this FlowScanner, which can be filtered in additional ways.
      • findFirstMatch

        @CheckForNull
        public FlowNode findFirstMatch​(@CheckForNull
                                       Collection<FlowNode> heads,
                                       @CheckForNull
                                       Collection<FlowNode> blackListNodes,
                                       com.google.common.base.Predicate<FlowNode> matchCondition)
        Find the first FlowNode within the iteration order matching a given condition Includes null-checking on arguments to allow directly calling with unchecked inputs (simplifies use).
        Parameters:
        heads - Head nodes to start walking from
        blackListNodes - Nodes that are never visited, search stops here (bound is exclusive). If you want to create an inclusive bound, just use a node's parents.
        matchCondition - Predicate to match when we've successfully found a given node type
        Returns:
        First matching node, or null if no matches found
      • filteredNodes

        @NonNull
        public List<FlowNode> filteredNodes​(@CheckForNull
                                            Collection<FlowNode> heads,
                                            @CheckForNull
                                            Collection<FlowNode> blackList,
                                            com.google.common.base.Predicate<FlowNode> matchCondition)
        Return a filtered list of FlowNodes matching a condition, in the order encountered. Includes null-checking on arguments to allow directly calling with unchecked inputs (simplifies use).
        Parameters:
        heads - Nodes to start iterating backward from by visiting their parents.
        blackList - Nodes we may not visit or walk beyond.
        matchCondition - Predicate that must be met for nodes to be included in output. Input is always non-null.
        Returns:
        List of flownodes matching the predicate.
      • allNodes

        @NonNull
        public List<FlowNode> allNodes​(@CheckForNull
                                       Collection<FlowNode> heads)
        Convenience method to get the list all flownodes in the iterator order.
      • allNodes

        @NonNull
        public List<FlowNode> allNodes​(@CheckForNull
                                       FlowExecution exec)
        Convenience method to get the list of all FlowNodes for the execution, in iterator order.
      • filteredNodes

        @NonNull
        public List<FlowNode> filteredNodes​(@CheckForNull
                                            FlowExecution exec,
                                            @NonNull
                                            com.google.common.base.Predicate<FlowNode> matchPredicate)
      • visitAll

        public void visitAll​(@CheckForNull
                             Collection<FlowNode> heads,
                             @CheckForNull
                             Collection<FlowNode> blackList,
                             @NonNull
                             FlowNodeVisitor visitor)
        Given a FlowNodeVisitor, invoke FlowNodeVisitor.visit(FlowNode) on each node and halt early if it returns false. Includes null-checking on all but the visitor, to allow directly calling with unchecked inputs (simplifies use). Useful if you wish to collect some information from every node in the FlowGraph. To do that, accumulate internal state in the visitor, and invoke a getter when complete.
        Parameters:
        heads - Nodes to start walking the DAG backwards from.
        blackList - Nodes we can't visit or pass beyond.
        visitor - Visitor that will see each FlowNode encountered.